# Universities and machine code



## Deleted member 53988 (Aug 8, 2018)

Hi,

The graudeejs said in 2014 the following about machine code:

"graudeejs wrote: Binary coding is used when assembler doesn't support some
instructions of CPU i.e. assembler is not new enough. So I wouldn't
say binary coding is dead. 

You basically code in asm and then write instruction as byte sequence. "

"apple wrote:While Herbert Schildt says Assembly creates other
problems, people say that nobody uses nowadays binary code, many of
you contradicted these people".

"graudeejs wrote: Who cares what While Herbert Schildt says? Not me.

However he is true, that in 99.999999% real world stuff binary code /
assembly is not used".

"apple wrote:If it is true that Assembly does not create too many
problems, binary code is not dead and other answers you have posted on
this topic which also contradicts several programming statements
including statements made by Herbert Schildt",

"apple wrote: perhaps the knowledge of computer programming you learned
is a the best in the world".

"graudeejs wrote: I seriously doubt".

"apple wrote:I asked twice where you learned computer programming and
unfortunately did not respond".

"graudeejs wrote: You didn't (at least to me")

"apple wrote:I do not regret not having responded because both appeared
graudeejs saying that binary code is not dead".

"graudeejs wrote: In my university there are still assembly lessons. I love assembly. In some other universities there are even binary coding lessons".

"apple wrote: I beg you to tell me where you learned computer
programming, for example, Assembly, C, etc".?

"graudeejs wrote: At home, in front of my computer. Where else? Not in university... lol"

Reference: http://archive-org.com/page/3491382...eebsd.org/viewtopic.php?f=34&t=42856&start=25

Binary coding still is used when assembler doesn't support some
instructions of CPU?

i.e. assembler is not new enough?

However Herbert Schildt still is true, that in 99.999999% real world stuff binary code /
assembly is not used?

In some other universities still there are even binary coding lessons?

I wonder what Oko would say about it.


----------



## Crivens (Aug 8, 2018)

Apple is a well known troll around here. Citing him as a reference does no good.


----------



## Deleted member 53988 (Aug 8, 2018)

Crivens said:


> Apple is a well known troll around here. Citing him as a reference does no good.



EDIT: I quoted also graudeejs.


----------



## Crivens (Aug 8, 2018)

Ninja_Root said:


> EDIT: I quoted also graudeejs.


True, but quoting a notorious troll like apple can be considered meta-trolling. And when it comes to that guy, a lot of staff is quite trigger happy. This is just a PSA, please make your case without referencing that source.


----------



## Deleted member 53988 (Aug 8, 2018)

Crivens said:


> True, but quoting a notorious troll like apple can be considered meta-trolling. And when it comes to that guy, a lot of staff is quite trigger happy. This is just a PSA, please make your case without referencing that source.



Crivens,

The my case is this:

Binary coding still is used when assembler doesn't support some
instructions of CPU?

i.e. assembler is not new enough?

However Herbert Schildt still is true, that in 99.999999% real world stuff binary code /
assembly is not used?

In some other universities still there are even binary coding lessons?

I wonder what Oko would say about it.


----------



## Crivens (Aug 8, 2018)

Please do not mix generation of programs and execution of them. Some times you need to, even in assembler decks, write an instruction in binary because the assembler does not know it.
Executing anything BUT binary is beyond a CPU, so every program execution involves binary, no matter how the program  was build.


----------



## Deleted member 53988 (Aug 8, 2018)

Crivens said:


> Please do not mix generation of programs and execution of them. Some times you need to, even in assembler decks, write an instruction in binary because the assembler does not know it.
> Executing anything BUT binary is beyond a CPU, so every program execution involves binary, no matter how the program  was build.



Crivens,

There are instructions in binary or machine code on FreeBSD development?

If yes, please quote examples.

Machine code is fun.


----------



## Crivens (Aug 8, 2018)

Terminology, this is not the same. Machine code is binary, not all binary is machine code. Hence the 'illegal instruction trap'.

The nearest to binary would be assembler source, with a 1:1 ratio of source statements to machine instruction.


----------



## Deleted member 53988 (Aug 8, 2018)

Crivens said:


> Terminology, this is not the same. Machine code is binary, not all binary is machine code. Hence the 'illegal instruction trap'.



Crivens,

Sorry, I meant binary code.

EDIT:



Crivens said:


> Please do not mix generation of programs and execution of them. Some times you need to, even in assembler decks, write an instruction in binary because the assembler does not know it.
> Executing anything BUT binary is beyond a CPU, so every program execution involves binary, no matter how the program  was build.



Crivens,

There are instructions in *binary code *or *machine code* on FreeBSD development?

If yes, please quote examples.

Machine code is fun.


----------



## Beastie (Aug 8, 2018)

Ninja_Root said:


> Binary coding still is used when assembler doesn't support some
> instructions of CPU?
> 
> i.e. assembler is not new enough?


This is very rare. Any up-to-date assembler (e.g. fasm, nasm, gas, etc.) should (theoretically, at least) support the complete set of instructions in its target architecture, including the most peculiar and rarely-used instructions. Modern operating systems like FreeBSD include modern and up-to-date toolchains.



Ninja_Root said:


> However Herbert Schildt still is true, that in 99.999999% real world stuff binary code / assembly is not used?


In the end, everything from the bootloader to your interpreted Python scripts (or rather the interpreter itself) is made of assembly instructions, which translate into bits of zeroes and ones, which are basically made of streams of electric pulses.

Even if you build some C++ code, under the hood there's a compiler-assembler tandem that's converting the high-level language functions to machine code, even though the developer may not know the slightest thing about assembly.



Ninja_Root said:


> In some other universities still there are even binary coding lessons?


Most provide a single course in assembly at best if any at all. IT is such a vast field that you could spend an entire lifetime learning. Universities have priorities to meet and must cater to the ever-evolving requirements of labor markets.



Ninja_Root said:


> There are instructions in binary or machine code on FreeBSD development?
> 
> If yes, please quote examples.


Binary, I very much doubt. If you want to see assembly code, run `find /usr/src -name *.s`

In modern times you'll find only a few operating systems written entirely in assembly (e.g. KolibriOS). Usually assembly is only used for things that cannot be done in higher level languages or that need to be highly optimized (setting up protected/long mode, setting up IDT, GDT and other such structures, fast memory moves, cryptographic computations, etc.) The rest is done in one or more high-level language (e.g. C).


----------



## Freakbeat (Aug 8, 2018)

Ninja_Root Are you a bot?


----------



## Deleted member 53988 (Aug 9, 2018)

Freakbeat said:


> Ninja_Root Are you a bot?



Freakbeat, 

I am not a bot.

Why did you ask me that?


----------



## Maelstorm (Aug 9, 2018)

I guess that I will add my two cents.

Assembly is a short hand for machine code.  Machine code is just numbers viewed in hexadecimal.  Look at the following code:


```
0804a780 <__do_global_ctors_aux>:
 804a780:       55                      push   %ebp
 804a781:       89 e5                   mov    %esp,%ebp
 804a783:       53                      push   %ebx
 804a784:       83 ec 04                sub    $0x4,%esp
 804a787:       a1 00 b0 04 08          mov    0x804b000,%eax
 804a78c:       83 f8 ff                cmp    $0xffffffff,%eax
 804a78f:       74 12                   je     804a7a3 <__do_global_ctors_aux+0x23>
 804a791:       31 db                   xor    %ebx,%ebx
 804a793:       ff d0                   call   *%eax
 804a795:       8b 83 fc af 04 08       mov    0x804affc(%ebx),%eax
 804a79b:       83 eb 04                sub    $0x4,%ebx
 804a79e:       83 f8 ff                cmp    $0xffffffff,%eax
 804a7a1:       75 f0                   jne    804a793 <__do_global_ctors_aux+0x13>
 804a7a3:       83 c4 04                add    $0x4,%esp
 804a7a6:       5b                      pop    %ebx
 804a7a7:       5d                      pop    %ebp
 804a7a8:       c3                      ret
 804a7a9:       90                      nop
 804a7aa:       90                      nop
 804a7ab:       90                      nop
```

This was obtained by using `objdump -d` on a binary file.

The first column is the linear address of the actual instruction in hexadecimal.

The second column with the variable number of hex digits is the iapx80686 32-bit instructions in machine code shown as hexadecimal (If you want to see binary, just directly convert from hex to binary).  In general, the first byte is the opcode or the instruction itself.  Some simple instructions are one or two bytes.  Some instructions are longer because they are more complex (usually because of memory operands).  Here's a little tidbit: Intel x64 CPUs can have instructions which are 15 bytes long!  This is because Intel x86, x64 machines (AMD included) are CISC (Complex Instruction Set Computing) machines.  This is in contrast to RISC (Reduced Instruction Set Computing) machines such as SPARC, MIPS, ARM, AVR, etc... which uses fixed fields in the instruction so the instructions are fixed size.  A MIPS/32 machine always has 32-bit instructions.  This actually simplifies the fetch and decode logic in the CPU's execution datapath.

The third column is the human readable instruction.

The forth column is the operands (parameters, data) to said instruction.

Using the file command on the same program gives this result:

```
binary: ELF 32-bit LSB executable, Intel 80386,
version 1 (FreeBSD), dynamically linked, interpreter /libexec/ld-elf.so.1,
for FreeBSD 9.0 (900044), not stripped
```
So the program is in machine code, even though it was written in C.

When it comes down to it, all CPUs, no matter what manufacturer, language, brand, family, instruction set, etc... all speak binary which is 1's and 0's which in turn translates into electrical impulses that move through logic circuits which performs the actual instructions.  So adding two numbers together, the instruction decoder sets the gates to route the data through an adder (Usually a look-ahead carry adder) that physically adds the numbers together.

Now when it comes to compilers (I took a course in compilers...interesting stuff), for x86/x64, compilers use about 25 instructions or so.  Now that is interesting because Intel CPUs and their clones (AMD) support over 1,000 instructions.
Here's a short list:


nop
push/pop
call/ret
mov
add/sub/mul/div
and/or/not/xor
shl/shr/rol/ror
jmp
jnz/jz/je/jne/ja/jae/jb/jbe/js/jns/and a few more that I can't remember....
cmp/test
That's about it I think.  FreeBSD is written mostly in C, but there is some Assembler because C does not support certain things that the CPU might support.  For Intel x86/x64 CPUs, there is a separately addressable I/O bus which uses instructions IN and OUT which cannot be generated by the C compiler.  You have to use assembly (either inline or a separate .asm or .S file (platform dependent).  Another one is the XCHGCMP instruction which is used for making spin locks (gcc and clang both have special builtin functions that will generate that instruction).

Hopefully this will answer your questions.


----------



## Deleted member 53988 (Aug 9, 2018)

Maelstorm said:


> Hopefully this will answer your questions.



Maelstorm,

This does not answer the following questions:

However Herbert Schildt still is true, that in 99.999999% real world stuff binary code /assembly is not used?

In some other universities still there are even binary coding lessons?


----------



## Crivens (Aug 9, 2018)

Maybe some still teach coding in binary, mostly in compiler construction. You have to emit binary at some stage. 

And please  don't try to ride that dead horse apple left when he got the troll treatment. Get sources of your own for these claims. Then we may discuss if Herbert is full of core dump or not. Now, we are only discussing if we shall close threads.


----------



## Freakbeat (Aug 9, 2018)

Ninja_Root said:


> Freakbeat,
> 
> I am not a bot.
> 
> Why did you ask me that?




You answer like a bot.


----------



## Deleted member 53988 (Aug 9, 2018)

Crivens said:


> And please  don't try to ride that dead horse apple left when he got the troll treatment. Get sources of your own for these claims. Then we may discuss if Herbert is full of core dump or not. Now, we are only discussing if we shall close threads.



It is written in book C: The Complete Reference - Third Editions of Herbert Schildt:

"As a general rule: do not use assembler, creates too many problems".

However Herbert Schildt is true, that in 99.999999% real world stuff binary code /assembly is not used?


----------



## Crivens (Aug 9, 2018)

Depends... without the few lines of assembler in crt0.s, nothing out of a C compiler would work. So no python, sh, ...

He may be right for things done completely in assembler, but without some lines of it 99.999999% of code would not work.

I know of exactly one self-hosting OS which can do without assembler at all.


----------



## Maelstorm (Aug 10, 2018)

Crivens said:


> Maybe some still teach coding in binary, mostly in compiler construction. You have to emit binary at some stage.



Not true.  You only emit assembly language.  The assembler makes the transition from what the compiler outputs to machine code.  Compilers have multiple stages of code analysis.  The lexicographical analyzer and parser are the front end.  They walk through the source code, generating the AST (abstract syntax tree) which is a series of interconnected nodes and lists.  Nodes are used for breaking down statements, such as assignments and such.  List are used for series of statements.  Once you have that, you walk down the tree and generate an internal representation of the code.  In our case, we used something called ILOC (Intermediate Language for Optimizing Compilers).  Then you do live range analysis and register assignment.  Optimization can happen at any stage.  The professor who taught the class is one of the people who does optimization research using LLVM.



Ninja_Root said:


> It is written in book C: The Complete Reference - Third Editions of Herbert Schildt:
> 
> "As a general rule: do not use assembler, creates too many problems".
> 
> However Herbert Schildt is true, that in 99.999999% real world stuff binary code /assembly is not used?



He's wrong.  Because as I stated previously, everything goes to binary code or the CPU will not understand it.  I have taken an assembly language class.  It's lower division and it's a requirement for a degree in computer science.


----------



## Crivens (Aug 10, 2018)

Interesting, Maelstorm ... the C++ compiler I was involved with directly compiled to binary. If you needed an assembler file, that was done by disassembling the internal object code stream. Compilation is _much_ faster that way. The code generator/peepholer are a bit more complex and harder to debug, but it is worth it. Anyone remembers 'draco' from Cris Gray, or the old turbo pascal systems? Try them in a VM, on modern hardware and enjoy a compiler+IDE living in L1 cache.


----------



## drhowarddrfine (Aug 10, 2018)

I've often read what a terrible book the Schildt book is. I own it but have never really delved into it.

I also used to bootstrap mini-computers using toggle switches and programmed EPROM's with wires. I did assembly programming exclusively on embedded processors for many years. I would not consider any school that doesn't teach assembly to CS students trustworthy or respectable.


----------



## Deleted member 53988 (Aug 10, 2018)

Maelstorm,

In some other universities there are even binary coding lessons?


----------



## Crivens (Aug 10, 2018)

Please make up your mind if you mean binary or assembly.


----------



## Deleted member 53988 (Aug 10, 2018)

Crivens said:


> Please make up your mind if you mean binary or assembly.



Crivens,

I meant *binary coding.*

I wrote:

In some other universities there are *even binary coding lessons*?


----------



## Crivens (Aug 10, 2018)

Ok, those are usually part of VLSI design or computer architecture. With VLSI design, you will see binary code of a completely new level


----------



## connchri (Aug 10, 2018)

Ninja_Root said:


> Maelstorm,
> 
> In some other universities there are even binary coding lessons?



I can't comment from a Computer Science course POV.  But I can say that, where I did my undergrad, we had a few lessons in Assembly.  I've still got the old book for the x86.  I bought this second hand in 2008.  The unltimate aim, however, was to familiarise ourselves, along with C++, for when we went on to program Programmable Logic Controllers, debugging C code, accessing registers that represent hardware inputs/outputs, etc.  It's not something I've looked at in great detail since, but if they offered this in an Electronic and Electrical engineering course, I would hazard a guess that it would be covered in greater detail, including byte code, in a Computer Science Degree.


----------



## Deleted member 53988 (Aug 10, 2018)

Crivens said:


> Please make up your mind if you mean binary or assembly.



Crivens,

I meant *binary coding.*

I wrote:

In some other universities there are *even binary coding lessons*?

*EDIT: Machine code is binary code, hexadecimal code, octal code...*


----------



## Crivens (Aug 10, 2018)

I remember being able to read Z80 code in hex dumps, and a friend of mine wrote 6502 code directly as hex code. That is where you might start. Modern cpus have too many instructions, sometimes more than the base word count for school children... no way you read that fluently in a year. No courses will be there for that level.   There are plenty of 8 bit computers around to train with, and it will be fun.


----------



## Deleted member 53988 (Aug 11, 2018)

Crivens said:


> Terminology, this is not the same. Machine code is binary, not all binary is machine code. Hence the 'illegal instruction trap'.



Crivens,

You say that machine code is binary and at the same time say that not all binary is machine code.

This is a contradiction.


----------



## Crivens (Aug 11, 2018)

No. This is called logic.


----------



## Deleted member 53988 (Aug 11, 2018)

Crivens said:


> No. This is called logic.



Crivens,

Why this is called logic?

Please explain.


----------



## ralphbsz (Aug 11, 2018)

I've been writing software professionally for ~25 years now.  Typically in groups of anywhere from 5 to 300 people.

I've seen about 0.1 cases where we actually coded in binary (meaning we wrote instructions that the CPU executed, and we didn't use assembly but emitted the numeric instructions).  That was a project where the only way to get the required speed (image processing on an i386, which lacked sufficient registers) was to generate a routine on the fly and execute it, and stuff the constants and pointer offsets into the instruction stream.  And even this was not done by actually coding whole instructions sequences in binary.  Instead we wrote sample code in C++, compiled it with assembly listing, then ran the assembly listing through awk to generate a version of the executable code that could be copied into a second C++ program as an array of integer constants (bytes that were the instructions), the modified that array programmatically, and executed it.

I've never heard of a case where the assembler is incapable of generating specific instructions.  That's just insane.  If that happens, deal with the people who wrote or sold the assembler harshly.

I've seen about 20 or 30 cases where we actually had to code in assembly.  This only happens extremely rarely, for bizarre performance optimizations (like having to use vector instructions that the optimizer doesn't want to use, because we know better), or for using atomic instruction primitives.  Even for those, we usually had compiler macros.

I have no idea whether and how instruction execution in binary (not assembly) would be taught.  It may happen in EE classes, as part of processor design (which is typically a VHDL/Verilog class).  It may happen for a few homework problems in a computer architecture class.  Even then, it will probably not use a real-world instruction set (those are way too complex for teaching), but an old or hypothetical instruction set (like IBM 360, Intel 808x or Z80, or MIX/MMIX).  In the 1980s, as part of the "operating systems" class, I had to do one or two homework problems in binary IBM 360 and Cyber 6xxx instructions, and a half dozen in IBM 360 assembly.

And if ninja_root asks another repetitive and inane question, I'll get seriously upset at him.


----------



## Maelstorm (Aug 11, 2018)

Yeah, I took a computer architecture course, and the advanced one as well.  In our semester project, we had to develop a 16-bit RISC CPU that was pipelined.  The class is long over with, so I'll attach the SVG file for the block diagram....or not since the forum does not allow SVG extension files...  But yeah, binary coding was something to see because of the signals.


----------



## Maelstorm (Aug 11, 2018)

Ninja_Root said:


> Crivens,
> 
> You say that machine code is binary and at the same time say that not all binary is machine code.
> 
> This is a contradiction.



Actually, it's not a contradiction because Crivens is right.  All machine code is binary, but the reverse is not true.  There is binary data....


----------



## Deleted member 53988 (Aug 11, 2018)

To end this topic:

Currently there are octal coding and decimal coding?


----------



## Crivens (Aug 11, 2018)

No. Those are *en*codings. Babylonians used base 6, if my memory serves me right. You may use any base you want, but the computer uses base 2. Only intel uses 1.95 internally in the pentium fpu and p4 pipeline.


----------



## Maelstorm (Aug 11, 2018)

Consider this: Any and all programs that a computer runs and data that a computer processes is in binary because a computer cannot use anything else.  So, anything that can get onto a computer must be in binary format or the computer cannot make sense of it.  Even on modern computers that have temperature and voltage readings (which are analog quantities) must be converted to digital (binary) format before the computer can understand it.  The characters in this post are all numbers.  If ASCII is used, then A = 65.  It's how computers work.  Everything is a binary number.  No exceptions.  Since humans don't deal with binary too well, the programs on the computer converts the binary data into a format that humans can easily read.  I have written some of the software that does that.  You mentioned octal.  Octal is an old binary type format were the bits are arranged into groups of three.  So you have the following if you use 421 binary:


```
000        0
001        1
010        2
011        3
100        4
101        5
110        6
111        7
```

Hexadecimal is basically the same thing as octal with grouping of 4 bits (a nibble) instead of three so the coding is 8421 binary:


```
0000        0        1000        8
0001        1        1001        9
0010        2        1010        A (10)
0011        3        1011        B (11)
0100        4        1100        C (12)       
0101        5        1101        D (13)
0110        6        1110        E (14)
0111        7        1111        F (15)
```

For decimal, you basically add up the position values where there's a 1 bit and that's the number.  So 1010 has 1s in bit positions 2 and 4 which correspond to the values of 2 and 8, so 2 + 8 = 10 which is A in hex.


Note: Analog signals are continuously varying voltages/currents.  Digital is fixed at either Vcc or Gnd which is voltage or no voltage with respect to ground.  Devices known as ADCs (Analog to Digital Converters) convert a continuously varying analog signal into a digital number of so many bits.  A 12-bit ADC has 4096 steps between 0 and whatever Vcc is.  So when the input voltage is close to a step, the ADC reports the number that is associated with that step.


----------



## ralphbsz (Aug 11, 2018)

Crivens said:


> No. Those are *en*codings. Babylonians used base 6, if my memory serves me right. You may use any base you want, but the computer uses base 2. Only intel uses 1.95 internally in the pentium fpu and p4 pipeline.



At times like this, it would be good if the forum allowed to not just "like" a post by giving it a thumbs-up, but also give a laughing smiley to a post.

On a serious note: A former colleague of mine was advocating that computers should stop working in binary, and instead use trinary (each bit can store or process the values 0, 1 and 2).  From a hardware point of view, this is doable, but extremely hard: every capacitor (memory cell) and transistor (gate, switch) would need to handle three voltages; but flash has demonstrated that it can be done, although the circuit elements would become a little larger.  His argument was, however, not about electrical and space efficiency, but purely theoretical, and involves computer arithmetic: In number theory, there are lots of theorems that are true for odd primes (3, 5, 7, ...), and using one of those as the base of the number system would allow a lot of cross-checking and correctness proving in the arithmetic operations.  Sadly, he is a former colleague, because he lost his two battles with cancer.  Fortunately, he had a pleasant life (with friends, wine, ...) until near the end.


----------



## Crivens (Aug 11, 2018)

ralphbsz
Did you know that this fdiv thing was completely pointless? It was done to shrink the die and increase yield, but failed because the die area was set by the IO drivers on the border. The US outer border would not change if somehow some aliens stole nevada, for example. So no gain and a PR disaster.

When you are around here again let me know so I can invite you to a nice brew or so.


----------



## itsthosestonesman (Jun 6, 2019)

|I'll try to answer your original question, it can be tough getting to grips with all the concepts involved in this.  I can give you a real example of "binary programming" from something I have worked on.  Intel CPUs have some special instructions called streaming multimedia instructions, which are high performance instructions for processing data in memory.  But the time I did the work these instructions were not supported by the C compiler I was using, which was gcc.  That being the case, the only way to get the cpu to run those instructions was to code them as direct binary data in the source of my program.  Let's have a bit of background to explain what I mean.  When you write some source code in any native compiled language, that is, code that executes directly on the machines CPU, such as assembly language or C, the compiler will translate the source code of your program into a binary object code that the CPU is able to process.  For example, let's say you have written in C "int b = 12"; the equivalent in assembly might be something like "mov eax, 12".  In fact, when your C code is compiled, it is usually translated into assembly first, before being translated to binary.  When the assembly is itself translated, the output is no longer readable as text, but is a sequence of binary byte values.  These values encode the assembly instruction in a sequence that the CPU instruction decoder is able to understand to actually execute the instructions.  Let's have an example.  Here is a tiny program written in C source code:-

```
#include <stdio.h>
int main(int argc, char *argv[]) {
    int b = 5;
    b = b * 2;
    printf("value of b is %d\n", b);
}
```
You can compile this and run it as follows:-

```
$ cc -o t t.c
./t
value of b is 10
```
So far so good, we wrote a little program and ran it.  We can also ask the compiler to output the assembly language it generated during compilation, which is a step that you don't normally bother to look at.  To do this compile the source program again but this time using the -S option, as follows:-

```
$ cc -S t.c
```
That step creates a file called t.s, which contains the assembly language source code generated from compiling your C source code.  This is still not in a form that the cpu can run, but it is getting closer.  It is still human-readable and is still a type of source code, called assembly language.  Let's have a look at t.s:-

```
$ cat t.s
        .text
        .file   "t.c"
        .globl  main                    # -- Begin function main
        .p2align        4, 0x90
        .type   main,@function
main:                                   # @main
        .cfi_startproc
# %bb.0:
        pushq   %rbp
        .cfi_def_cfa_offset 16
        .cfi_offset %rbp, -16
        movq    %rsp, %rbp
        .cfi_def_cfa_register %rbp
        subq    $32, %rsp
        movabsq $.L.str, %rax
        movl    %edi, -4(%rbp)
        movq    %rsi, -16(%rbp)
        movl    $5, -20(%rbp)
        movl    -20(%rbp), %edi
        shll    $1, %edi
        movl    %edi, -20(%rbp)
        movl    -20(%rbp), %esi
        movq    %rax, %rdi
        movb    $0, %al
        callq   printf
        xorl    %esi, %esi
        movl    %eax, -24(%rbp)         # 4-byte Spill
        movl    %esi, %eax
        addq    $32, %rsp
        popq    %rbp
        retq
.Lfunc_end0:
        .size   main, .Lfunc_end0-main
        .cfi_endproc
                                        # -- End function
        .type   .L.str,@object          # @.str
        .section        .rodata.str1.1,"aMS",@progbits,1
.L.str:
        .asciz  "value of b is [%d]\n"
        .size   .L.str, 20
        .ident  "FreeBSD clang version 6.0.1 (tags/RELEASE_601/final 335540) (based on LLVM 6.0.1)"
        .section        ".note.GNU-stack","",@progbits
```
Crikey!  Now you can see why we prefer to write the original program in C or another high-level language.  Just imagine if you had to type all that in just to write a simple program to double a constant.  I won't go into how this works in detail but you can assume that somewhere in there are instructions to make a variable, assign the value 5 to it, then double it, then call the printf() function to write the value to standard output.  In fact we can see the line that is equivalent to the assignment in  "int b=5"; it is the assembly line "movl    $5, -20(%rbp)"; and if you dig your way through it you might get an idea of which parts of the assembley code roughly correspond to the rest of the C source code.
But we still haven't got to binary code yet.  To do that we must invoke the assember stage of the compiler, which is done by using the -c option to the cc command.  This stage of compilation will translate the C to assembly, and then translate the assembly language and convert it to something called object code, which is the binary equivalent.

```
$ cc -c t.c
```
After running this it will be found that a file called t.o has been created, which is known as object code.  And finally this file does contain the binary version of the instructions that were in the assembly file, which in turn were generated from the original C source code.  And we can look at the contents of the t.o file using a binary file viewer such as the hexdump program, as follows:-

```
$ hexdump t.o
0000000 457f 464c 0102 0901 0000 0000 0000 0000
0000010 0001 003e 0001 0000 0000 0000 0000 0000
0000020 0000 0000 0000 0000 0268 0000 0000 0000
0000030 0000 0000 0040 0000 0000 0040 000a 0001
0000040 4855 e589 8348 20ec b848 0000 0000 0000
0000050 0000 7d89 48fc 7589 c7f0 ec45 0005 0000
0000060 7d8b c1ec 01e7 7d89 8bec ec75 8948 b0c7
0000070 e800 0000 0000 f631 4589 89e8 48f0 c483
0000080 5d20 76c3 6c61 6575 6f20 2066 2062 7369
0000090 5b20 6425 0a5d 0000 7246 6565 5342 2044
00000a0 6c63 6e61 2067 6576 7372 6f69 206e 2e36
00000b0 2e30 2031 7428 6761 2f73 4552 454c 5341
00000c0 5f45 3036 2f31 6966 616e 206c 3333 3535
00000d0 3034 2029 6228 7361 6465 6f20 206e 4c4c
00000e0 4d56 3620 302e 312e 0029 0000 0000 0000
00000f0 0014 0000 0000 0000 7a01 0052 7801 0110
.... (truncated for brevity)
```
Now finally we can see the binary data that was created by the assembly pass.  This is no longer in a form that you can read or understand, but it does make perfect sense to the CPU when it comes to execute it. Somewhere in this binary data is a series of bytes that represents our original assembly code that was in turn generated from C code; and this series of binary data will be executed by the CPU.  Believe it or not there is one further stage that must be performed before this file can actually be loaded and run by the operating system, which is called linkage.  This step combines our small object file with system library files so that the printf() function and certain other essential functions can be accessed.  I have only skimmed over the surface of the whole process here, but hopefully you get the rough idea.
So what exactly is the binary data that is in the object file?  It consists of a series of instructions called opcodes, and their parameters.  In the example given the assembly word "mov" is the assembly code representation of one type of opcode, and that will always generate a specific binary value to represent the "mov" (move) instruction to the CPU.  The entire set of opcodes understood by a cpu is called the instruction set.  If you rummage around the intel website at some time you will find that you can download books that describe the entire instruction set that a particular processor supports; for example, you will find one book that describes the "Intel 80386 instruction set", one for the pentium, etc.
Now, if you had a binary editor, you could actually write the object file directly by writing the values from the hexdump into a file, which would be another way to write the instructions for the cpu; actual binary programming; of course this would be far too laborious to ever do in practice, and you would have to look up every opcode in the instruction set book, work out what its parameters are, and write the correct sequence into the file.  It would be almost impossible for a human to get it right and would be extremely time-consuming.
So when might we ever encounter a need to program in binary?  As processors become more complex over the years, additional instructions are added to the instruction set, while the earlier instructions are retained for backwards compatibility.  And sometimes instructions are added to a processor but are not added to the compiler or assembler by the programmers who develop the tools; the reason might be a lack of time, or the instructions are very rarely used, or any number of other factors.  In those circumstances if you want to use the instructions that the compiler does not support, your only option is to write them directly as binary data into your source code, and both the assembler and C compiler provide a special syntax to allow you to do that.  Going back to my original example of the steaming multimedia instructions, the version of gcc I was using at the time did not support them, so there was no way to access those instructions from C or assembly source code.  However it is possible to embed snippets of assembler within C code, using something called "inline assembly language", and within one of those snippets it is possible to hard-code binary values that are to be executed directly by the CPU.  In this was it was possible to write some code that used the streaming multimedia instructions, despite them not being available on the CPU.  There is one other time that you might need to do "binary programming", which is when you have an executable program that you don't have the source code for; imagine you have shipped this program to lots of customers who have started reporting a bug to you.  But the guy who wrote the program has left the company and you have lost the source code.  Believe it or not this does happen in the real world.  So you get your team to debug the program using something called a disassembler, which converts the binary back to assembly language, and you find the bug, and you work out how to fix it; but how do you give this fix to all your customers?  You can use something called a binary patch, which which will edit the binary code of the executable file in situ on the customers system.  If you ever have to do this you are getting in deep ;-)  But that's another example of real-world "binary programming".  Don't worry, your university will never mention that!
So back to your original question about whether a university course would cover this.  I would expect a good university course to explain the concepts of computer architecture, cpu opcodes and instruction sets, how the CPU works in terms of binary instruction decode, and assembly language programming; with practical programming sessions most likely using a simple example cpu like a Z80 or 6800 to develop your knowledge.  As with everything, working through real problems is far more valuable than just learning principles from a book.  If you are lucky they will have actual hardware kits, such a Z80 development system, that will allow you to practice this and do exercises.  Perhaps this is more likely to be encountered in an electronics or embedded systems course rather than in computer science, but even a computer science degree should cover some of this stuff in the first year, IMHO. Of course in the real world you will almost never do "binary programming" except under the rare kinds of circumstances I have described.  But it's worth having a strong grip on how computers really work, when you come to work on higher level projects.  Anyway hopefully that's give you a few ideas, feel free to ask questions if you like.


----------



## itsthosestonesman (Jun 6, 2019)

Correction - the line "In this was it was possible to write some code that used the streaming multimedia instructions, despite them not being available on the CPU." should say "In this was it was possible to write some code that used the streaming multimedia instructions, despite them not being available in the compiler".


----------



## itsthosestonesman (Jun 6, 2019)

I've thought of one more fairly specialised case where you might encounter binary programming.  Certain processors can contain 'unofficial' undocumented opcodes, which programmers discover over the course of time; and these undocumented opcodes sometimes allow operations to be performed much more rapidly than the official documented opcodes.  In this case embedded hard-coded values would be used to access the undocumented opcodes.  This applies to things like highly optimised game engines running on consoles.  The Z80 processor I mentioned earlier had some well-known undocumented opcodes that were widely used in early games written in assembly language.  Similarly the Intel 80286 had an undocumented opcode that could be used to switch back from protected mode to real mode, which the book said was an illegal operation but was actually used in some software to access memory above the 1MB barrier.  Of course we're going back in time now, however you can bet your bottom dollar that the current generation of processors also contain undocumented opcodes.


----------



## ShelLuser (Jun 7, 2019)

Well, since the thread got necro'd I guess I might as well....



Deleted member 53988 said:


> In some other universities still there are even binary coding lessons?


When I went to collage (last century) I actually learned to code in assembly and it's actually not too bad, especially in comparison to all the tools you currently have which can make coding a lot easier.

Of course I hardly used it (also because I'm not a full time programmer), but it's still useful to know about this aspect and I also believe it can definitely help to get a better understanding of the logic and theory behind coding.



itsthosestonesman said:


> |I'll try to answer your original question, it can be tough getting to grips with all the concepts involved in this.


Just wondering... you _do_ realize that you're responding to a thread which is almost a year old, right?


----------



## ralphbsz (Jun 7, 2019)

Following up on itsthestones' post ...

About 25 years ago, we were using lowly x86 instruction set machines (probably Pentium Pro) to do image processing.  The problem was that our image processing code was very complex; we were doing double differences between three images, while simultaneously running each image through a correction based on mean/variance segmentation per pixel.  This required keeping about 6 or 7 two-dimensional arrays (images or lookup tables) in memory at the same time.  The problem was that we were writing the code in C++, and the function would get the addresses of these arrays as arguments (pointers).  Those pointers had to be stored in registers, but the registers were also needed to do the actual arithmetic; remember the 32-bit x86 architecture has very few registers.  So what happens was that the compiler ended up having to spill/reload the registered to/from memory all the time, which made the code run REALLY slow.

So with a group of a few people, we did the following.  We coded a version of the C++ code that had the arrays at fixed memory locations.  At this point, the addresses of the arrays were no longer in registers, but as constants in the instruction stream, which was very efficient to read and process.  We suddenly had a lot of registers free, and were able to code it in such a fashion that some registers contained offsets into the arrays, and we used constant-relative addressing modes to access the images.  This code ran really fast.

Great, but not practical: In the real world, the images and arrays are in variable places, and we need to pass pointers to the routine.  But we can't keep the pointers in registers, and accessing them in memory is too slow, they need to be in the instruction stream.

So here is what we did.  We ran the fast version of the code through the compiler with assembly output, and wrote an awk script that collected the instruction bytes in an array, which was stored in the source code.  Then, when the function needed to be called, we would patch the array of bytes with the correct addresses for this call, and then call the modified array (yes, you can call an array of bytes as if it were a function, by using some nasty casts).  This gave us our cake (fast execution) and eat it too (for images anywhere in memory).  The piece of code that patched up the code (in the array) and then called it was named "izmel", which is the special sharp knife used for circumcisions.

This is the only major (more than a line or two) of assembly programming I've done in the last 25 years.  And it wasn't really assembly programming: we let the C++ compiler write the assembly source for us, and then in a nutshell implemented our own relocating linker.  Occasionally, we need to use an instruction or two; typically for atomicity (lock-free data structures, often using a CAS instruction), or to call vector or multimedia instructions.


----------



## Crivens (Jun 7, 2019)

ShelLuser said:


> Just wondering... you _do_ realize that you're responding to a thread which is almost a year old, right?


And which was started by a notorious troll.
The only reason it is still around is that it contains good nuggets.

ralphbsz 
This is true evil and the lords of clean programming are frowning at you. But they are annoyed by the gods of speed and getting things done who clap and shout praise at you.


----------



## itsthosestonesman (Jun 7, 2019)

ShelLuser said:


> Well, since the thread got necro'd I guess I might as well....
> 
> 
> When I went to collage (last century) I actually learned to code in assembly and it's actually not too bad, especially in comparison to all the tools you currently have which can make coding a lot easier.
> ...


oh maaan... I typed that while I was having a crap.... seemed a good idea at the time but now


----------



## itsthosestonesman (Jun 7, 2019)

ralphbsz said:


> Following up on itsthestones' post ...
> 
> About 25 years ago, we were using lowly x86 instruction set machines (probably Pentium Pro) to do image processing.  The problem was that our image processing code was very complex; we were doing double differences between three images, while simultaneously running each image through a correction based on mean/variance segmentation per pixel.  This required keeping about 6 or 7 two-dimensional arrays (images or lookup tables) in memory at the same time.  The problem was that we were writing the code in C++, and the function would get the addresses of these arrays as arguments (pointers).  Those pointers had to be stored in registers, but the registers were also needed to do the actual arithmetic; remember the 32-bit x86 architecture has very few registers.  So what happens was that the compiler ended up having to spill/reload the registered to/from memory all the time, which made the code run REALLY slow.
> 
> ...


Your solution sounds great!  You've reminded me, implementing things like memory barriers for lockless is another place I've used very small amounts of inline asm in the last ten years... and writing things like a fast spinlock in asm to avoid the humungous overhead of using pthreads.   By co-incidence I also worked on image processing (satellite images) many years ago, back then I was using masm to write some filter routines, convolution etc, trying to squeeze performance out of a 386... hahaha, those were the days


----------



## itsthosestonesman (Jun 7, 2019)

itsthosestonesman said:


> Your solution sounds great!  You've reminded me, implementing things like memory barriers for lockless is another place I've used very small amounts of inline asm in the last ten years... and writing things like a fast spinlock in asm to avoid the humungous overhead of using pthreads.   By co-incidence I also worked on image processing (satellite images) many years ago, back then I was using masm to write some filter routines, convolution etc, trying to squeeze performance out of a 386... hahaha, those were the days


For anyone else reading this who is interested in low-level cpu design and how computers work under the covers, this https://gigatron.io/ looks like a really nice project to have a play with.  I wish tandy's were still around with the heathkit kits, but they appear to have vanished.  Making a cpu from TTL is a great idea.  Or you could go for a nice Z80 trainer like this http://www.kswichit.com/Z80/Z80.html. I don't have any connection with either of these sites, I'm sure there are others.  Of course they won't run your favourite o/s but it's nice to know that some people are still making kit like this.


----------



## Nicola Mingotti (Jun 7, 2019)

Deleted member 53988 said:


> However Herbert Schildt is true, that in 99.999999% real world stuff binary code /assembly is not use



 my $0.000005 on this discussion. by choice an impossible value 

*) it is quite apparent that that percent value is purely psychological, not as a statistical estimation. So, take from it what it can give. 

*) if you interpret as, "it is true that a tiny fraction of all software lines written is in assembly today ?" That is probably true. Just think how many web developer are out there !

*) if you interpret as "it is true that in the very large majority of computer application  bought by US do not contain assembly?" then the aswer is probably yes. Consider for example phone dev, mostly you write in Java (android case) you don’t have  access to the metal. other example, again, Web dev, the other huge player today, forget assembly at all. 

*) But now, see the thing from another angle, every device, every phone has something wrtten in assembly in it with good probability ! Assembly in embedded is far from forgotten.  
 usually you don t buy assembly software because it comes by default when you buy the machine. How many driver contain assembly? I don’t know, but i suppose many do.


----------



## itsthosestonesman (Jun 7, 2019)

Nicola Mingotti said:


> my $0.000005 on this discussion. by choice an impossible value
> 
> *) it is quite apparent that that percent value is purely psychological, not as a statistical estimation. So, take from it what it can give.
> 
> ...


Perhaps around ten years ago, the technical lead of the Sony games development lab in the uk (hopefully they are still in business here) came and gave us a talk on how they develop software for consoles.  He said the model is completely different from the PC world where more efficient hardware comes out every year.  In console land they keep the hardware constant and unchanged, for perhaps ten years, and instead evolve the programming techniques used to squeeze performance out of the same hardware (I think at the time it was PS/2).  So to produce more advanced games as time went on and keep customers buying, he said they used a lot of assembly language and learned how to program the hardware inside-out, working out how to get the maximum performance from the chips in the box.  If console games are still developed that way, that might be one area that still makes extensive user of assembly language.  The business model was to maximise the ROI on the initial development of the custom chips in the console by shipping millions of them and amortizing the cost over a long period.  Anyway I found this an interestingly different perspective on a computer business, compared to what we are used to, getting ever more powerful hardware each year and focusing on minimising the cost of writing new software.


----------



## itsthosestonesman (Jun 7, 2019)

Hmm.... eevblog didn't think much of the gigatron.  



_View: https://www.youtube.com/watch?v=6vbI-r5aXJI_
.  I guess a real micro would be much better to learn on


----------

