How the Compilation Process Works for C Programs

How the Compilation Process Works for C Programs

C is a compiled language. Its source code is written using any editor of a programmer?s choice in the form of a text file, then it has to be compiled into machine code.

Image for post

C source files are by convention named with .c extension and we use the command ?gcc? to compile C source files. (GCC stands for GNU Compiler Collection and it is a compiler system produced by the GNU Project.)

Four Steps of Compilation: preprocessing, compiling, assembly, linking.

Preprocessing:

Preprocessing is the first step. The preprocessor obeys commands that begin with # (known as directives) by:

  • removing comments
  • expanding macros
  • expanding included files

If you included a header file such as #include <stdio.h>, it will look for the stdio.h file and copy the header file into the source code file.

The preprocessor also generates macro code and replaces symbolic constants defined using #define with their values.

Compiling:

Compiling is the second step. It takes the output of the preprocessor and generates assembly language, an intermediate human readable language, specific to the target processor.

Assembly:

Assembly is the third step of compilation. The assembler will convert the assembly code into pure binary code or machine code (zeros and ones). This code is also known as object code.

Linking:

Linking is the final step of compilation. The linker merges all the object code from multiple modules into a single one. If we are using a function from libraries, linker will link our code with that library function code.

In static linking, the linker makes a copy of all used library functions to the executable file. In dynamic linking, the code is not copied, it is done by just placing the name of the library in the binary file.

Let?s now compile! For our example, we?ll use ?main.c? as our source file.

#include <stdio.h>int main(void){ printf(“Hello, World!n”); return (0);}

At the shell prompt, enter the command ?gcc main.c? and hit Enter. If it successfully compiles, the shell prompt will be displayed again. If it does not compile, it will display error message(s).

vagrant@vagrant-ubuntu-trusty-64:~$ gcc main.cvagrant@vagrant-ubuntu-trusty-64:~$ lsa.out main.c

After main.c is compiled, type the command ?ls? to list your directory contents and you will see an executable file named a.out. To run the program, type ?./a.out? at the shell prompt and hit Enter. Yay, we see the correct output ?Hello, World!? followed by a newline.

vagrant@vagrant-ubuntu-trusty-64:~$ ./a.outHello, World!vagrant@vagrant-ubuntu-trusty-64:~$

If you don?t want your output file to be named a.out, which is the default output filename, you can specify a different output filename with the -o option.

gcc -o <desired_output_filename> <source filename>

Let?s see an example below where we want the output file to be named main.

vagrant@vagrant-ubuntu-trusty-64:~$ gcc -o main main.cvagrant@vagrant-ubuntu-trusty-64:~$ lsmain main.cvagrant@vagrant-ubuntu-trusty-64:~$

To run the main program, we type ?./main? into the terminal.

vagrant@vagrant-ubuntu-trusty-64:~$ ./mainHello, World!vagrant@vagrant-ubuntu-trusty-64:~$

If you make changes to your code (i.e. make any changes to your source file), you will need to save and recompile.

In some instances, the C source file will not successfully compile.

Read the error messages for clues on how to fix.

Below, we left out the semicolon at the end of the printf statement.

#include <stdio.h>int main(void){ printf(“Hello, World!n”) return (0);}

An error message displays that a semicolon was expected before return.

vagrant@vagrant-ubuntu-trusty-64:~$ gcc -o main main.cmain.c: In function ‘main’:main.c:5:7: error: expected ‘;’ before ‘return’ return (0); ^vagrant@vagrant-ubuntu-trusty-64:~$

If we add the semicolon back?

#include <stdio.h>int main(void){ printf(“Hello, World!n”); return (0);}

?and then recompile, the error message will be gone.

vagrant@vagrant-ubuntu-trusty-64:~$ gcc -o main main.cvagrant@vagrant-ubuntu-trusty-64:~$ ./mainHello, World!vagrant@vagrant-ubuntu-trusty-64:~$

Even if your source file compiles, check if your output is correct.

Here is an example of a source file that compiles?

#include <stdio.h>int main(void){ printf(“The sum of 9+2 is: %in”, 10); return (0);}

?but the output is not correct.

vagrant@vagrant-ubuntu-trusty-64:~$ ./main2The sum of 9+2 is: 10vagrant@vagrant-ubuntu-trusty-64:~$

Oops, we have an error due to a typo in the source code.

printf(“The sum of 9+2 is: %in”, 10);

We know that 9+2 = 11 but the compiler did as the code instructed in the above line. It substituted the number 10 into the format specifer %i.

The correct code would have been:

#include <stdio.h>int main(void){ printf(“The sum of 9+2 is: %in”, 11); return (0);}

To sum up, the four steps of compilation are: preprocessing, compiling, assembly, linking.

Read the man page on gcc for more information. Happy coding!

Image for posttop of man page for gcc

26