Sunday, 31 January 2016

Compling a C Program

In this lab, we investigated the source code of a C program against the output of the C compiler. We first started off with compiling hello.c as is.
gcc -o hello hello.c -g -O0 -fno-builtin

The options do the following:
  • -g:                  Enables debug information
  • -O0:               Disables GCC optimizations
  • -fno-builtin:   Disables C optimizations

The resulting binary files was just a few kilobytes in size.  In the lab we were to recompile the code with the following changes:
  1. Add the compiler option -static
  2. Remove the compiler option -fno-builtin
  3. Remove the compiler option -g
  4. Add additional arguments to the printf()
  5. Move the printf() call to a separate function named output(), and call that function from main()
  6. Remove the -O0 and add -03 to the gcc options

 

Add the Compiler Option -static

The static option causes all the dependencies to be added for the program in the binary. Everything is included and compiled along with the source code. This option makes the program access links much faster but increases the size of the binary. Without the option the program will dynamically look for the links.

 

Remove the Compiler Option -fno-builtin

This option tells the compiler to avoid using any built in optimization techniques. When used it changes the function printf() to become puts(). The difference between the two is that printf() validates each character to make sure its a formatted. While puts() simply inserts the function into the buffer and onto the screen without any validation. Other functions that are being used in the background are also changed but I am unable to tell why and for what purpose. I assume most of the operations are much faster and more direct just like printf() and puts().

 

Remove the Compiler Option -g

The -g option allows debugging to occur. Without it errors are not as human readable and warning are not told. The -g option can make the job of figuring out easier for the developer, but the public binary should not be released with it.

 

Add Additional Arguments to the printf()

When additional arguments are added to printf(), based on the platform, numbers are loaded into the registry first while others are pushed onto the stack.

int main() {
  400500:       55                      push   %rbp
  400501:       48 89 e5                mov    %rsp,%rbp
  400504:       48 83 ec 30             sub    $0x30,%rsp
    printf("Hello World!\n%d%d%d%d%d%d%d%d%d%d%d", 0, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100);
  400508:       c7 44 24 28 64 00 00    movl   $0x64,0x28(%rsp)
  40050f:       00
  400510:       c7 44 24 20 5a 00 00    movl   $0x5a,0x20(%rsp)
  400517:       00
  400518:       c7 44 24 18 50 00 00    movl   $0x50,0x18(%rsp)
  40051f:       00
  400520:       c7 44 24 10 46 00 00    movl   $0x46,0x10(%rsp)
  400527:       00
  400528:       c7 44 24 08 3c 00 00    movl   $0x3c,0x8(%rsp)
  40052f:       00
  400530:       c7 04 24 32 00 00 00    movl   $0x32,(%rsp)
  400537:       41 b9 28 00 00 00       mov    $0x28,%r9d
  40053d:       41 b8 1e 00 00 00       mov    $0x1e,%r8d
  400543:       b9 14 00 00 00          mov    $0x14,%ecx
  400548:       ba 0a 00 00 00          mov    $0xa,%edx
  40054d:       be 00 00 00 00          mov    $0x0,%esi
  400552:       bf 00 06 40 00          mov    $0x400600,%edi
  400557:       b8 00 00 00 00          mov    $0x0,%eax
  40055c:       e8 7f fe ff ff          callq  4003e0 <printf@plt>
}
  400561:       c9                      leaveq
  400562:       c3                      retq
  400563:       66 2e 0f 1f 84 00 00    nopw   %cs:0x0(%rax,%rax,1)
  40056a:       00 00 00
  40056d:       0f 1f 00                nopl   (%rax)

 

Move the printf() Call to a Seperate Function Named output(), and Call that Function From main()

Moving the function for this is straight forward. All that seems to happens is that the output() function has its own section that main calls upon.

 

Remove the -O0 and Add -O3 to the GCC options

Any -OX command allows for optimization to be applied to the program. X being the level of optimization, where 0 is the minimal level of optimization and 3 is the most optimized. -O3 has the potential to break your program as it goes above and beyond to make sure it is optimized. From -O0 to -O3 different options are turned on automatically for the developer but they can also enable these options themselves for better control and choice.

Here are the important differences in main(-O3 vs -O0):
int main() {
    printf("Hello World!\n");
  400410:       bf a0 05 40 00          mov    $0x4005a0,%edi
  400415:       31 c0                   xor    %eax,%eax
  400417:       e9 c4 ff ff ff          jmpq   4003e0 <printf@plt>
and
int main() {
  400500:       55                      push   %rbp
  400501:       48 89 e5                mov    %rsp,%rbp
    printf("Hello World!\n");
  400504:       bf b0 05 40 00          mov    $0x4005b0,%edi
  400509:       b8 00 00 00 00          mov    $0x0,%eax
  40050e:       e8 cd fe ff ff          callq  4003e0 <printf@plt>
}
  400513:       5d                      pop    %rbp
  400514:       c3                      retq
  400515:       66 2e 0f 1f 84 00 00    nopw   %cs:0x0(%rax,%rax,1)
  40051c:       00 00 00
  40051f:       90                      nop

As you can see, -O3 accomplishes the same task in less code than -O0.
 

No comments:

Post a Comment