Tuesday, September 8, 2009

Creating Tiny ELF Executables For Linux !

Recently got a nice article link from my friend.

If you're a programmer who's become fed up with software bloat, then may you find herein the perfect antidote.

This document explores methods for squeezing excess bytes out of simple programs. (Of course, the more practical purpose of this document is to describe a few of the inner workings of the ELF file format and the Linux operating system. But hopefully you can also learn something about how to make really teensy ELF executables in the process.)

Please note that the information and examples given here are, for the most part, specific to ELF executables on a Linux platform running under an Intel-386 architecture. I imagine that a good bit of the information is applicable to other ELF-based Unices, but my experiences with such are too limited for me to say with certainty.

Please also note that if you aren't a little bit familiar with assembly code, you may find parts of this document sort of hard to follow. (The assembly code that appears in this document is written using Nasm; see http://www.nasm.us/.)


In order to start, we need a program. Almost any program will do, but the simpler the program the better, since we're more interested in how small we can make the executable than what the program does.

Let's take an incredibly simple program, one that does nothing but return a number back to the operating system. Why not? After all, Unix already comes with no less than two such programs: true and false. Since 0 and 1 are already taken, we'll use the number 42.

So, here is our first version:

  /* tiny.c */
int main(void) { return 42; }

which we can compile and test like so:

  $ gcc -Wall tiny.c
$ ./a.out ; echo $?
42

So. How big is it? Well, on my machine, I get:

  $ wc -c a.out
3998 a.out

(Yours will probably differ some.) Admittedly, that's pretty small by today's standards, but it's almost certainly bigger than it needs to be.

The obvious first step is to strip the executable:

  $ gcc -Wall -s tiny.c
$ ./a.out ; echo $?
42
$ wc -c a.out
2632 a.out

That's certainly an improvement. For the next step, how about optimizing?

  $ gcc -Wall -s -O3 tiny.c
$ wc -c a.out
2616 a.out

That also helped, but only just. Which makes sense: there's hardly anything there to optimize.

It seems unlikely that there's much else we can do to shrink a one-statement C program. We're going to have to leave C behind, and use assembler instead. Hopefully, this will cut out all the extra overhead that C programs automatically incur.

So, on to our second version. All we need to do is return 42 from main(). In assembly language, this means that the function should set the accumulator, eax, to 42, and then return:

  ; tiny.asm
BITS 32
GLOBAL main
SECTION .text
main:
mov eax, 42
ret

We can then build and test like so:

  $ nasm -f elf tiny.asm
$ gcc -Wall -s tiny.o
$ ./a.out ; echo $?
42

(Hey, who says assembly code is difficult?) And now how big is it?

  $ wc -c a.out
2604 a.out

Looks like we shaved off a measly twelve bytes. So much for all the extra overhead that C automatically incurs, eh?

..................

Please continue reading from below :

Reference : http://www.muppetlabs.com/~breadbox/software/tiny/teensyps.html

No comments: