Linux C/C++: Compiling and Linking Basics

My parents bought me a TRS-80 model III when I was 13 years of age, and boy did I love my TRS-80! I quickly learnt how to code using the baked in BASIC language, and even familiarized myself with simple Z80 assembly language programming using the magic of BASIC’s “peeks” and “pokes”. I recall desperately wanting to code in C/C++ back then. However, in order to get my grubby little hands on a C/C++ compiler, I needed money (from memory, I think it was around two hundred Australian dollars at the time). Considering I “earned” around $5 a week for washing dishes and taking out the trash, it was going to take me at an eternity to be able to purchase a C/C++ compiler (I had other expenses at the time – like buying the hand-held double-LCD screen “Donkey Kong” game, jaw busters, and Star Wars action figures/spaceships). The year was 1983…

I only started coding in C/C++ when I managed to purchase my first “IBM compatible” running MS-DOS. The compiler was Borland’s C++ compiler, and it set me back over a hundred Australia dollars. The year was 1987..

When I first started using Linux, I didn’t really comprehend the ideology/philosophy behind the FOSS (Free and Open Source Software) movement. I didn’t pay much attention the software licenses, I just knew that Linux was “free of charge”, and was ecstatic that a C/C++ compiler was included (free of charge!). I cannot remember what the year was – some time in the early 2000s.

Fast-forward to today, and you can buy a Raspberry Pi for next to nothing, install Raspbian or (I believe) Ubuntu Core, learn how to use one of the editors that can run in a terminal, and start coding in C/C++ to your heart’s content. The barrier to entry is now so low, that it has literally bought a tear to my eye (once or twice).

Anyway, enough about my personal journey and experiences with C/C++ compilers. Let’s get in to some basics about C/C++ in modern day (at time of writing) Linux.

Compiling C/C++ with GNU C/C++ Compiler

Here’s a simple C program that will print out the command line arguments you pass to the program when you execute it:

#include <stdio.h>

int main(int argc, char *argv[])
{
  for (int i=0; i<argc; i++) {
    printf("arg[%d]: %s\n", i, argv[i]);
  }

  return 0;
}

Let’s assume you’ve saved the above program in a file called program.c. To compile it in to a binary called program, you would use:

gcc -o program program.c

Let’s execute the program we’ve created to see what the output is:

[syd@pi-server:cpp]$ ./program one two three four
arg[0]: ./program
arg[1]: one
arg[2]: two
arg[3]: three
arg[4]: four

Let’s re-write this example as a C++ program and save the file as program.cpp:

#include <iostream>

int main(int argc, char *argv[])
{
  for (int i=0; i<argc; i++) {
    std::cout << "arg[" << i << "]: " << argv[i] << std::endl;
  }

  return 0;
}

Let’s try to compile this C++ program with the GNU C compiler:

syd@pi-server:cpp]$ gcc -o program program.cpp 
/tmp/ccgsFTNy.o: In function `main':
program.cpp:(.text+0x34): undefined reference to `std::basic_ostream<char, std::char_traits<char> >& std::operator<< <std::char_traits<char> >(std::basic_ostream<char, std::char_traits<char> >&, char const*)'
program.cpp:(.text+0x44): undefined reference to `std::ostream::operator<<(int)'
program.cpp:(.text+0x54): undefined reference to `std::basic_ostream<char, std::char_traits<char> >& std::operator<< <std::char_traits<char> >(std::basic_ostream<char, std::char_traits<char> >&, char const*)'
program.cpp:(.text+0x70): undefined reference to `std::basic_ostream<char, std::char_traits<char> >& std::operator<< <std::char_traits<char> >(std::basic_ostream<char, std::char_traits<char> >&, char const*)'
program.cpp:(.text+0x80): undefined reference to `std::ostream::operator<<(std::ostream& (*)(std::ostream&))'
program.cpp:(.text+0xa8): undefined reference to `std::cout'
program.cpp:(.text+0xb0): undefined reference to `std::basic_ostream<char, std::char_traits<char> >& std::endl<char, std::char_traits<char> >(std::basic_ostream<char, std::char_traits<char> >&)'
/tmp/ccgsFTNy.o: In function `__static_initialization_and_destruction_0(int, int)':
program.cpp:(.text+0xe8): undefined reference to `std::ios_base::Init::Init()'
program.cpp:(.text+0x114): undefined reference to `std::ios_base::Init::~Init()'
collect2: error: ld returned 1 exit status

Compilation failed. What happened? The answer is the GNU C compiler doesn’t pull in the libraries required to compile C++ code – in particular, the stdc++ library. You can explicitly instruct the GNU C compiler to use stdc++

gcc -o program -lstdc++ program.cpp

Compiling C++ code with the gcc compiler is not recommended. Things can quickly become very complicated (i.e. trying to remember what libraries need to be linked, and how). Thankfully, there’s a GNU C++ compiler included with Linux too, and it makes compiler C++ code much easier. For the same example above, the following can be used:

g++ -o program program.cpp

Large and complicated C and C++ programs often have multiple source files – each specific to a particular feature/function. This separation assists in creating projects that are easier to maintain and debug. Let’s take a look at a very simple example.

addtwoints.c

int add(int x, int y)
{
  return x + y;
}

addtwoints.h

int add(int, int);

program.c

#include <stdio.h>
#include <stdlib.h>

#include "addtwoints.h"

int main(int argc, char *argv[])
{
  if (argc != 3) {
    printf("*** ERROR! Usage: %s int1 int2\n", argv[0]);
    return 1;
  }

  int a = atoi(argv[1]), b = atoi(argv[2]);
  printf("%d plus %d is %d.\n", a, b, add(a, b));

  return 0;
}

Let’s assume the directory you’re working in contains nothing more than the three files listed above:

[syd@pi-server:addtwonumbers]$ ls -lh
total 12K
-rw-r--r-- 1 syd syd  42 Feb 24 11:25 addtwoints.c
-rw-r--r-- 1 syd syd  19 Feb 24 11:29 addtwoints.h
-rw-r--r-- 1 syd syd 305 Feb 24 11:43 program.c

A very simple way to compile and link this code in to a binary called program is:

gcc -o program *.c

You can then run the program as follows:

[syd@pi-server:addtwonumbers]$ ./program 
*** ERROR! Usage: ./program int1 int2
[syd@pi-server:addtwonumbers]$ ./program 1 2
1 plus 2 is 3.

Great! However, it should be noted that there are advantages to compiling and then linking source files separately. When you compile a C/C++ source file, the result is an object file (usually with the extension .o). You can then link your object files to create an executable. Let’s take a quick look at how this would be done manually using the above “add two integers” example

[syd@pi-server:addtwonumbers]$ gcc -c addtwoints.c program.c 
[syd@pi-server:addtwonumbers]$ ls -l
total 20
-rw-r--r-- 1 syd syd   42 Feb 24 11:25 addtwoints.c
-rw-r--r-- 1 syd syd   19 Feb 24 11:29 addtwoints.h
-rw-r--r-- 1 syd syd  868 Feb 24 12:01 addtwoints.o
-rw-r--r-- 1 syd syd  305 Feb 24 11:43 program.c
-rw-r--r-- 1 syd syd 1300 Feb 24 12:01 program.o

The -c compiler flag tells the compiler to “compile only” (i.e. do not link and do not create an executable), which is why addtwoints.o and program.o are the only files that were generated. This becomes very beneficial when you have a large program (perhaps with hundreds or even thousands of C/C++ source files) and you only change the code in a single file. Instead of having to recompile the entire entire project (which can take hours for very large projects), the single source file change can be compiled and linked with the object files (for which the corresponding source files haven’t been altered) in a matter of minutes or seconds.

Let’s now link the two object files (addtwoints.o and program.o) in to a binary that can be executed:

[syd@pi-server:addtwonumbers]$ gcc -o program addtwoints.o program.o
[syd@pi-server:addtwonumbers]$ ls -lh
total 32K
-rw-r--r-- 1 syd syd   42 Feb 24 11:25 addtwoints.c
-rw-r--r-- 1 syd syd   19 Feb 24 11:29 addtwoints.h
-rw-r--r-- 1 syd syd  868 Feb 24 12:01 addtwoints.o
-rwxr-xr-x 1 syd syd 8.1K Feb 24 12:24 program
-rw-r--r-- 1 syd syd  305 Feb 24 11:43 program.c
-rw-r--r-- 1 syd syd 1.3K Feb 24 12:01 program.o

Linking

There are two ways to link to compiled objects (libraries) so that your executable can use the functionality exposed within them:

Static linking
Dynamic linking

Static linking is done at compile time and pulls in the pre-compiled object code in to your executable (resulting in a larger executable). Dynamic linking is done at runtime and uses “shared objects”. When linking, this is the default option (if there are both static and shared libraries). Shared objects can be used by any number of executables within the operating system.

On Linux systems, there is a convention for libraries:

Library names are always preceded with lib.
The extension for static libraries is usually .a (for “archive”).
The extension for shared libraries is usually .so (for “shared object”).

A Linux executable can contain both statically linked and dynamically linked libraries.

In this section, we’re going to use four files:

addtwoints.c

int add(int x, int y)
{
  return x + y;
}

sayhello.c

#include <stdio.h>

void hello(const char **s)
{
  printf("Hello there, %s!\n", *s);  
}

myfunctions.h

int add(int, int);
void hello(const char **);

program.c

#include <stdio.h>
#include <stdlib.h>

#include "myfunctions.h"

int main(int argc, char *argv[])
{
  if (argc != 4) {
    printf("*** ERROR! Usage: %s int1 int2 string1\n", argv[0]);
    return 1;
  }

  int a = atoi(argv[1]), b = atoi(argv[2]);
  printf("%d plus %d is %d.\n", a, b, add(a, b));
  
  hello((const char **)&argv[3]);

  return 0;
}

Static Linking

From the source files listed above, we will create a static library (i.e. an “archive”) using addtwoints.c and sayhello.c. This is achieved by compiling and then using the ar tool on the resulting object files:

[syd@pi-server:static]$ gcc -static -c addtwoints.c sayhello.c 
[syd@pi-server:static]$ ar crv libmyfunctions.a addtwoints.o sayhello.o
a - addtwoints.o
a - sayhello.o
[syd@pi-server:static]$ ls -lh
total 28K
-rw-r--r-- 1 syd syd   42 Feb 24 18:23 addtwoints.c
-rw-r--r-- 1 syd syd  868 Feb 24 20:41 addtwoints.o
-rw-r--r-- 1 syd syd 2.2K Feb 24 20:42 libmyfunctions.a
-rw-r--r-- 1 syd syd   46 Feb 24 18:32 myfunctions.h
-rw-r--r-- 1 syd syd  351 Feb 24 18:36 program.c
-rw-r--r-- 1 syd syd   89 Feb 24 18:31 sayhello.c
-rw-r--r-- 1 syd syd 1.1K Feb 24 20:41 sayhello.o

We can now compile and statically link to libmyfunctions.a:

[syd@pi-server:static]$ gcc -o program program.c -L. -lmyfunctions
[syd@pi-server:static]$ ./program 1 2 syd
1 plus 2 is 3.
Hello there, syd!

By default, the linker will look for libraries in /usr/lib. The -L flag in the example above tells the linker about non-standard directories where libraries exist – in this case, the current directory. The -l flag tells the linker to link the specified library. You do not have to use the the leading “lib” part of the library filename, or file extension.

You can check shared object dependencies of a Linux executable using:

[syd@pi-server:static]$ ldd program
        linux-vdso.so.1 (0x7ea0a000)
        /usr/lib/arm-linux-gnueabihf/libarmmem.so (0x76ecf000)
        libc.so.6 => /lib/arm-linux-gnueabihf/libc.so.6 (0x76d90000)
        /lib/ld-linux-armhf.so.3 (0x76ee5000)

The above is verification there is no libmyfunctions.so dependency.

Dynamic Linking

Again using the example code above, let’s now create a dynamically loaded library (i.e. a “shared object”):

[syd@pi-server:dynamic]$ gcc -o libmyfunctions.so -shared -fPIC addtwoints.c sayhello.c 
[syd@pi-server:dynamic]$ gcc -o program program.c -L. -lmyfunctions
[syd@pi-server:dynamic]$ ls -lh
total 36K
-rw-r--r-- 1 syd syd   42 Feb 24 18:45 addtwoints.c
-rwxr-xr-x 1 syd syd 7.5K Feb 24 22:59 libmyfunctions.so
-rw-r--r-- 1 syd syd   46 Feb 24 18:45 myfunctions.h
-rwxr-xr-x 1 syd syd 8.2K Feb 24 23:00 program
-rw-r--r-- 1 syd syd  351 Feb 24 18:45 program.c
-rw-r--r-- 1 syd syd   89 Feb 24 18:45 sayhello.c
[syd@pi-server:dynamic]$ 
[syd@pi-server:dynamic]$ ldd program
        linux-vdso.so.1 (0x7ef00000)
        /usr/lib/arm-linux-gnueabihf/libarmmem.so (0x76f0c000)
        libmyfunctions.so => not found
        libc.so.6 => /lib/arm-linux-gnueabihf/libc.so.6 (0x76dcd000)
        /lib/ld-linux-armhf.so.3 (0x76f22000)
[syd@pi-server:dynamic]$

Notice the results contain: libmyfunctions.so => not found? Let’s try to execute the program to see what happens:

[syd@pi-server:dynamic]$ ./program 1 2 syd
./program: error while loading shared libraries: libmyfunctions.so: cannot open shared object file: No such file or directory

The dynamic library loader failed to file libmyfunctions.so. That’s because shared object files need to be a particular directories (e.g. /usr/lib or /lib etc…). There are a number of ways to resolve this:

Copy your shard object to /usr/lib or /usr/local/lib or /lib.
Add a new directory to the shared object path (see /etc/ld.so.conf.d/).
Set LD_LIBRARY_PATH to include the directory containing your .so file(s).

[syd@pi-server:dynamic]$ sudo cp libmyfunctions.so /usr/local/lib/
[syd@pi-server:dynamic]$ ./program 2 4 syd
./program: error while loading shared libraries: libmyfunctions.so: cannot open shared object file: No such file or directory

It still didn’t work!? That’s because you have to execute a program called ldconfig to rebuild the cache:

[syd@pi-server:dynamic]$ sudo ldconfig
[syd@pi-server:dynamic]$ ./program 2 4 syd
2 plus 4 is 6.
Hello there, syd!

One of the other options to execute the binary without copying the shared object to another directory and then running ldconfig is to set LD_LIBRARY_PATH:

[syd@pi-server:dynamic]$ LD_LIBRARY_PATH=`pwd` ./program 3 4 syd
3 plus 4 is 7.
Hello there, syd!

Other Considerations

Libraries are usually released in shared object form (for dynamic linking), and as archives (for static linking). When there’s both a dynamic and static version of a library, the dynamic version will always be used by the linker by default. This can be seen by searching for both the dynamic and static versions of libc (which, as can be seen from the ldd output above, is just another library):

[syd@pi-server:dynamic]$ sudo find /usr/lib -type f -name 'libc.*'
/usr/lib/arm-linux-gnueabihf/libc.a
/usr/lib/arm-linux-gnueabihf/libc.so

You can force the linker to use static libraries (whenever possible) by using the -static flag:

[syd@pi-server:static]$ gcc -static -o program program.c -lmyfunctions
[syd@pi-server:static]$ ls -lh
total 576K
-rw-r--r-- 1 syd syd   42 Feb 24 18:23 addtwoints.c
-rw-r--r-- 1 syd syd   46 Feb 24 18:32 myfunctions.h
-rwxr-xr-x 1 syd syd 557K Feb 25 00:02 program
-rw-r--r-- 1 syd syd  351 Feb 24 18:36 program.c
-rw-r--r-- 1 syd syd   89 Feb 24 18:31 sayhello.c
[syd@pi-server:static]$ ldd program
        not a dynamic executable

Notice the jump in the executable’s size: from 8.2K to 557K!

If you have both the shared and static version of the myfunctions library but want to link using just the static version (i.e. libmyfunctions.a), here’s one way you can do it:

[syd@pi-server:static]$ gcc -o program program.c -Wl,-Bstatic -lmyfunctions -Wl,-Bdynamic
[syd@pi-server:static]$ ldd program
        linux-vdso.so.1 (0x7ef5b000)
        /usr/lib/arm-linux-gnueabihf/libarmmem.so (0x76ef8000)
        libc.so.6 => /lib/arm-linux-gnueabihf/libc.so.6 (0x76db9000)
        /lib/ld-linux-armhf.so.3 (0x76f0e000)

Conclusion

How times have changed… Everything in this tutorial was (as can be seen by the pi-server prompt) executed on a Raspberry Pi. If you want to jump in and start messing around with some C/C++ coding in a Linux environment, an RPi is an ideal place to start.

C/C++ doesn’t have to be scary – it’s actually fun to code with once you familiarize yourself with how to use the tool-chain. Some of the flags may seem cryptic, but a quick search on the Interwebs will reveal all.

In a future post, I’ll cover a more realistic project and will do a deep-dive on make files. We may even touch on writing a library using Assembly language (and some of the reasons why you should/shouldn’t do this)…

Random Linux Thoughts