Mangling in C++

In standard ANSI C, if we wish to have two functions with the same name but different number of parameters or type of parameters (called overloading), we can’t. It is not allowed.

In the other hand, in C++, we can overload functions. Overloading a function means that we can have two or more functions sharing the same function name but with different type or different number of parameters:

int print_something(void){
...
}

int print_something(char a){
...
}

int print_something(int a){
...
}

int print_something(char a, int b){
...
}

Even do all these functions share the same name, they are different; so, how does your compiler keep track of which is which? By using a method called mangling algorithm, unique names are generated as identifiers for each of these functions.

First, lets say that we have a file with some functions such as:

#include <stdio.h>

int foo(double d_number);

int main(int argc, char* argv[]){

	double d_value = 4.5f;

	printf('For value %f, we get the number %d\n', d_value, foo(d_value));

  return 0;
}

int foo(double d_number){
  return (int) d_number;
}

This would show the follow:

For value 4.500000, we get the number 4

If we compile the file with gcc for example: gcc -c main.c. This will generate a file name main.o
Then we can analyse this file and see how the table is created in C for this particular function.

Type nm file.o and you will obtain something like this:

0000003f T foo
00000000 T main
 U printf

Notice the table only indicate the name of foo but doesn’t indicate any parameters types

What would append if we compile the same program using g++ instead?
A table would be build by the C++ compiler (in this case we are talking about g++) that would look like the follows:

0000003e T _Z3food
 U __gxx_personality_v0
00000000 T main
 U printf

Look the information _Z3food, the last character ‘d’ indicate that the parameter is a double

Lets say we modified main.c to main.cpp and we create another function with the same name but different parameters:

#include <stdio.h>

int foo(double d_number);
int foo(char d_character, int i_number);

int main(int argc, char* argv[]){
  double d_value = 4.5f;
  printf('For value %f, we get the number %d\n', d_value, foo(d_value));
  return 0;
}

int foo(double d_number){
  return (int) d_number;
}

int foo(char d_character, int i_number){
  return (int) d_character + i_number;
}

If we print the table using nm main.o we obtain:

0000006e T _Z3fooci
0000003e T _Z3food
 U __gxx_personality_v0
00000000 T main
 U printf

If you notice,  _Z3fooci has the last two character a ‘c’ for char and an ‘i’ for integer while _Z3food has the last character ‘d’ for double.

If we try to compile this code having both functions with the same name using gcc: gcc -g main.cpp, we would obtain an error in compilation:

/tmp/ccU0T0Co.oFrown.eh_frame+0x12): undefined reference to `__gxx_personality_v0'
collect2: ld returned 1 exit status

Using extern “C”, we can tell g++ which part of the code we wish to compile as regular C.
Let say we have to following code:

#include <stdio.h>

int foo(double d_number);

extern 'C'{
  int foo(char d_character, int i_number);
}

int main(int argc, char* argv[]){
  double d_value = 4.5f;
  printf('For value %f, we get the number %d\n', d_value, foo(d_value));
  return 0;
}

int foo(double d_number){
  return (int) d_number;
}

extern 'C'{
  int foo(char d_character, int i_number){
    return (int) d_character + i_number;
  }
}

This would compile with gcc -g main.cpp without problems. If we execute g++ -c main.cpp to create the object main.o and later executed nm main.o we obtain:

0000003e T _Z3food
 U __gxx_personality_v0
0000006e T foo
00000000 T main
 U printf

Notice that _Z3food was compiled as C++ while  T foo indicate that that function was compiled as standard C.

Share
Leave a comment

Dangling and Wild Pointers in C/C++

When we define a variable, memory is allocated for that variable.

/* definition of a variable with a size memory allocated of one byte */
char character;

This declare that the name character is of type char.

As you may know, we use pointers in order to manipulate data in the memory. Pointers are designed to store a value which is a memory address. However, if a defined pointers is not instantiated then this pointer may have a memory address pointing to any place in memory. This is pointer is called a “dangling pointer”. For example,

/* Dangling pointer */
char* c_pointer;

char character;
char* c_char_pointer = &character;  /* No dangling pointer */

Lets assume we defined a dangling pointer and we attempt to print the content of the memory address that may store in the dangling pointer:

int* dangling_pointer;
printf('%d', *dangling_pointer);

This would give us segmentation fault.

Lets assume we define a pointer, integer_pointer,  and we make sure that this pointer is not dangling by instantiated with NULL. Later in the our program, we define a variable inside brackets, integer_variable, and we assign the address of this variable to our pointer. The moment we get out of the brackets, the variable integer_variable will be out of scope (This means that this variable disappear in our program).

int main(int argc, char* argv[]){

  int* integer_pointer = NULL;
  ... /* Dots means that there are more code not included in the example */

  { /* integer_variable exist only between this brackets */
    int integer_variable = 100;
    integer_pointer = &integer_variable;
    ...
  } /* After this bracket, integer_variable is out of scope */
  ...

  *integer_pointer = 150;
  ...

  return 0;
}

This could produce unpredictable behaviour because the address store in the integer_pointer could be pointing to a sector in memory used for another program or another part of the program.

When using dynamic memory allocation such as malloc or new, the moment that we free the pointer, it becomes dangling. Therefore, it is a good policy to point the pointer to NULL. Why? Because the memory that was obtained by malloc or new will not exist anymore since it was freed by free.

#include <malloc.h>
...
int main(int argc, char* argv[]){
  ...

  char* character_pointer = malloc(sizeof(char));
  ...

  /* character_pointer is freed and become a dangling pointer */
  free(character_pointer); 

  /* Point the pointer to NULL so it is not dangling anymore */
  character_pointer = NULL;
  ...

  return 0;
}

Sometimes is said that an dangling pointer is a wild pointer; however, there is a distinction. While a dangling pointer is a wild pointer, a wild pointer may not be a dangling pointer… What?!! (You may ask) Let me explain.

The different is the instantiation of the variable. Lets define to pointers:

char* character_pointer_1;
char* character_pointer_2;

Both pointers can be called dangling and/or wild pointers because they are not instantiated. Now lets assume we instantiate character_pointer_1.

char* character_pointer_1;
char* character_pointer_2;
...
character_pointer_1 = malloc(sizeof(char));
...
free(character_pointer_1);
...

In this case, things change because character_pointer_1 was instantiated and later freed (with free). character_pointer_1 is called dangling pointer for the fact that it was instantiated before, while character_pointer_2 is called wild because it was never instantiated.

Share
Leave a comment

Function Pointers in C/C++

In the previous post, we cover how to work pointers with arrays. In this post we will see how to use function pointers, and how useful they can be:

I am assuming that you read the previous postings about pointers and pointers with arrays.

As we may recall, when we declare a pointer, the pointer will store an address of a position in memory.
Normally, we wish to create pointers of the same kind as the object we want to point at. For example, if the variable is an integer then we want the pointer to be an integer:

int i_variable = 22;
int* pi_variable = &i_variable;

If you create a different variable of the same kind, you could change where the pointer is pointing at:

int i_variable = 22;
int i_variable_2 = 44;
int* pi_variable = &i_variable;
printf('Value pointed at: %d \n', *pi_variable);     /* This line print 22 */
pi_variable = &i_variable_2;                         /* pi_variable points at i_variable_2 */
printf('New Value pointed at: %d \n', *pi_variable); /* This line prints 44 */

Now the question comes, can we use this with functions? Yes, we can!

Here is an example of how this works:

/* This return the addition of value_a with value_b */
int add(int value_a, int value_b){
	return value_a + value_b;
}

/* This function return the subtraction of value_b from value_a */
int sub(int value_a, int value_b){
  return value_a - value_b;
}

int main(int argc, char* argv[]){
  int val_a = 4;   
  int val_b = 5;

  /* Function pointer  must have the same return type and parameter type */
  int (*p_function)(int, int);

  p_function = add;
  printf('ADD A: %d with B:%d to obtain %d \n', 
         val_a, 
         val_b, 
         (*p_function)(val_a, val_b));

  p_function = sub; 
  printf('SUBTRACT B: %d OF A:%d to obtain %d \n', 
         val_b, 
         val_a, 
         (*p_function)(val_a, val_b));

return 0;
}

This will print:

ADD A: 4 with B:5 to obtain 9
SUBTRACT B: 5 OF A:4 to obtain -1

As you can see this can be a very powerful feature. The function pointer will point to the address of any function we want to point at while the function have the same return type (int in this case), the same number of parameters (in this case, we have two parameters), and the same type of parameters (both parameters are int).

Share
Leave a comment

Pointers with Arrays in C/C++

In the previous post, “Pointers in C/C++”, we talked about pointers in C/C++. We recommend to read this posting before continuing.

Lets say we declare an array of 5 elements:

int i_array[5];

Also, we could declare and instantiate each elements in the array at the same time:

int i_array[5] = {10, 20, 30, 40, 50};

Now, lets assume we created the array and we want to change the values of each element, one way could be:

i_array[0] = 15;
i_array[1] = 25;
i_array[2] = 35;
i_array[3] = 45;
i_array[4] = 55;

However, there is another way to change the values of elements in an array, and that way is by using pointers.

To begin with, first we need to understand how are the array build, each element in the array have an address in memory, the name that we give to the array is connected to the address of the first element. Each consecutive elements will have an address that is the previous address plus the size of the address. For example, if you have an array of characters, each character have a size of 8 bits (1 byte), this means that each element’s address will be different by 1 byte.

For example, this  code would print the address of each element in an array:

int main(int argc, char* argv[]){
  int c_array[5] = {'a', 'b', 'c', 'd', 'e'};
  int index;
  for (index = 0; index < 5; ++index){
    printf('c_array[%d] with value %d has address 0x%X \n',
     	     index, c_array[index], (unsigned int) &c_array[index]);
  }
  return 0;
}

This will show us:

c_array[0] with value a has address 0xBF95EAB7
c_array[1] with value b has address 0xBF95EAB8
c_array[2] with value c has address 0xBF95EAB9
c_array[3] with value d has address 0xBF95EABA
c_array[4] with value e has address 0xBF95EABB

As you can see they are separated by one byte:

0xBF95EAB7 - 0xBF95EAB8 = 1

The graphic representation of this array is:

When working with pointers, we can access to the information indirectly by just using the address of each element in the array. The following code will make a pointer to point at the first element of the array:

char* p_c_array = c_array; /* Not need &. c_array return an address. c_array[0] return a value */

The reason we are not using & when assigning the address is that arrays are accessed by reference, which means that the compiler will return the address of the first element if we write c_array while if we write c_array[0] it will return the value in that position.

If we want to print the first element:

printf('First Element: %c', *p_c_array);

If we want to print the second element, we need to increase the pointer so it will point to the next element.
By increasing it means that we must increase a total amount of one byte (because we are talking about char variables) to access to the next element:

0xBF95EAB7 + 1 byte =  0xBF95EAB8 

There are two ways to increase value in the pointer by one byte:
One ways is:

p_c_array = p_c_array + sizeof(char); /* Size of char return 1 byte */

Or by letting the compiler to increase the value:

p_c_array++;

Just take in consideration that you don’t wish to loose the original address to which you are pointing to the first element of the array; therefore it common practice to create a second pointer that will have the same address and increase that pointer, for example:

char* p_c_array = c_array;  /* Point to first element of array */
char* p_c_array_2;
p_c_array2 = p_c_array;      /* Copy address stored in p_c_array */
p_c_array2++;                /* Increase address stored by one byte to point to next element */
printf('First element: %c. Second Element %c \n',
       *p_c_array,
       *p_c_array2);

The follow example would print all the elements of the array:

int main(int argc, char* argv[]){
  char c_array[5] = {'a', 'b', 'c', 'd', 'e'};
  char* p_c_array = c_array;
  char* p_c_array_2;
  for (p_c_array_2 = p_c_array;
       *p_c_array_2 != '\0';
       ++p_c_array_2){
    printf('p_c_array_2 points to variable with value %c \n',
           *p_c_array_2);
  }
  return 0;
}

Notice that we are using in this case ‘\0’ to indicate the end of the array. This do not apply if we would be using an array of integers for example.

This code will print to screen:

p_c_array_2 points to variable with value a
p_c_array_2 points to variable with value b
p_c_array_2 points to variable with value c
p_c_array_2 points to variable with value d
p_c_array_2 points to variable with value e

We this we had cover the basic about pointers and arrays.

Next post, we are going to talk about pointer functions

Share
Leave a comment

How to Create a System Call

Note: Please check previous post about How to Build a Custom Kernel.

NOTIFICATION: These examples are provided for educational purposes. Using this code is under your own responsibility and risk. The code is given ‘as is’. I do not take responsibilities of how they are used.

  1. Prepare your system call:
    1. Go to the kernel source code folder: cd /home/your-home-folder/linux
    2. Create a personal folder: mkdir yourcall
    3. Access your folder: cd yourcall
    4. Create your source file: your_sys_call.c
    5. Create a makefile: Makefile
  2. Configure some kernel files
    1. Add your system call at the end of the file syscall_table_32.S
      1. cd /home/your-home-directory/linux/arch/x86/kernel
      2. gedit syscall_table_32.S
        iii. Add at the end of file: .long sys_your_new_system_call
    2. Add your system call at the end of the file unistd_32.h
      1. cd /home/your-home-directory/linux/x86/include/asm
      2. gedit unistd_32.h
      3. At the end of the file, add (Where XXX is the previous number plus 1)
        #define __NR_your_new_system_call XXX
    3. Increase by the number of system calls to the total system calls number: #define __NR_syscalls XXX (Where XXX is the previous number that was plus one.)
    4. Add the declaration of your system call at the end of syscalls.h
      1. cd /home/your-home-directory/linux/include/linux
      2. gedit syscalls.h
      3. asmlinkage long your_new_system_call (parameters you want to pass)
    5. Add the new folder to the kernel compile’s Makefile
      core-y += /kernel ... other folder... /yoursyscall
      
      
  3. Configure your system call
    1. Write your system call, inside yoursyscall.c
      asmlinkage long new_system_call (whatever params you want to pass){
      // whatever you want to do
      }
    2. Change the makefile by adding the following line: obj-y := yoursyscall.o
    3. Compile your kernel (Follow steps at Building a New Custom Kernel
  4. Test your system call
    1. Create a user level program that calls your system call
    2. Create a header file that the user space program can use:
      /* header.h */
      #include < linux/unistd.h >
      #define __NR_new_system_call XXX
      
      /* if you system call returns int and takes no parameter
      * use this macro
      */
      _syscall0(int,new_system_call)
      
      /* Otherwise, depending on the number of parameters
      * being passed use the _syscallN macro, N being the no
      * of params, like
      _syscall1(int, new_system_call, int)
      */
  5. Starting at kernel 2.6.18, the _syscallXX macros were removed from the header files to user space, therefore syscall() functions is required to use:
    1. printf (“System call returned %d \n”, syscall (__NR_new_system_call, params_if_any));
    2. or change the header.h file:
      * header.h */
      #include < linux/unistd.h >
      #include < sys/syscall.h >
      #define __NR_new_system_call XXX
      
      long new_system_call (params_if_any){
         return syscall (__NR_new_system_call, params_if_any);
      }
  6. Test the code:
    /* test client */
    #include 'header.h'
    
    int main (void){
        printf ('System call returned %d \n', new_system_call());
        return 1;
    }

NEW!!!

Before you do update-grub (if using , do the following command line (Where x.x.xx is the kernel version you compiled):

sudo update-initramfs -c -k x.x.xx

After Ubuntu 10.04 (or kernel 2.6.32) this seems to be a needed extra step to make it work.

If you encounter any problems or errors, please let me know by providing an example of the code, input, output, and an explanation. Thanks.

Share
Leave a comment