Saturday, November 6, 2010

Pointers: Basics

A pointer in C is a data type that holds the address to a specific block of memory within your computer's memory. The address in the pointer can be used to modify the contents thereof, or to cycle through other addresses in memory adjacent to such. It is also possible to use pointer arithmetic (addition and subtraction), though you cannot multiply or divide pointers. Take a look at the following diagram:




P is a pointer that has been declared and initialized with the value of the address 1884. P points to the block in memory next to the blocks with addresses of 1883 and 1885. C requires that every pointer point to only a specific type. There are no restrictions on what type of referenced data a pointer may reference to; pointers can even point to pointers.

int *p;
double *q;
char *r;

The above shows three ways of declaring a pointer as three different types of data. The only difference between declaring a pointer and a variable, is that a pointer's identifier must be preceded by the asterisk symbol.


Address and Dereference Operators


There are two different operators that are very commonly used with pointers. The first, which you've seen in the form of multiplication and when declaring a pointer, is the asterisk *, which is called the dereference operator (or also known as the indirection operator). You can translate the * literally into "the value pointed by". So, if we take P from our example above, and type *P, then *P means "the value pointed byP*P would equal whatever value is within the block of memory 1884. Actually, *P would be another alias for the value within the address 1884. This is because by modifying *P we actually directly modify the value within the address 1884. Suppose *P is an int value:


*p = 76;

This line of code would change the value within the address 1884 into the integer 76.


The second operator is the Address operator &. & can be translated literally into "the address of". This operator is particularly useful for assigning a value to a pointer, like so:

int val = 7;
int *p;
p = &val;

//You could also do:

int *p = &val;

Usually you wouldn't know exactly what the address of val is before assigning it to a pointer, as it could be anywhere within memory while your program is running. What is important, is that you can assign the address to a pointer.

Uses of Pointers

Imagine you need a function that modifies a variable. You cannot simply pass the variable to the function, since you can only pass the value of your variable to the function. This is due to the fact that the data within functions is private. You could however, pass a pointer to the function, and then use the dereference operator to directly modify the value pointed by the address. Consider the following:

#include <stdio.h>

int clear(int *x)
{
  *x = 0;
}

int main(void)
{
  int a = 5;

  clear(a);

  printf("%d", a);

  return 0;
}

You can also have a function return a pointer, like the following:

#include <stdio.h>

int *max( int *a, int *b)
{
  if (*a > *b)
    return a;
  else
    return b;
}

This function, when given pointers to two integers, will return a pointer to whichever integer is larger.


Pointers and arrays are used together all the time. Say we initialize pointer P and make it point to the first element of a[4]:

int a[4], *p;

p = &a[0];

Here is what we have just done graphically:




P now points to the first element of array a[]. Suppose we do the following:

*p = 7;

a[0] now equals 7. Here is what we have just done graphically:




So far this whole process doesn't seem too useful, but, where things start getting really useful is when you use pointer arithmetic to cycle through each element of the array using a pointer. C allows the following combinations of pointer arithmetic, and only these combination:


Adding an integer to a pointer
Subtracting an integer from a pointer
Subtracting one pointer from another pointer


Adding integer i to pointer P will cause P to point i elements ahead from where P originally pointed to. Similarly, if P points to a[x], then P + i points to a[x + i] (assuming a[x + i] even exists).

If P points to element a[x], then P - i points to a[x - i].

When one pointer is subtracted from another, the result is the distance in array elements from the two pointers.

It is also valid to compare pointers with the comparisons ==, !=, <=, and >=. However, in order for these comparisons to actually have meaning the two pointers being compared would need to point within the same array.

Pointers are also good for processing arrays, since you can apply addition and subtraction upon pointers. Though one could just as easily use array subscripting for such a task, pointers can be faster and less resource intensive (depending on the compiler; some compilers have no efficiency discrepancy between array subscripting and array processing via pointers).

#define VAL 10

int a[VAL], *p;

sum = 0;
for (p = &a[0]; p < &a[VAL]; p++)
  sum += *p;

The above code fragment shows how to sum all elements of an array with p. Note that this loop will not fire once p equals a[VAL], due to the properties of a for loop, thus the address a[VAL] won't actually be analyzed and the program will not produce an error during compilation.


A very important thing to note, is that the name of an array can be used as a pointer to the first element of an array.

int a[5];

*a = 7; //stores 7 in a[0]

*(a + 1) = 12; //stores 12 in a[1]

In general, a + i is the same as &a[i]. Also, *(a + i) is equivalent to a[i]. Also, the fact that an array name can serve as a pointer makes it easier to process them with for loops.

for (p = &a[0]; p < &a[VAL]; p++)
  sum += *p;

//this is the same as the following:

for (p =a; p < a + VAL; p++)
  sum += *p;

When passing an array to a function, the compiler passes a pointer to the first element in the array to the function. This is important to know.


Using all that I've explained so far, you can write loops to process both rows or columns of 2D arrays using pointers, like so (processing a row):

//loop that clears a row of a 2D array

int a[rows][cols], *p, row;

row = x; //x is an integer that represents our selected row 
for (p = a[row]; p < a[i] + cols; p++)
  *p = 0;

Processing a column isn't as simple, since a 2D array has the first array be an array of arrays, meaning it's an array of rows.

//loop that clears a column of a 2D array

int a[rows][cols], (*p)[cols], col;


col = x; //x is an integer that represents our selected column

for (p = a; p < a[rows]; p++)
  (*p)[col] = 0;

I have declared p to be a pointer to an array of integers, which will be used as a row in the loop. The parentheses are necessary to be around the *p, otherwise p would be an array of pointers rather than a pointer to an array. In the expression p = aa is equal to the address of a[0]. We know this from recalling the earlier quote:
In general, a + i is the same as &a[i]. Also, *(a + i) is equivalent to a[i].
Sources for this post:
* C Programming: A Modern Approach 2nd Ed. (particularly chapter 12)

8 comments:

  1. I found this very helpful

    ReplyDelete
  2. Woot! That's awesome to hear. I'm glad someone found something here useful.

    ReplyDelete
  3. Why aren't you using C++ broski?

    ReplyDelete
  4. C++ is an extension of C. You can't learn C and C++ at the same time unless you've had decent previous coding experience. Plus, I'm following what DigiPen teaches their freshman, so C++ will come after a solid base in C.

    ReplyDelete
  5. C++ is so similiar to C that I wouldn't bother trying to learn C by itself, just go straight into C++. For example, in that game you made, you could have used C++ classes for the different unit and weapon classes, which would make your job a whole lot easier.

    ReplyDelete
  6. It would be too hard to learn C at a level of detail that I desire, all the while learning C++. I think it's faster and more efficient to do one at a time.

    ReplyDelete
  7. I suppose. For game design, I see C++ as being a lot better to use simply for classes, although of course it's slower. If you're coding a big project, C++ is the way to go, but for what you're doing, C is fine.

    ReplyDelete

Note: Only a member of this blog may post a comment.