Friday, August 10, 2012

Generic Programming in C

Generic programming is style of programming that is typeless, as in the type of data to be used is not determined at the time the code was written. In C++ this is achieved with templating. I know of two ways to perform generic programming in C. One way is to use pointer typecasting. It's important to understand that when you typecast a pointer no data is modified within the pointer; how the data the pointer points to is interpreted is changed. This can be utilized to create generic algorithms and functionality.

For example a linked list data structure can be created utilizing pointer typecasting where the data to be held by a node is typecasted to a single type no matter what the actual type of the data is. Usually a void * is used in this case to prevent a dereference to data of unknown type. However, this method can only be used when access to the data held by the node (the data pointed by the void *) does not need to be accessed.

A more well-rounded (in my opinion) approach to generic programming in C is to make use of the preprocessor directive ##.

## pastes two tokens together to create a new token. Usage of the ## operator is known as token pasting, or token concatenation. There's a whole lot of documentation on the ## operator. If you're not familiar with it, you need to do some familiarizing before reading on.

All it does is take two tokens within a macro and stick them together to create a new token. Example:

#define PASTE_TOKENS( a, b ) \
  a##b

PASTE_TOKENS( var, val ); // output: varval

In the above example varval will need to be defined in order to avoid compiler errors, perhaps by using it as the name of a variable to define.

Token pasting is utilized to solve an age old problem in C of creating various functions to achieve the same problem. For example, I've written many linked lists in C and one day, I finally decide "I will never write another linked list in C". I'm sick and tired of having to write a new set of functions for each type of data that I need a new linked list for.

By using the ## operator one can automate the process of duplicating code to create new functions and new definitions based off of data type. This is much like templating in C++, except not nearly as user-friendly, yet still fun and satisfying to try.

Since there's a great need in C to duplicate code for various data types, and the ## operator allows us to create new tokens with two other tokens, we can use the type of the data to create a new name for a new set of functionality for each type of data desired.

Examine the following:

// Old C style "function overloading"
int INT_MAX( int a, int b )
{
  return (a > b) ? a : b;
}

float FLOAT_MAX( float a, float b )
{
  return (a > b) ? a : b;
}

double DOUBLE_MAX( double a, double b )
{
  return (a > b) ? a : b;
}

// Generic definition
#define GENERIC_MAX( TYPE )          \
  TYPE TYPE##_MAX( TYPE a, TYPE b ) \
  {                                  \
    return (a > b) ? a : b;          \
  }

As you can see, it can be really annoying to write different code for different types of data, when the code itself is highly redundant or even identical. However by providing the GENERIC_MAX macro a data type, a new function can be automatically generated by the C preprocessor! The MAX token is concatenated with the TYPE parameter to form a new function definition. Providing the GENERIC_MAX macro with type int will result in a new function being defined named int_MAX. Creating new sets of functionality of anything can be as simple as declaring or defining with a couple calls to a macro.

It might seem a bit self-defeating however to have to figure out what the new name of your generated function is going to be in order to call it. A very simple utility macro should be used to wrap around your generated generic functions. More information on how to do this is shown later in the post.

This can be taken further and be applied to even an abstract data type, such as a linked list. I've constructed myself a generic linked list.

//
// DECLARE_NODE
// Purpose: Declares a node of a specific type.
//
#define DECLARE_NODE( TYPE )   \
  typedef struct _NODE_##TYPE  \
  {                            \
    TYPE data;                 \
    struct _NODE_##TYPE *next; \
  } NODE_##TYPE;

The above code is an example definition of a macro that defines a generic node type. The type of the node is used in creating new name definitions wherever the node's type is required in the structure definition.

//
// DEFINE_FUNC( TYPE )
// Purpose: Defines a function to create a node of specified type.
//
#define DEFINE_FUNC( TYPE )                          \
  NODE_##TYPE *NODE_CREATE_##TYPE( TYPE data )       \
  {                                                  \
    NODE_##TYPE *newNode =                           \
     (NODE_##TYPE *)malloc( sizeof( NODE_##TYPE ) ); \
    newNode->data = data;                            \
    newNode->next = NULL;                            \
    return newNode;                                  \
  }

Above is an example of how to define a macro to define a function to create a new node of specified type based on the previous node definition.

You might be thinking that it would be a bit annoying to call so many different define and declare macros for each structure and function. You can actually just declare all structures and function prototypes within a single macro, I called mine DECLARE_LIST, which declares all structure types and function prototypes, as well as some utility macros required to use the generic linked list. I also have a DEFINE_LIST macro that defines all functions used in the generic linked list.

For abstract data types such as lists or stacks or queues, you'll probably want some additional utility functions to actually call the various generic functions you've defined. Here is an example set of utility functions I've made for my generic linked list:

//
// LIST_CONSTRUCT
// Purpose: Constructs and returns a new list object of specified type.
//
#define LIST_CONSTRUCT( TYPE ) \
  LIST_CONSTRUCT_##TYPE( )

//
// NODE_CREATE
// Purpose: Constructs and returns a new node of a specified type.
//
#define NODE_CREATE( TYPE, data, DeleteData, order ) \
  NODE_CREATE_##TYPE( data, DeleteData, order )

//
// LIST_INSERT
// Purpose: Inserts a node of specified type in position determined by the order
//          data member.
//
#define LIST_INSERT( TYPE, listSelf, node ) \
  LIST_INSERT_##TYPE( listSelf, node )

//
// LIST_REMOVE
// Purpose: Removes (deallocates) a node of a specified type from a list.
//
#define LIST_REMOVE( TYPE, node ) \
  LIST_REMOVE_##TYPE( node )

//
// LIST_DESTROY
// Purpose: Removes (deallocates) a list and all its nodes of a specified type.
//
#define LIST_DESTROY( TYPE, listSelf ) \
  LIST_DESTROY_##TYPE( listSelf )

//
// LIST_CALLER
// Purpose: Uses a callback on all nodes while supplying a specified param.
//
#define LIST_CALLER( TYPE, self, callback, param ) \
  LIST_CALLER_##TYPE( self, callback, param )

//
// LIST_TYPE
// Purpose: Returns the type of LIST that pertains to TYPE.
//
#define LIST_TYPE( TYPE ) \
  LIST_##TYPE

//
// NODE_TYPE
// Purpose: Returns the type of NODE that pertains to TYPE.
//
#define NODE_TYPE( TYPE ) \
  NODE_##TYPE

As you can see wrappers for the various generic functions are needed to properly call them (unless you want to manually figure out what the name of the function generated for whatever type you're currently using is). There's also a couple macros for getting the generated type associated with a provided type.

Click Here for Source Code

////////////////////////////////////////////////////
// Copyright (c) 2012 ICRL
// See the file LICENSE.txt for copying permission.
// 
// Original Author: Randy Gaul
// Date:   8/9/2012
// Contact: r.gaul@digipen.edu
////////////////////////////////////////////////////

#pragma once

#include "StringHash.h"

//
// DECLARE_LIST
// Purpose: Declares a linked list of a specific type on the heap.
//
#define DECLARE_LIST( TYPE )     \
  typedef struct _NODE_##TYPE    \
  {                              \
    TYPE data;                   \
    void (*DeleteData)( TYPE );  \
    int order;                   \
    struct _NODE_##TYPE *next;   \
    struct _NODE_##TYPE *prev;   \
  } NODE_##TYPE;                 \
\
  struct _LIST_##TYPE; \
\
  typedef struct _LIST_VTABLE_##TYPE                                                \
  {                                                                                 \
    void (*Destroy)  ( struct _LIST_##TYPE *self );                                 \
    NODE_##TYPE *(*CreateNode)( TYPE data, void (*DeleteData)( TYPE ), int order ); \
    RETURN_TYPE (*Insert)   ( struct _LIST_##TYPE *self, NODE_##TYPE *node );       \
    void (*Remove)   ( NODE_##TYPE *node );                                         \
  } _LIST_VTABLE_##TYPE;                                                            \
\
  extern const _LIST_VTABLE_##TYPE LIST_VTABLE_##TYPE; \
\
  typedef struct _LIST_##TYPE           \
  {                                     \
    NODE_##TYPE *list;                  \
    const _LIST_VTABLE_##TYPE *vtable;  \
  } LIST_##TYPE;                        \
\
  LIST_##TYPE *LIST_CONSTRUCT_##TYPE( void ); \
  NODE_##TYPE *NODE_CREATE_##TYPE( TYPE data, void (*DeleteData)( TYPE data ), int order ); \
  RETURN_TYPE LIST_INSERT_##TYPE ( LIST_##TYPE *self, NODE_##TYPE *node ); \
  void LIST_REMOVE_##TYPE ( NODE_##TYPE *node ); \
  void LIST_DESTROY_##TYPE ( LIST_##TYPE *self ); \
  void LIST_CALLER_##TYPE ( LIST_##TYPE *self, RETURN_TYPE (*callback)( NODE_##TYPE *, void * ), void *param ); \
  TYPE LIST_PEEK_##TYPE ( LIST_##TYPE *self )

//
// DEFINE_LIST
// Purpose: Defines a linked list of a specific type on the heap.
//
#define DEFINE_LIST( TYPE )                       \
  const _LIST_VTABLE_##TYPE LIST_VTABLE_##TYPE =  \
  {                                               \
    LIST_DESTROY_##TYPE,                          \
    NODE_CREATE_##TYPE,                           \
    LIST_INSERT_##TYPE,                           \
    LIST_REMOVE_##TYPE                            \
  };                                              \
\
  LIST_##TYPE *LIST_CONSTRUCT_##TYPE( void )                                  \
  {                                                                           \
    LIST_##TYPE *linkedList = (LIST_##TYPE *)malloc( sizeof( LIST_##TYPE ) ); \
    linkedList->list = NULL;                                                  \
    linkedList->vtable = &LIST_VTABLE_##TYPE;                                 \
    return linkedList;                                                        \
  }                                                                           \
\
  NODE_##TYPE *NODE_CREATE_##TYPE( TYPE data, void (*DeleteData)( TYPE ), int order )  \
  {                                                                                    \
    NODE_##TYPE *newNode = (NODE_##TYPE *)malloc( sizeof( NODE_##TYPE ) );             \
    newNode->data = data;                                                              \
    newNode->DeleteData = DeleteData;                                                  \
    newNode->order = order;                                                            \
    newNode->next = NULL;                                                              \
    newNode->prev = NULL;                                                              \
    return newNode;                                                                    \
  }                                                                                    \
\
  RETURN_TYPE LIST_INSERT_##TYPE ( LIST_##TYPE *self, NODE_##TYPE *node )  \
  {                                                                        \
    NODE_##TYPE *scan = NULL;                                              \
    if(!self)                                                              \
      return RETURN_FAILURE;                                               \
                                                                           \
    scan = self->list;                                                     \
                                                                           \
    if(scan) /* if list is not empty */                                    \
    {                                                                      \
      while(scan->next)                                                    \
      {                                                                    \
        if(scan->order <= node->order)                                     \
          break;                                                           \
        scan = scan->next;                                                 \
      }                                                                    \
                                                                           \
      /* beginning of list */                                              \
      if(!scan->prev)                                                      \
      {                                                                    \
        node->next = scan;                                                 \
        scan->prev = node;                                                 \
        self->list = node;                                                 \
      }                                                                    \
      /* if not at end of list */                                          \
      else if(scan->next)                                                  \
      {                                                                    \
        node->next = scan;                                                 \
        node->prev = scan->prev;                                           \
        scan->prev->next = node;                                           \
        scan->prev = node;                                                 \
      }                                                                    \
      else                                                                 \
      {                                                                    \
        if(scan->prev) /* If more than one item in list */                 \
        {                                                                  \
          if(scan->order <= node->order)                                   \
          {                                                                \
            node->next = scan;                                             \
            node->prev = scan->prev;                                       \
            scan->prev->next = node;                                       \
            scan->prev = node;                                             \
          }                                                                \
          else                                                             \
          {                                                                \
            node->prev = scan;                                             \
            scan->next = node;                                             \
          }                                                                \
        }                                                                  \
        else /* One item in list */                                        \
        {                                                                  \
          /* Place before or after depending on zOrder */                  \
          if(scan->order <= node->order)                                   \
          {                                                                \
            scan->prev = node;                                             \
            node->next = scan;                                             \
            self->list = node;                                             \
          }                                                                \
          else                                                             \
          {                                                                \
            scan->next = node;                                             \
            node->prev = scan;                                             \
          }                                                                \
        }                                                                  \
      }                                                                    \
    }                                                                      \
    else /* if list is empty */                                            \
    {                                                                      \
      self->list = node;                                                   \
    }                                                                      \
                                                                           \
    return RETURN_SUCCESS;                                                 \
  }                                                                        \
\
  void LIST_REMOVE_##TYPE ( NODE_##TYPE *node )   \
  {                                               \
    if(node->prev)                                \
    {                                             \
      node->prev->next = node->next;              \
      if(node->next)                              \
      {                                           \
        node->next->prev = node->prev;            \
      }                                           \
    }                                             \
    else if(node->next)                           \
    {                                             \
      node->next->prev = NULL;                    \
    }                                             \
                                                  \
    if(node->DeleteData)                          \
    {                                             \
      node->DeleteData( node->data );             \
    }                                             \
    free( node );                                 \
  }                                               \
\
  void LIST_DESTROY_##TYPE ( LIST_##TYPE *self )  \
  {                                               \
    NODE_##TYPE *temp = NULL;                     \
    NODE_##TYPE *scan = self->list;               \
                                                  \
    while(scan)                                   \
    {                                             \
      temp = scan->next;                          \
      self->vtable->Remove( scan );               \
      scan = temp;                                \
    }                                             \
                                                  \
    free( self );                                 \
  }                                               \
\
  void LIST_CALLER_##TYPE ( LIST_##TYPE *self, RETURN_TYPE (*callback)( NODE_##TYPE *, void * ), void *param )  \
  {                                                                                                             \
    NODE_##TYPE *scan = self->list, *temp = NULL;                                                               \
                                                                                                                \
    while(scan)                                                                                                 \
    {                                                                                                           \
      temp = scan->next;                                                                                        \
      /* Stop the callback process if return type is not 0 */                                                   \
      if(callback( scan, param ))                                                                               \
      {                                                                                                         \
        scan = temp;                                                                                            \
        return;                                                                                                 \
      }                                                                                                         \
      scan = temp;                                                                                              \
    }                                                                                                           \
  }                                                                                                             \
\
  TYPE LIST_PEEK_##TYPE ( LIST_##TYPE *self ) \
  {                                           \
    return self->list->data;                  \
  }

//
// LIST_CONSTRUCT
// Purpose: Constructs and returns a new list object of specified type.
//
#define LIST_CONSTRUCT( TYPE ) \
  LIST_CONSTRUCT_##TYPE( )

//
// NODE_CREATE
// Purpose: Constructs and returns a new node of a specified type.
//
#define NODE_CREATE( TYPE, data, DeleteData, order ) \
  NODE_CREATE_##TYPE( data, DeleteData, order )

//
// LIST_INSERT
// Purpose: Inserts a node of specified type in position determined by the order
//          data member.
// Notes: Will return RETURN_FAILURE if list pointer is NULL.
//
#define LIST_INSERT( TYPE, listSelf, node ) \
  LIST_INSERT_##TYPE( listSelf, node )

//
// LIST_REMOVE
// Purpose: Removes (deallocates) a node of a specified type from a list.
//
#define LIST_REMOVE( TYPE, node ) \
  LIST_REMOVE_##TYPE( node )

//
// LIST_DESTROY
// Purpose: Removes (deallocates) a list and all its nodes of a specified type.
//
#define LIST_DESTROY( TYPE, listSelf ) \
  LIST_DESTROY_##TYPE( listSelf )

//
// LIST_CALLER
// Purpose: Uses a callback on all nodes while supplying a specified param.
//
#define LIST_CALLER( TYPE, self, callback, param ) \
  LIST_CALLER_##TYPE( self, callback, param )

//
// LIST_PEEK
// Purpose: Returns the data of the first in the list.
//
#define LIST_PEEK( TYPE, self ) \
  LIST_PEEK_##TYPE( self )

//
// General data type handling. These functions must be used to handle list and node
// objects/pointers directly.
//
#define LIST_TYPE( TYPE ) \
  LIST_##TYPE
#define NODE_TYPE( TYPE ) \
  NODE_##TYPE


Drawbacks:
There is one major drawback to the ## operator strategy; you cannot use * in the type name, or spaces in the type name. If you need to use a pointer as your type, you'll have to typedef your pointer into a single valid token and pass the typedef'd type to the generic macros.

There is another thing that can be considered a drawback: you cannot have a pointer to a macro. As such debugging macros can be very very tedious. I myself had to generate a processed file and compile with the expanded macro in order to pinpoint where I originally had some compiling errors when creating my generic linked list.

6 comments:

  1. I love you for what you do. Really respect this. Keep it up.

    ReplyDelete
    Replies
    1. Will you sponsor me with some muffins? My friends and I really like muffins :3

      Delete
  2. I'm aware of one other way to do generic programming in C. Take a look at the linked list implementation in the Linux kernel.
    http://isis.poly.edu/kulesh/stuff/src/klist/

    ReplyDelete
    Replies
    1. I talked about the void pointer method briefly :)

      Delete
  3. From the first snippet:

    TYPE TYPE##_MAX( TYPE a, TYPE, b ) \

    Is there a misplaced comma, or is just my imagination?

    ReplyDelete

Note: Only a member of this blog may post a comment.