I've moved the site to a new location! All legacy posts were transferred.
Link to new location: http://www.randygaul.net/
Friday, October 19, 2012
Tuesday, October 9, 2012
C++ Reflection: Type MetaData: Part 3 - Improvements
In our last article we learned how to store information about a classes's members, however there are a couple key improvements that need to be brought to the MetaData system before moving on.
As you can see, this differs a bit from our previous implementation. The way it works is by passing in a single type to the RemQual struct via typename T. Then, the templating matches the type provided with one of the overloads and feeds the type back into the RemQual struct with less qualifiers. This acts as some sort of compile-time "recursive" qualifier stripping mechanism; I'm afraid I don't know what to properly call this technique. This is useful for finding out what the "base type" of any given type.
It should be noted that the example code above does not strip pointer qualifiers off of a type. This is to allow the MetaData system to properly provide MetaData instances of pointer types; which is necessary to reflect pointer meta.
It should be noted that in order to support pointer meta, the RemQual struct will need to be modified so it does not strip off the * qualifier. This actually applies to any qualifier you do not wish to have stripped.
There's one last "improvement" one could make to the RemQual struct that I'm aware of. I don't actually consider this an improvement, but more of a feature or decision. There comes a time when the user of a MetaData system may want to write a tidbit of code like the following:
Say the user wants to send a message object from one place to another. Imagine this message object can take three parameters of any type, and the reflection system can help the constructor of the message figure out the types of the data at run-time (how to actually implement features like this will be covered once Variants and RefVariants are introduced). This means that the message can take three parameters of any type and then take them as payload to deliver elsewhere.
However, there's a subtle problem with the "Message ID" in particular. Param1 and Param2 are assumed to be POD types like float or int, however "Message ID" is a const char * string literal. My understanding of string literals in C++ is that they are of the type: const char[ x ], x being the number of characters in the literal. This poses a problem for our templated MetaCreator, in that every value of x will create a new MetaData instance, as the templating treats each individual value of x as an entire new type. Now how can RemQual handle this? It gets increasingly difficult to actually manage Variants and RefVariant constructors for string literals for reasons explained here, though this will be tackled in a later article.
There are two methods of handling string literals that I am aware of; the first is to make use of some form of a light-weight wrapper. A small wrapper object can contain a const char * data member, and perhaps an unsigned integer to represent the length, and any number of utility functions for common string operations (concat, copy, compare, etc). The use of such a wrapper would look like:
The S would be the class type of the wrapper itself, and the constructor would take a const char *. This would require every place in code that handles a string literal to make use of the S wrapper. This can be quite annoying, but has great performance benefits compared to std::string, especially when some reference counting is used to handle the heap allocated const char * data member holding the string data in order to avoid unnecessary copying. Here's an example skeleton class for such an S wrapper:
As I mentioned before, I found this to be rather annoying; I want my dev team and myself to be able to freely pass along a string literal anywhere and have MetaData handle the type properly. In order to do this, a very ugly and crazy solution was devised. There's a need to create a RemQual struct for every [ ] type for all values of x. This isn't possible. However, it is possible to overload RemQual for a few values of x, at least enough to cover any realistic use of a string literal within C++ code. Observe:
The macro ARRAY_OVERLOAD creates a RemQual overload with a value of x. The __COUNTER__ macro (though not standard) increments by one each time the macro is used. This allows for copy/pasting of the ARRAY_OVERLOAD macro, which will generate a lot of RemQual overloads. I created a file with enough overloads to cover any realistically sized string literal. As an alternative to the __COUNTER__ macro, __LINE__ can be used instead, however I imagine it might be difficult to ensure you have one definition per line without any gaps. As far as I know, __COUNTER__ is supported on both GNU and MSVC++.
Not only will the ARRAY_OVERLOAD cover types of string literals, but it will also cover types with array brackets [ ] of any type passed to RemQual.
The second issue is the ability to properly reflect private data members. There are three solutions to reflecting private data that I am aware of. The first is to try to grant access to the MetaData system by specifying that the MetaCreator of the type in question is a friend class. I never really liked the idea of this solution and haven't actually tried it for myself, and so I can't really comment on the idea any further than this.
The next possible solution is to make use of properties. A property is a set of three things: a gettor; a settor; a member. The gettor and settor provide access to the private member stored within the class. The user can then specify gettors and/or settors from the ADD_MEMBER macro. I haven't implemented this method myself, but would definitely like if I find the time to create such a system. This solution is by far the most elegant of the three choices that I'm presenting. Here's a link to some information on creating some gettor and settor support for a MetaData system like the one in this article series. This can potentially allow a MetaData system to reflect class definitions that the user does not have source code access to, so long as the external class has gettor and settor definitions that are compatible with the property reflection.
The last solution is arguably more messy, but it's easier to implement and works perfectly fine. I chose to implement this method in my own project because of how little time it took to set up a working system. Like I said earlier, if I have time I'd like to add property support, though right now I simply have more important things to finish.
The idea of the last solution is to paste a small macro inside of your class definitions. This small macro then pastes some code within the class itself, and this code grants access to any private data member by using the NullCast pointer trick. This means that in order to reflect private data, you must have source code access to the class in question in order to place your macro. Here's what the new macros might look like, but be warned it gets pretty hectic:
The META_DATA macro is to be placed within a class, it places a couple declarations for NullCast, AddMember and RegisterMetaData. The DEFINE_META macro is modified to provide definitions for the method declarations created by the META_DATA macro. This allows the NullCast to retrieve the type to cast to from the DEFINE_META's TYPE parameter. Since AddMember method is within the class itself, it can now have proper access to private data within the class. The AddMember definition within the class then forwards the information it gathers to the AddMember function within the MetaCreator.
In order for the DEFINE_META api to remain the same as before, the META_DATA macro creates a RegisterMetaData declaration within the class itself. This allows the ADD_MEMBER macro to not need to user to supply to type of class to operate upon. This might be a little confusing, but imagine trying to refactor the macros above. Is the RegisterMetaData macro even required to be placed into the class itself? Can't the RegisterMetaData function within the MetaCreator call AddMember on the class type itself? The problem with this is that the ADD_MEMBER macro would require the user to supply the type to the macro like this:
This would be yet another thing the user of the MetaData system would be required to perform, thus cluttering the API. I find that by keeping the system as simple as possible is more beneficial than factoring out the definition of RegisterMetaData from the META_DATA macro.
Here's an example usage of the new META_DATA and DEFINE_META macros:
The only additional step required here is for the user to remember to place the META_DATA macro within the class definition. The rest of the API remains as intuitive as before.
Here's a link to a compileable (in VS2010) example showing everything I've talked about in the MetaData series thus far. The next article in this series will likely be in creating the Variant type for PODs.
The first issue is with our RemQual struct. In the previous article we had support for stripping off qualifiers such as *, const or &. We even had support for stripping off an R-value reference. However, the RemQual struct had no support for a pointer to a pointer. It is weird that RemQual<int**> would behave differently than RemQual<int*>, and so on. To solve this issue we can cycle down, at compile time, the type through the RemQual struct recursively, until a type arrives at the base RemQual definition. Here's an example:
//
// RemQual
// Strips down qualified types/references/pointers to a single unqualified type, for passing into
// a templated type as a typename parameter.
//
template <typename T>
struct RemQual
{
typedef T type;
};
template <typename T>
struct RemQual<const T>
{
typedef typename RemQual<T>::type type;
};
template <typename T>
struct RemQual<T&>
{
typedef typename RemQual<T>::type type;
};
template <typename T>
struct RemQual<const T&>
{
typedef typename RemQual<T>::type type;
};
template <typename T>
struct RemQual<T&&>
{
typedef typename RemQual<T>::type type;
};
template <typename T>
struct RemQual<const T *>
{
typedef typename RemQual<T *>::type type;
};
As you can see, this differs a bit from our previous implementation. The way it works is by passing in a single type to the RemQual struct via typename T. Then, the templating matches the type provided with one of the overloads and feeds the type back into the RemQual struct with less qualifiers. This acts as some sort of compile-time "recursive" qualifier stripping mechanism; I'm afraid I don't know what to properly call this technique. This is useful for finding out what the "base type" of any given type.
It should be noted that the example code above does not strip pointer qualifiers off of a type. This is to allow the MetaData system to properly provide MetaData instances of pointer types; which is necessary to reflect pointer meta.
It should be noted that in order to support pointer meta, the RemQual struct will need to be modified so it does not strip off the * qualifier. This actually applies to any qualifier you do not wish to have stripped.
There's one last "improvement" one could make to the RemQual struct that I'm aware of. I don't actually consider this an improvement, but more of a feature or decision. There comes a time when the user of a MetaData system may want to write a tidbit of code like the following:
SendMessage( "Message ID", Param1, Param2 );
Say the user wants to send a message object from one place to another. Imagine this message object can take three parameters of any type, and the reflection system can help the constructor of the message figure out the types of the data at run-time (how to actually implement features like this will be covered once Variants and RefVariants are introduced). This means that the message can take three parameters of any type and then take them as payload to deliver elsewhere.
However, there's a subtle problem with the "Message ID" in particular. Param1 and Param2 are assumed to be POD types like float or int, however "Message ID" is a const char * string literal. My understanding of string literals in C++ is that they are of the type: const char[ x ], x being the number of characters in the literal. This poses a problem for our templated MetaCreator, in that every value of x will create a new MetaData instance, as the templating treats each individual value of x as an entire new type. Now how can RemQual handle this? It gets increasingly difficult to actually manage Variants and RefVariant constructors for string literals for reasons explained here, though this will be tackled in a later article.
There are two methods of handling string literals that I am aware of; the first is to make use of some form of a light-weight wrapper. A small wrapper object can contain a const char * data member, and perhaps an unsigned integer to represent the length, and any number of utility functions for common string operations (concat, copy, compare, etc). The use of such a wrapper would look like:
SendMessage( S( "Message ID" ), Param1, Param2 );
The S would be the class type of the wrapper itself, and the constructor would take a const char *. This would require every place in code that handles a string literal to make use of the S wrapper. This can be quite annoying, but has great performance benefits compared to std::string, especially when some reference counting is used to handle the heap allocated const char * data member holding the string data in order to avoid unnecessary copying. Here's an example skeleton class for such an S wrapper:
class S
{
public:
S( const char *src ) : data( src ) {}
~S( ) {}
bool IsEqual( const S& rhs ) const;
void Concat( const S& rhs )
{
char *temp = new char[rhs.len + len];
strcpy( temp, data );
strcpy( temp + len, rhs.data );
}
private:
const char *data;
unsigned len;
};
As I mentioned before, I found this to be rather annoying; I want my dev team and myself to be able to freely pass along a string literal anywhere and have MetaData handle the type properly. In order to do this, a very ugly and crazy solution was devised. There's a need to create a RemQual struct for every [ ] type for all values of x. This isn't possible. However, it is possible to overload RemQual for a few values of x, at least enough to cover any realistic use of a string literal within C++ code. Observe:
#define INTERNAL_ARRAY_OVERLOAD( x ) \
template <typename T> \
struct RemQual<T[ x ]> \
{ \
typedef typename RemQual<T *>::type type; \
}; \
\
template <typename T> \
struct RemQual<const T[ x ]> \
{ \
typedef typename RemQual<T *>::type type; \
};
#define ARRAY_OVERLOAD( ) \
INTERNAL_ARRAY_OVERLOAD( __COUNTER__ )
The macro ARRAY_OVERLOAD creates a RemQual overload with a value of x. The __COUNTER__ macro (though not standard) increments by one each time the macro is used. This allows for copy/pasting of the ARRAY_OVERLOAD macro, which will generate a lot of RemQual overloads. I created a file with enough overloads to cover any realistically sized string literal. As an alternative to the __COUNTER__ macro, __LINE__ can be used instead, however I imagine it might be difficult to ensure you have one definition per line without any gaps. As far as I know, __COUNTER__ is supported on both GNU and MSVC++.
Not only will the ARRAY_OVERLOAD cover types of string literals, but it will also cover types with array brackets [ ] of any type passed to RemQual.
The second issue is the ability to properly reflect private data members. There are three solutions to reflecting private data that I am aware of. The first is to try to grant access to the MetaData system by specifying that the MetaCreator of the type in question is a friend class. I never really liked the idea of this solution and haven't actually tried it for myself, and so I can't really comment on the idea any further than this.
The next possible solution is to make use of properties. A property is a set of three things: a gettor; a settor; a member. The gettor and settor provide access to the private member stored within the class. The user can then specify gettors and/or settors from the ADD_MEMBER macro. I haven't implemented this method myself, but would definitely like if I find the time to create such a system. This solution is by far the most elegant of the three choices that I'm presenting. Here's a link to some information on creating some gettor and settor support for a MetaData system like the one in this article series. This can potentially allow a MetaData system to reflect class definitions that the user does not have source code access to, so long as the external class has gettor and settor definitions that are compatible with the property reflection.
The last solution is arguably more messy, but it's easier to implement and works perfectly fine. I chose to implement this method in my own project because of how little time it took to set up a working system. Like I said earlier, if I have time I'd like to add property support, though right now I simply have more important things to finish.
The idea of the last solution is to paste a small macro inside of your class definitions. This small macro then pastes some code within the class itself, and this code grants access to any private data member by using the NullCast pointer trick. This means that in order to reflect private data, you must have source code access to the class in question in order to place your macro. Here's what the new macros might look like, but be warned it gets pretty hectic:
#define DEFINE_META( TYPE ) \
MetaCreator NAME_GENERATOR( )( #TYPE, sizeof( TYPE ), true ); \
RemQualPtr<TYPE>::type *TYPE::NullCast( void ) { return reinterpret_cast(NULL); } \
void TYPE::AddMember( std::string name, unsigned offset, MetaData *data ) { return MetaCreator<RemQualPtr<TYPE>::type>::AddMember( name, offset, data ); } \
void MetaCreator<RemQualPtr<TYPE>::type>::RegisterMetaData( void ) { TYPE::RegisterMetaData( ); } \
void TYPE::RegisterMetaData( void )
//
// META_DATA
// Purpose : This macro goes on the inside of a class within the public section. It declares
// a few member functions for use by the MetaData system to retrieve information about
// the class.
#define META_DATA( TYPE ) \
static void AddMember( std::string name, unsigned offset, MetaData *data ); \
static RemQual<TYPE>::type *NullCast( void ); \
static void RegisterMetaData( void )
The META_DATA macro is to be placed within a class, it places a couple declarations for NullCast, AddMember and RegisterMetaData. The DEFINE_META macro is modified to provide definitions for the method declarations created by the META_DATA macro. This allows the NullCast to retrieve the type to cast to from the DEFINE_META's TYPE parameter. Since AddMember method is within the class itself, it can now have proper access to private data within the class. The AddMember definition within the class then forwards the information it gathers to the AddMember function within the MetaCreator.
In order for the DEFINE_META api to remain the same as before, the META_DATA macro creates a RegisterMetaData declaration within the class itself. This allows the ADD_MEMBER macro to not need to user to supply to type of class to operate upon. This might be a little confusing, but imagine trying to refactor the macros above. Is the RegisterMetaData macro even required to be placed into the class itself? Can't the RegisterMetaData function within the MetaCreator call AddMember on the class type itself? The problem with this is that the ADD_MEMBER macro would require the user to supply the type to the macro like this:
#define ADD_MEMBER( TYPE, MEMBER ) \
TYPE::AddMember( #MEMBER, (unsigned)(&(NullCast( )->MEMBER)), META( NullCast( )->MEMBER ))
This would be yet another thing the user of the MetaData system would be required to perform, thus cluttering the API. I find that by keeping the system as simple as possible is more beneficial than factoring out the definition of RegisterMetaData from the META_DATA macro.
Here's an example usage of the new META_DATA and DEFINE_META macros:
class Sprite : public Component
{
public:
Sprite( std::string tex = "scary.png" );
virtual void SendMessage( Message *msg );
const std::string& GetTextureName( void );
void SetTextureName( std::string name );
META_DATA( Sprite );
private:
std::string texture;
};
DEFINE_META( Sprite )
{
ADD_MEMBER( name );
ADD_MEMBER( texture );
}
The only additional step required here is for the user to remember to place the META_DATA macro within the class definition. The rest of the API remains as intuitive as before.
Here's a link to a compileable (in VS2010) example showing everything I've talked about in the MetaData series thus far. The next article in this series will likely be in creating the Variant type for PODs.
Labels:
C++,
Class,
Data,
Meta,
Meta Data,
MetaData,
MetaProgramming,
Reflection,
Template,
Type
Tuesday, October 2, 2012
C++ Reflection: Type MetaData: Part 2 - Type Reduction and Members
In the last post we learned the very basics of setting up a reflection system. The whole idea is that the user manually adds types into the system using a single simple macro placed within a cpp file, DEFINE_META.
Whenever you pass in a const, reference, or pointer qualifier an entire new templated MetaCreator will be constructed by the compiler. This just won't do, as we don't want the MetaData of a const int to be different at all from an int, or any other registered type. There's a simple, yet very quirky, trick that can solve all of our problems. Take a look at this:
I'm actually not familiar with the exact terminology to describe what's going on here, but I'll try my best. There's many template overloads of the first RemQual struct, which acts as the "standard". The standard is just a single plain type T, without any qualifiers and without pointer or reference type. The rest of the templated overloaded version all contain a single typedef which lets the entire struct be used to reference a single un-qualified type by supplying any of the various overloaded types to the struct's typename param.
Overloads for the R-value reference must be added as well in order to strip down to the bare type T.
Now that we have our RemQual (remove qualifiers) struct, we can use it within our META macros to refer to MetaData types. Take a look at some example re-writes of the three META macros:
The idea is you feed in the typedef'd type from RemQual into the MetaCreator typename param. This is an example of using macros well; there's no way to screw up the usage of them, and they are still very clean and easy to debug as there isn't really any abuse going on. Feel free to ignore specific META macros you wouldn't actually use. I use all three META_TYPE, META and META_STR. It's a matter of personal preference on what you actually implement in this respect. It will likely be pretty smart to place whatever API is created into a namespace of it's own, however.
And that covers type deduction. There are some other ways of achieving the same effect, like partial template specialization as covered here, though I find this much simpler.
Next up is register members of structures or classes with the MetaData system. Before anything continues, lets take a look at an example Member struct. A Member struct is a container of the various bits of information we'd like to store about any member:
This member above is actually almost exactly what implementation I have in my own reflection as it stands while I write this; there's not a lot needed. You will want a MetaData instance to describe the type of data contained, a name identifier, and an unsigned offset representing the member's location within the containing object. The offset is exceptionally important for automated serialization, which I'll likely be covering after this article.
The idea is that a MetaData instance can contain various member objects. These member objects are contained within some sort of container (perhaps std::vector).
In order to add a member we'll want another another very simple macro. There are two big reasons a macro is efficient in this situation: we can use stringize; there's absolutely no way for the user to screw it up.
Before showing the macro I'd like to talk about how to retrieve the offset. It's very simple. Take the number zero, and turn this into a pointer to a type of object (class or struct). After the typecasting, use the -> operator to access one of the members. Lastly, use the & operator to retrieve the address of the member's location (which will be offset from zero by the -> operator) and typecast this to an unsigned integer. Here's what this looks like:
This is quite the obtrusive line of code we have here! This is also a good example of a macro used well; it takes a single parameter and applies it to multiple locations. There's hardly any way for the user of this macro to screw up.
NullCast is a function I'll show just after this paragraph. All it does is return a pointer to NULL (memory address zero) of some type. Having this type pointer to address zero, we then use the ADD_MEMBER macro to provide the name of a member to access. The member is then accessed, and the & operator provides an address to this member with an offset from zero. This value is then typecasted to an unsigned integer and and passed along to the AddMember function within the macro. The stringize operator is also used to pass a string representation of the member to the AddMember function, as well as a MetaData instance of whatever the type of data the member is.
Now where does this AddMember function actually go? Where is it from? It's actually placed into a function definition. The function AddMember itself resides within the MetaCreator. This allows the MetaCreator to call the AddMember function of the MetaData instance it holds, which then adds the Member object into the container of Members within the MetaData instance.
Now, the only place that this AddMember function can be called from, building from the previous article, is within the MetaCreator's constructor. The idea is to use the DEFINE_META macro to also create a definition of either the MetaCreator's constructor, or a MetaCreator method that is called from the MetaCreator's constructor. Here's an example:
As you can see this formation is actually very intuitive; it has C++-like syntax, and it's very clear what is going on here. A GameObject is being registered in the Meta system, and it has members of ID, active, and components being added to the Meta system. For clarity, here's what the GameObject's actual class definition might look like (assuming component based architecture):
Now lets check out what the new DEFINE_META macro could look like:
The RegisterMetaData declaration is quite peculiar, as the macro just ends there. What this is doing is setting up the definition of the RegisterMetaData function, so that the ADD_MEMBER macro calls are actually lines of code placed within the definition. The RegisterMetaData function should be called from the MetaCreator's constructor. This allows the user to specify what members to reflect within a MetaData instance of a particular type in a very simple and intuitive way.
Last but not least, lets talk about the NullCast function real quick. It resides within the MetaCreator, as NullCast requires the template's typename MetaType in order to return a pointer to a specific type of data.
And that's that! We can now store information about the members of a class and deduce types from objects in an easily customizeable way.
Here's a link to a demonstration program, compileable in Visual Studio 2010. I'm sure this could compile in GCC with a little bit of house-keeping as well, but I don't really feel like doing this as I need to get to bed! Here's the output of the program. The format is <type> <size>. For the object, members and their offsets are printed:
Now you might notice at some point in time, you cannot reflect private data members! This detail will be covered in a later article. The idea behind it is that you require source code access to the type you want to reflect, and place a small tiny bit of code inside to gain private data access. Either that or make the MetaCreator a friend class (which sounds like a messy solution to me).
And here we have all the basics necessary for automated serialization! We can reflect the names of members, their types, and offsets within an object. This lets the reflection register any type of C++ data within itself.
In this article I'll talk about type deduction and member reflection, both of which are critical building blocks for everything else.
First up is type deduction. When using the templated MetaCreator class:
template <typename Metatype>
class MetaCreator
{
public:
MetaCreator( std::string name, unsigned size )
{
Init( name, size );
}
static void Init( std::string name, unsigned size )
{
Get( )->Init( name, size );
}
// Ensure a single instance can exist for this class type
static MetaData *Get( void )
{
static MetaData instance;
return &instance;
}
};
Whenever you pass in a const, reference, or pointer qualifier an entire new templated MetaCreator will be constructed by the compiler. This just won't do, as we don't want the MetaData of a const int to be different at all from an int, or any other registered type. There's a simple, yet very quirky, trick that can solve all of our problems. Take a look at this:
//
// RemQual
// Strips down qualified types/references/pointers to a single unqualified type, for passing into
// a templated type as a typename parameter.
//
template <typename T>
struct RemQual
{
typedef T type;
};
template <typename T>
struct RemQual<const T>
{
typedef T type;
};
template <typename T>
struct RemQual<T&>
{
typedef T type;
};
template <typename T>
struct RemQual<const T&>
{
typedef T type;
};
template <typename T>
struct RemQual<T&&>
{
typedef T type;
};
template <typename T>
struct RemQual<T *>
{
typedef T type;
};
template <typename T>
struct RemQual<const T *>
{
typedef T type;
};
I'm actually not familiar with the exact terminology to describe what's going on here, but I'll try my best. There's many template overloads of the first RemQual struct, which acts as the "standard". The standard is just a single plain type T, without any qualifiers and without pointer or reference type. The rest of the templated overloaded version all contain a single typedef which lets the entire struct be used to reference a single un-qualified type by supplying any of the various overloaded types to the struct's typename param.
Overloads for the R-value reference must be added as well in order to strip down to the bare type T.
Now that we have our RemQual (remove qualifiers) struct, we can use it within our META macros to refer to MetaData types. Take a look at some example re-writes of the three META macros:
//
// META_TYPE
// Purpose: Retrieves the proper MetaData instance of an object by type.
//
#define META_TYPE( TYPE ) (MetaCreator<RemQual<TYPE>::type>::Get( ))
//
// META
// Purpose: Retrieves the proper MetaData instance of an object by an object's type.
//
#define META( OBJECT ) (MetaCreator<RemQual<decltype( OBJECT )>::type>::Get( ))
//
// META_STR
// Purpose : Finds a MetaData instance by string name
//
#define META_STR( STRING ) (MetaManager::Get( STRING ))
//
// DEFINE_META
// Purpose : Defines a MetaCreator for a specific type of data
//
#define DEFINE_META( TYPE ) \
MetaCreator<RemQual<TYPE>
::type> NAME_GENERATOR( )( #TYPE, sizeof( TYPE ) )
The idea is you feed in the typedef'd type from RemQual into the MetaCreator typename param. This is an example of using macros well; there's no way to screw up the usage of them, and they are still very clean and easy to debug as there isn't really any abuse going on. Feel free to ignore specific META macros you wouldn't actually use. I use all three META_TYPE, META and META_STR. It's a matter of personal preference on what you actually implement in this respect. It will likely be pretty smart to place whatever API is created into a namespace of it's own, however.
And that covers type deduction. There are some other ways of achieving the same effect, like partial template specialization as covered here, though I find this much simpler.
Next up is register members of structures or classes with the MetaData system. Before anything continues, lets take a look at an example Member struct. A Member struct is a container of the various bits of information we'd like to store about any member:
//
// Member
// Purpose: Stores information (name and offset of member) about a data member of a specific class. Multiple
// Member objects can be stored in MetaData objects within a std::vector.
//
class Member
{
public:
Member( std::string string, unsigned val, MetaData *meta );
~Member( );
const std::string &Name( void ) const; // Gettor for name
unsigned Offset( void ) const; // Gettor for offset
const MetaData *Meta( void ) const; // Gettor for data
private:
std::string name;
unsigned offset;
const MetaData *data;
};
This member above is actually almost exactly what implementation I have in my own reflection as it stands while I write this; there's not a lot needed. You will want a MetaData instance to describe the type of data contained, a name identifier, and an unsigned offset representing the member's location within the containing object. The offset is exceptionally important for automated serialization, which I'll likely be covering after this article.
The idea is that a MetaData instance can contain various member objects. These member objects are contained within some sort of container (perhaps std::vector).
In order to add a member we'll want another another very simple macro. There are two big reasons a macro is efficient in this situation: we can use stringize; there's absolutely no way for the user to screw it up.
Before showing the macro I'd like to talk about how to retrieve the offset. It's very simple. Take the number zero, and turn this into a pointer to a type of object (class or struct). After the typecasting, use the -> operator to access one of the members. Lastly, use the & operator to retrieve the address of the member's location (which will be offset from zero by the -> operator) and typecast this to an unsigned integer. Here's what this looks like:
#define ADD_MEMBER( MEMBER ) \
AddMember( #MEMBER, (unsigned)(&(NullCast( )->MEMBER)), META( NullCast( )->MEMBER ))
This is quite the obtrusive line of code we have here! This is also a good example of a macro used well; it takes a single parameter and applies it to multiple locations. There's hardly any way for the user of this macro to screw up.
NullCast is a function I'll show just after this paragraph. All it does is return a pointer to NULL (memory address zero) of some type. Having this type pointer to address zero, we then use the ADD_MEMBER macro to provide the name of a member to access. The member is then accessed, and the & operator provides an address to this member with an offset from zero. This value is then typecasted to an unsigned integer and and passed along to the AddMember function within the macro. The stringize operator is also used to pass a string representation of the member to the AddMember function, as well as a MetaData instance of whatever the type of data the member is.
Now where does this AddMember function actually go? Where is it from? It's actually placed into a function definition. The function AddMember itself resides within the MetaCreator. This allows the MetaCreator to call the AddMember function of the MetaData instance it holds, which then adds the Member object into the container of Members within the MetaData instance.
Now, the only place that this AddMember function can be called from, building from the previous article, is within the MetaCreator's constructor. The idea is to use the DEFINE_META macro to also create a definition of either the MetaCreator's constructor, or a MetaCreator method that is called from the MetaCreator's constructor. Here's an example:
DEFINE_META( GameObject )
{
ADD_MEMBER( ID );
ADD_MEMBER( active );
ADD_MEMBER( components );
}
As you can see this formation is actually very intuitive; it has C++-like syntax, and it's very clear what is going on here. A GameObject is being registered in the Meta system, and it has members of ID, active, and components being added to the Meta system. For clarity, here's what the GameObject's actual class definition might look like (assuming component based architecture):
class GameObject
{
public:
// Sends all components within this object the same message
void SendMessage( Message *msg );
bool HasComponent( std::string& name ) const;
void AddComponent( std::string componentType );
Component *GetComponent( std::string name ) const;
handle GetID( void ) const;
handle ID;
// This boolean should always be true when the object is alive. If this is
// set to false, then the ObjectFactory will clean it up and delete this object
// during its inactive sweep when the ObjectFactory's update is called.
bool active;
std::vector components;
private:
// Only to be used by the factory!
GameObject( ) : ID( -1 ), active( true ) {}
~GameObject( );
};
Now lets check out what the new DEFINE_META macro could look like:
#define DEFINE_META( TYPE ) \
MetaCreator<RemQual<TYPE>::type> NAME_GENERATOR( )( #TYPE, sizeof( TYPE ) ); \
void MetaCreator<RemQual<TYPE>::type>::RegisterMetaData( void )
The RegisterMetaData declaration is quite peculiar, as the macro just ends there. What this is doing is setting up the definition of the RegisterMetaData function, so that the ADD_MEMBER macro calls are actually lines of code placed within the definition. The RegisterMetaData function should be called from the MetaCreator's constructor. This allows the user to specify what members to reflect within a MetaData instance of a particular type in a very simple and intuitive way.
Last but not least, lets talk about the NullCast function real quick. It resides within the MetaCreator, as NullCast requires the template's typename MetaType in order to return a pointer to a specific type of data.
And that's that! We can now store information about the members of a class and deduce types from objects in an easily customizeable way.
Here's a link to a demonstration program, compileable in Visual Studio 2010. I'm sure this could compile in GCC with a little bit of house-keeping as well, but I don't really feel like doing this as I need to get to bed! Here's the output of the program. The format is <type> <size>. For the object, members and their offsets are printed:
int
4
float
4
std::string
28
Object
8
{
ID
0
active
4
}
Now you might notice at some point in time, you cannot reflect private data members! This detail will be covered in a later article. The idea behind it is that you require source code access to the type you want to reflect, and place a small tiny bit of code inside to gain private data access. Either that or make the MetaCreator a friend class (which sounds like a messy solution to me).
And here we have all the basics necessary for automated serialization! We can reflect the names of members, their types, and offsets within an object. This lets the reflection register any type of C++ data within itself.
Labels:
C++,
Class,
Data,
Meta,
Meta Data,
MetaData,
MetaProgramming,
Reflection,
Template,
Type
Sunday, September 30, 2012
C++ Reflection: Type MetaData: Introduction
Thinking about how to take things a step further in terms of what your tools can do for you is really important in increasing productivity. This is where a form of reflection can come in very handy. I call the reflection system I'm working on by shorthand of "meta" or "MetaData", though the proper terminology for it would be something more like Class MetaData or Type Reflection. So when I say MetaData I'm referring to data about data, and specifically data about the types of data within your C++ code.
Having a MetaData system allows for information about your data types to be saved during run-time for use during run-time. The C++ compiler compiles away a lot of information, and a lot of this information is very useful. So, a MetaData system saves this information from being compiled away.
So what is the use of such a system? Where here are the things I've constructed so far: simple and powerful debugging features; automated serialization for any type registered with Meta; automic function binding to Lua; Variant and RefVariant class types (objects that can hold any type of data registered within Meta). However these aren't the only things I'll be using the MetaData system for. One could also apply the MetaData to make simple object factories with hardly any code, or property viewing tables for editors with ease. I'm sure there's even more uses as well that I just haven't quite come to terms with. In a way, systems like this can generate tools and functionality for the developer whenever a new class or data type is introduced.
Before we start I want to let the reader know that such a system can be very efficient, efficient enough to run wonderfully within a real-time application like a game. I myself don't know a lot about efficiency or optimization at a low-level, though I do know other programmers who've made such Reflection systems that are very, very fast. So make sure not to have any pre-built misconceptions about a C++ Reflection system before moving on.
I started learning about how to construct a MetaData system from this post. You can take a look if you like, though it won't be necessary if you just want to learn what I'm trying to teach here. In that post, by a fellow student here at DigiPen, he goes over how to get started with MetaData, but leaves a lot to yet be learned. I'll be modelling my post after his quite a bit, as he does have good content structure for an introduction post.
I'll attempt to document what I've learned here as honestly, there's not any resources on constructing a MetaData system like this anywhere as far as I know. The closest thing I could find was from a Game Programming Gems book, the 5th one chapter 1.4, although it required all classes that wanted to participate to inherit from the MetaData class. This isn't really sufficient if you want to reflect class or structure types you don't have source code access to, and it doesn't support the built-in types at all.
Getting Started
To start lets talk about the overall structure of what a MetaData system looks like. I think it's going to best to draw a diagram of what will be achieved from this article:
In the above diagram the MetaData objects are the key object in which all important operations are performed. A MetaData object is a non-templated class, allowing them to be stored in a data structure. There is one MetaData object for each type of data registered to the system, and the MetaData object of a corresponding type is a representation of that type. It stores information about whether or not it is a Plain Old Data type (POD), the size of the type, it's members and methods, and name. Inheritance information can be stored along with multiple inheritance information, though I haven't even bothered adding that feature in yet as it's not actually very useful and quite difficult to do properly.
The MetaCreator is a templated class that manages the creation of a single unique MetaData instance. It then adds its instance into the MetaManager which contains it in some sort of map.
The MetaManager is a non-templated class that contains all of the created MetaData instances, and can perform search operations on them to find specific types. I use a map of strings to instances, so I can search by string identifier. I've also placed some other small utility functions into my MetaManager as well.
Client Code
Before we get started writing anything, I'd like to try to show some example client code to exemplify why I've taken all this time to make a MetaData system in the first place.
As you can see I have a nice DESERIALIZE macro that can deserialize any type of data registered within within the Reflection. My entire serialization file (includes in and out) is only about 400 lines of code, and I implemented my own custom file format. I also have a LuaReference data type, which contains a handle and a MetaData instance, and allows any class to be sent to Lua via handle. Because of my Meta system I can write very generic and powerful code pretty easily.
Getting Started
The first I recommend starting with is to reflect the size of a type, and the name of a type. Here's what a very simple MetaData class could like:
This simple class just stores the size and name of a type of data. The next thing required is the ability to create a unique instance of MetaData. This requires a templated class called the MetaCreator.
You should be able to see that by passing in a type, perhaps <int> to the MetaCreator, a single MetaData instance becomes available from the Get function, and is associated with the MetaCreator<int> class. Any type can be passed into the Metatype typename. The MetaCreator constructor initializes the MetaData instance. This is important. In the future you'll have MetaData for a class that contains MetaData for POD types. However because of out-of-order initialization, some of the types required to construct your class MetaData instance might not be initialized (as in the Init function will not have been called) yet. However if you use the MetaManager<>::Get( ) function, you can retrieve a pointer to the memory location that will be initialized once the MetaCreator of that specific type is constructed. It should be noted that the construction of a MetaCreator happens within a macro, so that there's absolutely no way of screwing up the type registration (the inside of the macro will become quite... ugly).
Lastly you'll need a place to store all of your MetaData instances: the MetaManager!
And there we have a nice way to store all of our MetaData instances. The Get function is most useful in retrieving a MetaData instance of a type by string name. Now that we have our three major facilities setup, we can talk about the macros involved in actually registering a type within the MetaData system.
So far so good. Using the DEFINE_META macro it's pretty easy to add a type to the MetaData system, simply do DEFINE_META( type );. The decltype might be confusing, as it's new. decltype simply returns the type of an object. This allows the user to retrieve a MetaData instance that corresponds to an object's type, without knowing what the object's type is; this lets very generic code be easily written.
NAME_GENERATOR is a bit tricky. Every single instance of a MetaCreator needs to be constructed at global scope- this is the only way to get the Init( ) function to be called, without having to place your DEFINE_META macro in some sort of code scope. Without the constructor of the MetaCreator calling Init, the only way to have any sort of code run by using the DEFINE_META macro is to place it within some scope that is run sometime after main executes. This makes the use of the DEFINE_META macro harder. If you create the MetaCreator at global scope, then you can have the constructor's code run before main executes at all, making the DEFINE_META macro very easy and simple to use.
So then comes the issue of "what do I call my MetaCreator?" arises. The first thing you might think of is, just call it MetaCreator and make it static. This hides the MetaCreator at file scope, allowing the DEFINE_META macro to be used once per file without any naming conflicts. However, what if you need more than one DEFINE_META in a file? The next solution I thought of was to use token pasting: ## operator. Here's an example usage of the token pasting technique:
The only problem with this strategy is that you cannot pass in type names with special characters or spaces, as that won't result in a proper token name. The last solution is to use some sort of unique name generation technique. There are two macros __LINE__ and __FILE__ that can be used to generate a unique name each time the macro is used, so long as it is not used twice on the same line of the same file. The __LINE__ and __FILE__ definitions are a bit tedious to use, but I think I have them working properly, like this:
You have to feed in the __COUNTER__ definition carefully in order to make sure they are translated by the preprocessor into their respective values. A resulting token could look like: GENERATED_NAME__29. This is perfect for creating all of the MetaCreators at global scope on the stack. Without the use of some sort of trick like this, it can be very annoying to have to use a function call to register your MetaData.
Alternatively there are __FILE__ and __LINE__ macros, but they aren't really necessary as the __COUNTER__ does everything we need. The __COUNTER__ however is, I believe, not actually standard.
So far everything is very straightforward, as far as I can tell. Please ask any questions you have in the comments!
It should be noted that const, &, and * types all create different MetaData instances. As such, there is a trick that can be used to strip these off of an object when using the META macro. This will be covered in a subsequent article.
Example Uses
Now lets check out some example client code of what our code can actually do!
And there we have it! An extremely easy to use DEFINE_META macro that stores name and size information of any type, including class and struct types.
Future Posts
For future posts I look forward to writing about automated Lua binding, Variants and RefVariants, automated Serialization, factory usage, messaging, and perhaps other topics as well! These topics are really rather huge, so please by patient in that it may take a while to cover everything.
Link to second article in series.
Having a MetaData system allows for information about your data types to be saved during run-time for use during run-time. The C++ compiler compiles away a lot of information, and a lot of this information is very useful. So, a MetaData system saves this information from being compiled away.
So what is the use of such a system? Where here are the things I've constructed so far: simple and powerful debugging features; automated serialization for any type registered with Meta; automic function binding to Lua; Variant and RefVariant class types (objects that can hold any type of data registered within Meta). However these aren't the only things I'll be using the MetaData system for. One could also apply the MetaData to make simple object factories with hardly any code, or property viewing tables for editors with ease. I'm sure there's even more uses as well that I just haven't quite come to terms with. In a way, systems like this can generate tools and functionality for the developer whenever a new class or data type is introduced.
Before we start I want to let the reader know that such a system can be very efficient, efficient enough to run wonderfully within a real-time application like a game. I myself don't know a lot about efficiency or optimization at a low-level, though I do know other programmers who've made such Reflection systems that are very, very fast. So make sure not to have any pre-built misconceptions about a C++ Reflection system before moving on.
I started learning about how to construct a MetaData system from this post. You can take a look if you like, though it won't be necessary if you just want to learn what I'm trying to teach here. In that post, by a fellow student here at DigiPen, he goes over how to get started with MetaData, but leaves a lot to yet be learned. I'll be modelling my post after his quite a bit, as he does have good content structure for an introduction post.
I'll attempt to document what I've learned here as honestly, there's not any resources on constructing a MetaData system like this anywhere as far as I know. The closest thing I could find was from a Game Programming Gems book, the 5th one chapter 1.4, although it required all classes that wanted to participate to inherit from the MetaData class. This isn't really sufficient if you want to reflect class or structure types you don't have source code access to, and it doesn't support the built-in types at all.
Getting Started
To start lets talk about the overall structure of what a MetaData system looks like. I think it's going to best to draw a diagram of what will be achieved from this article:
Diagram of entire MetaData system layout. |
In the above diagram the MetaData objects are the key object in which all important operations are performed. A MetaData object is a non-templated class, allowing them to be stored in a data structure. There is one MetaData object for each type of data registered to the system, and the MetaData object of a corresponding type is a representation of that type. It stores information about whether or not it is a Plain Old Data type (POD), the size of the type, it's members and methods, and name. Inheritance information can be stored along with multiple inheritance information, though I haven't even bothered adding that feature in yet as it's not actually very useful and quite difficult to do properly.
The MetaCreator is a templated class that manages the creation of a single unique MetaData instance. It then adds its instance into the MetaManager which contains it in some sort of map.
The MetaManager is a non-templated class that contains all of the created MetaData instances, and can perform search operations on them to find specific types. I use a map of strings to instances, so I can search by string identifier. I've also placed some other small utility functions into my MetaManager as well.
Client Code
Before we get started writing anything, I'd like to try to show some example client code to exemplify why I've taken all this time to make a MetaData system in the first place.
GameObject *obj = FACTORY->CreateObject( "SampleObject" );
// FACTORY->CreateObject code
GameObject *ObjectFactory::CreateObject( const std::string& fileName )
{
SERIALIZER->OpenFile( fileName );
GameObject *obj = DESERIALIZE( GameObject );
SERIALIZER->CloseFile( );
IDObject( obj );
LUA_SYSTEM->PushGameObject( obj );
return obj;
}
// LUA_SYSTEM->PushGameObject code
void LuaSystem::PushGameObject( GameObject *obj )
{
lua_getglobal( L , "GameObjects" );
lua_pushinteger( L, obj->GetID( ) );
LuaReference *luaObj = (LuaReference *)lua_newuserdata( L, sizeof( LuaReference ) );
luaL_setmetatable( L, META_TYPE( GameObject )->MetaTableName( ) );
luaObj->ID = obj->GetID( );
luaObj->meta = META_TYPE( GameObject ); // grab MetaData instance of this type
lua_settable( L, 1 ); // Stack index 1[3] = 2
// Clear the stack
lua_settop( L, 0 );
}
As you can see I have a nice DESERIALIZE macro that can deserialize any type of data registered within within the Reflection. My entire serialization file (includes in and out) is only about 400 lines of code, and I implemented my own custom file format. I also have a LuaReference data type, which contains a handle and a MetaData instance, and allows any class to be sent to Lua via handle. Because of my Meta system I can write very generic and powerful code pretty easily.
Getting Started
The first I recommend starting with is to reflect the size of a type, and the name of a type. Here's what a very simple MetaData class could like:
//
// MetaData
// Purpose: Object for holding various info about any C++ type for the MetaData reflection system.
//
class MetaData
{
public:
MetaData( std::string string, unsigned val ) : name( string ), size( val ) {}
~MetaData( )
const std::string& Name( void ) const { return name; }
unsigned Size( void ) const { return size; }
private:
std::string name;
unsigned size;
}
This simple class just stores the size and name of a type of data. The next thing required is the ability to create a unique instance of MetaData. This requires a templated class called the MetaCreator.
template <typename Metatype>
class MetaCreator
{
public:
MetaCreator( std::string name, unsigned size )
{
Init( name, size );
}
static void Init( std::string name, unsigned size )
{
Get( )->Init( name, size );
}
// Ensure a single instance can exist for this class type
static MetaData *Get( void )
{
static MetaData instance;
return &instance;
}
};
You should be able to see that by passing in a type, perhaps <int> to the MetaCreator, a single MetaData instance becomes available from the Get function, and is associated with the MetaCreator<int> class. Any type can be passed into the Metatype typename. The MetaCreator constructor initializes the MetaData instance. This is important. In the future you'll have MetaData for a class that contains MetaData for POD types. However because of out-of-order initialization, some of the types required to construct your class MetaData instance might not be initialized (as in the Init function will not have been called) yet. However if you use the MetaManager<>::Get( ) function, you can retrieve a pointer to the memory location that will be initialized once the MetaCreator of that specific type is constructed. It should be noted that the construction of a MetaCreator happens within a macro, so that there's absolutely no way of screwing up the type registration (the inside of the macro will become quite... ugly).
Lastly you'll need a place to store all of your MetaData instances: the MetaManager!
//
// MetaManager
// Purpose: Just a collection of some functions for management of all the
// various MetaData objects.
//
class MetaManager
{
public:
typedef std::map<std::string, const MetaData *> MetaMap;
// Insert a MetaData into the map of objects
static void RegisterMeta( const MetaData *instance );
// Retrieve a MetaData instance by string name from the map of MetaData objects
static const MetaData *Get( std::string name ); // NULL if not found
// Safe and easy singleton for map of MetaData objects
static MetaMap& GetMap( void )
{
// Define static map here, so no need for explicit definition
static MetaMap map;
return map;
}
};
And there we have a nice way to store all of our MetaData instances. The Get function is most useful in retrieving a MetaData instance of a type by string name. Now that we have our three major facilities setup, we can talk about the macros involved in actually registering a type within the MetaData system.
//
// META_TYPE
// Purpose: Retrieves the proper MetaData instance of an object by type.
//
#define META_TYPE( TYPE ) (MetaCreator<TYPE>::Get( ))
//
// META
// Purpose: Retrieves the proper MetaData instance of an object by an object's type.
//
#define META( OBJECT ) (MetaCreator<decltype( OBJECT )>::Get( ))
//
// META_STR
// Purpose : Finds a MetaData instance by string name
//
#define META_STR( STRING ) (MetaManager::Get( STRING ))
//
// DEFINE_META
// Purpose : Defines a MetaCreator for a specific type of data
//
#define DEFINE_META( TYPE ) \
MetaCreator<TYPE> NAME_GENERATOR( )( #TYPE, sizeof( TYPE ) )
So far so good. Using the DEFINE_META macro it's pretty easy to add a type to the MetaData system, simply do DEFINE_META( type );. The decltype might be confusing, as it's new. decltype simply returns the type of an object. This allows the user to retrieve a MetaData instance that corresponds to an object's type, without knowing what the object's type is; this lets very generic code be easily written.
NAME_GENERATOR is a bit tricky. Every single instance of a MetaCreator needs to be constructed at global scope- this is the only way to get the Init( ) function to be called, without having to place your DEFINE_META macro in some sort of code scope. Without the constructor of the MetaCreator calling Init, the only way to have any sort of code run by using the DEFINE_META macro is to place it within some scope that is run sometime after main executes. This makes the use of the DEFINE_META macro harder. If you create the MetaCreator at global scope, then you can have the constructor's code run before main executes at all, making the DEFINE_META macro very easy and simple to use.
So then comes the issue of "what do I call my MetaCreator?" arises. The first thing you might think of is, just call it MetaCreator and make it static. This hides the MetaCreator at file scope, allowing the DEFINE_META macro to be used once per file without any naming conflicts. However, what if you need more than one DEFINE_META in a file? The next solution I thought of was to use token pasting: ## operator. Here's an example usage of the token pasting technique:
DEFINE_META( TYPE ) \
MetaCreator<TYPE> Creator##TYPE( )( #TYPE, sizeof( TYPE ), false )
The only problem with this strategy is that you cannot pass in type names with special characters or spaces, as that won't result in a proper token name. The last solution is to use some sort of unique name generation technique. There are two macros __LINE__ and __FILE__ that can be used to generate a unique name each time the macro is used, so long as it is not used twice on the same line of the same file. The __LINE__ and __FILE__ definitions are a bit tedious to use, but I think I have them working properly, like this:
#define PASTE( _, __ ) _##__
#define NAME_GENERATOR_INTERNAL( _ ) PASTE( GENERATED_NAME, _ )
#define NAME_GENERATOR( ) NAME_GENERATOR_INTERNAL( __COUNTER__ )
You have to feed in the __COUNTER__ definition carefully in order to make sure they are translated by the preprocessor into their respective values. A resulting token could look like: GENERATED_NAME__29. This is perfect for creating all of the MetaCreators at global scope on the stack. Without the use of some sort of trick like this, it can be very annoying to have to use a function call to register your MetaData.
Alternatively there are __FILE__ and __LINE__ macros, but they aren't really necessary as the __COUNTER__ does everything we need. The __COUNTER__ however is, I believe, not actually standard.
So far everything is very straightforward, as far as I can tell. Please ask any questions you have in the comments!
It should be noted that const, &, and * types all create different MetaData instances. As such, there is a trick that can be used to strip these off of an object when using the META macro. This will be covered in a subsequent article.
Example Uses
Now lets check out some example client code of what our code can actually do!
DEFINE_META( int );
DEFINE_META( float );
DEFINE_META( double );
DEFINE_META( std::string );
void main( void )
{
std::cout << META_TYPE( int )->Name( ); // output: "int"
std::cout << META_TYPE( float )->Size( ); // output: "4"
std::string word = "This is a word";
std::cout << META( word )->Name( ); // output: "std::string"
std::cout << META_STR( "double" )->Size( ); // output: "8"
if(META( word ) != META_TYPE( int ) // address comparison
{
std::cout << "This object is not an int!";
}
}
And there we have it! An extremely easy to use DEFINE_META macro that stores name and size information of any type, including class and struct types.
Future Posts
For future posts I look forward to writing about automated Lua binding, Variants and RefVariants, automated Serialization, factory usage, messaging, and perhaps other topics as well! These topics are really rather huge, so please by patient in that it may take a while to cover everything.
Link to second article in series.
Tuesday, September 4, 2012
Simple Smart Pointers
This is going to be the first of many posts abandoning C and heading into C++. I stuck around in C for as long as could, but now heading into my Sophomore year it's time to make the switch.
There are many different implementations and goals of what a smart pointer can be or attempt to achieve. As far as I can see, the term "smart pointer" is a pretty general term that means some sort of extra functionality wrapped around normal pointers. However the smart pointer should be able to behave just like a normal pointer, so overloading of the * and -> operators will be done.
In previous game projects (the ones I written in C in my portfolio) had an issue where when a bit of data was deleted or modified, all other code required to be made aware of such act. This required extra work by the programmer to ensure that no pointers within the program could be accessed if the data they pointed to was modified in an undesirable way (perhaps deleted or converted to another type, or moved). The smart pointer implementation will solve this problem by making it seem like all smart pointers to a single location update each other automatically. Additionally, I wanted to add a little more functionality for ease of use and convenience, of which I'll talk about later in the article.
Here's an example of a problem situation that can arise from misusing ordinary pointers:
I decided to make my smart pointers handle based. A handle system is where some sort of identifier (ID) is mapped to a value by mapping functionality, i.e. std::map. This is extremely useful, as one can now place a layer of indirection between an identifier and its associated value. The reason this is useful goes into many different scenarios, but pertaining to smart pointers we can have a single point of access to the address a smart pointer holds.
Here are the barebones for our smart pointer. The SmartPtr holds an ID, where handle is a typedef'd integer. GetPtr( ) indexes the handleMap (handleMap should be a class to wrap around std::map, to provide inserting and erasing methods) with the SmartPtr's ID to retrieve the pointer value the ID maps to. The SmartPtr itself is templated for a specific type of pointer, this way each different type of data to be pointed to has a different statically defined handleMap. This is a benefit to the amount of unique handles your game can generate. The SmartPtr can be constructed with a handle value, and has a gettor for the handle ID itself.
Now whenever the pointer mapped by a specific ID is modified within the handleMap, all subsequent requests to that value will be updated throughout all other SmartPtr instances with the same ID. This is because the only point of access to the pointer value is through the map itself, meaning there's actually only one pointer mapped by an ID at any given time, although there can be many SmartPtrs with the ID mapping to that single value.
However you may be wondering about how to generate the handle ID itself, as each one needs to be unique. It's very simple: start your first handle at the integer value of 1, and then for every subsequent ID mapped to a new value, add one to the previously used ID. This will give you millions of unique IDs, which is well than more enough for basically any game ever, especially in this case as each type has its own handleMap altogether. The ID value of 0 can be reserved for a NULL pointer to allow to check at any given time if a SmartPtr's handle is valid.
As for the implementation of -> and *, you should use your utility GetPtr( ) function for them. -> will simply require a returning of GetPtr( ), and * will dereference the value returned by GetPtr( ), or return NULL if the value from GetPtr( ) is NULL. In this way your SmartPtr class will behave exactly like an ordinary pointer.
Additionally operator overloads for comparisons, or even pointer arithmetic, can be implemented as well. I implemented equality comparisons.
Lastly, I needed a way to handle inserting a pointer into the handleMap, and consequently erasing it from the map as well. I decided to make the SmartPtr handle this automatically with its constructor and destructor. Currently the example SmartPtr interface I've shown above supports constructing a SmartPtr from an already-generated handle, assuming the handle generation is external the SmartPtr itself. However overloading the constructor to receive a TYPE pointer allows for insertion of this TYPE * into the handleMap automatically. Here's an example:
The constructor inserts the ptr of TYPE into the handleMap by calling a method called InsertPtr( ). The return value is used to initialize the SmartPtr's ID data member. A bool called clearOnDestruct is required to be set, and this will be talked about later.
The InsertPtr( ) is a custom-defined method for your handleMap class. It should just insert a new pointer with a newly generated ID. On destruction of the SmartPtr, if the bool clearOnDestruct is set, the pointer can erase the slot in the handleMap for that ID. This way SmartPtr can contain a pointer for you, perhaps within a linked list or some other data structure.
Lastly, a method called DeletePtr( ) should be implemented to allow delete to be called on the data the SmartPtr contains. This SmartPtr implementation isn't supposed to act as any sort of garbage collection: no reference counting is done. Heap memory contained by a SmartPtr must be explicitly deleted with DeletePtr( ) by design. This way SmartPtr's can easily contain heap allocated memory as well. Here's a diagram showing the flow of a SmartPtr:
Like I said, there are many different ways to implement smart pointers, and many different things they can achieve. However the details here are just what I decided to use to solve specific problems for projects I work on.
Additional reading:
There are many different implementations and goals of what a smart pointer can be or attempt to achieve. As far as I can see, the term "smart pointer" is a pretty general term that means some sort of extra functionality wrapped around normal pointers. However the smart pointer should be able to behave just like a normal pointer, so overloading of the * and -> operators will be done.
In previous game projects (the ones I written in C in my portfolio) had an issue where when a bit of data was deleted or modified, all other code required to be made aware of such act. This required extra work by the programmer to ensure that no pointers within the program could be accessed if the data they pointed to was modified in an undesirable way (perhaps deleted or converted to another type, or moved). The smart pointer implementation will solve this problem by making it seem like all smart pointers to a single location update each other automatically. Additionally, I wanted to add a little more functionality for ease of use and convenience, of which I'll talk about later in the article.
Here's an example of a problem situation that can arise from misusing ordinary pointers:
SomeFunction
{
ObjectAddress = newly allocated object
}
Some Other Function
{
Delete data at ObjectAdress
}
Last Function
{
ObjectAddress.GetName( ) // Is object deleted yet?
}
I decided to make my smart pointers handle based. A handle system is where some sort of identifier (ID) is mapped to a value by mapping functionality, i.e. std::map. This is extremely useful, as one can now place a layer of indirection between an identifier and its associated value. The reason this is useful goes into many different scenarios, but pertaining to smart pointers we can have a single point of access to the address a smart pointer holds.
template <typename TYPE>
class SmartPtr
{
public:
SmartPtr( handle val = NULL ): ID( val ) {};
~SmartPtr( );
TYPE *GetPtr( void ) const;
handle GetID( void ) const { return ID; }
TYPE *operator->( void );
TYPE operator*( void );
private:
handle ID;
static HandleMap<TYPE> handleMap;
};
Here are the barebones for our smart pointer. The SmartPtr holds an ID, where handle is a typedef'd integer. GetPtr( ) indexes the handleMap (handleMap should be a class to wrap around std::map, to provide inserting and erasing methods) with the SmartPtr's ID to retrieve the pointer value the ID maps to. The SmartPtr itself is templated for a specific type of pointer, this way each different type of data to be pointed to has a different statically defined handleMap. This is a benefit to the amount of unique handles your game can generate. The SmartPtr can be constructed with a handle value, and has a gettor for the handle ID itself.
Now whenever the pointer mapped by a specific ID is modified within the handleMap, all subsequent requests to that value will be updated throughout all other SmartPtr instances with the same ID. This is because the only point of access to the pointer value is through the map itself, meaning there's actually only one pointer mapped by an ID at any given time, although there can be many SmartPtrs with the ID mapping to that single value.
However you may be wondering about how to generate the handle ID itself, as each one needs to be unique. It's very simple: start your first handle at the integer value of 1, and then for every subsequent ID mapped to a new value, add one to the previously used ID. This will give you millions of unique IDs, which is well than more enough for basically any game ever, especially in this case as each type has its own handleMap altogether. The ID value of 0 can be reserved for a NULL pointer to allow to check at any given time if a SmartPtr's handle is valid.
As for the implementation of -> and *, you should use your utility GetPtr( ) function for them. -> will simply require a returning of GetPtr( ), and * will dereference the value returned by GetPtr( ), or return NULL if the value from GetPtr( ) is NULL. In this way your SmartPtr class will behave exactly like an ordinary pointer.
Additionally operator overloads for comparisons, or even pointer arithmetic, can be implemented as well. I implemented equality comparisons.
Lastly, I needed a way to handle inserting a pointer into the handleMap, and consequently erasing it from the map as well. I decided to make the SmartPtr handle this automatically with its constructor and destructor. Currently the example SmartPtr interface I've shown above supports constructing a SmartPtr from an already-generated handle, assuming the handle generation is external the SmartPtr itself. However overloading the constructor to receive a TYPE pointer allows for insertion of this TYPE * into the handleMap automatically. Here's an example:
template <typename TYPE>
SmartPtr<TYPE>::SmartPtr( TYPE *ptr )
{
ID = handleMap.InsertPtr( ptr );
clearOnDestruct = true;
}
The constructor inserts the ptr of TYPE into the handleMap by calling a method called InsertPtr( ). The return value is used to initialize the SmartPtr's ID data member. A bool called clearOnDestruct is required to be set, and this will be talked about later.
The InsertPtr( ) is a custom-defined method for your handleMap class. It should just insert a new pointer with a newly generated ID. On destruction of the SmartPtr, if the bool clearOnDestruct is set, the pointer can erase the slot in the handleMap for that ID. This way SmartPtr can contain a pointer for you, perhaps within a linked list or some other data structure.
Lastly, a method called DeletePtr( ) should be implemented to allow delete to be called on the data the SmartPtr contains. This SmartPtr implementation isn't supposed to act as any sort of garbage collection: no reference counting is done. Heap memory contained by a SmartPtr must be explicitly deleted with DeletePtr( ) by design. This way SmartPtr's can easily contain heap allocated memory as well. Here's a diagram showing the flow of a SmartPtr:
Flow of a smart pointer. Can either reference an ID (left), or contain an inserted pointer (right). |
Like I said, there are many different ways to implement smart pointers, and many different things they can achieve. However the details here are just what I decided to use to solve specific problems for projects I work on.
Additional reading:
- C++ for Game Programmers - Noel Llopis: Chapter 12
- Modern C++ Design - Andrei Alexandrescu: Chapter 7
Sunday, August 26, 2012
Easy Buttons for your Game
In my past couple game projects, I never had a proper button system; every clickable button I wrote was ad-hoc and thusly very annoying to write.
I came up with a really simple but flexible button system for my C projects. This post is going to be pretty short, so here's a diagram to summarize everything:
As you can see, the button object takes a function pointer as a parameter during construction. This function is what I refer to as exec. The exec function allows for a button to perform any action on-click. The exec function can take a general purpose parameter, perhaps defaulted to type int or void pointer. Within the exec function any necessary typecasting can be hidden away within it.
The exec functions will often times have additional dependencies throughout your program, in order to provide their desired functionality. For example I might need my exec function to create a live game object on-click. I'd need to include any headers required for creating game objects into my button system in order for this to work. For this reason the exec functions should be abstracted into their own separate file.
Alternatively one can make use of a map of identifiers to exec pointers in order to cut down on file dependencies. By mapping, of perhaps a string identifier, to an exec pointer the button class can then take the identifier upon creation rather than the pointer itself. The various exec functions can then be placed into file systems that pertain to the type of functionality the exec function itself provides. A registration of each exec function into the table would be required.
As for button behavior, I decided to create a few different types of Update functions for each button. Once a button has detected an on-click and has called its exec function, the Update function is called each cycle until the button is ready to detect another click and call its exec again. Since there can be multiple types of buttons each requiring its own unique update, the button object also takes a pointer to an Update function on creation. Here's a couple examples of button updates I've written:
These two buttons act like a standard button press in most GUI systems. One has a slight delay for how fast you can actually click the button, and a debouncer that requires a release of the mouse key before update actually starts using DT to calculate the current time duration. The other acts more like a spray brush where you can just hold down the click button on have exec called repeatedly with a pause between calls.
I came up with a really simple but flexible button system for my C projects. This post is going to be pretty short, so here's a diagram to summarize everything:
As you can see, the button object takes a function pointer as a parameter during construction. This function is what I refer to as exec. The exec function allows for a button to perform any action on-click. The exec function can take a general purpose parameter, perhaps defaulted to type int or void pointer. Within the exec function any necessary typecasting can be hidden away within it.
The exec functions will often times have additional dependencies throughout your program, in order to provide their desired functionality. For example I might need my exec function to create a live game object on-click. I'd need to include any headers required for creating game objects into my button system in order for this to work. For this reason the exec functions should be abstracted into their own separate file.
Alternatively one can make use of a map of identifiers to exec pointers in order to cut down on file dependencies. By mapping, of perhaps a string identifier, to an exec pointer the button class can then take the identifier upon creation rather than the pointer itself. The various exec functions can then be placed into file systems that pertain to the type of functionality the exec function itself provides. A registration of each exec function into the table would be required.
As for button behavior, I decided to create a few different types of Update functions for each button. Once a button has detected an on-click and has called its exec function, the Update function is called each cycle until the button is ready to detect another click and call its exec again. Since there can be multiple types of buttons each requiring its own unique update, the button object also takes a pointer to an Update function on creation. Here's a couple examples of button updates I've written:
//
// ButtonUpdateStandard
// Purpose: Standard update function for a generic button.
//
RETURN_TYPE ButtonUpdateStandard( BUTTON *self, float dt )
{
// Update dt only if non-zero.
// For example when exec is called, set dt to 0.001 (seconds)
// to begin debounce timer.
if(self->dt != 0)
{
self->dt += dt;
}
// Debounce
if(self->dt > STANDARD_BUTTON_DEBOUNCE && !IsKeyPressed( VK_LBUTTON ))
{
self->dt = 0;
}
if(self->dt == 0)
{
VECTOR2D pos = { 0 };
pos.x_ = (float)GLOBAL_INPUTS.xPos;
pos.y_ = (float)GLOBAL_INPUTS.yPos;
// Test for button press
if(IsKeyPressed( VK_LBUTTON ))
{
if(StaticPointToStaticRect( &pos, &self->rect ))
{
{
self->exec( self );
// Initiate debounce
self->dt = 0.0001f;
// End list caller routine if blocking
if(self->isBlocking)
return RETURN_FAILURE;
}
}
}
}
return RETURN_SUCCESS;
}
//
// ButtonUpdateBrush
// Purpose: Standard update function for a generic button.
//
RETURN_TYPE ButtonUpdateBrush( BUTTON *self, float dt )
{
// Update dt only if non-zero.
// For example when exec is called, set dt to 0.001 (seconds)
// to begin debounce timer.
if(self->dt != 0)
{
self->dt += dt;
}
// Debounce
if(self->dt > BRUSH_BUTTON_DEBOUNCE)
{
self->dt = 0;
}
if(self->dt == 0)
{
VECTOR2D pos = { 0 };
pos.x_ = (float)GLOBAL_INPUTS.xPos;
pos.y_ = (float)GLOBAL_INPUTS.yPos;
// Test for button press
if(IsKeyPressed( VK_LBUTTON ))
{
if(StaticPointToStaticRect( &pos, &self->rect ))
{
{
self->exec( self );
// Initiate debounce
self->dt = 0.000001f;
// End list caller routine if blocking
if(self->isBlocking)
return RETURN_FAILURE;
}
}
}
}
return RETURN_SUCCESS;
}
These two buttons act like a standard button press in most GUI systems. One has a slight delay for how fast you can actually click the button, and a debouncer that requires a release of the mouse key before update actually starts using DT to calculate the current time duration. The other acts more like a spray brush where you can just hold down the click button on have exec called repeatedly with a pause between calls.
Labels:
Button,
C,
C++,
Game Button,
GUI,
User Interface
Thursday, August 23, 2012
Game Object Factory: Distributed Factory
When creating game objects within your game, you will likely want to start using new or malloc all over the place. This is bad! Every object should have a clear owner, and there should be a single interface for creating any object. Take a look at this code:
The above code does work, although it creates a giant mess of file dependencies. Suddenly one would require a header file for all the various object definitions in order to use the new operator everywhere. Furthermore, handling of attaching the object to some sort of owner will have to be done everywhere that the new keyword is used.
A factory should be used to create game objects in a unified way. Here's an example of the simplest form of an object factory:
However there are some problems with this type of factory. One is that the factory will still have annoying file dependencies requiring the user to make another include, and modify the switch statement each time a new game object is designed. Examine the following:
The above example shows the usage of the object factory when strings are mapped to types of objects. This setup is the ideal interface one would desire, as a single include to the ObjectFactory header is all that is required to create any type of object. The previous examples were mapping integers of an enumeration type to object types, which would require modification to the switch statement run in the enum, as well as manual updating to the enumeration itself.
So now that we've identified the ideal interface for creating game objects, we still need to make sure that the game objects created have a clear owner. Mapping of strings to game object types will need to be implemented, and cutting down on file dependencies of the the GameObjectFactory file itself is still something to be solved as well. By using a map (perhaps std::map) or hash table we can use a string identifier to fetch some value. Here's what we want our factory to look like:
By mapping a string to a creator function of the type of object desired, all file dependencies except that of those required to utilize the table (map in the above example) are eliminated. Furthermore, whenever a new game object type is designed, all that is needed to be done to ensure compatibility with our factory is a single registration call.
There are some details about the Create function that should be discussed. The Create function is specific to each type of game object, and the interface for each Create function needs to be identical. To ensure this, each type of game object should have a single class called the creator. The creator object for each class derives from a base Creator class to ensure a clear and consistent interface.
What we've now created is called a distributed factory. All file dependencies are hidden by registration in using a hash table data structure to provide the mapping of typeless strings to actual types of game objects. The creation of objects has a unified interface, and there's a single access point (the object factory) that allows the creation of all types of game objects that are registered within its data table of types. There's one last small problem to solve: who owns the created objects? This answer can of course vary, but luckily there can be a single point of object management, like so:
And there we have it! A properly structured object factory. To further the awesome might of the factory, one could easily add in deserialization to the factory, since the factory is already working with abstract strings as type identifiers. Serialization, however, is for a future post!
Factories like this also interface very well with scripting languages :)
Additional reading:
C++ for Game Programmers: Noel Llopis - Chapter 13
void ExampleFunc( void )
{
object *GO = new SomeGameObjectType( );
// Who owns this object?
// What headers do I need to include?
}
The above code does work, although it creates a giant mess of file dependencies. Suddenly one would require a header file for all the various object definitions in order to use the new operator everywhere. Furthermore, handling of attaching the object to some sort of owner will have to be done everywhere that the new keyword is used.
A factory should be used to create game objects in a unified way. Here's an example of the simplest form of an object factory:
#include "RedBlock.h"
#include "BlueEnemy.h"
#include "FireBall.h"
// ID is an enumeration
object *CreateObject( ID type )
{
switch( ID )
{
case REDBLOCK:
return new RedBlock( );
case BLUENEMY:
return new BlueEnemy( );
case FIREBALL:
return new FireBall( );
}
}
However there are some problems with this type of factory. One is that the factory will still have annoying file dependencies requiring the user to make another include, and modify the switch statement each time a new game object is designed. Examine the following:
// Map strings to object types
void ExampleFunc( void )
{
object *GO = CreateObject( "RedBlock" );
}
The above example shows the usage of the object factory when strings are mapped to types of objects. This setup is the ideal interface one would desire, as a single include to the ObjectFactory header is all that is required to create any type of object. The previous examples were mapping integers of an enumeration type to object types, which would require modification to the switch statement run in the enum, as well as manual updating to the enumeration itself.
So now that we've identified the ideal interface for creating game objects, we still need to make sure that the game objects created have a clear owner. Mapping of strings to game object types will need to be implemented, and cutting down on file dependencies of the the GameObjectFactory file itself is still something to be solved as well. By using a map (perhaps std::map) or hash table we can use a string identifier to fetch some value. Here's what we want our factory to look like:
object *CreateObject( char *type )
{
return map[type]->Create( );
}
By mapping a string to a creator function of the type of object desired, all file dependencies except that of those required to utilize the table (map in the above example) are eliminated. Furthermore, whenever a new game object type is designed, all that is needed to be done to ensure compatibility with our factory is a single registration call.
There are some details about the Create function that should be discussed. The Create function is specific to each type of game object, and the interface for each Create function needs to be identical. To ensure this, each type of game object should have a single class called the creator. The creator object for each class derives from a base Creator class to ensure a clear and consistent interface.
// Base class
class Creator
{
virtual object *Create( void );
virtual ~Creator( void );
}
// In another file...
// Specific derived creator for a single object type
class RedBlockCreator : public Creator
{
virtual object *Create( void )
{
return new RedBlock( );
}
}
// In another file...
// Register all game object creators in a single place in a unified way!
void RegisterCreators( void )
{
map->register( "RedBlock", new RedBlockCreator( ) );
// Or use a macro like I did:
REGISTER_CREATOR( RedBlock );
}
What we've now created is called a distributed factory. All file dependencies are hidden by registration in using a hash table data structure to provide the mapping of typeless strings to actual types of game objects. The creation of objects has a unified interface, and there's a single access point (the object factory) that allows the creation of all types of game objects that are registered within its data table of types. There's one last small problem to solve: who owns the created objects? This answer can of course vary, but luckily there can be a single point of object management, like so:
object *CreateObject( char *type )
{
object *newObject = NULL;
newObject = map[type]->Create( );
// Single point of management
... any other code necessary, like init, counting, etc.
ObjectList->Insert( newObject );
return newObject;
}
And there we have it! A properly structured object factory. To further the awesome might of the factory, one could easily add in deserialization to the factory, since the factory is already working with abstract strings as type identifiers. Serialization, however, is for a future post!
Factories like this also interface very well with scripting languages :)
Additional reading:
C++ for Game Programmers: Noel Llopis - Chapter 13
Friday, August 10, 2012
Generic Programming in C
Generic programming is style of programming that is typeless, as in the type of data to be used is not determined at the time the code was written. In C++ this is achieved with templating. I know of two ways to perform generic programming in C. One way is to use pointer typecasting. It's important to understand that when you typecast a pointer no data is modified within the pointer; how the data the pointer points to is interpreted is changed. This can be utilized to create generic algorithms and functionality.
For example a linked list data structure can be created utilizing pointer typecasting where the data to be held by a node is typecasted to a single type no matter what the actual type of the data is. Usually a void * is used in this case to prevent a dereference to data of unknown type. However, this method can only be used when access to the data held by the node (the data pointed by the void *) does not need to be accessed.
A more well-rounded (in my opinion) approach to generic programming in C is to make use of the preprocessor directive ##.
## pastes two tokens together to create a new token. Usage of the ## operator is known as token pasting, or token concatenation. There's a whole lot of documentation on the ## operator. If you're not familiar with it, you need to do some familiarizing before reading on.
All it does is take two tokens within a macro and stick them together to create a new token. Example:
In the above example varval will need to be defined in order to avoid compiler errors, perhaps by using it as the name of a variable to define.
Token pasting is utilized to solve an age old problem in C of creating various functions to achieve the same problem. For example, I've written many linked lists in C and one day, I finally decide "I will never write another linked list in C". I'm sick and tired of having to write a new set of functions for each type of data that I need a new linked list for.
By using the ## operator one can automate the process of duplicating code to create new functions and new definitions based off of data type. This is much like templating in C++, except not nearly as user-friendly, yet still fun and satisfying to try.
Since there's a great need in C to duplicate code for various data types, and the ## operator allows us to create new tokens with two other tokens, we can use the type of the data to create a new name for a new set of functionality for each type of data desired.
Examine the following:
As you can see, it can be really annoying to write different code for different types of data, when the code itself is highly redundant or even identical. However by providing the GENERIC_MAX macro a data type, a new function can be automatically generated by the C preprocessor! The MAX token is concatenated with the TYPE parameter to form a new function definition. Providing the GENERIC_MAX macro with type int will result in a new function being defined named int_MAX. Creating new sets of functionality of anything can be as simple as declaring or defining with a couple calls to a macro.
It might seem a bit self-defeating however to have to figure out what the new name of your generated function is going to be in order to call it. A very simple utility macro should be used to wrap around your generated generic functions. More information on how to do this is shown later in the post.
This can be taken further and be applied to even an abstract data type, such as a linked list. I've constructed myself a generic linked list.
The above code is an example definition of a macro that defines a generic node type. The type of the node is used in creating new name definitions wherever the node's type is required in the structure definition.
Above is an example of how to define a macro to define a function to create a new node of specified type based on the previous node definition.
You might be thinking that it would be a bit annoying to call so many different define and declare macros for each structure and function. You can actually just declare all structures and function prototypes within a single macro, I called mine DECLARE_LIST, which declares all structure types and function prototypes, as well as some utility macros required to use the generic linked list. I also have a DEFINE_LIST macro that defines all functions used in the generic linked list.
For abstract data types such as lists or stacks or queues, you'll probably want some additional utility functions to actually call the various generic functions you've defined. Here is an example set of utility functions I've made for my generic linked list:
As you can see wrappers for the various generic functions are needed to properly call them (unless you want to manually figure out what the name of the function generated for whatever type you're currently using is). There's also a couple macros for getting the generated type associated with a provided type.
Click Here for Source Code
Drawbacks:
There is one major drawback to the ## operator strategy; you cannot use * in the type name, or spaces in the type name. If you need to use a pointer as your type, you'll have to typedef your pointer into a single valid token and pass the typedef'd type to the generic macros.
There is another thing that can be considered a drawback: you cannot have a pointer to a macro. As such debugging macros can be very very tedious. I myself had to generate a processed file and compile with the expanded macro in order to pinpoint where I originally had some compiling errors when creating my generic linked list.
For example a linked list data structure can be created utilizing pointer typecasting where the data to be held by a node is typecasted to a single type no matter what the actual type of the data is. Usually a void * is used in this case to prevent a dereference to data of unknown type. However, this method can only be used when access to the data held by the node (the data pointed by the void *) does not need to be accessed.
A more well-rounded (in my opinion) approach to generic programming in C is to make use of the preprocessor directive ##.
## pastes two tokens together to create a new token. Usage of the ## operator is known as token pasting, or token concatenation. There's a whole lot of documentation on the ## operator. If you're not familiar with it, you need to do some familiarizing before reading on.
All it does is take two tokens within a macro and stick them together to create a new token. Example:
#define PASTE_TOKENS( a, b ) \
a##b
PASTE_TOKENS( var, val ); // output: varval
In the above example varval will need to be defined in order to avoid compiler errors, perhaps by using it as the name of a variable to define.
Token pasting is utilized to solve an age old problem in C of creating various functions to achieve the same problem. For example, I've written many linked lists in C and one day, I finally decide "I will never write another linked list in C". I'm sick and tired of having to write a new set of functions for each type of data that I need a new linked list for.
By using the ## operator one can automate the process of duplicating code to create new functions and new definitions based off of data type. This is much like templating in C++, except not nearly as user-friendly, yet still fun and satisfying to try.
Since there's a great need in C to duplicate code for various data types, and the ## operator allows us to create new tokens with two other tokens, we can use the type of the data to create a new name for a new set of functionality for each type of data desired.
Examine the following:
// Old C style "function overloading"
int INT_MAX( int a, int b )
{
return (a > b) ? a : b;
}
float FLOAT_MAX( float a, float b )
{
return (a > b) ? a : b;
}
double DOUBLE_MAX( double a, double b )
{
return (a > b) ? a : b;
}
// Generic definition
#define GENERIC_MAX( TYPE ) \
TYPE TYPE##_MAX( TYPE a, TYPE b ) \
{ \
return (a > b) ? a : b; \
}
As you can see, it can be really annoying to write different code for different types of data, when the code itself is highly redundant or even identical. However by providing the GENERIC_MAX macro a data type, a new function can be automatically generated by the C preprocessor! The MAX token is concatenated with the TYPE parameter to form a new function definition. Providing the GENERIC_MAX macro with type int will result in a new function being defined named int_MAX. Creating new sets of functionality of anything can be as simple as declaring or defining with a couple calls to a macro.
It might seem a bit self-defeating however to have to figure out what the new name of your generated function is going to be in order to call it. A very simple utility macro should be used to wrap around your generated generic functions. More information on how to do this is shown later in the post.
This can be taken further and be applied to even an abstract data type, such as a linked list. I've constructed myself a generic linked list.
//
// DECLARE_NODE
// Purpose: Declares a node of a specific type.
//
#define DECLARE_NODE( TYPE ) \
typedef struct _NODE_##TYPE \
{ \
TYPE data; \
struct _NODE_##TYPE *next; \
} NODE_##TYPE;
The above code is an example definition of a macro that defines a generic node type. The type of the node is used in creating new name definitions wherever the node's type is required in the structure definition.
//
// DEFINE_FUNC( TYPE )
// Purpose: Defines a function to create a node of specified type.
//
#define DEFINE_FUNC( TYPE ) \
NODE_##TYPE *NODE_CREATE_##TYPE( TYPE data ) \
{ \
NODE_##TYPE *newNode = \
(NODE_##TYPE *)malloc( sizeof( NODE_##TYPE ) ); \
newNode->data = data; \
newNode->next = NULL; \
return newNode; \
}
Above is an example of how to define a macro to define a function to create a new node of specified type based on the previous node definition.
You might be thinking that it would be a bit annoying to call so many different define and declare macros for each structure and function. You can actually just declare all structures and function prototypes within a single macro, I called mine DECLARE_LIST, which declares all structure types and function prototypes, as well as some utility macros required to use the generic linked list. I also have a DEFINE_LIST macro that defines all functions used in the generic linked list.
For abstract data types such as lists or stacks or queues, you'll probably want some additional utility functions to actually call the various generic functions you've defined. Here is an example set of utility functions I've made for my generic linked list:
//
// LIST_CONSTRUCT
// Purpose: Constructs and returns a new list object of specified type.
//
#define LIST_CONSTRUCT( TYPE ) \
LIST_CONSTRUCT_##TYPE( )
//
// NODE_CREATE
// Purpose: Constructs and returns a new node of a specified type.
//
#define NODE_CREATE( TYPE, data, DeleteData, order ) \
NODE_CREATE_##TYPE( data, DeleteData, order )
//
// LIST_INSERT
// Purpose: Inserts a node of specified type in position determined by the order
// data member.
//
#define LIST_INSERT( TYPE, listSelf, node ) \
LIST_INSERT_##TYPE( listSelf, node )
//
// LIST_REMOVE
// Purpose: Removes (deallocates) a node of a specified type from a list.
//
#define LIST_REMOVE( TYPE, node ) \
LIST_REMOVE_##TYPE( node )
//
// LIST_DESTROY
// Purpose: Removes (deallocates) a list and all its nodes of a specified type.
//
#define LIST_DESTROY( TYPE, listSelf ) \
LIST_DESTROY_##TYPE( listSelf )
//
// LIST_CALLER
// Purpose: Uses a callback on all nodes while supplying a specified param.
//
#define LIST_CALLER( TYPE, self, callback, param ) \
LIST_CALLER_##TYPE( self, callback, param )
//
// LIST_TYPE
// Purpose: Returns the type of LIST that pertains to TYPE.
//
#define LIST_TYPE( TYPE ) \
LIST_##TYPE
//
// NODE_TYPE
// Purpose: Returns the type of NODE that pertains to TYPE.
//
#define NODE_TYPE( TYPE ) \
NODE_##TYPE
As you can see wrappers for the various generic functions are needed to properly call them (unless you want to manually figure out what the name of the function generated for whatever type you're currently using is). There's also a couple macros for getting the generated type associated with a provided type.
Click Here for Source Code
//////////////////////////////////////////////////// // Copyright (c) 2012 ICRL // See the file LICENSE.txt for copying permission. // // Original Author: Randy Gaul // Date: 8/9/2012 // Contact: r.gaul@digipen.edu //////////////////////////////////////////////////// #pragma once #include "StringHash.h" // // DECLARE_LIST // Purpose: Declares a linked list of a specific type on the heap. // #define DECLARE_LIST( TYPE ) \ typedef struct _NODE_##TYPE \ { \ TYPE data; \ void (*DeleteData)( TYPE ); \ int order; \ struct _NODE_##TYPE *next; \ struct _NODE_##TYPE *prev; \ } NODE_##TYPE; \ \ struct _LIST_##TYPE; \ \ typedef struct _LIST_VTABLE_##TYPE \ { \ void (*Destroy) ( struct _LIST_##TYPE *self ); \ NODE_##TYPE *(*CreateNode)( TYPE data, void (*DeleteData)( TYPE ), int order ); \ RETURN_TYPE (*Insert) ( struct _LIST_##TYPE *self, NODE_##TYPE *node ); \ void (*Remove) ( NODE_##TYPE *node ); \ } _LIST_VTABLE_##TYPE; \ \ extern const _LIST_VTABLE_##TYPE LIST_VTABLE_##TYPE; \ \ typedef struct _LIST_##TYPE \ { \ NODE_##TYPE *list; \ const _LIST_VTABLE_##TYPE *vtable; \ } LIST_##TYPE; \ \ LIST_##TYPE *LIST_CONSTRUCT_##TYPE( void ); \ NODE_##TYPE *NODE_CREATE_##TYPE( TYPE data, void (*DeleteData)( TYPE data ), int order ); \ RETURN_TYPE LIST_INSERT_##TYPE ( LIST_##TYPE *self, NODE_##TYPE *node ); \ void LIST_REMOVE_##TYPE ( NODE_##TYPE *node ); \ void LIST_DESTROY_##TYPE ( LIST_##TYPE *self ); \ void LIST_CALLER_##TYPE ( LIST_##TYPE *self, RETURN_TYPE (*callback)( NODE_##TYPE *, void * ), void *param ); \ TYPE LIST_PEEK_##TYPE ( LIST_##TYPE *self ) // // DEFINE_LIST // Purpose: Defines a linked list of a specific type on the heap. // #define DEFINE_LIST( TYPE ) \ const _LIST_VTABLE_##TYPE LIST_VTABLE_##TYPE = \ { \ LIST_DESTROY_##TYPE, \ NODE_CREATE_##TYPE, \ LIST_INSERT_##TYPE, \ LIST_REMOVE_##TYPE \ }; \ \ LIST_##TYPE *LIST_CONSTRUCT_##TYPE( void ) \ { \ LIST_##TYPE *linkedList = (LIST_##TYPE *)malloc( sizeof( LIST_##TYPE ) ); \ linkedList->list = NULL; \ linkedList->vtable = &LIST_VTABLE_##TYPE; \ return linkedList; \ } \ \ NODE_##TYPE *NODE_CREATE_##TYPE( TYPE data, void (*DeleteData)( TYPE ), int order ) \ { \ NODE_##TYPE *newNode = (NODE_##TYPE *)malloc( sizeof( NODE_##TYPE ) ); \ newNode->data = data; \ newNode->DeleteData = DeleteData; \ newNode->order = order; \ newNode->next = NULL; \ newNode->prev = NULL; \ return newNode; \ } \ \ RETURN_TYPE LIST_INSERT_##TYPE ( LIST_##TYPE *self, NODE_##TYPE *node ) \ { \ NODE_##TYPE *scan = NULL; \ if(!self) \ return RETURN_FAILURE; \ \ scan = self->list; \ \ if(scan) /* if list is not empty */ \ { \ while(scan->next) \ { \ if(scan->order <= node->order) \ break; \ scan = scan->next; \ } \ \ /* beginning of list */ \ if(!scan->prev) \ { \ node->next = scan; \ scan->prev = node; \ self->list = node; \ } \ /* if not at end of list */ \ else if(scan->next) \ { \ node->next = scan; \ node->prev = scan->prev; \ scan->prev->next = node; \ scan->prev = node; \ } \ else \ { \ if(scan->prev) /* If more than one item in list */ \ { \ if(scan->order <= node->order) \ { \ node->next = scan; \ node->prev = scan->prev; \ scan->prev->next = node; \ scan->prev = node; \ } \ else \ { \ node->prev = scan; \ scan->next = node; \ } \ } \ else /* One item in list */ \ { \ /* Place before or after depending on zOrder */ \ if(scan->order <= node->order) \ { \ scan->prev = node; \ node->next = scan; \ self->list = node; \ } \ else \ { \ scan->next = node; \ node->prev = scan; \ } \ } \ } \ } \ else /* if list is empty */ \ { \ self->list = node; \ } \ \ return RETURN_SUCCESS; \ } \ \ void LIST_REMOVE_##TYPE ( NODE_##TYPE *node ) \ { \ if(node->prev) \ { \ node->prev->next = node->next; \ if(node->next) \ { \ node->next->prev = node->prev; \ } \ } \ else if(node->next) \ { \ node->next->prev = NULL; \ } \ \ if(node->DeleteData) \ { \ node->DeleteData( node->data ); \ } \ free( node ); \ } \ \ void LIST_DESTROY_##TYPE ( LIST_##TYPE *self ) \ { \ NODE_##TYPE *temp = NULL; \ NODE_##TYPE *scan = self->list; \ \ while(scan) \ { \ temp = scan->next; \ self->vtable->Remove( scan ); \ scan = temp; \ } \ \ free( self ); \ } \ \ void LIST_CALLER_##TYPE ( LIST_##TYPE *self, RETURN_TYPE (*callback)( NODE_##TYPE *, void * ), void *param ) \ { \ NODE_##TYPE *scan = self->list, *temp = NULL; \ \ while(scan) \ { \ temp = scan->next; \ /* Stop the callback process if return type is not 0 */ \ if(callback( scan, param )) \ { \ scan = temp; \ return; \ } \ scan = temp; \ } \ } \ \ TYPE LIST_PEEK_##TYPE ( LIST_##TYPE *self ) \ { \ return self->list->data; \ } // // LIST_CONSTRUCT // Purpose: Constructs and returns a new list object of specified type. // #define LIST_CONSTRUCT( TYPE ) \ LIST_CONSTRUCT_##TYPE( ) // // NODE_CREATE // Purpose: Constructs and returns a new node of a specified type. // #define NODE_CREATE( TYPE, data, DeleteData, order ) \ NODE_CREATE_##TYPE( data, DeleteData, order ) // // LIST_INSERT // Purpose: Inserts a node of specified type in position determined by the order // data member. // Notes: Will return RETURN_FAILURE if list pointer is NULL. // #define LIST_INSERT( TYPE, listSelf, node ) \ LIST_INSERT_##TYPE( listSelf, node ) // // LIST_REMOVE // Purpose: Removes (deallocates) a node of a specified type from a list. // #define LIST_REMOVE( TYPE, node ) \ LIST_REMOVE_##TYPE( node ) // // LIST_DESTROY // Purpose: Removes (deallocates) a list and all its nodes of a specified type. // #define LIST_DESTROY( TYPE, listSelf ) \ LIST_DESTROY_##TYPE( listSelf ) // // LIST_CALLER // Purpose: Uses a callback on all nodes while supplying a specified param. // #define LIST_CALLER( TYPE, self, callback, param ) \ LIST_CALLER_##TYPE( self, callback, param ) // // LIST_PEEK // Purpose: Returns the data of the first in the list. // #define LIST_PEEK( TYPE, self ) \ LIST_PEEK_##TYPE( self ) // // General data type handling. These functions must be used to handle list and node // objects/pointers directly. // #define LIST_TYPE( TYPE ) \ LIST_##TYPE #define NODE_TYPE( TYPE ) \ NODE_##TYPE
Drawbacks:
There is one major drawback to the ## operator strategy; you cannot use * in the type name, or spaces in the type name. If you need to use a pointer as your type, you'll have to typedef your pointer into a single valid token and pass the typedef'd type to the generic macros.
There is another thing that can be considered a drawback: you cannot have a pointer to a macro. As such debugging macros can be very very tedious. I myself had to generate a processed file and compile with the expanded macro in order to pinpoint where I originally had some compiling errors when creating my generic linked list.
Subscribe to:
Posts (Atom)