Thursday, February 16, 2012

Template identity transformation

So, here's a trick involving placement new and this. The motivating example is as follows:

Suppose you have two versions of the same structure: 32 and 64. You have an algorithm which needs to operate on the members of these structures, but the algorithm is the same regardless of which version you are using. Sounds like a job for template classes, right? Perhaps something like this:


struct Version32
{
short version;
long length;
};

struct Version32
{
short version;
long long length;
};

template<class T>
class Reader {
public:
Reader(T* pMember):
m_pMember(pMember)
{}

T* pMember;
long long GetLength(void) const
{
return pMember->length;
}
};


So you can see here, if we know what version the structure is, then we can initialize the correct Reader and do what we need to do. Here's an example:


struct Version32
{
short version;
long length;
};

struct Version32
{
short version;
long long length;
};

template<class T>
class Reader
{
public:
Reader(T* pMember):
m_pMember(pMember)
{}

T* pMember;
long long GetLength(void) const
{
return pMember->length;
}
};

long long LengthGetter(void* pData)
{
switch(*(short*)pData)
{
case 32:
{
Reader<Version32> rdr((Version32*)pData);
return rdr.GetLength();
}
case 64:
{
Reader<Version64> rdr((Version32*)pData);
return rdr.GetLength();
}
}
return -1;
}


This works, but it's unwieldy. You would need to create a wrapper function like LengthGetter for every member function of Reader. So, we provide an interface:


struct Version32
{
short version;
long length;
};

struct Version32
{
short version;
long long length;
};

// This one is the interface
class Reader
{
public:
virtual long long GetLength(void) const = 0;
};

template<class T>
class ReaderInst:
public Reader
{
public:
ReaderInst(T* pMember):
m_pMember(pMember)
{}

T* pMember;
long long GetLength(void) const
{
return pMember->length;
}
};

Reader* ReaderFactory(void* pData)
{
switch(*(short*)pData)
{
case 32:
return new Reader<Version32>((Version32*)pData);
case 64:
return new Reader<Version32>((Version32*)pData);
}
return nullptr;
}


This is great, and works reasonably well. Except for one problem.

What if the type of the structure isn't easily known unless part of it is processed? For instance, PECOFF headers have both 32 and 64 bit versions, and you have to parse most of the header in order to figure out which one you're dealing with. You would need to first initialize the first version, get the information you need, and then initialize the second version. If pointers to this type are being retained in a lot of places, tearing down and regenerating the structure isn't an easy option.

And this is where you might use placement new. It winds up looking like this:


#include <new.h>

class Reader
{
public:
virtual short GetVersion(void) const = 0;
virtual long long GetLength(void) const = 0;
};

template<class T>
class ReaderInst:
public Reader
{
public:
ReaderInst(T* pMember):
m_pMember(pMember)
{}

T* pMember;

short GetVersion(void) const {return pMember->version;}
long long GetLength(void) const {return pMember->length;}

template<class U>
ReaderInst<U>* SwitchType(void)
{
return new (this) ReaderInst<U>(this);
}
};

Reader* ReaderFactory(void* pData)
{
ReaderInst<Version32>* pReader;
pReader = new Reader<Version32>((Version32*)pData);
if(pReader->GetVersion() == 64)
pReader->SwitchType<Version64>();
return pReader;
}


But there's a problem even with this version. What do you do when you have non-primitive members? For instance, let's say your ReaderInst has a vector in it. Now, when you call placement new, vector will wind up getting reinitialized with nothing in it. Well, fortunately the C++0x standard includes the idea of a move constructor, which allows you to perform a "shallow move" from one instance to another, without copying.


#include <new.h>

class Reader
{
public:
virtual short GetVersion(void) const = 0;
virtual long long GetLength(void) const = 0;
};

template<class T>
class ReaderInst:
public Reader
{
private:
template<class U>
ReaderInst(ReaderInst<U>&& rhs) :
m_myData(move(rhs.m_myData))
{
static_assert(
sizeof(rhs) == sizeof(*this),
"Cannot move between types of unequal sizes"
);
}

public:
ReaderInst(T* pMember):
m_pMember(pMember)
{}

vector<int> m_myData;
T* pMember;

short GetVersion(void) const {return pMember->version;}
long long GetLength(void) const {return pMember->length;}

template<class U>
ReaderInst<U>* SwitchType(void)
{
return new (this) ReaderInst<U>(ReaderInst<T>std::move(*this)));
}
};

Reader* ReaderFactory(void* pData)
{
ReaderInst<Version32>* pReader;
pReader = new Reader<Version32>((Version32*)pData);
if(pReader->GetVersion() == 64)
pReader->SwitchType<Version64>();
return pReader;
}


What's happening in SwitchType, here? This is it:

  1. The call to std::move causes the argument to be transformed to one whose value you don't care about. It doesn't actually cause any code to be emitted.

  2. A new temporary local instance of ReaderInst<Version32> is created, and it moves (does not copy) values out of this

  3. The constructor to ReaderInst<Version64> gets called, even though this currently points to a ReaderInst<Version32>

  4. The ReaderInst<Version64> constructor moves those members out of the temporary instance and back into this


Complicated, but as long as you remember to fully implement your move constructor any time you add new state variables, this is a safe way to change type at runtime. And as long as your move constructor is private, you only need to worry about moving around nonprimitive members--the primitive members remain untouched.