C++ - Part 5

Tony Houghton

This month's article is a direct continuation from last month and does not stand alone. Please read last month's article first.

Assignment operators

Assignment operators are those such as += and *=. It is natural to think of implementing, say += in terms of + and =, but it can be more efficient to do the reverse, i.e. use += to implement +. Complex objects are small, so little would be gained, but consider a class with many data members, such as a large matrix. A suitable += operator function would be something like:

Matrix &Matrix::operator+=(Matrix &m)
{
  // Add each element of m to each
  // element of self (*this)
  return *this;
}

This saves creating a temporary object. For + we cannot avoid creating a temporary object, but to save repeating code, we can call += :

Matrix Matrix::operator+(Matrix &m)
// Must return object, not reference
// because temp will be destroyed
// after return
{
  Matrix temp = *this;
  temp += m;
  return temp;
}

Subscripting

The subscript operator is operator[]. This can be overloaded with a subscripting operand (the expression inside the square brackets) of any type. A common use is for an associative array. Consider an example which contains a set of strings, each string having an associated number, being used to convert textual user input into something which the program can make more sense of:

class Command {/*...*/};
// representing various actions the
// program can perform
class CommandSet {
  // List of strings & Commands etc
public:
  Command operator[](char *str);
  // Find a Command to match str
};
extern CommandSet command_set;
void process_input()
{
  char input[256];
  cin >> input;
  Command com = command_set[input];
  // Do something based on com
}

Function call

To enable overloading of function call, a calling expression of the form function(expression list) is interpreted as function.operator()(expression list), where function is the name of an object.

In the above example, we would most likely want the Command returned by command_set to perform an action depending on its internal information, and on a parameter supplied by the caller. To simplify the call notation, we could supply Command with a () operator:

void Command::operator()(char *carg)
{
  // Take some action depending on
  // object's internal information
  // with carg as a parameter
}

and process_input() would become:

void process_input()
{
  char input[256];
  cin >> input;
  Command com = command_set[input];
  cin >> input;
  com(input);
}

This is a good example of when to overload function call; one particular method of the class dominates all others.

Dereferencing

Dereferencing is the name for -> statements; -> can be overloaded as a unary operator; this is useful for a container class, where you could use -> to access members of the contained object.

The dereferencing operator is redefined as a function returning a pointer to some other class or structure. When the compiler encounters a -> call for the container class, it applies -> (as known to C programmers) to the object pointed to by the return value of operator->().

struct X {
  int a;
  float b;
};
class PtrToX {
  X *x;
public:
  // ...
  X *operator->()  { return x; }
};
extern PtrToX px;
int main()
{
  int i = px->a;
    // Equivalent to i = (px.x)->a;
    // except the latter would not
    // be allowed due to a being a
    // private member
}

A quick recall of the rules for overloading clarifies the point that there can only be one operator-> for each class, because only the return type could change.

Increment and decrement

The only difference between the operators ++ and -- and other unary operators is that they can be postfix operators (e.g. a++) as well as the more usual prefix (e.g. ++a). Since there can be no assumption that ++ and -- will always have similar meanings as when applied to an int, it is necessary to define the prefix and postfix versions separately. The prefix version is no problem - this is the same as for any other unary operator. To define and declare a postfix version, you add a dummy int argument which is never actually used. For stand-alone functions, the int is placed as the second argument, while for member functions, it is placed as the only argument. For example:

X *PtrToX::operator++()    // Prefix
{ return ++x; }
X *PtrToX::operator++(int) // Postfix
{ return x++; }

A common application for the increment and decrement operators is in conjunction with dereferencing and subscripting to implement a 'smart pointer'. At its simplest, a smart pointer is essentially a pointer contained within a class so that checking can be applied to ensure it always points to a proper object etc. Overloading allows the creation of an interface to make the class appear as a pointer.

Free store

It is possible to define new versions of new and delete as member operator functions, but they are both special cases.

operator new() and operator delete() must be static members (they need not be declared with the keyword static, as this is implicit). Concentrating on operator new() for now, it takes an argument of type size_t (an integral type guaranteed to be large enough to hold any size the machine can cope with, defined in <stddef.h>) and returns void *, a value pointing to the memory allocated. The size_t argument is automatically given a value equal to the size of the object to be created when called (multiplied by the number of elements if allocating an array). The object pointed to by the return value cannot be allowed to move because the new operator will not know where the resultant pointer is to be stored; therefore you could not use new directly as an interface to a shifting heap such as flex or Mem.

Reasons for taking over free store allocation include improving efficiency (although you are unlikely to improve significantly on the default) and placement, i.e. ensuring objects are created in a specific area of memory. If you wanted to ensure all objects of a certain class were placed in an area controlled by OS_Heap you could give the class a base class OnHeap :

// Allow misc OS_Heap operations
// on a heap
// Error handling etc assumed
class Heap {
  // ...
public:
  // ...
  // Constructor must initialise
  // ...
  void *alloc(int reason, int size);
  // Returns block allocated
  void free(void *block);
};
class OnHeap {
  static Heap heap;
public:
  void *operator new(size_t s)
  {
    return heap.alloc(s);
  }
  void operator delete(void *block)
    // See below
  {
    heap.free(block);
  }
};

Operator delete cannot be overloaded as such, but redefined in one of two ways, either with or without a size_t argument. In either case, the first or only argument is a void * pointing to the block of memory to be freed; the size_t argument follows this, if present, and will be given the size of the block of memory to be freed.

The OnHeap example shows an implementation of delete without size_t. Remember that if you supply a delete operator using size_t, you must ensure any base classes using it have a virtual destructor to ensure the correct size is always passed for derived classes (see Archive 9.1).

Placement

The concept of placement can be taken a step further by overloading the global new operator. It can take a further argument after the size_t, called the placement argument, to allow the operator new function to decide where to place the object. For instance, to allow objects of any type to be placed on Heaps :

void *operator new(size_t s,
                   Heap *heap)
{
  Heap *result = (Heap *)
    heap.alloc(s + sizeof(Heap));
  *result = heap;
  return result + 1; 
}

This has the overhead that the block allocated must be created with enough room to also store the placement for use by delete, and the pointers manipulated to allow for this. The fact that every object created could theoretically be on a different heap makes it necessary to store the placement alongside each object in this way.

Placement parameters (as opposed to arguments, see the introduction to this article) are given as parameters to new, but the type being created follows these outside the parentheses; the type name then takes any arguments for a constructor in plain parentheses, or the number of elements in square brackets for an array. (Remember you cannot have arrays of objects whose constructors require parameters.) The type name is used to create the size_t argument. For example, if you wanted to create an Array on a heap called heap :

  Array *a = new(heap) Array(10);

would result in a call of:

  a = (Array *)
   operator new(sizeof(Array), heap);

Objects that have been placed in this way cannot be destroyed by the standard operator delete. Instead, you would have to first call the object's destructor explicitly then call a function to free the memory.

The global operator delete cannot be overloaded, but it does seem to be possible to replace it. If you were absolutely certain that you would only use your customised operator new you could provide:

void operator delete(void *block)
{
  Heap *heap = ((Heap *) block) - 1;
  heap.free(heap);
}

You are strongly advised not to overload global operator new. Where it seems advantageous, you should try to find an alternative first.

Contents - The Archives - Archive Articles