Tony Houghton
One of the newer features of C++ is an effective solution to the age-old problem of error handling. It provides a mechanism called exceptions to allow program flow to be diverted to error handlers when something goes awry. Unfortunately, at the time of writing, the only RISC OS C++ compiler to support exceptions is Easy C++, which I do not have, so I cannot test any examples. However, the subject is quite easy to understand, and is clearly explained by the documentation I have. If I show you what you're missing, you might feel like me and want to make a lot of noise to Acorn about it!
The traditional way of dealing with errors, such as lack of memory, in large applications which cannot afford to quit abruptly, is to skip the rest of the current function or loop and return:
int f() { // Do something if (error) { // Perhaps report error here return 0; } // Carry on as normal return (useful value); }
A drawback of this is that f() probably has to indicate to its caller that an error has occurred, and the caller in turn may have to indicate to its caller, and so on. Also, the example shows using a return value of zero to indicate an error. This is usually fine for pointers, but zero may be a normal value for a function returning an integer. Also, it tells the caller nothing about why f() failed.
The technique can be expanded so that all functions that may return an error condition have a return type that can indicate an error, such as _kernel_oserror *. The usual return value could be changed to a pointer argument:
_kernel_oserror *f(int *result) { // Do something if (error) { // Perhaps report error here return error; } // Carry on as normal result = useful value; return 0; }
This is inefficient in that it adds an extra, mostly redundant, return value or argument to the function call process. Swapping the error and result pointers would not significantly improve this.
In fact, the whole technique suffers from the overhead of every function in a chain having to explicitly check for errors.
Exceptions
The alternative offered by exceptions can best be shown by an example:
class Array { // of ints int size; int *array; public: // ... class Range {}; int &operator[](int subscript) { if (subscript >= size) throw Range(); return array[subscript]; } // ... };
The member class, Range, is used only to distinguish exceptions associated with Arrays; classes need not contain any members.
The keyword throw is followed by an expression. In this case, Range() constructs an object of type Array::Range. When an exception is thrown, the rest of the function is abandoned and program flow is resumed at the innermost handler for a matching type provided by the chain of callers. This process is called stack unwinding, effectively automating the process described in the previous section. Any automatic (local) variables created by functions in the call chain are properly destructed and deallocated to avoid memory being lost ("memory leaks").
To handle (catch) an exception, a function must enclose the block that calls functions which might throw them in a try block followed by one or more catch blocks (handlers):
void negate(Array &a) { for (int i=0; i<100; ++i) a[i] = -a[i]; cout << "Array negated\n"; } void f(Array &a) { try { negate(a); } catch (Array::Range) { cerr << "Array subscript " "out of range\n"; } // Subsequent code }
negate() is bugged in that it assumes all Arrays have 100 members. If it tries to process one with less, Array::operator[] will throw a Range exception, skipping the rest of the operator's code. From here the computer 'looks at' the part of negate() that called the operator; as it is not a try block, negate() is also terminated early, skipping the cout statement, but the variable i is deallocated. negate()'s caller is a try block, so the following handlers are examined in order. In this case, there is only one, so the code within it is executed because its type matches that of the thrown object. The "Subsequent code" is executed if an exception is caught or if there is no exception, but not if there was an exception that was not caught before it.
Using a caught value
The bit in brackets following keyword catch is a little like a function argument. As well as acting as a tag for matching the type, a variable can be put here to be used within the handler. For example, we may want to indicate the value of the subscript that caused an exception:
class Array { // ... class Range { public: int bad_sub; Range(int s) : bad_sub(s) {} }; // ... };
We would then have to change the throw expression to:
throw Range(subscript);
and the handler could become:
catch (Array::Range r) { cerr << "Array subscript " << r << " out of range\n"; }
Catching derived classes
Errors can usually be categorised; exceptions allow categorisation to be modelled as base and derived classes. For example, suppose Range was just one of a group of errors that could be thrown by an operation on an Array:
class Array { // ... class Error { public: virtual void print_reason() { cerr << "Array error\n"; }; class Range : public Error { int bad_sub; public: Range(int s) : bad_sub(s) {}; void print_reason() { cerr << "Array subscript " << bad_sub << " out of range\n"; }; class MisMatch : public Error { // Similar stuff to Range }; // ... };
Calling the base class Error might seem foolishly vague, but out of the scope of Array it has to be qualified with Array:: anyway. A MisMatch might be thrown by attempting to add the members of two arrays of different sizes.
Then we might write:
void subtract(Array &a, Array &b) { try { negate(b); add(a, b); // Assuming we have defined // void add(Array &, Array &); } catch (Array::Error &e) { e.print_reason(); } // Subsequent code }
Note the use of virtual functions to allow a specific message to be printed without the handler knowing which of the Array::Error classes it has caught.
To be able to do this, you must catch a reference to the exception, not simply the exception. The latter would just cause the sub-class (e.g. the base Error part) of a derived class (e.g. Range) to be copied to a new base object without the information specific to the derived class.
Multiple inheritance is also often useful for exception types.
Multiple handlers
Only being able to handle one type of exception, or those derived from it, would be a little restrictive, so you can have a chain of handlers:
{ try { // ... } catch (type1) { // ... } catch (type2) { // ... } // ... catch (...) { // ... } }
The final catch statement with an ellipsis (...) is not my abbreviation. This stands for catch any type.
The syntax is reminiscent of a switch...case statement, but the differences are:
Each handler has its own scope.
When program flow reaches the end of a handler, none of the others are executed (whereas in a case statement you'd have to add break).
The catch-all handler, catch (...) is analogous to switch...case's default.
Handlers must be in a sensible order to avoid writing ones which can never get executed. This means that handlers for derived classes must go before their derived classes, and a (...) handler must be the last one for any try block.
Rethrowing an exception
A throw statement with no operand rethrows the last exception:
throw; // Rethrow last exception
A good use for this would be a situation like:
void g() { // Set up something special try { // ... } catch (...) { // Clear up the special something throw; // Let caller deal with // the exception } // Clear up the special something } void f() { try { g(); } // Many useful handlers }
throw on its own can only be called from a handler or from a function that is directly or indirectly called by a handler. Otherwise terminate() is called (see below).
One thing to remember is that the whole exception (i.e. all of a derived class) is rethrown even if only a base class was caught.
Uncaught exceptions
If an exception is not caught, the terminate() function is called. By default terminate() (defined void terminate(void);) calls abort() which abruptly terminates the program.
As with operator new, it is possible to replace terminate. The following is provided:
typedef void (*PFV)(); PFV set_terminate(PFV);
The return value is the previously registered handler. Unfortunately, I cannot tell you in which header the above is defined because Acorn C++ does not provide it.
By the time terminate() is called, all automatic variables should have been destructed, but you may have some global objects which need to be destructed. Therefore, a practical strategy might be:
// Putting it all in a class keeps // things tidy class MyTerminate { static PFV old_terminate; static void my_terminate(); MyTerminate() { if (!old_terminate) old_terminate = set_terminate(my_terminate); } } my_terminate; PFV MyTerminate::old_terminate = 0; void MyTerminate::my_terminate() { set_terminate(old_terminate); exit(0); }
The reason my_terminate() restores the original function is that, otherwise, if a destructor called by exit() caused terminate() to be called again, there might be an infinite loop.
Resource acquisition is initialisation
Exceptions have led to a programming style known as 'resource acquisition is initialisation'. Whenever a function or block acquires a resource at its beginning, such as claiming some memory or opening a file, and it is released at its end, care should be taken that an exception will also cause the resource to be released. One way would be to wrap the resource acquisition and release in the constructor and destructor of an automatic variable, but to avoid introducing many trivial classes, it can be preferable to write something like:
void f() { char *workspace = new char[256]; try { // Use workspace } catch (...) { delete[] workspace; throw; } delete[] workspace; }
Exceptions and constructors
Exceptions make it much easier to handle errors from constructors. Constructors do not have a return value, so without them, a programmer has to resort to providing error flags; the stream libraries do this.
The interaction between constructors and exceptions is potentially fragile, but it has been carefully designed so that a little care and application of the 'resource acquisition is initialisation' technique leads to reliable operation. Remember (issues 8.12 and 9.1), a constructor does the following before executing the code provided as a function:
Constructs base classes.
Initialises special members.
As we have seen, this process can either be automatic, controlled by the programmer or a mixture.
As far as the exception mechanism is concerned, an object or sub-object is not considered initialised until its constructor has completely executed.
If an exception occurs during a constructor, all sub-objects and members that have been constructed are destructed in reverse order. The destructor corresponding to the constructor in which the exception occurred, is not called; therefore constructors should be written with 'resource acquisition is initialisation' in mind.
If a destructor that is executed during stack unwinding caused by an exception, throws an exception, terminate() is called. There is no provision for nesting of exceptions in this way. Try to avoid throwing exceptions from destructors, although it is not always practical to keep track of every function that a destructor calls.
One form of nested exception that is allowed is for a handler to throw another exception, or for exceptions to be nested by having a try...catch block in a handler. The latter should be avoided if possible.
New handlers
Providing a new handler by passing a function address to set_new_handler() is an extremely limited solution without exceptions. All it can do is try to create some space for new to try again and return, at which point new is called again, or not return. If it cannot make any memory available, all it can do without exceptions is terminate the program, probably by calling exit(). With exceptions, the program can be allowed its 'second chance' at claiming memory and still be able to continue in some way (essential for an application) if this fails, by throwing an exception from the new handler.
Interface specifications
A function may specify, in its declaration, what types of exception it can throw. The syntax is:
void f(int a) throw (T1, T2, T3);
This means that the only types that f() can throw are T1, T2, T3, or classes derived from them. Without any specification (conventional declaration), a function is allowed to throw any type, and with an empty specification (throw()), it cannot throw any exceptions at all.
If a function throws an exception that is not specified, unexpected() is called. Its default action is to call terminate(), but it can be replaced with set_unexpected() in the same way as terminate.
If you provide the above specification, it is equivalent to defining f() as:
void f(int a) { try { // f()'s code } catch (T1) { throw; } catch (T2) { throw; } catch (T3) { throw; } catch (...) { unexpected(); } }
This means that unexpected() is conceptually called from a handler, so it may rethrow the exception. A typical way of exploiting this would be to give some error message to indicate that an unexpected exception has been thrown, before letting it be handled as normal.
Exceptions that are not errors
I will not go into details here, but consider a loop that calls a function and completes the loop when the function returns a value to say it has finished (e.g. reading a file). If the function is checking for this condition, it is really a waste of time for the loop to have to check it as well. If the loop is likely to have many iterations and is time-critical, it may be more efficient to have the function throw an exception when it reaches its terminal condition, even if this is not an error.
That's all folks
I have now covered virtually all of C++. There are a few other concepts, but these are either minor or very new and not standardised; they are not supported by any of the compilers so far available for RISC OS, to my knowledge.