Archive

C++ - Part 1



Tony Houghton

In this column, I intend to provide C programmers with a good working knowledge of C++; it will tend to concentrate on the actual language, rather than RISC OS specific issues. Although I am a relative newcomer to C++, I am a reasonably experienced C programmer, and I thought that writing a series of articles, using Bjarne Stroustrup as a reference, would be a good way to get a firm grasp of the concepts myself.

This first article will introduce the concepts of C++ and detail the minor, but useful, changes from ANSI C that can be applied to procedural (non-object-oriented) programming. Subsequent articles will deal in more detail with entities such as classes and, hopefully, how to use them effectively.

An overview of C++

The main facilities C++ adds to ANSI C can be summed up as:

A "better" C

C++ can be used to write C-style procedural programs using the enhancements below, but to do this would be grossly under-using the language. Therefore, these sections are intended for introduction and reference only. You should expand your ideas to embrace object-oriented programming as soon as possible.

Comments

C++ adds a new type of comment, beginning with // and ending at the end of a line. This is now the preferred way of using comments except where commenting out a block of code (where the comment needs to span several lines and it is inconvenient to prefix every line with //) and where code is desired after a comment, in the middle of a line (the latter would best be avoided). Comments using // can be nested within /* */ comments, and vice-versa.

Type checking

C++ is stricter than ANSI C about type checking. Functions must be defined with their arguments or declared before they can be called (Acorn ANSI C gives a warning, C++ gives an error). int's cannot be assigned to enum's, or void *'s to other pointers without explicit casting, but the reverse is permitted:

enum acorn {electron, bbc, master, archimedes,
                                          risc_pc};
enum acorn computer;
int machine;
void *vptr;
int *iptr;
int main()
{
  computer = archimedes;// OK, enum= enum
  machine = archimedes; // OK, int =enum
  computer = 3;         // Error:enum = int
  vptr = &computer;   // OK, void *= &variable
  iptr = &computer;   // OK, int *= &enum
  vptr = iptr;        // OK, void *= int *
  iptr = vptr;        // Error: int *= void *
}

Streams

Streams can only be very briefly introduced at this stage, but it is useful to know how to use cout and cin. These are sometimes incorrectly referred to as functions, whereas they are actually objects (analogous to a variable with a struct type).

Streams are more sophisticated than the functions provided by <stdio.h>, so they are in most ways superior, but they can be harder to set up and use (as you're about to see). As you would expect, there are streams specialised for use with files, strings and the screen and keyboard.

cout is the default output stream (c.f. printf()) and cin is the default input stream (c.f. scanf()). They are used thus:

#include <iostream.h> // Contains declarations of
                                              cin & cout
#include <stdio.h>
int main()
{
  int number;
  cout << "Hello, here is a number: "<< 4 << ".\n";
  printf("Enter another number.\n");
  cin >> number;
  cout << "\nYou entered " << number<< ".\n";
}

Incidentally, if you try compiling and running this, the results may be a little unexpected, to say the least. I'll explain this shortly. The lines beginning with cout and cin look a little strange if you do not know about classes and overloaded operators. To see how the syntax fits (ignoring functionality), it may help you to imagine cout and cin as int variables, the operators << and >> as having int results (imagining + in place of them may be easier), and the strings as integers. It then becomes clear that the above are (chained) expressions whose values are ignored.

Depending on your implementation, you may have found that the second message, output by printf(), appeared before the first cout message, and you had to enter loads of numbers (or give up and press escape), before the last message appeared. This is because streams are buffered; the buffers are allowed to fill before any characters are passed to the screen or the program. The easiest general purpose 'cure' is to add the lines:

  cout.sync_with_stdio();

  cin.sync_with_stdio();

before using the streams (e.g. at the start of main()).

Variable declaration

C++ allows automatic (local) variables to be declared anywhere within a function, not only at the start of the block. Variables can also be declared in the first sub-statement of a for; they then have scope to the end of the block enclosing the for. A variable declaration cannot follow if except in braces {}. Variables can be declared, but not initialised in the middle of switch statements.

int g(int z){return (z + 1)};
int h(int y){return (y - 1)};
int main(int argc, char *argv[])
{
  int a, b = argc;// As in C
  int c;
  a = g(argc);// Call another function
  int d = h(a);// Declaration mid-block (OK)
  if (argc) int e;// Error: conditional
                                        declaration
  for (int index = 0;// Declaration with for (OK)
       index < argc;
       index++)
  {
    printf(argv[index]);
  }
  switch (argc)
  {
    case 0:
      c = h(-1);
      break;
    case 1:
      int e;// Declaration mid-block (OK)
      e = h(argc);
      break;
    case 2:
      e = h(argc);// OK: e still in scope
                                           from above
      int x = e;// Error: initialisation 
                                         within switch
      break;
  }
  return index;// OK: index still in 
                                     scope from for
}

Hidden global variables

If an automatic variable or argument in a function has the same name as a global variable, the global variable is hidden by the temporary variable within the scope of the function, i.e. it cannot be accessed by the function because all references to its name refer to the temporary variable instead. C++ allows access to hidden global variables (but not other hidden temporary variables) by prefixing their name with :: .

int x;
int main()
{
  int x;
  x = 10;      // Refers to automatic variable
  ::x = x;     // Assigns the value of automatic 
                                   x to global x
}

References

A reference to another variable or constant can be defined with int &ref = var;

Wherever ref is mentioned in its scope, it actually refers to var. This means that references must be initialised (with the variable they refer to), except if declared extern. Initialisation is totally different from assignment. The former creates an alternative name for a variable, whereas subsequent assignment assigns to the original variable. A reference can only refer to a constant if the reference is const. const references can be assigned with non-lvalues, and even with a value of differing arithmetic type (e.g. int to float). In such cases, the reference actually becomes an automatic const variable (?!) with its own distinct storage.

References are only really useful when used with functions. Their main use is as arguments (called by reference arguments). If a function needs to be passed a struct and you prefer to use the . notation rather than ->, a reference can be used to avoid the inefficiency of temporarily duplicating the struct.

void rect_to_workarea(const WimpGetRectangleBlock &cvtstr,
                                               BBox &rect)
// cvtstr is const because function 
// doesn't alter it; no temporary is 
// needed, because original block 
// would have been an lvalue of the 
// same type. Lack of const in front 
// of BBox &rect implicitly declares
// intention of function to alter it
{
  rect.xmin -= cvtstr.visible_area.xmin - cvtstr.xscroll;
  // ... etc ...
}
void redraw_window(int wh)
{
  WimpGetRectangleBlock redraw_block;
  BBox wkarea_clip;
  // ...
  rect_to_workarea(redraw_block,wkarea_clip); // Alters
                                            wkarea_clip
  // ...
}

In fact, it is usually clearer to use a pointer in place of modifiable reference arguments (such as wkarea_clip/rect).

Functions can also return references e.g. int &f(void);. This means that you can modify or take the address of a function result. As above, this is really only an alternative to returning a pointer for notational convenience.

One thing for which I find references useful, but which is not usually documented, is to provide a way of efficiently converting the 'handle' of an event handler to a useful reference to an object (not a pointer to it, as you will see) without the overheads of creating a new object at run-time, e.g.

int load_handler(WimpMessage *message, void *handle)
{
  file_data *data_ptr1 = (file_data *) handle;
    // Wasteful, creates a new variable with a duplicated value
  file_data *&data_ptr2 = (file_data *) handle;
    // Gives 'anachronism' warning: cast is not a lvalue
  const file_data *&data_ptr3 = handle;
    // Wrong! Defines non-const reference to pointer to const
  file_data *const &data_ptr4 = (file_data *) handle;
    // OK, but likely to create a temporary,
    // defeating aim at efficiency
  file_data &data_ref = *((file_data *) handle); // (now a lvalue)
    // Does the job, if perhaps inelegantly; const optional

// Now use data_ref.<member>, not data_ref-><member>

  // ... rest of function ...
}

Memory management

C++ provides two extra operators (not functions) for memory management. They allocate and deallocate memory from the free store like malloc() and free(). new allocates enough memory for an object of the type specified, creates the object (important for classes with constructors), and returns a pointer to it. delete frees the memory used by the object pointed to by its operand, calling any destructors first. Deleting a zero pointer is guaranteed to do nothing. Arrays can be created by specifying the number of members in square brackets after the type name. Any constructors are called for each element. Arrays can be deleted by putting empty square brackets immediately after delete (the size of a dynamic array is stored alongside it in an implementation-dependent way). Deleting a single object with array delete or vice versa has unspecified results (i.e. probably a crash), but the compiler cannot always detect this.

int *iptr = new int;      // Create an int on the free store
// ... some code ...
delete iptr;              // Delete the int
int *iptr = new int[100]; // Create an array of 100 ints
// ... some code ...
delete[] iptr;            // Delete the whole array

new and delete should, if possible, be used in preference to malloc() and free(). There is no replacement for realloc(), but it is easy to use new then delete the original. If new fails, it returns a null pointer. In addition, you can register another function (with no arguments or return value) using set_new_handler() defined in <new.h>.

Functions

Function declarations can be preceded by the keyword inline, so that, where possible, function calls are replaced by the code in the body of the function. This is a replacement for macros, where it is desirable to avoid the time overheads of calling a short function. The advantage of inline functions over macros is that they can be strictly type checked. Acorn's C++ translator is limited in that there can be no statements following a return statement in an inline function.

Functions can also be declared with default arguments by giving them initialisers (in the first declaration only). The arguments can then optionally be omitted when calling the function:

inline int f(int a = 0) { return a + 1; }
int main()
{
  int b = f(2);     // a = 2, b = 3
  int c = f();      // a = 0, c = 1
}

Any arguments following the first default argument in the declaration must also be default.

If you do not intend to use an argument in a function, but it has to be included for type equivalency (often the case with RISC OS event handlers), it need not be named in the function's definition. This avoids having to use wasteful statements such as handle = handle; to suppress annoying compiler warnings (although Acorn's example C code uses handle = handle; style statements, ANSI C does not warn of unused arguments, but C++ does).

int my_message_handler(WimpMessage*msg, void *)
// void * argument can be ignored
// without generating warning

Non-simple types

By non-simple types, I mean struct, union, enum and class. In C, you would generally use:

struct quad_word { int a[4]; };
/* Must always be referred to as
    struct quad_word. Can be referred 
    to before definition provided no
   access to members is attempted.  */

or

typedef struct { int a[4]; } quad_word;
/* Referred to as quad_word, but
    only after definition. Can also 
   be referred to as struct
    quad_word.                */

In the second case, the struct is given the same name as the attached typedef name, so this is the more flexible method in C. In C++, the class key (struct, union, enum, class) can be omitted in subsequent references to the type, even if a typedef is not used:

struct quad_word { int a[4]; };
quad_word.a[0] = 1;     // OK in C++,error in C

Anonymous unions

When nested within a structure, unions need not have a name in C++:

struct mc_result
{
  int tag;
  union {
    int words[4];
    char bytes[16];
  };             // No name necessary in C++
} res_holder;

The words member of res_holder can then be accessed by res_holder.words without an intervening union name, e.g. res_holder.words[0].

Linking C++ and C programs

You will often want to link C object files with C++ object files, e.g. to use <stdio.h> etc in C++. However, C++'s extra features mean that function and variable names have to be expanded into more complicated forms in object files. To allow C++ programs to access C, all C declarations must be preceded by extern "C". This tells the compiler to create a reference to the simpler C-style name. Whole blocks can be made extern "C" by enclosing them in braces {} preceded by extern "C". Libraries for use with both languages can use the directive conditionally, by testing for the predefined macro __cplusplus. (See one of the new clib header files for details of how this works.) Note the matching conditionally compiled closing brace } at the end of each file. Note, too, that __cplusplus is not automatically predefined by Acorn's c++ tool (use the Define entry from its menu), but it is predefined by Make.

Similarly, C functions can access C-compatible parts (simple functions, non-class variables) of C++ functions by declaring them extern "C++", provided the C compiler is new enough to recognise the directive and know how to expand C++ names.

Programming style

To program effectively in C++, you will have to make a considerable effort to think in terms of objects rather than procedures. However, there are some simple guidelines for improving reliability that are relevant to all types of programming:


Contents - The Archives - Archive Articles