Archive

The ASCII Code



Gerald Fitton

An 8 bit binary number is called a byte. In binary such an eight bit number is a number between 0 (usually written with leading zeros as 00000000) and 11111111. If these binary numbers are converted to decimal they represent numbers between 0 and 255. Let me put this another way, if you want to describe the contents of one single byte of memory (you've probably got a few million bytes of RAM in your machine) then you can do so by saying that the memory location contains a number between 0 and 255 inclusive.

Look at the your keyboard and consider the following. Our alphabet has 26 lower case letters, 26 upper case letters, 10 digits (note that these are not numbers) 0 to 9 and a miscellany of punctuation symbols. In total, there are about 96 characters directly available from the keyboard (yes, I know there are a few more but let's keep it simple for now). Each of these printable characters can be coded as a number. For example, the capital letter A is coded as the number 65 and the lower case a as the number 97. This code is called the ASCII code and is used for storing text within the memory of your computer.

A <space> is stored in memory as ASCII code 32. ASCII codes below 32 are reserved for printer instructions, for example, ASCII code 12 is an instruction to the printer to execute a form feed. In general, all the codes necessary to print a letter in English (or should I say American) correspond to ASCII code numbers between 0 and 127.

For example, to send the word 'Gerald' to the printer, the codes stored in memory and transmitted to the printer are: 71  101  114  97  108  100. The printer will decode this six byte message and, using its internal fonts, will print the corresponding 'ASCII coded' characters. Only six bytes are required to store and to print 'Gerald'! These six bytes can be transferred from computer to printer and decoded at the printer about 300 times more quickly than the outline font (graphic) equivalent. Speed is the biggest single advantage of using character printing when compared with graphics printing.

The extended ASCII code

Up to now, I haven't said what the codes between 128 and 255 inclusive are used for. These are called 'top bit set' characters because, in binary, the first (i.e. the most significant) of the eight bits is a 1, ie 'set' to a 1 (for codes between 0 and 127, the top bit is a 0). In the so-called Latin 1 character set, one of the uses of these top bit set characters is to represent characters such as ± and accented (foreign) characters such as é ï ö or ç. The Latin 1 character set used by the Archimedes has a few extra characters more than the internationally agreed Latin 1. These extras include the 'smart quotes' " & " - the standard Latin 1 includes only 'sexless quotes' such as ".

Perhaps the most important thing to understand is that, within your computer, each character you see displayed is stored in memory as a single byte, a number between 0 and 255.

When you send a text file to your printer, every character is sent as a single byte; each byte sent corresponds to a number between 0 and 255. Having specified a character set at your printer, the printer decodes that number into a printable character and prints the character from that particular character set. If you have the Latin 1 set loaded for your screen display but you have the IBM set selected at the printer, top bit set characters won't print out as they are on the screen. Each character will print out as the character with the same code number from the IBM set!


Return to Text Import Part One