C Programming

 

 

Department of Physics and Astronomy

 

 

 

 

 

Computers and Programming Languages *

C - A First Program** *

VARIABLES ** *

CONSTANTS ** *

NUMBER SYSTEMS IN C ** *

Binary numbers ** *

Octal numbers *

Hexadecimal numbers ** *

VARIABLE TYPES AND DECLARATIONS ** *

Fundamental Variable Types ** *

MEMORY ORGANISATION AND BINARY NUMBERS** *

COMMENTS ** *

ASSIGNMENT STATEMENTS AND EXPRESSIONS ** *

POINTERS ** *

FUNCTIONS - A FIRST LOOK ** *

INPUT AND OUTPUT (KEYBOARD AND SCREEN) ** *

printf ** *

scanf ** *

putchar * *

getchar * *

getch * *

COMPOUND STATEMENTS AND SCOPE ** *

RELATIONAL AND EQUALITY OPERATORS ** *

LOGICAL OPERATORS ** *

ARITHMETICAL OPERATORS ** *

Increment and decrement operators ** *

FLOW OF CONTROL ** *

The while statement ** *

The for statement ** *

The comma operator ** *

The do statement ** *

The if and if - else statements ** *

The break instruction * *

The continue instruction *

The switch statement * *

The function exit * *

ARRAYS AND STRINGS ** *

Arrays ** *

Multiple indices * *

Arrays and loops ** *

STRINGS * *

String constants * *

BITWISE OPERATORS * *

ASSIGNMENT OPERATORS ** *

 

 

PRECEDENCE AND ASSOCIATIVITY ** *

Precedence rules ** *

FUNCTIONS ** *

Function types ** *

return statement ** *

Functions with no return value ** *

Declaring functions ** *

Parameter passing to functions ** *

THE PREPROCESSOR ** *

#include ** *

#define ** *

Macros with arguments *

ARRAYS AND POINTERS * *

More about pointers * *

More about arrays *

Pointers and arrays *

Pointer arithmetic * *

Pointers to void *

Passing arrays as function arguments *

MIXED MODE ARITHMETIC AND CASTS ** *

MEMORY ALLOCATION * *

The function calloc *

The function malloc *

The function free *

The function realloc *

STORAGE CLASSES *

The storage class auto *

The storage class extern *

The storage class register *

The storage class static *

Static external variables *

DECLARATIONS AND typedef * *

STRUCTURES * *

Structures of arrays * *

Arrays of structures *

Structure complex *

Use of typedef for structures *

Assignment statements for structures *

Initialising structures *

Pointers to structures *

Structures of structures *

Passing structures to functions *

Passing elements *

Passing whole structures and functions which return structures *

Passing pointers to structures *

FILE INPUT AND OUTPUT * *

fopen *

** Two asterisks indicate that the section is essential knowledge;

* one asterisk that it contains knowledge that is useful; and

no asterisks that the programmer should be aware that this feature of C exists.

C Programming

 

These notes are intended to give an introduction to the features of C which are important for scientific programming so that the emphasis is on the parts required for numerical work. Not every feature is described but the omitted parts probably have very unlikely applications to scientific programming or their effect can be achieved in another way.

It is not necessary to know all of C in order to be able to program using it and a significant part of the language can be learned as required by the needs of the individual program being written. In order to help guide the reader through the complexities of C, each part is given a rating of its importance when learning C for the first time. Two asterisks indicate that the section is essential knowledge, one asterisk that it contains knowledge that is useful once programming skills in C have been developed and no asterisks that the programmer should be aware that this particular feature of C exists but that it unlikely to be needed until sophisticated programs have to be written.

 

Computers and Programming Languages

It is not necessary to understand much about what goes on inside a computer to be able to write many type of computer programs but a little knowledge can give some useful insight.

The most important parts of the computer are the processor and the memory. The processor, often called a microprocessor on small computers, carries out operations on numbers and is linked to the memory which stores numbers at well defined locations. Even small computers have now the ability to store a million or more numbers in their memory. The operations performed by the processor can be arithmetical, such as addition or subtraction, can have a number of other forms such as comparing numbers or even just accessing or storing numbers in memory.

The processor has a well defined set of operations which it can perform. These are referred to as the instruction set and are usually different for different processors. Each instruction is encoded in the form of a number. The program, the sequence of operations to be carried out by the processor, is stored in the memory as a sequence of the corresponding numbers. The memory is also used to store data, the numbers to be manipulated by the program.

On their own, the processor and memory are quite useless since they cannot communicate with the outside world and it is necessary to have a number of other parts of the computer to achieve this. These give input and output and are often referred to as peripheral devices. Such devices can be the keyboard, the display screen, disk drives and printer output ports. They are used to load programs into the computer, to provide the programs with data and to display the results of the program.

In order to operate all these devices and to be able to load and execute a program, the computer needs a basic program called the operating system. This is always in the computer and, when another program is loaded and run, control is temporarily transferred to that program but returns to the operating system after the program is finished.

Obviously, it is quite impractical to enter the program in the form used by the processor unless the program is very short since typically there can be several hundred possible sets of numbers for the full instruction set and the location of each piece of data has to be given explicitly. However, it is possible to program using suitable mnemonics for the instructions and the data and have them converted into the appropriate numbers. This is known as assembly language. It is a very powerful since it allows the use of the whole of the processor's instruction set and the computer can be made to do anything that is actually possible. A language which programs the processor in such a direct way is referred to as a low level language.

Clearly, if large programs are to be written, it is necessary to make the process simpler and to be able to write programs in a way which is much more closely related to its application. For mathematical work, a language often used is Fortran (FORmula TRANslation). This has a form which is close to mathematics and is an example of a high level language. Other examples of high level languages are Pascal and COBOL (used for commercial work). Programs written in a high level language have to be converted into a form that can be understood by the processor. This is carried out by a program called a compiler which has to be run on the computer before the program written in the high level language can be used.

 

C - A First Program**

C is basically a high level language but it has some features which allow a greater degree of control of the way in which the computer operates and it is less specialised for some specific purpose than in many high level languages so that it is often referred to as a medium level language. It is a general purpose language which is suitable for developing programs for computer operating systems, compilers, word processors and many other forms of general purpose programming as well as for mathematical programming. It is very suitable for scientific programming and for work which requires the interfacing of the computer with devices external to the computer.

It gets its name because it was developed from a language called B but B was not developed from a language called A. It was developed from BCPL (Basic Computer Programming Language). There

is now a standard form of C called ANSI C (American National Standards Institute) which is described in the current (second) edition of the book by Kernighan and Ritchie. There are earlier versions of C and corresponding C compilers which do not meet the standards of ANSI C. In order to allow the compilation of old programs, written before ANSI C was introduced, most compilers meeting the new standard will accept certain features which are not ANSI C. However, it is strongly recommended that any programs written now should not use these obsolescent features. These notes use ANSI C with occasional extensions from Borland Turbo C.

C is a very terse language but is highly structured so that it is just as suitable a language for teaching computer programming as Pascal. It does have the advantage compared to Pascal that, because it is a medium level language, the programmer is much less remote from what the computer is actually doing and it is also much more flexible in what can be done.

The standard introductory example to C, found in almost all books on C, is the program called "Hello world".

 

#include <stdio.h>

void main()

{

printf("Hello, World\n");

}

 

It gives

Hello, World

on the terminal or computer screen and the cursor moves to the start of the next line.

Note the

#include <stdio.h>

line. This is an instruction to add to the program an additional piece called a header which is contained in a standard disk file. It allows access to standard input and output functions and will almost invariably be required for any program. Note also that the main part of the program is preceded by

void main()

and that the subsequent part of the program is enclosed by curly brackets (braces). This is an essential feature of C which is uses functions as the basic building block for the program and the curly brackets to mark the blocks of the program. This is discussed later in some detail.

The original program, written in C or any other high or medium level language is called the source code and usually has a form which is relatively easily understood by the programmer but is quite unlike what can be understood by the computer as a set of instructions. This is prepared for running on the computer in two stages. Firstly, the compiler converts it into an object file (usually given the suffix obj, e.g. hello.obj). The next stage is to pass it through the linker, a program which connects it to any other part which it might require and which is taken, perhaps, from a library of functions supplied with the compiler or which has previously been written by the programmer. This results in the executable file. This contains the program which can be loaded and run on the computer via the computer's operating system.

Diagrammatically, with Borland Turbo C, the process is as shown below.

 

Let us now consider some of the fundamental elements and terminology of C. Some of it is common with other languages and some is peculiar to C.

 

 

VARIABLES **

These are quantities with values which are stored at specific locations in the computer's memory, called addresses, and whose value can be used or changed by the program. Usually their values are numerical but it is also possible in C to have variables whose values represent characters (letters, numbers etc) or logical variables (with values true or false). There are also variables which are arrays of numbers or strings of characters. In C a variable name can be any sequence of letters or numbers although the name must begin with a letter or an underscore. Upper and lower case letters are considered to be different. By convention, lower case is usually used for variables and upper case for constants (described immediately after this section) although there can be exceptions if the variable name is more obvious with the use of upper case letters.

a fred _D01

are valid names but

32z

is not. Also

Fred, FRED and fred

would all be considered to be different variables. The use of underslash as the initial character in the name is not recommended since it can cause confusion with standard functions to which the linker has access. Also, there are certain keywords used in the language such as float, int, double etc. which are not allowed as variable names. This is because they have special meanings for the compiler. Fortunately, C has fewer keywords than most other computer languages.

 

 

CONSTANTS **

These are quantities whose values are fixed and are not allowed to change during the execution of the program. The value might be say 32 or 3.14159. There will be a description of how to define constants later in the section on the preprocessor.

 

 

NUMBER SYSTEMS IN C **

As well as the use of decimal numbers (numbers to the base 10), other forms can be used in C.

 

Binary numbers **

Numbers are stored in memory in binary form but binary numbers are usually too cumbersome for common use. However, they are important when dealing with the contents of individual parts of the memory and when dealing with interface chips that link the computer to its peripherals and the outside world. The link between binary numbers and other number systems is described in Appendix A.

 

Octal numbers

These are numbers to the base 8 (23). In C they are denoted by a leading zero, e.g. 06745. They are not of much immediate use here and will not be described further.

 

 

Hexadecimal numbers **

These are numbers to the base 16 (24). In C they are denoted by a leading 0x or 0X. The first character is a zero. For example,

0xaf34 or 0X45DE

are representations of hexadecimal numbers in C. Hexadecimal numbers often give the most useful way of accessing and manipulating the computer's memory and are therefore very important for any type of medium level programming.

They are also described in Appendix A.

 

 

VARIABLE TYPES AND DECLARATIONS **

All variables have to be declared before they are used. That is, their names have to be given and the type of variable has to be specified. This tells the compiler the significance of the sequence of letters which represents the variable in the program so that it can allocate an appropriate amount of memory, can understand each occurrence of the variable and interpret the manner in which it is stored in the memory. Declaration is not required in some languages (such as BASIC or Fortran) and might seem to be an unnecessary chore but it does help to minimise programming errors such as the mis-typing of a variable name.

 

Fundamental Variable Types **

The fundamental types are

char character variables with values

'a' to 'z',

'A' to 'Z',

'0' to '9'

plus others such as punctuation marks, parentheses etc. The single quotation marks indicate that it is a character variable. It can also be considered as containing an integer numerical value. This relates to the ASCII code for characters and will be described later.

int integer variables whose values are integer numbers such as 12, -27, 1358, etc.

float floating point variables. These correspond to non-integer numbers, such as 10.3 and 123.76, with a decimal fraction.

Float and int are arithmetical variable types. Their range of values is restricted by the amount of memory allocated to them and will be discussed below. The type char can also be thought of as having a numerical value so that can also be used in arithmetic.

Note that a variable of type int with value 2 has the same effective numerical value as a variable of type float with value 2.0 but the numbers are stored differently in memory and their arithmetic is handled differently by the processor. There is no need to discuss this in detail at present but it is essential to remember that integer and floating point numbers are different. It is also worth remembering that, on many microcomputers, arithmetic involving integers is much faster than for floating point numbers so that integers should be used whenever possible.

The variable type can be qualified by terms which define, within certain limits, the amount of memory that is allocated for the storage of the variable's value. These apply to arithmetical variables. They are

short

long

so that we can have

short int

long int

long float

The amount of memory assigned can be implementation dependent (i.e. it depends on the individual compiler being used). The ranges for Turbo C will be discussed below. It is this that determines the range of values that can be given to these variables. An alternative to the specification long float is

double

and this is the form that is usually employed.

The type int and char can contain integers which can be positive or negative but they can be qualified by

unsigned

to give

unsigned char

unsigned int

so that the allowed values are restricted to zero and positive integers. It is also possible to qualify integer and character variables with

signed

but since this is the default for such variables, it is rarely used.

 

Finally it is possible to have

unsigned short int

unsigned long int

long double (In Turbo C++)

The declaration of any variable must be made before it is used. Typical forms are

int a, b, c;

float x, y, z;

char l, m, n;

double p, q;

unsigned long int p, q;

Note the use of the semi colon to end each declaration. This is an example of a statement.

A statement is anything that ends with a semi colon.

Variables can be assigned initial values when declared as shown by the following.

int i=7, j=-3;

unsigned char p = q = 'A', r = 1, s = 2;

float x = 12.25, y = -18.33;

Note the use of both numerical and character values when giving the character variables initial values and the way in which it is possible to give more than one variable the same initial value.

 

MEMORY ORGANISATION AND BINARY NUMBERS**

It is now appropriate to digress into a short description of how numbers are stored computer's memory. The memory consists of bits which can either be clear (contain zero) or be set (contain one). Therefore, the natural number system for computers, at the lowest level, is binary in which the possible digits are zero (clear) or one (set).

The basic unit of memory is the byte. This can contain an eight digit binary number. If it is treated as an unsigned integer, it can contain values from

76543210

0 given by 00000000 -- eight cleared bits

to

(28 - 1) = 255 given by 11111111 -- eight set bits

In general a binary number with n digits can contain values from zero to 2n - 1. If negative values are required, the form usually employed to represent them, two's complement, allows a range of values from -2n to +2n - 1. For an eight bit number, the range is -128 to +127.

A description of conversion between binary and decimal numbers and of the two's complement representation of negative numbers is given in Appendix A.

Having only byte length numbers is clearly too restrictive and often two or more bytes are used. In Turbo C++, the lengths of the variables which can be used to store integers are

char - one byte - signed have values -128 to +127

unsigned have values 0 to 255

short int - two bytes - same as int described below.

 

int - two bytes - signed have values -32768 to +32767,

(-215 to +215 -1)

unsigned have values 0 to 65535 (216 -1)

long int - four bytes - signed have values -2147483628 to 2147483627

(-215 to +215 -1)

unsigned have values 0 to 4294967295

(0 to 216)

The storage of floating point numbers is quite complicated and will not be covered here since it is usually not necessary to know about it. Only the important properties for the C programmer are quoted.

float uses four bytes and can take values which range approximately from 3.4*10-38 to 3.4*10+38 with a corresponding range of negative numbers and with accuracy to approximately seven significant decimal digits.

double uses eight bytes and takes values which range approximately from 1.7*10-308 to 1.7*10+308 with a corresponding range of negative numbers and with accuracy to approximately fifteen significant digits.

Turbo C also has the type long double which uses 10 bytes and has a range of values from

3.4*10-4932 to 3.4*10+4932 with a corresponding range of negative numbers and with approximately nineteen significant digits.

The numbers are only given to a limited number of significant figures (about seven for float and fifteen for double) so that floating point arithmetic is not quite exact and there is the possibility of errors creeping in, particularly in very long arithmetical calculations. These are called rounding errors and begin to be significant if there is a long sequence of floating point arithmetic or of the difference is taken between two floating point numbers which have almost the same size.

 

Data type char *

This is a byte length number which can have a numerical value or can be associated with the set of characters given by the alphabetical letters and the numbers (alphanumerical characters), the other characters, such as parentheses, punctuation marks etc., which appear on the computer keyboard and also some control characters which can be used to operate say a teletype. Each of these has a number associated with it. This is known as the ASCII code (American Standard Code for Information Interchange).

The range of values for the ASCII code is 0 to 127 - i.e. it is a seven bit number. The ASCII code is given in Appendix E.

A character is denoted as say 'a', 'Z', '7', '}' etc. inside single quotation marks.

Important values for the ASCII code are

'a' to 'z' are 65 to 90 (0X41 to 0X5A)

'A' to 'Z' are 97 to 122 (0X61 to 0X7A)

'0' to '9' are 48 to 57 (0X30 to 0X39)

The significance of using only a seven bit number is that often numbers are transmitted as byte length (eight bit) numbers and the eighth bit (usually bit seven) can be used as a parity bit. This allows each transmitted number always to have an even number of bits set, even parity, or always an odd number of bits set, odd parity. If a character is transmitted and one of the bits is received wrongly, the received number will have the wrong parity and it will be known that an error has occurred. The use of parity therefore gives a partial check on the accuracy of transmission of characters but it does mean that the parity bit has to be cleared before the appropriate ASCII value is obtained. This can be done in C and will be described later.

The value of a character variable can either be a number in the appropriate numerical range or the character, inside single quotation marks, which has the corresponding ASCII value. For example,

char c = 'a';

and

char c = 97;

both initialise the variable c with the same value.

 

COMMENTS **

It is possible and often useful to be able to add comments to a program in order to make it clearer and to state what the various parts are intended to do. Comments are sandwiched between /* to indicate the start and */ for the end.

/* This is a comment and is ignored by the compiler */

Sometimes it is useful to eliminate part of a program during its development, not by removing it from the source code, but simply by converting it into a comment so that it is ignored by the compiler. Once the programmer decides to include it again in the program, it can be restored by removing the comment symbols. However, care should be taken with this since some implementations of ANSI C do not allow the nesting of comments (e.g. Turbo C++) so that, if the part commented out contains a comment, the result can be unpredictable.

 

 

ASSIGNMENT STATEMENTS AND EXPRESSIONS **

Assignment statements allow variables to be given new values and have the basic form

variable = expression;

Note that they end with a semi colon.

The expression can be any combination of variables, operators and functions (discussed later) which give the expression a value. The term, expression, is used for anything that can be given a value and this value need not be a number. The value of the expression is given to the variable whose name is on the left hand side.

For example the piece of code

int a = 7, b = 3, c;

{

c = a + b;

b = b + c;

}

first gives c the value 10 (= 7 + 3) and then gives b the value 13 (= 3 + 10). Note in the second expression that the algebraic interpretation c = 0 is not meaningful. It has to be remembered that what the expression says is that the variable on the left hand side has to be given the value calculated for the right hand side before the left hand variable's value is re-assigned.

Arithmetical operators will be discussed in more detail later but the most commonly used ones are

+ add

- subtract

* multiply

/ divide

These are referred to as binary operators; not because computers use binary numbers but because they have two operands.

e.g. a + b

Also used is the unary minus sign. It is called unary because it only has one operand and so acts on one variable. In this case, it is the variable to its right and the unary minus sign changes its sign. For example.

a = -a;

changes the sign of the variable a.

 

POINTERS **

Although the C compiler automatically assigns memory for the declared variables, it is necessary at times to be able to find out the addresses of certain variables and to be able to access the contents of specific parts of the computer's memory. This can be done in, for example BASIC by using PEEK and POKE but in C, pointers are used. As implemented, these are much more powerful and useful than having functions which correspond to PEEK and POKE in BASIC.

A pointer is a variable whose value is a memory address. Also, each pointer is linked to a specific type of variable. As in the case of the other variables, the same rules for their names apply and they have to be declared. For example, the declarations

int *p, q;

float *x, y;

declare q as an integer variable as before but the asterisk before p declares it as a pointer to an integer. That is, it can contain the address in memory where an integer variable is stored. Similarly, x is a pointer to floating point variable. The actual value of x will be an integer since it is an address. The compiler will allocate memory for the integer q and for the floating point number y and for two addresses, one which will be the address of an integer pointed to by p and one of the floating point number pointed to by x. The compiler will also know the type of variable associated with the pointer but no memory is assigned by the compiler for the storage of these numbers. This has to be done by the programmer, either by giving the pointer an initial value or during the running of the program.

In the subsequent program, the values stored at p and x are given by the use of the unary operator *.

*p and *x

e.g. we might have in later parts of the program, after x has been given a value (an address) and a floating point number has been stored there

y = *x;

y is given the value of the floating point number stored at the address contained by x, or for the integers,

*p = q;

The integer stored at p is given the value of the variable q, again after p has been given a value.

Pointers might be thought to be an unnecessary complication but their use can be very powerful and they can be crucial if the values of a large number of variables have to be communicated between functions or if functions are required to alter the values of parameters and pass back the new values. They are also closely related to the use of arrays of variables. This will be clear later.

The use of the same symbol, *, as for multiplication might be thought of as a possible cause of confusion but this is not the case in practice.

The need to define the type of variable when declaring the pointer should now be obvious. If it was not done, the computer would not know how much memory to look at and how to interpret its contents when using say, *p, to give access to the numerical value pointed to by p.

The address of a variable is given by the unary operator & (ampersand). e.g.

p = &q; /* p points to the variable q,

it has the value of the address of q */

The use of & and * is illustrated by the following piece of program.

int i = 7, *p, j, *q; /* p and q are pointers to integers */

{

p = &i; /* p has the address of i */

q = &j; /* q has the address of j */

*q = *p; /* the integer stored at the address of q has value of the integer

stored at the address of p, i.e. 7 */

}

This is just a roundabout (and silly) way of having

j = i;

but it illustrates the use of pointers and of the unary & and * operators.

 

FUNCTIONS - A FIRST LOOK **

A function is a part of the program which can be called from any other part of a C program including

from within itself. Usually, but not always, the function is passed the values of certain variables, the

parameters or arguments of the function, it carries out the instructions contained in its code and again, but not always, it returns a value which can be assigned to a variable. The type of variable returned by a function has to be well defined.

For example, consider

#include <math.h>

float x = 0.5, y;

{

y = sin(x);

}

Here y is given the floating point value sin x (= sin(0.5)) by the function sin which is supplied in a library of functions included with the compiler and linker.

There are many such functions available with Turbo C and other implementations. Many of these are standard to ANSI C but others are not, particularly the graphics functions which can vary from compiler to compiler.

It should be remembered that is usually necessary to use the #include statement to incorporate the appropriate header in the program. In the example above, the header required for the sine function is in the file "math.h". This will be discussed in more detail later. The properties, requirements and limitations of the supplied functions can be obtained from the manual for the C compiler being used.

It is also possible to write tailor made functions in the process of writing a program or to construct one's own library of functions. The description of the use and writing of functions is given more fully later in these notes. The significance of having the headers which declare the functions will then also be clear.

INPUT AND OUTPUT (KEYBOARD AND SCREEN) **

There are functions that allow output to the computer screen and input from the keyboard. Although the basic concept of a function has still to be described, these functions are simple enough that they can be introduced now.

 

printf **

Printf is used to output text and data to the screen. It requires the header file called stdio.h which is added to the program by

#include <stdio.h>

at the start of the program. If we have a variable i (type is int), it can be shown on the screen with a suitable message by the statement

int i = 7;

printf("The variable has value %d\n",i);

will give the message

The variable has value 7

on the screen and the cursor will move down to the next line.

This function returns, as an int, the number of characters printed on the screen but it is not usually written in the form of an assignment statement but simply as in the example above.

The part in double quotation marks is called a string. It is just a sequence of characters. Strings will also be described more fully later.

The argument of the printf function, the part inside the brackets, has the form

control string and (optionally), other arguments

In the example above the control string is "the variable has value %d\n". The contents of the control string are printed, except for parts beginning with % or \, and the values of the variables listed after the string are inserted at the parts beginning with %. These define the format to be used for the output of the variables in the other arguments part. The parts after the reverse slash, \, give control instructions to the output system. The part \n is an instruction to start a new line.

The variables to be printed are included after the end of the control string and are separated from it and from one another by commas. There need not be any variables, e.g.

printf("This just prints a message");

or the control string can consist only of format and control characters, e.g.

printf("\n%d %d \n%f", i, j, x);

will print only the values of the variables i, j and x without any other strings of characters.

 

 

The most commonly used forms of the format instructions are

%d output a signed decimal integer

%u output an unsigned decimal integer

%x,%X output a hexadecimal number without the leading 0x or 0X, using a,b,c,d,e,f or A,B.C.D.E,F for decimal 10, 11, 12, 13, 14, 15 with the use respectively of %x or %X.

%c output a single character

%s output a string of characters until a '\0' is encountered. (This will be clear later.)

%f output a floating point number - typical form is -45.128364

%e, %E output a floating point number in the form say of 0.546373e-07 (This means 0.546373x10-7 ) or 0.845256E 12 (0.845256x1012 )

%ld output a long integer in decimal form

These format instructions can be further modified to specify the precision and format of the output. For example

%6d will print as a decimal integer, at least 6 characters wide.

%6f will print as a decimal floating point number, at least 6 characters wide.

%.2f will print as a decimal floating point number, with two digits after the decimal point.

%8.2f will print as a decimal floating point number, at least eight characters wide and with two characters after the decimal point.

%10s will print a string and will use at least 10 characters.

%.10s will print the first 10 characters of a string.

There are other control strings which can be used but the information already given is probably sufficient for most forms of output. A fuller description is given in Appendix C

 

scanf **

This is used to input the values of variables from the keyboard. and requires the header stdio.h. It is used with a form similar to that for printf.

scanf(control string, argument list);

with for example,

#include <stdio.h>

int a, b;

long int c;

float x;

 

the statement later in the program,

scanf("%d%d%ld%f", &a, &b, &c, &x);

will read integer values for a and b, a long integer value for c and a floating point value for x which are entered from the keyboard.

Note that the argument list contains the addresses of the variables to be read using the & operator, and not the variable names themselves. The scanf function needs to know the addresses in order to know where to put the values into memory.

The conversion of the input when read is given by the characters in the control string. The number of these is large and usually it is necessary to know only the effect of %c, %d, %ld, %u, %f, %e, %E and %s. The remainder are rarely used in practice.

%d - decimal integer

%u - unsigned decimal integer

%x - hexadecimal integer (without leading 0x or 0X)

%f - floating point number

%c - single character

%s - string of non-space characters (See the note below)

%nc - n characters

%ld - long decimal integer

%lf - long float or double.

Because scanf only has access to the address of the variable, there is a possible trap here. The format given by the control string must correspond to the declared variable type. For example, if %d is used to read in a long integer, the result can be spectacular but not what is desired because scanf will place an integer in the two bytes starting at that address instead of the long integer which occupies the four bytes starting at the same address.

The characters in the control string should usually be contiguous (i.e. with no spaces) since spaces in the format string are matched with spaces in the input. This can cause mysterious problems if spaces are included unintentionally.

The %s instruction is different from the %s in printf. In scanf, it causes all leading blanks in the input to be ignored and the following string of non-blank characters to be read until a blank or the end of string, '\0' occurs. For example, with the input string " ABCDEFGHI JKLMN OP" which has three leading blanks, %s will read in "ABCDEFGHI" and %5s will read in "ABCDE". This can cause some unexpected behaviour for the unwary.

If blanks are to be included, the form %nc, where n is an integer, should be used. With the string above, %6c will read in the first six characters including blanks, i.e. " ABC".

 

Return value from scanf

The return value from scanf is the number of conversions, including zero, if it is successful or it returns end of file (EOF), value usually -1, if there is an input failure. If there is a matching failure, for example, input of a floating point number instead of an expected integer, scanf returns the number of successful conversions and leaves the remainder of the input in the input stream. This usually causes the program to fail.

putchar *

This function places characters, one at a time on the screen. It has the form

putchar(c);

where the argument, in this case, the variable c, is of type char. It requires the header stdio.h.

 

getchar *

This function gets a character from the input stream, usually the keyboard, and is used in the form

c = getchar();

For example, the program fragment

char ch;

ch = getchar();

putchar(ch);

will read a character from the input from the keyboard once RETURN has been pressed and place it on the screen. It requires the header stdio.h.

 

getch *

This function checks the keyboard and returns the ASCII value of a key when it has been pressed. There is no need to enter the character into the input stream by pressing RETURN.

char c;

c = getch();

will cause the computer to wait until a key is pressed and will give the character variable the value corresponding to that key. This function is not standard ASCII C but is found in Turbo C and Turbo C++. It requires the header conio.h.

It can be useful to insert

getch();

into a program to halt its execution temporarily. Pressing any key will allow it to restart. In this case, where the return value is not assigned to a variable, it is just discarded.

 

 

 

COMPOUND STATEMENTS AND SCOPE **

The use of curly brackets has been met in the very simple programs already given. It has been used to enclose the main part of the program after main(). However, their use is much wider and introduces the concept of a compound statement. This is a block of statements which can even contain further, nested, compound statements The start and end are marked by a pair of curly brackets. Note that variables can be declared inside a compound statement. For example, consider the following short program.

#include <stdio.h>

void main(){

int a, b, c=8;

{

scanf("%d",&a); /* Type in 3 for a */

scanf("%d",&b); /* Type in 2 for b */

{

int c=4;

c = a*b-c; /* This gives c = 3*2-4 = 2 */

b = b*c; /* This gives b = 2*2 =4 */

printf("\nFrom the inner statement a =%2d, b =%2d, c =%2d",a,b,c);

}

printf("\nFrom the outer statement a =%2d, b =%2d, c =%2d",a,b,c);

}

}

The output on the screen from this is

From the inner statement a = 3, b = 4, c = 4

From the outer statement a = 3, b = 4, c = 8

Here there are two nested compound statements or blocks whose start and end are marked by the curly brackets. The outer one starts below int a, b, c; and ends at the end of the program. The inner one is from int d = 4; to the first printf. The variables a and b are the same in both compound statements but c has the value 8 in the outer statement from being initialised in the declaration but is re-declared, re-initialised and recalculated in the inner statement. The second print statement gives 8 for the value of c because it is in the outer statement. The values of a and b are the same within both blocks. The fact that there seem to be two variables with the identifier c is explained by the concept of the scope of a variable and is described later in this section.

The curly brackets are therefore used to mark the start and end of compound statements and these can be nested within outer compound statements. Note the use of indentation to mark the level of the compound statements inside the program. It is essential that this be used in programming in C since otherwise it is very difficult to keep track of the level.

Incidentally, note also the use of printf, of scanf and of the formatting of the input and output.

The scope of a variable is an important concept in C. It is the part of the program in which the variable is considered to be accessible. This is within the compound statement in which it has been declared unless it is re-declared in an inner block.

If the same identifier is used again in a further declaration statement within an inner, nested block, inside the first compound statement, the identifier refers to a new variable which is totally independent of the previous one and which can have a different value without altering in any way the value of the variable which was defined in the outer block statement. This was illustrated in the program above but, to demonstrate it further, consider the following program fragment.

{

int a = 3, b = 4;

/* Here a has value 3, b has value 4 */

a = 1;

/* Now a has value 1 */

{

int a, c;

a = 0; /* This is a different a. */

c = 7; /* c appears for the first time */

/* Here a has value 0, b has value 4 from the outer block and c has value 7 */

}

b = 8;

/* Now a has value 1 because we are back in the outer block, b has value 8 and c is undeclared as a variable*/

}

the variable name a actually refers to two different variables, one outside the inner block and one within it and c is only a valid name within the inner block. The variable b is not re-declared inside the inner block and is common to both outer and inner. Having two variables with the same name might be thought to be confusing. It can be! It is good programming practice, unless there is a particular reason for it, to avoid having two variables with the same name within the same section of the program.

 

RELATIONAL AND EQUALITY OPERATORS **

It is now possible to write simple programs which take input, carry out arithmetical operations on it and print the results on the screen. However, it is only possible to do this once during each running of the program and it is not possible to vary the way in which the program treats the information that it is given. In most programs, it is necessary to have branches which allow the program to vary the way in which it operates or loops which allow the program to carry out the same operation many times without being restarted.. In order to be able to do this, it is necessary to be able to carry out tests on the variables and this involves the use of relational and equality operators which allow the comparison of the values of two variables or of a variable and a constant.

The relational operators are binary.

< less than

<= less than or equal to

> greater than

>= greater than or equal to

The equality operators are also binary

== equal to

!= not equal to

They appear between two variables, expressions or a constant and a variable with forms such as

a >= b a is greater than equal to b

c == 5 c equals 5

d != 2*(a + b) d is not equal to 2*(a + b)

If the expression involving these operators is true, it is given the value one (true) or of it is not true the value zero (false). Actually, any non zero value is treated as true.

Note that the "equal to" relational operator is two equals signs and not one. Forgetting this is a common mistake and often the compiler will not complain.

 

 

 

LOGICAL OPERATORS **

Often it is necessary to combine the results of more than one test in order to decide about which part of the program is to be executed next. For example, two conditions might have to be true. The logical operators combine the results from the relational and equality operators into a single result that is true or false. They operate on variables with the values true (non-zero) and false (zero)

They are

&& - and

|| - or

both of which are binary, and

! - not

which is unary

&& returns the value one (true) if both operators are true (non-zero) and false (zero) otherwise.

|| returns the value one (true) if one or both of the operators are true (non-zero) and false (zero) if both are false.

! acts on the variable to its right and gives zero (false) if it is true (non-zero) or gives one (true) if it is false (zero).

The first two can be used to combine statements involving relational or equality operators. They give results as in the following table.

and or

exp1 exp2 exp1 && exp2 exp1 || exp2

0 (false) 0 (false) 0 (false) 0 (false)

¹ 0 (true) 0 (false) 0 (false) 1 (true)

0 (false) ¹ 0 (true) 0 (false) 1 (true)

¹ 0 (true) ¹ 0 (true) 1 (true) 1 (true)

Note that and is two ampersands without a space between them. The single ampersand, &, is also a valid operator. Confusion between these can cause a program to compile but not run properly.

 

ARITHMETICAL OPERATORS **

These have already been met informally but it is now convenient to give a complete description of them. There are five which are binary.

+ Add the expressions on the left and right

- Subtract the expression on the right from the one on the left

* Multiply the expressions on the left and on the right

/ Divide the expression on the left by the one on the right It should be remembered that floating point and integer division behave differently.

% Get the modulus of the integer expression on the left with respect to the one on the right. The modulus is the remainder after integer division of the expression on the left by the one on the right.

There is one unary operator

- change sign of the expression on the right

(A "-" without an operand to its left is treated as unary in this way just as in arithmetic.)

 

Increment and decrement operators **

These are very useful operators not generally found in other languages. They are

++ and --

They respectively increase or decrease the operand by one. They can be used either before or after the operand on which they act. For example, on their own,

n++; or ++n;

are both equivalent to

n = n + 1;

However, if one of these operators appears in an expression, its position relative to the variable on which it acts is important.

If it is to the left of the variable (e.g. ++n), the change to the variable occurs before it is used in the expression.

If it is to the right of the variable (e.g. n++), the change is made after the variable is used in the expression.

For example, in

int a = 1, b = 2, c = 3, d = 4, x, y;

x = a++ + --b;

y = ++c - d--;

b and c are changed before being used to give x or y but a and b are changed after being used to calculate x or y.

a is given value 2 after being used to get x

b is given value 1 before being used to get x

so that x is given value (1 + 1) = 2.

c is given value 4 before being used to get y

d is given value 3 after being used to get y

so that y is given value (4 - 4) = 0.

 

 

FLOW OF CONTROL **

As already described, it is often necessary to repeat many times what is essentially the same calculation during the running of a program, usually with changed values for some of the variables. This might be for a specified number of times or until some condition is eventually satisfied, sometimes without knowing the actual number of cycles required when starting. This is known as a loop. In other cases, it may be required to carry out only one of several possible parts of the program as the next stage of the calculation. This is called branching. In both cases it is necessary to be able to control the way in which different parts of the program are executed.

There are several ways of doing this in C.

 

The while statement **

This is used for loops and has the form

while (expression)

statement1

statement2

where statement1 can be a compound statement.

The execution of statement1 is carried out if expression is true (non-zero) and is repeated while it continues to be true. When expression is found to be false (zero), statement2 is executed followed by the rest of the program. If expression is false when the loop is first reached, statement is not executed at all and the program proceeds to statement2.

The flow of control is shown in the diagram below.

As an example, consider a program which reads characters, one at a time from the keyboard and outputs them on to the screen. It terminates when a quotation mark is input and the number of characters is also printed.

/* This program copies an input string of characters to the screen and also prints the

number of characters. The string is terminated by an exclamation mark */

#include <stdio.h>

void main()

{

char c = 'a', cx;

int n =0;

printf("\nInput 'y' to input a string of characters"):

printf("\n or any other character to stop");

cx = getchar();

while ((c != '!') && (cx == 'y')){

/* while c is not equal to '!' and cx equals 'y' */

c = getchar();

putchar(c);

n++;

}

printf("\n\nNumber of characters %d\n",n);

}

Note that if cx has the value 'y' the loop is executed at least once and that the character '!' is output at the end. However, because the test is at the start, the loop can be omitted completely. This is the key feature of the while statement. Note also the use of the && operator to combine the two equality statements and the initialising of c to a character value that allows the loop to execute at least once if required.

Note again the use if indentation to help identify the limits of the loop.

As an exercise, write a version of the program which asks for the number of characters to be input before starting the loop and does not execute at all if the number is zero or less.

A loop can be caused to loop indefinitely by having

while(1){

.............

}

because the expression inside the while is always true. (It is non-zero.) It can however, be terminated at any point within the loop by using the break instruction described later.

The for statement **

The while statement gives a loop in which the test for executing the loop is at the start so that it is possible that the loop might not even be executed once. In the for statement, the test is also at the start but the form of the instruction is different.

The construction of the for loop is

for(initialisation; condition; changes)

statement1

statement2

where statement1 can be a compound statement and statement2 is the next statement after the loop.

Inside the brackets following the for, initialisation gives starting condition(s) for the loop, condition gives the condition for the loop to be executed and changes gives the change(s) for the next time round the loop

It is equivalent to the following while structure

initialisation;

while(condition){

statement1

changes;

}

statement2

As an example, consider a small program to print the squares and cubes of the integers from 1 to 25

 

#include <stdio.h>

void main()

{

int a, b, c;

printf("Number square cube );

for (a = 1; a <= 25; a++){

b = a*a;

c = b*a;

printf("%5d%8d%8d", a, b, c));

}

}

The equivalent program using the while structure would be

void main()

{

int a, b, c;

a = 1; /* Initialisation */

printf(""Number square cube");

while (a <=25){ /* Condition */

b = a*a;

c = b*a;

printf("%5d%8d%8d", a, b, c);

a++; /* Changes */

}

}

The comma operator **

The comma operator can be used in for loops to give initial values to more then one variable and to change more than one variable before the start of the next loop. For example we might have

for(i = 1, j = 3; (i < 12) && (j < 24); i++, j = j + 2)

Here i starts with the value 1, j with the value 3, i increased by one after the completion of each loop and j by two.

In fact, it is possible to have the whole loop in a single line if it is small enough. Suppose we want to calculate the factorial of an integer variable n. This is given by the integer variable fac and using the integer variable i as the loop index.

for(i = 1, fac = 1; i <= n; fac = fac*i, i++);

Note the semi-colon at the end to indicate that the following statement is not part of the for loop.

The do statement **

In many ways the while and for statements are equivalent since they give a loop structure in which the test for the execution of the loop is at the start. If the test fails for the first time that the loop is attempted, it will not be executed at all. Anything that can be done by one form can be done exactly equivalently by the other. The do structure is different in that the test is at the end of the loop and the loop is always executed at least once. It has the form

do{

statement1

} while (expression);

statement2

where statement1 can be compound On reaching the loop, statement1 is executed and is repeated while expression is true. Note that the test is carried out after the execution of statement1 so that statement1 is always executed at least once. The flow diagram is given below.

The program to print the squares of the numbers from 1 to 25 using this form for the loop might be

#include <stdio.h>

void main()

{

int a,b,c;

printf("\nNumber square cube ");

a = 1;

do {

b = a*a;

c = b*a;

printf("\n%5d%8d%8d",a,b,c);

a++;

} while(a <=25);

}

The version of the program to input a string of characters using this structure is.

 

/* This program copies an input string of characters to the screen and also prints the number of characters. The string is terminated by an exclamation mark */

#include <stdio.h>

void main()

{

int n =0;

printf("\n")

do {

c = getchar();

putchar(c);

n++;

}while (c != '!');

printf("\n\nNumber of characters %d\n",n);

}

It will always ask for at least one character and prints the final '!'.

 

 

The if and if - else statements **

These allow branching in the program.

The if construction allows the execution of a program block only if certain conditions are met. Its form is

if(expression1)

statement1

statement2

where statement1 and statement2 can be compound statements. If expression1 is true (non-zero) the statement1 is executed followed by statement2, otherwise only statement2 is executed.

As an example, consider the following fragment from a program to process class examination marks.

printf("\nTotal marks %d\n",total);

if (total >= 130){

printf(" Exemption \n");

numexempt++;

}

It prints the message "Exemption" and increases the variable numexempt by one if total is greater than or equal to 130.

 

The if - else statement has the form

if (expression1)

statement1

else

statement2

statement3

where statement1, statement2 and statement3 can be compound statements.

If expression1 is true (non-zero) then statement1 is executed followed by expression3. Otherwise, if expression1 is false, statement2 is executed followed by statement3. This allows the program to follow one of two branches depending on a tested condition and then to continue with the rest of the program after completion of the appropriate branch.

Consider the example.

printf("\n Total marks %d",total);

if (total > 140){

printf(" exemption\n");

numexempt++;

}

else

others++;

It is possible to have combinations of if-else statements of the form

if (expression1)

statement1

else if (expression2)

statement2

else if (expression3)

statement3

. . . . . . .

else if(expressionN)

statementN

else

default statement

In this case, any statement is executed if its corresponding expression is true and only if all previous expressions are false. The default statement is only executed if all the expressions are false. There need not be a default statement and, in this case, the default is to do nothing and continue with the rest of the program.

Consider the example given above, expanded to allow for various outcomes which depend on the mark obtained.

printf("\n Total marks %d",total);

if (total >= 160){

printf(" first class ticket\n");

n1++;

numexempt++;

}

else if (total >= 130){

printf(" second class ticket\n");

n2++;

numexempt++;

}

else if (total >= 70){

printf(" class ticket\n");

nc++;

}

else{

printf(" no class ticket\n");

nt++;

}

It might seem that there is an ambiguity if there is the following structure (for integer variables a, b and c, previously declared and given values)

if(a < 3)

if(b >= 4)

c = 5;

else

c = 0;

To what if does the else refer? The rule for this is that an else is associated with the if without an else that most immediately precedes it. In the above case it is

if(b >= 4)

This becomes clearer when the appropriate indentation is used. The program fragment above should be indented as below.

if(a < 3)

if(b >= 4)

c = 5;

else

c = 0;

Again, the importance of proper indentation is clearly demonstrated.

 

The break instruction *

Often it is necessary to end a loop part of the way through it or if it is otherwise set up to continue indefinitely. The statement

break;

does this. If this is encountered inside a loop, the loop is immediately terminated and control resumes at the next statement following the loop. For example

#include <stdio.h>

void main()

{

int t;

for (t = 0; t < 100; t++){

printf("\n%d",n);

if (t >= 10) break;

}

printf("Ended");

}

will print the numbers from zero to ten followed by the message

Ended

on the screen.

This is a rather artificial example but the program to read and count a string of characters can now be modified to eliminate the inclusion of the '!' used to terminate the string.

/* This program copies an input string of characters to the screen and also prints the number of characters. The string is terminated by an exclamation mark which is not printed or counted */

#include <stdio.h>

void main()

{

int n =0;

printf("\n")

while (1) {

c = getchar();

if (c == '!')

break;

putchar(c);

n++;

}

printf("\n\nNumber of characters %d\n",n);

}

This statement is usually used when a special condition requires immediate termination of a loop before reaching the test at the end or the start of the loop. Note in the example above, the while statement is always true so that the loop would be infinite without the break instruction. Break can also be used in the switch statement below.

The continue instruction

This is used like the break instruction inside a loop but its effect is not to terminate the loop at that point but to omit the rest of the loop and the continue with the execution of the next cycle round the loop. The same effect can be achieved by the use of an if structure which includes or omits the rest of the loop as required.

The switch statement *

This allows multiple branching which depends on the value of a single variable in a way which is less cumbersome than a sequence of if - else statements.

A typical program fragment using it is

char c;

int a=0, b=0, x;

.....

.....

c = getch();

switch(c){

case 'a':

case 'A':

printf("\nThe upper or lower case A key has been pressed");

a++;

break;

case 'b':

printf("\nThe lower case b key has been pressed");

b++;

break;

case 'B':

printf("\nThe upper case B key has been pressed");

b++;

break;

default:

printf("\nSome other key has been pressed");

x++;

}

Depending on the value of the character c, control jumps to the appropriate case label and the subsequent parts of the program are executed until a break; is encountered after which control goes to the statement immediately after the switch structure. The flow of control should be obvious from the messages printed on the screen.

The function exit *

This is a function which causes immediate termination of the program, closes all open files, flushes the output stream and returns control to the computer's operating system. It requires the inclusion of the header stdlib.h. It is normally used in the statement

exit(0);

with its argument set to zero to signal a normal termination of the program. Non-zero values can be used to signal an abnormal end of a program but the way in which the argument is returned to the operating system is implementation dependent.

If it is omitted from the end of the program, the program will terminate as if the statement

exit(0);

was at the end of the main function.

 

ARRAYS AND STRINGS **

These are two closely related topics since a string is basically a special kind of array, one which contains characters.

 

Arrays **

Suppose we have to write a computer program which has to input and store a large number of measurements of the same quantity. An example might be the readings of a voltage from some experiment, taken at one second intervals over a period of one hour, 3600 readings in all. It would be very cumbersome to treat each reading as separate variables, say volt1 to volt3600. The program would then need a section for each reading which was taken.

It is much easier in such a case to have a single name and to label the particular variable with an index which can be calculated by the program. This is called an array.

An array has to be declared in the same way as the variables already encountered. The difference is that the number of elements of the array, its size, has to be specified. If the voltage is read as an integer, the appropriate declaration would be

int volt[3600];

This assigns contiguous space in memory for 3600 integer variables. The first value of the index is zero, not one, so that the variables are

volt[0], volt[1], to , volt[3598], volt[3599].

Therefore using volt[3600] during the program would take it outside the memory assigned to the array volt.

The index is enclosed within square brackets and can itself be a variable or an expression which returns an integer value.

For example,

int i = 50, j, volt[3600];

int x = 234, y = -78;

.

.

j = 25;

volt[i] = x;

volt[i + 3*j] = y;

assigns the value 234 to volt[50] and the value -78 to volt[125].

Arrays can have more than one index and are closely related to pointers. These properties will be discussed later in the course.

An array can be declared and initialised as follows.

int a[6] = {2, 4 , 6, 3, 5, 7};

This is equivalent to

int a[6]

a[0] = 2;

a[1] = 4;

a[2] = 6;

a[3] = 3;

a[4] = 5;

a[5] = 7;

Memory is allocated for six integers and the values given above are placed in these addresses. If the compiler gives 0x3f00 as the address of the origin of the array, the contents of the memory will be

Address Variable Value

0x3f00 a[0] 2

0x3f02 a[1] 4

0x3f04 a[2] 6

0x3f06 a[3] 3

0x3f08 a[4] 5

0x3f0a a[5] 7

This should demonstrate the logic behind having the index values start at zero rather than one. The address of each element of the array is the address of the origin of the array plus the index times the size of the variable. This is related closely to pointer arithmetic which is described below.

It is also possible to declare the array without giving its size if it is initialised. The above declaration could have been written as

int a[ ] = { 2, 4, 6, 3, 5, 7};

and the compiler would know to assign enough memory to contain six integer variables. It should be obvious that there is enough information for the compiler to be able to know to do this.

Multiple indices *

It is possible to have more than one index. For example,

float g[4][4];

will assign memory for 16 (4*4) floating point variables, from g[0][0] to g[3][3].

A multiply indexed array can be initialised but it is necessary to declare the size of the array. The array is filled from the given values with the right hand index increasing fastest. For example,

float a[3][3] = {1., 2., 3., 4., 5., 6., 7., 8., 9.};

is equivalent to

float a[3][3];

a[0][0] = 1.0;

a[0][1] = 2.0;

a[0][2] = 3.0;

a[1][0] = 4.0;

a[1][1] = 5.0;

a[1][2] = 6.0;

a[2][0] = 7.0;

a[2][1] = 8.0;

a[2][2] = 9.0;

 

 

Arrays and loops **

It is clearly very useful to use arrays inside loop structures with indices, particularly using the for statement. As an example, suppose data has been received from some external device and placed in an array of floating point variables declared by

float dat[200];

starting at dat[0] and that the number of values entered is given by the variable declared by

int ndat;

The average value of the data can be calculated as follows.

float av;

int i;

for (i = 0, av = 0.0; i < ndat; i++)

av = av + dat [i];

av = av/ndat;

Note how, inside the for loop, the sum of the array elements is accumulated in the variable av which is given the initial value of zero on entering the loop.

A more complicated task might be to calculate the numbers of values which fall within certain intervals. This might be done as the first stage of presenting the data as a histogram using the graphics abilities of some implementations of C. Suppose that we know that the array dat above contains values which lie between 0.0 and 10.0 and that we wish to know the number of values which lie between 0.0 and 1.0, between 1.0 and 2.0, and so on. These have to be place in an integer array declared by

hist[10];

The piece of program to calculate the values for hist could be as follows.

 

int i, j;

float low_value = 0.0, high_value = 1.0;

/* lower and upper values for interval limits */

for (i = 0; i < 10; i++){

/* loop over the intervals */

hist[i] = 0;

/* The array must be set to zero at start */

for (j = 0; j < ndat; j++){

/* Loop over the data */

if ((dat[i] >= low_value) && (dat[i] < high_value))

hist[i]++;

}

lv = lv + 1.0; /* New interval limits */

hv = hv + 1.0;

}

(It should be noted that any data value 0f 10.0 would not be included in the histogram obtained.)

 

Important note **

One feature of arrays in C which it is important to remember is that the values of array indices are not usually checked to ensure that they are inside the values used to define the array when it is declared. In particular, when placing values inside an array, it is possible to change the contents at memory locations well outside the part allocated for the array if the values of the indices are allowed to stray outside the range defined. This can sometimes cause very mysterious failure of a program. Care is therefore required! In particular, programmers who are converting to C from other languages often forget that declaring an array, int a[100];, does not allocate memory for the array element a[100].

 

STRINGS *

These are a special case of arrays in which the variable are of type char.

A string is written as, for example, "This is a string". It is enclosed by double quotation marks. Remember that, in comparison, a variable of type char is enclosed by single quotation marks.

The end of a string is denoted by an extra character which is not shown. This is the null character, denoted '\0'. Remember that a single character preceded by a backslash is treated as a single character.

 

String constants *

These have already been encountered. "This is a string" is a string constant. It is a string with a well defined value.

A string can be declared and initialised as a character arrays as, for example,

char message[ ] = "This is a string";

The dimensions of the character array are set so that is just enough to contain the string, including the '\0' at the end. The declaration above would be equivalent to

char message[ ] = ('T','h','i','s',' ','i','s',' ','a',

' ','s','t','r','i','n','g','\0');

 

 

BITWISE OPERATORS *

These are used to control the values contained in individual bits of variables of type char, short, int and long. They are

& - bitwise AND

| - bitwise OR

^ - bitwise exclusive OR (XOR)

>> - left shift

>> - right shift

~ - one's complement This operator is unary.

 

The operators &, | and ^ compare the corresponding bits of the variables on either side of them and set the appropriate bit of the result according to the following rules.

The AND operator sets each bit of the result if both of the corresponding bits of the operands are set and clears it otherwise.

The OR operator sets each bit of the result if either or both of the corresponding bits of the operands is set and clears it otherwise, i.e. of both bits of the operand are clear

The XOR operator sets each bit of the result if both of the corresponding bits of the operands are different and clears them if they are the same.

The << operator moves the bit pattern of the left hand operand to the left by the number of bits given by the right hand operand.

The >> operator moves the bit pattern of the left hand operand to the right by the number of bits given by the right hand operand.

The ~ operator is unary. (It only has one operand, that being on its right.) It reverses the bit pattern of the operand.

These operators can be used to control individual bits in memory. Each hexadecimal digit is contained in four bits so that two hexadecimal digits are needed to access or alter the contents of a byte. It is useful to be able to relate the hexadecimal digits to the bit pattern in a half byte (a nibble!).

Hexadecimal Binary

0X0 0000

0X1 0001

0X2 0010

0X3 0011

0X4 0100

0X5 0101

0X6 0110

0X7 0111

0X8 1000

0X9 1010

0XA 1011

0XB 1011

0XC 1100

0XD 1101

0XF 1111

Therefore, in order to set bits 2, 5 and 7 in a byte length number, the operator | should be used with the binary number which has these bits set.

76543210 ¬ The individual bits are numbered from zero starting from the right

10100100

i.e. with 0XA4. If a points to the byte (to type char)

*a = *a | 0XA4;

will set bits 2, 5 and 7.

If it is required to clear bits 2, 5 and 7, then the operator & can be used with the binary number which has these bits cleared.

76543210

01011011

i.e. with 0X5B. If a points to the byte,

*a = *a & 0X5B;

will clear bits 2, 5 and 7.

It is now possible to remove the parity bit from a character received from a peripheral device and to get the corresponding ASCII value. This was discussed in the section on the ASCII code. If c is the value of the variable received from a peripheral device, the parity bit can be removed by

c = c & 0X7F;

This clears the parity bit (bit 7).

The process of clearing specific bits in a binary number is known as masking. It can be done also with integer and long integer sized numbers.

If it is required to change bits 2, 5 and 7, the operator ^ can be used with the binary number which has these bits set, i.e. with 0XA4 as above. Again, if a points to the byte,

*a = *a ^ 0XA4;

will change bits 2, 5 and 7.

These techniques have an obvious extension to segments of memory that are two or four bytes long by using integer or long integer variables.

The & operator can be used to restrict the value of a variable of type char, int or long, to values between zero and 2n - 1. Here, of course, the value of n must be such that the number can be contained in the appropriate type of variable. As an example, suppose that i is an unsigned integer which is used as the index of an array

int sn[512]

and that during a loop, i is increased by a small integer but, if it exceeds 511 and takes the index outside the defined range, it has to be reduced by 512, to stay inside the allowed range. It would be possible to do this with

if (i > 511)

i = i - 512;

However, it is also possible, and faster, to do this by using the AND operator.

i = i & 0X1FF;

The hexadecimal number 0X1FF in binary is

FEDCBA9876543210

0000 000 111111111

Using & with this clears all bits above bit 8 and effectively subtracts the appropriate multiple of 512 to ensure that the result is between zero and 511. This is quicker than using the if form above but it should be clear that it only works for integers, whose values have to lie in the range between zero and

2n - 1 for some integer value of n.

The shift operators >> and << can be used to move bit patterns to the left and to the right. When moving to, the left, zeroes are moved into the bits vacated on the right hand side. Zeroes are also moved into the left when acting on an unsigned variable but the behaviour when acting on a signed variable will move in ones if the variable to be shifted is negative on some implementations but zeroes on others. It is safest to use unsigned variables.

 

ASSIGNMENT OPERATORS **

Often it is necessary to use an expression like

k = k + 2;

or

f = f*i;

These can be shortened by the use of assignment operators which have the form

variable operator= expression;

and which have the meaning

variable = variable operator (expression);

so that the two examples above could be encoded as

k +=2;

f *= i;

Note that there is no gap between the operator and the equals sign. The possible assignment operators are

+= -= *= /= %= >>= <<= &= ^= |=

Note that the expression to the right of the operator is evaluated first followed by the addition or whatever is specified by the assignment operator. For example,

i *= j + 3;

is equivalent to

i = i*(j + 3);

and not to

i = i * j + 3;

which in turn would be equivalent to

i = (i*j) + 3;

The assignment operators are useful since they produce a very compact code but they should be used with care in complicated expressions and need not be used at all.

PRECEDENCE AND ASSOCIATIVITY **

Suppose we have the program fragment -

int a = 3, b = 4, c = 5, d;

d = a*b + c;

The value given to d will depend on whether the multiplication is done before or after the addition. The first gives 12 + 5 = 17 and the second gives 3*9 = 27. Precedence is the priority given to different operations and determines the order in which are carried out. For example, all multiplications are executed before any additions.

The order in which operators of the same precedence are carried out is still undetermined. For example, for floating point variables, a, b c and x, is

x = a/b/c;

to be treated as x being given the value of a divided by b with the result being divided by c or is the required value obtained by dividing b by c and using the result to divide a? Associativity is the order in which operations of the same precedence are carried out; i.e. from left to right or from right to left.

The order of precedence for the arithmetical operators, starting with those with the highest precedence, along with the associativity of those with the same precedence, is as follows.

 

Operators Associativity

++ -- -(unary) R to L (highest)­

* / % L to R

+ - L to R

= R to L (lowest)¯

Parentheses are very useful since the rule is that any part of an expression in parentheses is evaluated before the rest of the expression. This can be used to circumvent the order of precedence rules. If we wanted in the program fragment above to set d to a times all of b plus c, we would write

d = a*(b + c);

so that a + b is evaluated before the multiplication is carried out. The current style of writing programs in C is to make full use of the rules of precedence and to minimise the use of parentheses but this can make expressions hard to interpret and it is often clearer to make frequent use of parentheses instead of just using the precedence rules on their own. It certainly makes it easier to understand the program.

 

 

Precedence rules **

The rules of precedence are not restricted to the arithmetical operators but to any kind.. C has many operators, 46 in total. This is more than in most other languages so that the precedence and associativity rules play a crucial part when developing a program. The rules for the operators already met are as follows.

Operators Associativity

() [] ++ (postfix) -- (postfix) L to R (Highest)

++ (prefix) -- (prefix) - (unary)

~ sizeof & (address) * (dereferencing) R to L

* / % L to R

+ - L to R

<< >> L to R

< <= > >= L to R

== != L to R

& (bitwise) L to R

^ L to R

| L to R

&& L to R

|| L to R

?: R to L

= += -= *= /= %=

>>= <<= &= ^= |= L to R

, (Comma operator) L to R (Lowest)

 

Examples **

int a = b = c = d = e = f = g = h = i = j = k = m = n = 3;

---a;

/* equivalent to -(-(-a)) with value -3 */

- --a;

/* equivalent to -(--a) with value -2. a has value 2 */

-- -a;

/* ILLEGAL - precedence is -- before - */

b - --c;

/* Equivalent to b - (--c) giving value 3 - 2 = 1 */

d-- - e;

/* Equivalent to (d--) - e giving value 3 - 3 = 0 and d has final value 2 */

d --- e;

/* ILLEGAL - precedence rules do not allow this to be interpreted by the compiler */

- f-- - g;

/* Equivalent to (- (f--) - g giving value -3 and f has final value 2 */

f++ = g;

/* Equivalent to (f++) = g; - ILLEGAL */

h ++/ ++ i * --j;

/* equivalent to ((h++)/(++i))*(--j) giving value (3/4)*2 = 0*2 = 0.

Associativity is L t R and h and i have final values of 4, j has final value 2

Note the 3/4 is evaluated using integer division so that it gives zero */

++k / m++ * -- n;

/* Equivalent to ((++k)/(m++))/(--n) giving (4/3)*2 = 1*2 = 2.

k and m have final values of 4, n of 2 */

 

FUNCTIONS **

These have already been met briefly but it is now necessary to give a much more complete description of what they are and how they are employed.

A function has a header, which identifies the function, gives the type of value it returns and names its arguments or parameters, the values that are passed to it when it is called. A function also has a body. This is the remainder of the function and is enclosed by curly brackets. The recommended form for functions in ANSI C is

return-type function-name(parameter declarations, if any)

{

declarations

statements

}

As an example, consider a function which returns the integer power of a floating point number. The arguments are of type float and int. The function returns a floating point value.

float power(float x, int n) /* Header */

{/* Body */

float v = 1.0;

int i;

for (i = 0; i < n; i++)

v *=x;

return v;

}

Inside the body, the return statement specifies the value that has to be returned after the function has been executed.

Note that, in the header, the name and return type of the function is given and that the arguments x and n are declared inside the round brackets. This is the ANSI form although, in order to be compatible with earlier versions of C, the following from traditional (pre-ANSI) C is also acceptable to most ANSI compilers with the parameters being declared after the header.

 

float power(x,n)

float x;

int n;

{

float v = 1.0;

int i;

for (i = 0; i < n; i++)

v *=x;

return v;

}

This form is acceptable only to allow the compilation of programs written before the introduction of ANSI C. It is strongly recommended that this style is no longer used.

A function like this can give the value of a floating point number raised to an integer power in any other part of the program. For example, inside a program,

float x, y=2.0;

int p =3;

x = 2.5*power(y,3);

will give the value 2.5 times 8 (i.e. 20. ), or 20 to x.

Note that the variable names for the argument, used when the function is called need not be the same as the names used inside the function.

In general, a function can be called from any part of the program as part of an expression, if it returns a value, or as a statement by giving the function name. The arguments can, in both cases, be variables declared inside the calling part of the program or expressions involving such variables. These must be of the same type as the function's arguments which are specified in the declaration of the function. For example, if it is required to set variable y to the value 2 times x all cubed, the function power given above allows this with the statement

y = power(2*x,3);

Here, x must be a floating point variable so that the expression 2*x is also floating point to satisfy the declaration of the function power.

 

Function types **

Functions can be of the same type as variables.

int

short int

long int

unsigned int

unsigned short int

unsigned long int

char

unsigned char

float

double

However, it is also possible to have a function which does not return any value. This is type

void

Such a function corresponds to a subroutine in Fortran or BASIC or to a procedure in Pascal.

However, even a function which does return a value can be called without the returned value being assigned to any variable. For example,

getch();

has already been used to produce a pause in a program although getch returns a character value. This is discarded when the statement above is executed.

 

return statement **

This specifies the value which is to be returned to the calling part of the program.

The statement giving the value may be contained inside brackets but this is not necessary. Both

return(z*x/y);

and

return z*x/y;

are acceptable.

A function can have more than one return statement if it has to return different values depending on certain conditions. For example, a function to calculate factorials might have the form.

int factorial(int n)

{

int f = 1, i;

if (n < 0)

return 0;/* factorial is not defined for this */

else if ((n = 0) || (n = 1))

return 1;

for (i = 2; i <= n; i++)

f *=i;

return f;

}

 

Functions with no return value **

Functions can be written which do not return a value. For example, a small function which causes a program to print a message and stop the execution of the program until a key is pressed might be as follows.

void hold()

{

printf("\nUse any key to restart the program\n");

getch();

return;

}

This does not return a value and can be used to halt the program at any point by having

hold();

Note the use of return without a return value. In this kind of program, it is possible to omit the return statement and when the last curly statement of the function is reached, the program "falls off the end" and returns to the calling part of the program.

 

Declaring functions **

Functions should, and for many compilers must, be declared before they are used. This declaration is known as the function prototype and it allows the compiler to know the type of the returned value and the number and types of its arguments. The types of the arguments must therefore be clearly specified.

For example, a function to print the sine of a floating point number and to return the value of the sine might have declaration

float prntsin(float);

The function prntsin could have the form

float prntsin(float y)

{

float y;

y = sin(x);

printf("The value of the sine of %f is %f",x,y);

return y;

}

If a return type is not specified in the function declaration, type int is assumed by default.

The header files which are added to programs by #include, contain the declarations of the standard functions included in the C function libraries along with other information needed to use the functions. They should not and, in many cases, cannot be omitted! The program using prntsin should therefore contain the prototype for the sine function which is used within prntsin. This is achieved by adding the appropriate header file.

#include <math.h>

In order to conform to earlier forms of C, before ANSI C was specified, it is possible to declare a function without giving the arguments as, for example,

float power();

This allows the compiler to know how to interpret the number returned by the function but it deprives the compiler of some of the power to spot errors since it is then unable to check the use of the function in subsequent parts of the program to ensure that the correct number and type of arguments are being used. This style is not recommended.

 

Parameter passing to functions **

If an argument of a function is a variable, what is passed into the function is the value of the variable. This is referred to as call by value. What happens is that copies of the values of the arguments are created and these copies are passed to the function. It means that, even if the value of an argument is altered inside a function, these changes are not passed back to the calling part of the function.

To illustrate this, consider the power function which could also be written as

 

float power(float x, int n)

{

float;

for (v = 1.0; n > 0; n--)

v *=x;

return v;

}

A call to this function by

float y = 5.0, a;

int p = 3;

a = power(y,p);

would give a the value 125.0 (5.03 ) and leave y with the value 5.0 and p still with the value 3. This is because the function power has no knowledge of where p, in the calling part of the program, is stored and it therefore cannot change p. This would not be the case in some other languages. For example, in Fortran, a call to a corresponding function would leave p with the value zero because the what is passed to the function in Fortran is not the value of the parameter but its address. The Fortran function uses the variable at this address in its calculation so that after the return has been executed, the parameter has the last value it had inside the function. The passing of parameters as addresses is referred to as call by reference.

Obviously, it is sometimes necessary to have a function which has to alter the values of several variables. This would allow functions in C to take on the role of subroutines in Fortran and procedures in Pascal. As has been described, this can be done if the addresses of the variable are passed, rather than the value of the variables themselves. In C this is achieved by using pointers as the function arguments.

As an illustration, consider a function to exchange the values of two integers. It might be thought that this could be achieved by a function declared by the prototype

void swap(int a, int b);

and with the code

void swap(int i, int j)

{

int temp;

temp = j;

j = i;

i = temp;

}

The values of two variables, p and q, would be exchanged by the call

swap(p,q);

However, because swap does not have access to p and q (i.e. to their addresses), but only to copies of their value, this call would leave p and q unchanged.

What is required is the following function, declared by the prototype

void swap(int *, int *);

whose arguments are pointers to integers and with the code

 

 

 

void swap(int *i, int *j)

{

int temp;

temp = *j;

*j = *i;

*i = temp;

}

The values of two integer variables, p and q, would be exchanged by the call

swap(&p,&q);

Now the function knows where to find p and q and it is therefore able to alter them.

Consider as another simple example a function which has to cube the value of three numbers. It might have the form

void cuber(float *a, float *b, float *c)

{

*a = (*a)*(*a)*(*a);

*b = (*b)*(*b)*(*b);

*c = (*c)*(*c)*(*c);

return;

}

Note the use of (*a) etc. avoid having to remember the rules of precedence. This function could be called to cube the values of three floating point variables, x, y and z, by

cuber(&x,&y,&z);

This is one way in which C can have a structure equivalent to a subroutine in Fortran which is used to alter the values of several variables but does not actually have a return value.

 

THE PREPROCESSOR **

The instructions starting with a hash, '#', have already been met but not explained. These instructions, which must have a '#' in the first column, are for the preprocessor, a stage which is invoked before the actual compilation and which can be used to alter the source code before it is compiled.

In all calls to the preprocessor, there should be no space between the '#' and the subsequent instruction and the '#' must be in the first column.

There are two important instructions which are particularly relevant to the needs of the programmer who is learning to use C..

 

#include **

This has already been met in the form of instructions to include the header files appropriate for library functions. For example,

#include<stdio.h>

#include<math.h>

These contain prototypes and other information for the functions in that library.

In general the include instruction tells the preprocessor to add to the program, before compiling, the contents of the disk file specified between < and > or between double quotation marks. It does so by replacing the #include instruction in the source code with the file contents

If it has the form

#include <filename>

the preprocessor looks only in the standard places where the standard header and other files are kept.

If it has the form

#include "filename"

the preprocessor replaces the include instruction with the disk file specified by filename. An example might be

#include "A:/mylib/bubble.c"

to include the bubble sort functions which had been previously written and stored in the file bubble.c in library mylib on disk A (The floppy disk in a microcomputer). In this way, it is possible to have some personal uncompiled functions which can be included in a program without explicitly adding them to the source file.

 

#define **

This is another instruction to the preprocessor. It can be used to define constants in a program. For example, the constant PI can be defined by

#define PI 3.141593

What happens is that the preprocessor goes through the source code and replaces any occurrence of the sequence of letters PI by 3.141593 before the actual compilation takes place. The quantity PI in the program is not a variable because, if there is a statement

a = sin(x*PI/180);

what is actually compiled is

a = sin(x*3.141593/180);

By convention, constants defined in this way are given names in upper case letters to distinguish them from variables which are given names with lower case letter. This is just a convention and is not essential.

The #define instruction is useful in a program with a large number of arrays of the same size. For example, it might contain the arrays declared by

float x[128], y[128], z[128]

float g[128][128]

etc.

with loops throughout the program such as

for(i = 0, i<128, i++)

x[i] = ...etc.

If the size of the arrays have to be changed, for example, to contain more data, it can be a long and tedious job to modify the program. However, if the preprocessor is used, it is possible to write

#define NSIZE 128

and to declare the arrays by

float x[NSIZE], y[NSIZE], z[NSIZE]

float g[NSIZE][NSIZE]

etc.

and to have loops throughout the program which are written as

for(i = 0, i <NSIZE,i++)

x[i] = ...etc.

Any change in the size of the arrays can be achieved simply by altering the #define statement and the whole program is then converted.

The #define statement is best used to define constants which are used during the program. For example, it is possible to give the memory addresses used for i/o ports suitable names using this instruction. The best position for it is at the start of the program.

It can be used for a variety of other purposes, sometimes to suit the individual tastes of the programmer. For example

#define FOREVER while(1)

will allow an infinite loop to begin with

FOREVER {

etc.

}

However, this can result in programs which are idiosyncratic and confusing to others so that the temptation to do too much of this should perhaps be resisted.

 

Macros with arguments

#define can also be used to define small functions using it in the form

#define identifier(identifier1,identifier2,....) token_string

where identifier is the name of the function, identifier1 etc. are the function arguments and token_string defines the function. As an example, consider the function to give the square of its argument

#define SQ(x) (x)*(x)

With this, each occurrence of SQ(expression) is replaced by

(expression)*(expression)

Note the parentheses round the argument. If these were omitted and, instead

#define SQ(x) x*x

was used, SQ(x + y) would become

x + y*x + y

Note also that there is no semi colon. This allows the function defined in this way to be incorporated in an expression.

There are several more preprocessor commands but these are not frequently needed and the reader is referred to a suitable textbook if they are required.

 

 

ARRAYS AND POINTERS *

More about pointers *

Let us first revise what has already been given about pointers. Variables, except for register variables (described later) but including array elements, are stored in memory at specific locations and with an appropriate amount of memory allocated to them depending on their type. A pointer is a variable whose size is a long integer which gives the location in memory of a variable. The type of the variable is specified when declaring the pointer so that a pointer also implicitly contains information about how much memory has been assigned to the variable.

For the declaration

int *p, i, j = 5;

i is an integer variable and p is a pointer to an integer variable. The form for declaring a pointer can be remembered because *p is the value of the integer stored at p and this value in memory is appropriate for type int; i.e. *p is type int.

The unary operator * is the indirection or dereferencing operator. The inverse operator is &.

With the above declaration, the compiler sets the location for the integers i and j.

p = &j;

will give p the value of the address of the integer j. Then

i = *p;

will give i the value 5, the value stored for j.

*p = 7;

will then place the value 7 into the two byte part of memory allocated to j. That is, j will have the value of 7.

 

More about arrays

The declaration of an array allocates memory for a specified number of variables of a given type and allows access to each of them using indices. For example,

int a[12];

will allocate memory for twelve integers and, say, a[5] will refer to the sixth of them. (Remember that the indices have values that start at zero.)

Similarly, it is possible to have more than one index.

int a[12][5], b[3][4][3];

will allocate memory for 60 integers for the array a and 36 for the array b.

For integers i, j, k,

i = 3;

j = 3;

k = 2;

a[i][j] = 24;

b[i][j][k] = 26;

will set a[3][3] to 24 and b[3][3][2] to 26.

Obviously, loop structures with for, while and do ... while are very useful when dealing with arrays.

 

Pointers and arrays

There is a very close relationship between pointers and arrays. What has to be stored for an array with one index in order that it can be used is the location of the first element and the type of variable in the array?

In fact, for

int ar[12], *p;

p = ar;

will set the value of p, a pointer to an integer, to the first element of the array ar. The use of ar on its own is just like the use of a pointer with one important exception. The value of ar has been set by the compiler or linker and this value cannot be changed.

For arrays with more than one index, for example, declared by

int br[6][9];

the use of br with only one index, say br[3], gives a pointer to br[3][0] so that br[3] contains the address of the array element br[3][0]. In other words, br[0] to br[5], are like an array of pointers. The situation gets even more complicated with three or more indices and will not be pursued here.

However, it should now be clear that there is a strong link between arrays and pointers.

 

Pointer arithmetic *

It is possible to add to and subtract from pointers but the results are not quite what might be first expected. When a pointer is declared, the type of variable is also specified so that if we have several integers in memory, with no gaps and starting at 0X1200 say,

long int *p, *q, i, j = 3;

has p pointing to the first of these long integers.

p = 0X1200;

i = *(p + j);

does not give i the value of the long integer which is stored in memory starting at location 0X1203 but from the location which would store the fourth element of an array of long integers, the one with index [3], with the array starting at address 0X1200. That is the integer would be stored at 0X1200 plus decimal 12 (3*4), at 0X120C. Therefore the statement, involving pointers,

q = p + j;

will give q the value 0X120C, increased from the value 0X1200 by decimal 12 to take account of the three integers, each of length four bytes, which must be passed over to get to the fourth element of the array.

Therefore, any change in the value of a pointer by adding or subtracting numbers, changes its value by an amount which corresponds to that number multiplied by the length of the variable type to which the pointer refers..

This also applies to arrays when the array name is used without an index. For example for declarations

float a[20], y, *p;

int i = 6;

the statements

p = a;

y = *(p + i);

would set p to the address of a[0] and y to a[6]. The second result would also be obtained by

y = *(a + 6);

or

y = a[6];

The right hand parts of these statements giving y a value are equivalent.

It might be thought that this pointer arithmetic is an unnecessarily complication but it can significantly reduce the amount of code needed in some situations. Consider reading valued for a[2], a[3], a[5] and a[12] from the keyboard for the array above. It can be done using

scanf("%f%f%f%f", &a[2], &a[3], &a[5], &a[12]);

but this can be shortened to

scanf("%f%f%f%f", a+2, a+3, a+5, a+12);

Note from this example that the variable list in scanf need not always use &.

Pointers to void

In pre-ANSI C, pointers of different types are considered to be assignment compatible. That is, the value for one pointer can be given to any other pointer. In ANSI C, one pointer can only be assigned to another only when they have the same type or if one of them is type void. The declaration void * can be thought of as a generic pointer type. With the declarations,

int *p;

float *q;

void *v;

the following would not be legal in ANSI C,

p = 0X2300;

q = 0X5000;

p = q;

but these would

p = 0;

p = (int *) 1; /* This is a cast (described later) to convert to a pointer to an integer. Note the parenthesis to take account of precedence */

v = q;

p = v; /* To get the value in q into p */

p = v = q; /* Same as the two previous lines */

p = (int *) q /* Again, this is a cast */

The standard functions calloc() and malloc() which provide dynamic storage allocation for arrays and structures return a pointer to void. These are described below

All of this might seem an unnecessary complication but it does force a discipline on programming in a way which tends to pick up programming errors.

 

Passing arrays as function arguments

The correspondence between arrays and pointers now allows arrays to be passed to functions and, if necessary, for the function to alter the contents of the array.

Consider the following function to sum the elements of an integer array.

float sum(float a[], int n)

/* Sums the first n elements of the array pointed to by a */

{

int i

float s = 0.0;

for (i = 0; i < n; i++)

s += a[i];

return s;

}

This might be called, with the previous declaration

float x, u[50], v[100];

as follows, once the array elements have been given values,

x = sum(v,100);

to give the sum with the value v[0]+v[1]+ +v[98]+v[99],

x = sum(u,10);

to give the sum, u[0]+u[1]+u[2]+ +u[9],

int n =43;

x = sum(u,n-8);

to give u[0]+u[1]+u[2]+ +u[34] and

x = sum(&v[9],10);

to give v[10]+v[11]+v[12]+ +v[19].

The last of these illustrates the correspondence between pointers and arrays. The variable a[ ] declared as the argument of the function sum is actually treated as a pointer and what is passed to the function is the address of v[10]. This is used as the origin of the array a inside the function. The same result would be obtained with

x = sum(v+9,10);

since v+9 has the value of the address of v[9].

Another example of a function which acts on arrays is the following which orders the elements of an array according to their value It uses a rather primitive and inefficient method called bubble sort and so is called bubble. See if you can understand how it works.

bubble(int a[ ],int n)

{

int i,j;

for (i = 0; i < n; i++){

for (j = 0; j < n-1; j++){

if (a[j] > a[j+1])

swap(&a[i], &a[i+1]);

}

}

}

It uses the function swap already written and could have to be declared by the prototype

void bubble(int a[ ], int n)

The first n elements of any previously declared integer array, say a, can be ordered by

bubble(a,n);

It can be seen that the function to which a singly indexed array is passed can receive an array of any size but in the case of arrays with more than one index this is not the case since it needs to know the range of values of all but the first index. The passing of a two dimensioned array is illustrated by the following function to calculate the trace of a matrix. The declaration of the function is

float trace(float a[ ][SIZE]);

where SIZE is given by a #define and the function is

 

float trace(float a[ ][SIZE])

{

float x;

int i;

x = 0.0;

for (i=0; i<SIZE; i++)

x += a[i][i];

return(x);

}

For the array p declared by

float p[SIZE][SIZE];

the trace is given by

y = trace(p);

This can only be used for a matrix which is SIZE by SIZE so that the flexibility which exists for singly dimensioned arrays no longer exists. This example serves as a suitable prototype for other cases in which a doubly indexed array has to be passed to a function and the extension to more than two indices should be reasonably transparent.

 

MIXED MODE ARITHMETIC AND CASTS **

C allows the use of variables of different types in an expression. This is referred to as a mixed expression. The general principle employed is that, in any arithmetic expression, the value obtained for any variable is converted to the form of any other variable encountered in the expression so as to preserve information.

First, any variable of type char is converted to int and any unsigned char or short char is promoted to unsigned int.

Second, the values used to evaluate an expression of mixed type are promoted according to the following hierarchy

int < unsigned int < long int < unsigned long int < float < double.

If, in an expression or inside parentheses, a variable is encountered which is higher in the hierarchy, all the other numbers used are promoted to that level. It should be noted that nothing is done to the variables themselves but only to the values taken from them.

Once the expression is evaluated, the value obtained is converted to the type of the variable to which it is to be assigned.

It should also be remembered that placing the value of an expression into a variable which is lower in the hierarchy can result in a loss of information. For example placing the result from an expression which is evaluated as a long integer in an integer variable or placing the value of a floating point expression into a long integer can, and in the second case will usually, result in a loss of information. For example, placing a floating point value into an integer or long integer will result in the loss of the decimal fraction from the value.

Conversions can be forced on the values taken from a variable by preceding the value with the name of the required type of variable in brackets. For example, for integer i and floating point x,

(long) i

gives the value of i but expressed as a long integer and

(double) x

gives the value of x but expressed as a double precision floating point variable.

These conversions are known as casts.

MEMORY ALLOCATION *

C provides the functions calloc() and malloc() for dynamical memory allocation. (i.e. for allocation of memory as required during the execution of the program.) These functions are in the standard library and their prototypes are in stdlib.h. These allow writing of programs which allow the user to input the required array sizes or to compute the array size as the program is running instead of setting is as a specific constant in the program. This can allow much more efficient use of memory.

The function calloc

The function call,

p = calloc(n,size);

where n and size are positive integers, allocates enough memory to store n objects, each of length size, and returns the origin of that space as a pointer which, in this case, is assigned to the pointer p. If it cannot assign the requested memory, the value NULL is returned. It has the prototype

void *calloc(size_t,size_t);

so that it returns a pointer of type void. This means that the returned value can be assigned to any type of pointer. The type size_t is given by a typedef (see below) in stdlib.h and can vary from one system to another but is typically unsigned.

The storage set aside by calloc is automatically set to zero (whereas the storage assigned by malloc, see below, is not initialised) and is contiguous. (the name calloc comes from "contiguous allocation")

The following short program will illustrate the use of calloc. It asks the user to input the size of an array of integers, then requests that it be filled with integer values and finally gives their sum.

#include<stdoi.h>

#include<stdlib.h>

void main(){

int *a, i, n, sum=0;

printf("\n\nInput the size of the array ");

scanf("%d",n);

printf("Memory for this will be allocated dynamically");

a = calloc(n, sizeof(int)); /* This allocates memory for n integers */

for (i=0; i<n; i++)

scanf("%d", a +i);

for (i=0; i<n; i++)

sum += a[i];

printf("\n\nNumber of elements %d",n);

printf("\nSum of the elements %d",sum);

free(a); /* This de-allocates the space */

}

This program could have been written without using an array but it serves to illustrate how calloc can be used for the dynamical allocation of memory. There are several parts of the program which should be noted and which can be used in this type of situation. Instead of using &a[i] in scanf , a+i is used which, from the rules of pointer arithmetic, is equivalent. Also, instead of *(a+i), the equivalent statement, a[i] is used.

The sizeof operator returns the number of bytes allocated to a given object. This object can be a variable type, an expression, an array or a structure. It is useful for calloc and malloc since the memory allocated to some variable types is implementation dependent and it therefore gives improved portability of the source code.

At the end of the program, the function free, described below, is used to free the memory which was allocated to the array.

The function malloc

This has the prototype

void *malloc(size_t);

in stdlib.h. The function call

p = malloc(n);

where p is a pointer and n is an integer, will allocate a block of n bytes of memory and return the origin of the block. If it cannot assign the memory, NULL is returned. Unlike calloc, the memory is not initialised to any value and will contain the values originally there. The use of malloc in the program above would require

a = malloc(n*sizeof(int));

instead of the call to calloc.

The function free

This function frees memory allocated by calloc and malloc so that it is released back to the system. Its prototype is in stdlib.h and is

void free(void *ptr);

Its argument must be a pointer whose value has originally been assigned by calloc, malloc or realloc (see below).

It should be noted that space allocated by calloc, malloc or realloc remains assigned for the duration of the program unless it is released by the programmer. Space is not released on function exit.

The function realloc

This function has the prototype

*void realloc(void *ptr, size_t size);

in stdlib.h and is used to change the size of an already assigned memory block obtained using calloc, malloc or realloc. If the pointer p must already been set to point to a memory block obtained in this way and which has not been de-allocated by free or realloc, the function call

q = realloc(p, n);

will allocate n bytes of memory, give the pointer q the value of its origin and copy the relevant part of the block pointed to by p into the new block. If possible, the function keeps the same origin for the old and new blocks. If this is not possible, a new block is allocated and the old one is de-allocated If the new block is larger than the original block, the data in the old block will fill up the first part of the new and the rest will not be initialised. If it is smaller, only the first part is copied.

If p has the value NULL, the effect of calling realloc is the same as malloc.

A successful call returns the origin of the new block and failure returns NULL.

STORAGE CLASSES

Any variable or function in C has two attributes; type and storage. The first has already been described when discussing assignment of variables. There are four storage classes

auto extern register static

The default class is auto and is by far the commonest one used. That is why it is possible to get so far into a description of C without needing to introduce storage classes.

 

The storage class auto

Variables declared within function bodies or within blocks enclosed by curly brackets are automatic by default and they act within the scope defined by the enclosing curly brackets. This is described in the section on scope above. When the block or function in which an automatic variable has been declared is left, the value of that variable is lost since the memory allocated to it is released. If the block is re-entered, new memory is allocated for the variable.

Since variables are automatic by default, the keyword auto is rarely used.

The storage class extern

This refers to external variables which are declared outside the function or block in which it is used. These include global variables which exist for all parts of the program.

A variable which is declared as external is considered to be global to all functions declared after it and, upon exit from a block or a function, the external variable continues to remain in existence. The following programs, which all have the same effect, illustrate the use of global variables and external variables.

#include <stdio.h>

int a=1, b=2, c=3; /* Global variables */

int f(void); /* Function prototype */

void main(){

printf("\n%3d",f()); /* Prints 12 */

printf("\n%3d%3d%3d", a,b,c): /* Prints 4 2 3 */

}

int f(void) {

int b, c;

a = b = c = 4; /* b and c are local but a is global */

return(a+b+c);

}

In this program a is always global so that its value is altered by the function. Inside the function, b and c are re-declared so that they become internal to the function and any value assigned to them there have no effect outside. Outside it, b and c are still global and retain the original values given to them.

Instead of the first line, it would have been possible to use

extern int a=1, b=2, c=3; /* Global variables */

without altering the behaviour of the program.

The keyword extern is an instruction to the compiler which says "look elsewhere for the variables". It allows the linking of programs which are formed from several files which are compiled separately. For example the program above could have been written as two files file1.c and file2.c which were compiled on their own.

 

In file file1.c,

#include <stdio.h>

int a=1, b=2, c=3; /* Global variables for this file */

int f(void); /* Function prototype */

void main(){

printf("\n%3d",f()); /* Prints 12 */

printf("\n%3d%3d%3d", a,b,c): /* Prints 4 2 3 */

}

In file file2.c,

int f(void) {

extern a; /* Look for it elsewhere */

int b, c;

a = b = c = 4; /* b and c are local but a is global */

return(a+b+c);

}

The use of extern in the second file tells the compiler that the variable a will be defined elsewhere, either in the same file or in another.

Global variables always exist during the execution of a program and can be used to transmit information between functions. However, they can be hidden by re-using the variable name inside some block. The variable name then refers to a local variable but only inside that block.

Information can be passed into a function in two ways: as parameters or by using external variables. The first is the preferred method since it increases the modularity of the code and reduces the possibility of undesired side effects. This can happen when a function inadvertently changes a global variable's value within its body because the programmer failed to remember that the variable is global and not local to the function. The safe practice is to change the values of global variables only through the parameter and return mechanisms. This rather cancels the value of global variables for passing information between functions.

All functions have storage class extern and this is the default value so, while it is possible to declare a function by, for example,

extern double sin(double);

the use of extern is not required.

 

The storage class register

The storage class register tells the compiler that the associated variables should, if possible, be stored in the high speed memory registers associated with or within the processor. Resource limitations may make this impossible and, in this case, the storage class defaults to auto. The main use of this class is to increase execution speed since access to the variable value can be very much faster than otherwise.

The storage class static

When the program exits from a program block, the values of all variables declared within the block are, by default, discarded so that, when the block is re-entered, they have been lost. The use of the storage class static preserves the variables. This is particularly useful if the program block is a function since it allows the retention of previously calculated values without the need to have global or external variables. The following function reacts differently depending on whether it is called for the first or subsequent times.

void f(void) {

static int counter = 0;

if (counter == 0) {

printf("\n First call to the function ");

counter++;

} else {

printf("\n This is not the first call to the function");

)

}

Here counter is initialised to zero but every function call increases its value by one and this is retained for future function calls.

Static external variables

The storage class static can also be used with external declarations so that there is a "privacy" mechanism which imposes restrictions on the scope of otherwise accessible variables or functions. This can increase program modularity. The difference between extern and static extern variables is that the scope of the latter are restricted to the remainder of the file in which they are declared. They are therefore not available to functions defined earlier in the file or defined in other files even if these functions attempt to use the extern storage keyword.

With this, it is possible to have variables which are local only to a set of related functions which can be stored and compiled in their own file and without these variables being accessible to the rest of the program.

DECLARATIONS AND typedef *

C has its fundamental types such as char, int, float etc. but the typedef declaration effectively allows the introduction of new types which are specified by the programmer or to rename existing fundamental types. For example,

typedef float LENGTH, AREA;

typedef float VECTOR[3];

typedef float TENSOR [3][3];

followed by

LENGTH l1, l2, l3;

AREA a1, a2, a3;

VECTOR x1, x2;

TENSOR t1, t2;

declares l1, l2, l3 as floating point variables, x1 and x2 as singly indexed arrays with index values from 0 to 2 and t1 and t2 as doubly indexed arrays with the same range of index values. It corresponds to

float l1,l2,l3, a1,a2,a3, x1[3],x2[3], t1[3][3], t2[3][3];

but has the advantage that the arrays are all known to belong to a well-specified type.

The declaration typedef is particularly suitable for the declaration of structures, the topic of the next section.

STRUCTURES *

Structures are a way of associating a single name with a set of variables which have a well defined pattern. An example would be the full name, data of birth and address which are associated with a single person or the examination results obtained by a student during a course. All this information can be contained in one structure. In numerical work, the structure might contain all the values of the observables required to measure a single quantity. An example would be the object and image positions, with experimental errors. Such a structure, to contain the experimental results for a lens could be defined by

struct lens{

float.u;

float.erru;

float.v;

float,errv;

};

This only defines the form of the structure and no memory is assigned since no structures have actually been declared. This could be done subsequently by

struct lens lens1, lens2;

This declares two structures of type lens with names lens1 and lens2. Each of these contains four floating point numbers.

Alternatively, the form of the structure and the structures can be declared simultaneously by

struct lens{

float u;

float erru;

float v;

float errv;

} lens1, lens2;

If only one structure is required the following form can also be used

struct {

float u;

float erru;

float v;

float errv;

} lens1;

 

The general form is

struct structure_type {

type variable_name;

type variable_name;

type variable_name;

.

.

} structure_variables;

Where one but not both of structure_type or structure_variables can be omitted as can be seen in the examples given above.

 

Individual variables inside a structure can be accessed by using the form

structure_name.variable_name

where the structure name and the variable name inside the structure are separated by a full stop. The resulting form can be treated as an ordinary variable. This is demonstrated by

f1 = 1.0/(1.0/lens1.u + 1.0/lens1.v);

with floating point variable f1, or by

scanf("%f",&lens1.u);

to input the value of u for lens1.

 

Structures of arrays *

The elements of the structure can be arrays. As an example, consider the following structure to store the exam results for one person.

struct marks{

float questions1[6];

float total1;

float questions2[6];

float total2;

} exam1, exam2;

which is intended to contain the results of two examinations, each of two papers. The marks for individual questions as well as the total can be stored. The mark for question 3 in the second paper of the first exam is

accessed by

exam1.question2[2]

(Remember that arrays in C start their indexing at zero)

Arrays of structures

The structure for exam marks above can be extended to include the name of the student and as an array of structures defined to hold the marks for a class.

struct marks{

char name[40];

float question1[6];

float total1;

float question2[6];

float total2;

} exam1[50], exam2[50];

This defines two 50 element arrays of structures with one for each examination and one element of the arrays of structures for each student. The name of the 18th student in the list of results for the first examination is contained in the string

exam[17].name

and in fact this points to the origin of the string containing the name. The mark for the third question of the second paper of the first examination for this student is accessed by

exam1[17].question2[2]

The advantage of using structures is that blocks of information an be stored and that each element of the block within the structure can be given a label which is a suitable mnemonic to help identify it. Structures also permit a compact and efficient way of passing data to functions and of allowing functions to manipulate a large set of variables. The passing of structures to functions is discussed later.

 

Structure complex

Unlike Fortran there is no variable type which accommodates complex numbers and no built in complex arithmetic in ANSI C (although this can be built into C++ and, in particular Turbo C++ has variable type complex with implicit complex arithmetic). However, in C, it is possible to use a structure to contain complex numbers although it is necessary to introduce user-supplied functions to carry out many of the standard complex arithmetical operations. This is much more cumbersome than for Fortran and is arguably the biggest, and possibly only, disadvantage C has compared to Fortran although once the programmer moves on to C++ this criticism no longer applies.

The structure can be defined as

struct complex{

float re;

float im;

};

so that for the complex structure declared by

struct complex z;

the real part of z is

z.re

and the imaginary part is

z.im

an array of complex numbers would be declared by

struct complex za[50];

 

Use of typedef for structures

It is considered to be good programming practice to associate a name with a given structure type using typedef and possibly to place the definition of the structure in a header file. Examples could be

typdef struct{

char name[40];

float questions1[6];

float total1;

float questions2[6];

float total2;

} marks;

so that arrays of structures to contain the results of examinations could be declared later in the program by

marks exam1[50], exam2[50]

Usually, as described above, the declarations of structures are placed in header files so that they are global in scope.

An obvious example of the value of using typedef is in programs containing complex numbers. If the following is declared in a header,

typedef struct{

float re;

float im;

} complex;

typedef struct{

double re;

double im;

} dcomplex;

any subsequent declaration in the program such as

complex a, b, c, d[100];

dcomplex p, q, r[50];

will declare complex and double complex variables and arrays of such variables. Effectively, this introduces two new types of variable.

 

Assignment statements for structures

Since typedef allows the introduction of programmer-defined variables, it is logical to expect that there should be an assignment statement that transfers the values in one structure into another, provided that they are of the same type. This is the case. If a and b are structures of the same type,

a = b;

will transfer all the values in a into b.

 

Initialising structures

Typical forms for initialising structures are (again with previously defined types marks and complex)

complex z = {1.0, 0.5};

complex sigma2[2][2] = {{0.,0.},{0.,-1.},{0.,0.},{0.,1.}};

marks student1 = {"F. Bloggs",5,7,3,6,5,8,34,6,4,7,8,7,5,37};

In the definition of sigma2, sigma2[1][1] and sigma2[2][2] have value zero, sigma2[1][2] has value (0,-1) = -i and sigma2[2][1] has value (0,+1) = i.

The assignment of values to the structure student follows the order of the definition of the elements inside marks and should be clear.

 

 

Pointers to structures

A pointer to a structure is declared using an asterisk as for example, with the previously defined types marks and complex

complex *a, b;

marks *ex, exam1;

Here a and ex are pointers to their respective type of structure and b and exam1 are structures. Later in the program, a can be made to point to b and ex to exam1 by

a = &b;

ex = &exam1;

Elements of the structures pointed to can be accessed as below.

(*a).re

for the real part and

(*ex).questions2[3]

for the mark for question 4 of examination 2.

Note that the * has to be enclosed with the pointer name inside parenthesis because * has a lower precedence than the . operator.

An alternative form uses the operator -> which obtains the individual variable directly from the pointer. The alternative forms for the variables above are

a -> re

ex -> question2[3]

This is now preferred to explicit use of the * operator.

 

Structures of structures

It is possible to have nested structures, i.e. structures which have structures as elements.

 

Passing structures to functions

As already described, structures allow the efficient passing of large numbers of variables to a function. In this, they play the same role as common blocks in Fortran.

 

Passing elements

The value of an individual element of a structure can be passed to a function in the same way as a normal variable. For example,

sin(az[40].re);

calculates the sin of the real part of the complex number in az[40]. What is passed to the function is a copy of the value of the floating point variable az[40].re in the usual call by value form of parameter passing in C.

Passing whole structures and functions which return structures

An example of this is the function to return the complex sum of two complex numbers. With the previous declaration of type complex, this could have prototype

complex cadd(complex,complex);

and have the form

complex cadd(complex a, complex b)

{

complex c;

c.re = a.re + b.re;

c.im = a.im + b.im;

return c;

}

A call to it, with complex variables a, b and z, would be

z = cadd(a,b);

and would give the complex number in z the value of the sum of the complex numbers in a and b.

If complex.h is a header containing the definition of type complex, the following program illustrates the passing of structures to a function.

#include <stdio.h>

#include "complex.h"

complex cadd(complex,complex);

void main()

{

complex a = {1.2,-0.76}, b = {-0.54,.84},c;

c = cadd(a,b);

printf("The sum is %f5.2,%f5.2i",c.re,c.im);

}

complex cadd(complex p, complex q)

{

complex z;

z.re = p.re + q.re;

z.im = p.im + q.im;

return z;

}

 

 

Passing pointers to structures

The method above for passing structures to a function has the serious disadvantage that a replica of the values in each structure has to be created and then passed to the function. This takes time and might even cause an overflow of the stack where these copies are stored if the structure is large enough. Also, although this is not always a disadvantage, the function cannot manipulate the values in a structure in the calling part of the program. If however, a pointer to the structure is passed, these problems are avoided. The program above can be modified to illustrate how it is done.

#include <stdio.h>

#include "complex.h"

complex cadd(complex *,complex *);

void main()

{

complex a, b, c, *d, *e;

a.re = 1.2;

a.im = -0.75;

b.re = -0.86;

b.im = .93;

d = &a;

e = &b;

c = cadd(d,e);

printf("The sum is %f5.2,%f5.2i",c.re,c.im);

}

complex cadd(complex *p, complex *q)

{

complex z;

z.re = p -> re + q -> re;

z.im = p -> im + q -> im;

return z;

}

Alternatively, instead of having pointers in the calling part of the program, it is possible to have

#include <stdio.h>

#include "complex.h"

complex cadd(complex,complex));

void main()

{

complex a = {1.2,-0.76}, b = {-0.54,.84},c;

c = cadd(&a,&b);

printf("The sum is %f5.2,%f5.2i",c.re,c.im);

}

complex cadd(complex *p, complex *q)

{

complex c;

c.re = p -> re + q -> re;

c.im = p -> im + q -> im;

return c;

}

 

 

FILE INPUT AND OUTPUT *

As well as output to the screen and input from the keyboard, it is necessary to be able to output and input to disks and other peripherals. This information is stored in files. These have several important properties. The first is that they have a name. They can be opened and closed. If a file is opened, the file can be, written, read or appended to, i.e. have extra information added to the end. The system has to be told which activity is to be performed and, when it is no longer required, the file has to be closed. In the abstract, the file can be thought of as a stream of characters which is stored somewhere.

Information about the current state of a file is stored in a structure of type FILE which is declared in stdio.h. There is no need to know the details of this information but it is necessary to declare pointers to type FILE for each file to be used before opening it.

Also defined in stdio.h are three pointers, stdin, stdout and stderr. They allow the connection to the standard input and output channels, the keyboard and screen respectively

stdin standard input file connected to the keyboard

stdout standard output file connected to the screen

stderr standard error file connected to the screen

In practice, it is not often necessary to know all these details and only the general form need be remembered. An example of a program which reads and writes to files is as follows.

#include <stdio.h>

void main()

{

int i,j,k;

FILE *infile, *outfile;

infile = fopen("C:\\mydata\\datin.dat","r");

outfile = fopen("C:\\mydata\\datout.dat","w");

fscanf(infile,"%d%d%d",&i,&j,&k);

i = i*i;

j = j*j*j;

k = 2*k;

fprintf(outfile,"The square of the first term is %d\n",i);

fprintf(outfile,"The cube of the second term is %d\n",j);

fprintf(outfile,"Twice the third term is %d\n",k);

fclose(infile);

fclose(outfile);

}

This opens a disk file called c:\mydata\datin.dat, reads some data, manipulates it and writes the result to a disk file called c:\mydata\datout.dat. (The file names are typical of those used for disk files and are those used by the computer's operating system.)

 

fopen

This function opens the file and returns a value to the pointer to type FILE where the relevant information about the file is stored. The function call has the form

fopen(file_name,mode)

where file_name is a string which identifies the file and mode is a string which gives the mode of access as follows.

Mode Meaning

"r" open text file for reading

"w" open text file for writing

"a" open text file for appending

"rb" open binary file for reading

"wb" open binary file for writing

"ab" open binary file for appending

A text file here should be thought of as a sequence of ASCII characters which corresponds to the input to the keyboard or the output to the screen and which can contain numerical information as well as character strings.

There are also forms which allow both input and output but these are more complicated to use and can be ignored until they are needed, if at all.

"r+" open text file for reading and then writing

"w+" open a text file for writing and then reading

"a+" open a text file for reading and writing

"rb+" open a binary file for reading and then writing

"wb+" open a binary file for writing and then reading

"ab+" open a binary file for reading and writing

For "r+" and "rb+", input cannot be immediately followed by output unless there is a call to fseek(), fsetpos() or rewind() or unless the end of file mark has been reached. Similarly, for "w+" and "wb+", input cannot immediately follow output unless there is a call to fflush(), fseek(), fsetpos() or rewind()

The return value from fopen is NULL, defined in stdio.h, if the function fails to open the file. It is not possible to open a file for reading if it does not exist but, if the mode is "r", "a", "rb", "ab" etc., the file is created if it does not exist and writing begins at the start of the file.