05 April, 2009

Role of Type System in Software Development (Technical)

If some one asks what is C++, I would say better C with primary emphasis on type safety. So what is the significance of "Type System" in program development? I will try to explain...
During implementation of software, programmers usually overlook warnings thrown by the compilers that are against to the usual conversion rules. Unfortunately programmers will tend to suppress forcibly few warnings. A sort of type overriding will cause critical errors. Let us analyze two such kinds of idioms.

Brief Description of Type System
A type is set of values and associated operations, where as an Abstract Data Type represents generic mathematical model. An ADT will have its own definition in each programming language. For example the 'int' is a primitive type defined by compiler which allows addition, subtraction, multiplication and division on such variables. When an 'int' type is mixed with scalar variables (e.g. pointers) the kind of operations will be different, like pointer addition and subtraction. (multiplication and division are not valid) And, present day compilers are providing user defined operations on any of data type with the help of operator overloading.

Enumerations in C and C++
In C the enumeration constants (like INACTIVE given below, not enum variable) are just integers or can be converted to integers easily. After one or two stages of parsing, the compiler replaces the enumerated constants with their equivalent integer representations. And hence assignment of integer to enumerated type variable is a valid programming construct.

In C++, even though the literal INACTIVE is constant, it will have an associated type information during parsing. And an implicit conversion of a user defined type (like enum) to another type (like int) is not allowed. This will help us in making better software.

How the type system will help us?
As an example, observe the following code snippet,

typedef enum SystemStateTag
{
INACTIVE = 0,
OPERATIONAL,
FAILURE
} SystemState;

Snippet1

The type SystemState represents state of a system in enumerated constants INACTIVE, OPERATIONAL and FAILURE. The compiler imposes enumeration type on any SystemState variable. Let us analyze the following code in terms of C and C++ compilation.

int calculateState();
SystemState sysState = calculateState();

In the above statement, the assignment

sysState = calculateState();

causes an implicit or explicit conversion of int to enum based on the type of compiler used.

As per ANSI C, the conversion from an int to enum is valid and implicit (no need of casting), and as per ANSI C++ the conversion is an error, requires an explicit casting. Some compilers (e.g. Borland C++ 5.02) treat the conversion as warning by default, and generates error message when compiled as per ANSI standard (try using –A option in Borland).

Since we are assigning the return value of function calculateState() to sysState, it is possible that the function may return values other than 0 (INACTIVE), 1 (OPERATIONAL) or 2 (FAILURE), in which case the system will be malfunctioning.

Perhaps this error would be trapped during system testing time and at this stage the correction effort will be more. By imposing C++ strict compiler type checking we can push such errors to compile time. In C, using defensive programming designs, like use of switch cases along with default statement, we can catch such errors at run time, but it would be too late.

Errors due to implicit conversions
The following code compares sysState against OPERATIONAL and the result is of type Boolean (can be of 0 or 1).

if(OPERATIONAL == sysState)
{
// process command
}
Snippet2

If the programmer tends to compare as shown below, is valid in C or C++

if(sysState = OPERATIONAL)
{
// process command
}
Snippet3

A C/C++ compiler implicitly converts ‘the return value of assignment statement (What it mean?)’ in ’if’ to Boolean type. It is the limitation of C++ to support legacy code written in C, it can be made fail safe by using defensive programming idioms as shown in the snippet2, because the statement OPERATIONAL = sysState is not valid in C/C++. This was an error that caused loss of $400 million during space craft take off.

Modern programming languages like C# imposes strict type checking, and will not allow any implicit conversions. Code snippet3 will not be allowed in C# or Java.
Especially Engineers making safety critical systems need to be aware of in depth knowledge in type system.

No comments: