Abhinav Vidwans: C & C++

C and C++
Dennis Ritchie developed C and it was quite popular. An interesting feature in C is the use of functions. The programmer could write a function for checking whether a number is odd or even and store it in a separate file. In the program, whenever it is needed to check for even numbers, the programmer could simply call that function instead of rewriting the whole code. Usually a set of commonly used functions would be stored in a separate file and that file can be included in the current project by simply using the #include syntax. Thus the current program will be able to access all functions available in ‘filename’. Programs written in C were more structured compared to high level languages and another feature was the ability to create your own data types like structures. For instance if you want to create an address book program, you will need to store information like name and telephone number. The name is a string of characters while the telephone number is an integer number. Using structures you can combine both into one unit. Similarly there are many more advantages of using C.
Though C seemed to be ideal, it was not effective when the programs became even more complex (or larger). One of the problems was the use of many functions (developed by various users) which led to a clash of variable names. Though C is much more efficient than BASIC, a new concept called Object Oriented Programming seemed better than C. OOP was the basis of C++ (which was initially called ‘C with classes’). C++ was developed by Bjarne Strastroup. In object oriented programming, the programmer can solve problems by breaking them down into real-life objects (it presented the programmer with an opportunity to mirror real life). What is an object? This topic is dealt with extensively in the chapter on ‘Objects and Classes’ but a brief introduction is provided here.
Consider the category of cars. All cars have some common features (for example all cars have four wheels, an engine, some body colour, seats etc.). Are all cars the same? Of course not. A Fiat and a Ford aren’t the same but they are called as cars in general. In this example cars will form a class and Ford (or Fiat) will be an object.
For those people who know C programming, it would be useful to know the differences between C and C++. Basically C++ includes everything present in C but the use of some C features is deprecated in C++.
• C does not have classes and objects (C does not support OOP)
• Structures in C cannot have functions.
• C does not have namespaces (namespaces are used to avoid name collisions).
• The I/O functions are entirely different in C and C++ (ex: printf( ), scanf( ) etc. are part of the C language).
• You cannot overload a function in C (i.e. you cannot have 2 functions with the same name in C).
• Better dynamic memory management operators are available in C++.
• C does not have reference variables (in C++ reference variables are used in functions).
• In C constants are defined as macros (in C++ we can make use of ‘const’ to declare a constant).
• Inline functions are not available in C.
Let’s recap the evolution of programming languages: initially programs were written in terms of 1s and 0s (machine language). The drawback was that the process was very tedious and highly error-prone. Assembly language was developed to write programs easily (short abbreviations were used instead of 1s and 0s). To make it even simpler for programmers, high level languages were developed (instructions were more similar to regular English). As the complexity of programs increased, these languages were found to be inadequate (because they were unstructured). C was developed but even that was not capable of dealing with complex or larger programs. This led to the development of C++.
Note: Sometimes languages are divided into low level and high level only. In such a classification, C/C++ will come under high level languages.
Why do you need to learn C++?
There are many people who ask the question, "why should I learn C++? What use is it in my field?" It is a well-known fact that computers are used in all areas today. Programming is useful wherever computers are used because it provides you the flexibility of creating a program that suits your requirements. Even research work can be simulated on your computer if you have programming knowledge. Construction and programming may appear to be miles apart but even a civil engineer could use C++ programming. Consider the case of constructing a physical structure (like a pillar) in which the civil engineer has to decide on the diameter of the rods and the number of rods to be used. There are 2 variables in this case:
1. The number of rods needed (let’s denote it as ‘n’) and
2. The diameter of each rod (let’s call it as ‘d’)
The civil engineer might have to make a decision like: "Is it cost-effective for me to have 10 rods of 5cm diameter or is it better to have 8 rods of 6cm diameter?" This is just one of the simple questions he may have in his mind. There are a few related questions: "What is the best combination of number of rods, their diameters and will that combination be able to handle the maximum stress?"
Usually equations are developed for each of the factors involved. It would be much easier if the civil engineer could simply run a program and find out what is the best combination instead of manually trying out random values and arriving at a solution. This is where programming knowledge would benefit the engineer. Any person can write a good program if he has adequate knowledge about the domain (domain refers to the area for which the software is developed. In this case it is construction). Since the civil engineer has the best domain knowledge he would be able to write a program to suit his requirements if he knew programming.
Programming is applicable to almost every field- banking (for maintaining all account details as well as the transactions), educational institutions (for maintaining a database of the students), supermarkets (used for billing), libraries (to locate and search for books), medicine, electrical engineering (programs have been developed for simulating circuits) etc.
Binary Numbering System
________________________________________
The following 2 sections on binary numbering system and memory are optional but recommended. If you’ve taken a course in electronics you probably already know about this and can use the material provided here as a refresher. Having knowledge of the binary system and computer memory will help in understanding some features in programming.
________________________________________
The heart of the computer is the microprocessor, which is also referred to as the processor. The microprocessor is the main part of the computer’s CPU (central processing unit). A processor consists of millions of electronic switches; a switch can be either in ON or OFF state. Thus there are only two distinct states possible in these devices. In our real-world calculations we make use of the decimal system (which has 10 distinct states/numbers: 0 to 9). Counting up to ten is easy for humans but would be quite difficult for computers. It is easier to create a device capable of sensing 2 states than one capable of sensing 10 states (this also helps reduce on errors). Computers make use of the binary system (i.e. they store data in binary format and also perform calculations in binary format). The binary system (binary meaning two) has just two numbers: 1 and 0 (which correspond to the ON and OFF state respectively – this is analogous to a switch which can either be in ON state or in OFF state).
When we have different systems (binary, decimal etc.), there ought to be a way of converting data from one system to the other. In the decimal system the number 247 stands for 7*100 + 4*101 + 2*102 (add it up and the result will be 247; i.e. in each place we can have one of the 10 digits and to find the actual place value we have to multiply the digit by the corresponding power of 10). For example:
247 = (2 x 102) + (4 x 101) + (7 x 100) = 200 + 40 + 7
1258 = (1 x 103) + (2 x 102) + (5 x 101) + (8 x 100) = 1000 + 200 + 50 + 8
Note: In C++ and most other computer languages, * is used as the multiplication operator.
The same concept holds good for a binary number but since only two states are possible, they should be multiplied by powers of 2.
Remember: A binary digit is called a bit.
So, what is the value of 1101? Multiply each position by its corresponding power of 2 (but remember, you have to start from 20 and not from 21). The value for 1101 is 13 as illustrated in the figure below:

An alternate method to obtain the value is illustrated below (but the underlying concept is the same as above):

It is easy to obtain the values which are written above each bit (27=128, 26=64 and so on). Write these values on top and then write the binary number within the squares. To find the equivalent decimal value, add up the values above the square (if the number in the square is 1). If a number is denoted as 1101, then this stands for the lower (or last) four bits of the binary number (the upper bits are set to 0). Hence 1101 will come under the values 8, 4, 2 and 1. Now, wherever there is a 1, just add the value above it (8+4+1=13). Thus 13 is the decimal equivalent of 1101 (in binary format). To distinguish between decimal and binary we usually represent the system used (decimal or binary) by subscripting the base of the system (10 is the base for the decimal system while 2 is the base for the binary system).
Hence (13)10 = (1101)2

Computers store information in the form of bits and 8 bits make a byte. But memory capacity is expressed as multiples of 210 bytes (which is equal to 1024 bytes). 1024 bytes is called a Kilobyte. You may wonder why it is 1024 and not 1000 bytes. The answer lies in the binary system. Keeping uniformity with the binary system, 210=1024 and not 1000 (the idea is to maintain conformity with the binary system).
Beware: The bit in position 7 in fig 1.1 is actually the 8th bit of the number (the numbering of bit starts from 0 and not 1). The bit in the highest position is called as the most significant bit. In an 8 bit number, the 7th bit position is called the most significant bit (MSB) and the 0th bit is known as the least significant bit (or LSB). This is because the MSB in this case has a value of 128 (28) while the LSB has a value of just 1.
________________________________________
Computer Memory
________________________________________
We know that computers can operate only on bits (0s and 1s). Thus any data that has to be processed by the computer should be converted into 0s and 1s.
Let us suppose that we want to create a text file containing a single word “Hello”. This file has to be stored physically in the computer’s memory so that we can read the file anytime in the future. For the time being forget about the file-storage part. Let’s just concentrate on how the word “hello” is stored in the computer’s memory. Computers can only store binary information; so how will the computer know which number corresponds to which alphabet? Obviously we cannot map a single bit to a character. So instead of bits we’ll consider a byte (8 bits). Now we can represent 256 characters. To perform map a character to a byte we’ll need to use some coding mechanism. For this purpose the ASCII (American Standard Code for Information Interchange) is used. In this coding system, every alphabet has an equivalent decimal value. When the computer uses ASCII, it cannot directly use the decimal value and it will convert this into an 8-bit binary number (in other words, into a byte) and store it in memory.
The following table shows part of the ASCII code.
Character Equivalent decimal value Binary value
A 65 0100 0001
B 66 0100 0010
a 97 0110 0001
b 98 0110 0010
In this way each character is mapped to a numeric value. If we type the word hello, then it is converted into bytes (5 bytes- one for each character) based on the ASCII chart and is stored in memory. So ‘hello’ occupies 5 bytes or 40 bits in memory.
Note: It is very important to know the binary system if you want to use the bitwise operators available in C++. The concept of memory is useful while learning pointers.
A question arises, “where are the individual bits stored in memory?” Each individual bit is stored in an electronic device (the electronic device is technically called a flip-flop; which is something like a switch). A single flip-flop can store one bit. Consider the fig. below:

As mentioned earlier we deal in terms of bytes rather than bits. The figure shows a 4-byte memory (which means it can hold 32 bits – each cell can store one bit). All information is stored in memory and the computer needs some method to access this data (i.e. there should be some way of distinguishing between the different memory locations). Memory addresses serve this purpose. Each bit can be individually accessed and has a unique memory address. To access the first byte, the computer will attempt to read the byte stored at memory address 1.
If memories didn’t have addresses then the computer would not know from where it has to read data (or where it has to store data). An analogy to memory address is the postal address system used in real-life. A city will have a number of houses and each house has a unique address. Just imagine the situation if we didn’t have any postal address (we wouldn’t be able to locate any house in the city!). One difference is that a memory address can house only bits and nothing else.
Memory address representations are not as simple as shown above. In the above case we’ve considered a memory that has capacity to store just 4 bytes. Memories usually contain kilobytes or gigabytes of space. As the amount of memory available increases, so does the size of the address (the address number might be in the range of millions). Thus instead of using the decimal system for addresses, the hexadecimal numbering system is used. Hexa means 16 and the distinct numbers in this system are 0 to 9 followed by A, B, C, D, E and F where A= 10 in decimal, B= 11 in decimal and F= 15 in decimal.
Counting in hexadecimal system will be 0…9, A, B…F, 10,11,12,13…19,1A, 1B, 1C…and so on.
It is quite clear that 10 in hexadecimal does not equal 10 in decimal. Instead, (0F)16 = (15)10, (10)16 = (16)10 and (11)16 = (17)10
Four bits form a ‘nibble’. Eight bits form a ‘byte’ and four bytes of memory are known as a ‘word’. There is some significance attached to a ‘word’ which we shall deal with later.
Another term related to memory is ‘register’. A register consists of a set of flip-flops (flip-flops are electronic devices that can store one bit) for storing information. An 8-bit register can store 8 bits. Every processor has a set of internal registers (i.e. these registers are present within the processor and not in an external device like a hard disk). These registers (which are limited in number depending on the processor) are used by the processor for performing its calculations and computations. They usually contain the data which the processor is currently using. The size of the register (i.e. whether the registers will be 16-bit, 32-bit or 64-bit registers also depends on the processor). The common computers today use 32-bit registers (or 4-byte registers). The size of a register determines the ‘word-size’ of a computer. A computer is more comfortable (and prefers) working with data that is of the word-size (if the word-size is 4 bytes then the computer will be efficient in computations involving blocks of 4 byte data). You’ll understand this when we get into data types in the subsequent chapters.
The different types of memory are:
1. Secondary storage (for example the hard disk, floppy disks, magnetic disks, CD-ROM) where one can store information for long periods of time (i.e. data is retained in memory irrespective of whether the system is running or not).
2. The RAM (random access memory) is used by the computer to store data needed by programs that are currently running. RAM is used for the main (primary) memory of the computer (all programs which are executed need to be present in the main memory). The RAM will lose whatever is stored in memory once the computer is switched off.
3. ROM (read only memory): This contains instructions for booting up the system and performing other start-up operations. We cannot write to this memory.
4. The internal registers within the processor- these are used by the computer for performing its internal operations. The compiler will decide what has to be stored in which register when it converts our high-level code into low-level language. As such we won’t be able to use these registers in our C++ code (unless we write assembly code).
Remember: Secondary storage (also called auxiliary memory) is not directly accessible by the CPU. But RAM is directly accessible (thus secondary memory is much slower than the primary memory). For a program to execute it needs to be present in the main memory.
Which memory has lowest access time (or which memory can the CPU access quickly)?
The internal registers can be accessed quickly and the secondary storage devices take much longer to access. Computers also make use of a cache-memory. Cache memory is a high-speed memory which stores recently accessed data (cache access is faster than main memory access).

Related concept: In general cache means storing frequently accessed data temporarily in a place where it can be accessed quickly. Web browsers tend to cache web pages you visit frequently on the hard disk. Generally when we type in a website address, the browser needs to query the website and request for the page; which is a time consuming process. When the browser displays this webpage, it internally caches (stores a copy) this webpage on our hard disk also. The next time we time the same website address, the browser will directly read out from the hard disk rather than query the website (reading from hard disk is faster than accessing a web server).
Remember: Computers store information using the binary system, addresses are represented in the hexadecimal system and in real-life we use the decimal system.
A 3-bit number can be used to form 8 different combinations (including the 000 combination).
Binary Decimal Equivalent
000 0
001 1
010 2
011 3
100 4
101 5
110 6
111 7
If 3 bits are used then the maximum possible decimal number that can be represented is 7 (not 8 because the first number is 0). Similarly if an 8-bit number can be used to represent up to (2^8) different values (0 to 255 in the decimal system).
A binary number can be either signed or unsigned. When a number is unsigned (we don’t bother about the sign), it means that the number is always positive. If a number is signed, then it could be positive or negative. +33 is a signed number (with a positive sign). In real life, we can use + or – to indicate whether a number is positive or negative but in computers only 1s and 0s can be used. Every decimal number has a binary equivalent. The binary equivalent for the decimal number 8 is 00001000. If this is a signed number then +8 would be written as 00001000 and –8 would be denoted by 10001000. Notice the difference between the two. The 7th bit (or the most significant bit) is set to 1 to indicate that the number is negative.
Assume the number 127.
For +127 you will write: 01111111
For –127 you will write: 11111111
For 255 you will write: ?
Well, the value for 255 cannot be written using a signed 8-bit number (because the 7th bit is reserved for the sign). If the unsigned representation was used then 255 can be represented as 11111111 (in this case the MSB signifies a value and is not concerned about the sign of the number).
What is the point to note here? By using a signed representation the maximum value that can be represented is reduced. An 8 bit unsigned binary number can be used to represent values from 0 to 255 (11111111 will mean 255). On the other hand, if the 8 bit binary number is a signed number then it can represent from –127 to +127 only (again a total of 255 values but the maximum value that can be represented is only 127).
Beware: Signed representation in binary format will be explained in detail later. Computers store negative numbers in 2s complement rather than storing them directly as shown above.
Remember: In signed numbers the MSB (in the binary representation) is used to indicate the sign of the number.

Your very first C++ Program
________________________________________
All the example programs in this book have been tested in Turbo C++ compiler (the programs were also tested in Visual C++ 6.0). They should run on other C++ compilers as well.
Let us start with a basic C++ program. This will give you an idea about the general structure of a C++ program. Let’s write a program in which the user can enter a character and the program will display the character that was typed:
// My first C++ program
# include
int main( )
{
char letter;
cout << "Enter any letter" ; cin >>letter;
cout << "The letter you entered is : " <, it will physically include the iostream header file in your source code (i.e. the entire file called iostream.h file will be pasted in your source code). This is one of the standard header files which you’ll use in almost all of your C++ programs.

• int main( ) : Every C++ program has to have one ‘main’ function. Functions will be explained later. Remember that the compiler will execute whatever comes within the ‘main’ function. Hence the instructions that have to be executed should be written within the ‘main’ function. Just as the name implies, ‘main’ is the main part of the C++ program and this is the entry point for a C++ program.
• { : Functions are defined within a block of code. Everything within the opening and closing braces is considered a block of code. ‘main’ is a function and hence we define this function within braces. This is as good as telling the compiler, “The ‘main’ function starts here”.
• char letter; : ‘letter’ is the name of a variable. Instead of the name ‘letter’ we could also use any other name. ‘char’ defines ‘letter’ as a character variable. Therefore, the variable ‘letter’ will accept the value of only one character.
• cout << "Enter any letter" ; : This is followed by the << operator (known as the insertion operator). Following this operator, anything typed within double quotes will be displayed on the screen. Remember that cout and << go together. ‘cout’ is known to the compiler already (since it is defined in the iostream header file which has been included). So when the compiler comes across cout<<, it knows what to do. • cin >>letter;: cin is the opposite of cout. cin>> is used to obtain the value for a variable from the user. (>> is known as the extraction operator). Input is usually obtained from the keyboard.
• return 0; : This statement tells the compiler that the ‘main’ function returns zero to the operating system (return values will be discussed in the chapter on functions).
• }: The closing brace is used to tell the compiler that ‘this is the end of the function’. In our case, it is the end of the ‘main’ function and this indicates the end of the program as well.
You might have noticed in the above program that every statement is terminated with a semi-colon (;). That is exactly the use of the semi-colon. It is used to tell the compiler that one instruction line is over. If you don’t put semi-colons in your program, you will get errors while compiling.
A variable is used for temporary storage of some data in a program. In the above example, ‘letter’ was a variable. Every variable belongs to a particular type (referred to as data type). The different data types in C++ are discussed later. In the above program:
char letter;
declares ‘letter’ as a character variable (a character is one of the basic data types in C++). When the compiler encounters this statement, it will allocate some memory space for this variable (i.e. any value which is assigned to the variable ‘letter’ will be stored in this allocated memory location). When the required memory space has been allocated we say that the variable has been defined (in this case a single statement will declare and define the variable ‘letter’).
________________________________________
Some points to remember:
When we want to display something on the screen we will code it as:
cout<>variable-name;
Beware of the direction of the >> and << operators. In the statement cout<>variable;
information (the value of the variable) flows from ‘cin’ (which would be the keyboard) into the variable (i.e. the value is stored in the variable). Information will flow in the direction of the arrows (<< or >>). ‘cout’ is linked to the standard display device (i.e. the monitor) while ‘cin’ is linked to the standard input device (i.e. the keyboard). Hence, you can’t use
cout>>variable;
This will cause an error while compiling. ‘cout’ and ‘cin’ are already pre-defined and so you can use them directly in your programs. 'iostream.h’ is a header file that is used to perform basic input and output operation (or general I/O like using ‘cin’ and ‘cout’).
Remember: C++ is a case-sensitive language, which means that the compiler will not consider ‘letter’, ‘LETTER’ and ‘Letter’ as the same.
How to run your first program in the compiler?
Saving and compiling the program:
There are many C++ compilers available in the market. Some of them are freeware (meaning that they are free to use) while others have to be paid for. Turbo C++ compiler might be the simplest to use (but it is not freeware). Simply choose "New File" and type out the program coding. Suppose you are using some other compiler, click on File and choose "New". Some compilers may have the option of creating a C++ source file while other compilers may require a new project to be created. Whatever the method you will ultimately come to the step of creating a C++ source file. After typing the code in the compiler, save the file by giving it some name. The "Save As" option will appear under the "File" menu. Give a name (for example: first). Now the file is saved as first.cpp. All C++ source files are saved in *.cpp format. Just like *.doc represents a Word document file, *.cpp denotes a C++ (C Plus Plus) source file.
In the compiler program there will be an option called ‘Compile’ in the menu bar. Select the compile option and the compiler will do its work. It will compile the program (or in other words, it will read whatever has been typed) and in case there are any errors, the compiler will point out the line where the error was detected. Check whether the program has been typed exactly as given earlier. Even if a semi-colon is missing, it will lead to errors. If the compiler says no errors (or if the message "compiled successfully" appears), then you can go to the next stage: building the *.exe file.
*.exe file extension stands for executable files. A *.cpp file cannot be run directly on the computer. This has to be converted into a *.exe file and to do so select the "Make or Build exe" option in the compiler. The file ‘first.exe’ will be created. Now the program can be executed from the DOS prompt by typing ‘first’. But instead of running the program every time from DOS, it will be convenient to run the program from the compiler itself and check the output. In the compiler there will be another option called "Run". Just click on this and the program will run from the compiler itself (you needn’t switch back and forth between DOS and the compiler screen).
The figure should give you a rough idea as to how the executable file is created from the C++ source code.

The source code is the program that you type. Source code is converted into object code by the compiler. The linker will link your object code with other object codes. This process of creating a C++ program is discussed in the last chapter.
Modification that might be needed in first.cpp (depending on your compiler):
A little problem might be encountered while running your program from the compiler (this problem will exist in Turbo C++ compiler). While running the program from the DOS prompt, this problem will not occur.
What’s the problem? When first.cpp is executed from the compiler the program will ask the user to enter a character. Once a character has been entered, the program will return to the compiler screen. You won't see your output! It might appear as if there is some problem with the program. What happens is that the program displays the character on the screen, immediately terminates the program and returns to the compiler screen. This happens so fast that you can’t see the output being displayed.
Modify your program as shown below (if you are using Turbo C++):
// Your first program modified: first.cpp
# include
# include
int main( )
{
char letter;
cout<< "Enter any letter" ; cin>>letter;
cout<< "The letter you entered is : " <>letter;
at the end of the program just before return 0;
The program flow will be the same as described earlier.
The latest compilers, like VC++ (Microsoft Visual C++ compiler) do not have any of the above problems even if the program is run from the compiler. VC++ will always ask the user to press a character to terminate the program.
Another alternative is to use a function called ‘system’, which is defined, in the header file: stdlib.h.
• system("PAUSE");
can be used to pause the program (i.e. execution of the program will continue only when the user presses a key).
• system("CLS");
can be used to clear the display screen. To use these two functions you have to type #include header file in your source code. The system( ) function actually executes a DOS command. (Try giving the commands ‘cls’ and ‘pause’ in your DOS prompt).

Abhinav Vidwans

Tuesday, July 20, 2010

C & C++

Labels

Blog Archive