Data type is defined as a set of values that a variable can store along with a set of operations that can be performed on that variable. Data type specifies the type of data that is to be used in a program.
Fig. Data Type
There are 2 types of data. They are:
Primitive/ Simple data type
Simple data type is classified into 4 data type:
Fig: Types of Data Type
Integer: Integer is the basic data type which contains discrete numbers (both +ve and –ve) like 2,-5, etc. The range of value for integer differs from one programming language to another. An integer data type cannot have fraction or an exponent.
Real: The data type which contains any numeric representation is called real data type. The value may be signed or unsigned, fractions or exponential as well. Example: 0, 0.5, 4.7234, 3.0e-4, etc are real numbers.
Characters: The data type which contains any printable alpha-numeric character plus another special character like #, @, %, etc are called Character data type. Example: Alphabets: a to z/ A to Z Numbers: 0 to 9
Boolean (logical): The data type which only takes one of two possible values at any one time is called Boolean data type. The two values are true and false. It is useful for checking the condition inside a program. Example: AND, OR and NOT.
A data structure is an organized group of data items which is treated as a unit. It may be regarded as the computer data type. The main data structures are listed below:
Array: Array is a collection of sequences from the storage of the same type of data such as Integer, Real or Character. It is of 1 to 3 dimensions.
String: String is an array of characters. They are used in programming for storing and manipulating texts such as words, names and sentences. The combination of more than one character inside double quotation mark like “welcome”, “056-2345”, etc. is called string data type. In programming, there can be various string processing operations such as string comparison, string sorting, etc.
Tree: It is a hierarchical complex data structure where each element known as node, contains data part and a pointer (address) to another node.The top most nodes (which are not pointed by any other node) are called Root nodes. The node which doesn’t point to any other node is known as Leaf node. If a node can point two nodes at the maximum, then such tree is known as Binary tree.
Link list: Link List is a data structure composed of elements, containing data and pointers to the next element of the same type. In other words, link list is a chain of elements where one element points to another. The first node does not contain information. It only contains the address of another node. The last node points to NULL but stores information. But, if the last pointer points to the first element of the list, then the list becomes circular link list. The info field contains actual elements on the list. The next address field contains the address of the next node in the list.
Differences between Array and Structure:
Array is a collection of sequences, of storage of the same type of data such as integer, real or character.
Structure is a collection of heterogeneous data (different types of data) under the same name.
Array are one dimensional, two dimensional or three dimensional.
Structure is only of one dimension.
In the array, we can store any number of data type using only one variable.
In structure, we have to use different variables to store different data types.
Array cannot store varying data types.
Structure stores varying data types.
Example: If we construct an array A of integer there we can only store integer data type. We cannot put integer and character data in a single array
Example: In structure, we can put different data type under the same name.
In the array, a structure is not included.
In structure, we can easily include array.
A computer can only understand binary numbers which are in the form of two electronic states i.e. high voltage and low voltage. Such notations are further derived into standard codes and such codes can represent the data for users' convenience. Some of the popular codes are:
Absolute Binary (pure binary): In an absolute binary method, 0 is placed before the binary number to represent positive number and 1 is placed before the binary number to represent a negative number. The most significant bit in binary number denotes the sign bit and the rest bits represent the actual number. The binary number is expressed in 8,16,32,64, etc. bit format.
Fig: pure binary clock with Arduino Source:www.instructables.com
BCD (Binary Coded Decimal): It is a simple system for converting decimal numbers into a binary form where each decimal number is converted separately into binary and placed spaces in between numbers. In BCD, each decimal digit occupies 4 bit. For example, the decimal number 24 can be represented in BCD as (0010 0100)2.
ASCII (American Standard Code for Information Interchange): ASCII is a standard coding system that assigns numeric values to the letter, numbers, punctuation marks and control characters to achieve compatibility with different hardware and peripherals. ASCII was developed in 1968 and was divided into 2 sets: Standard ASCII (7 bits code, 128 characters) and Extended ASCII (8 bits code, 256 characters). Most systems use 8 bit extended ASCII to represent foreign language characters and other graphical symbols.
Fig: ANSI Extended ASCII Source:www.cplusplus.com
In ASCII, each character is represented by a unique integer value from 0 to 255. The value 0 to 31 is used for non-printing control characters and the range from 32 to 127 is used to represent the letters of the alphabet and common punctuation symbols. For example: ASCII code for capital letter A is 65, for * is 42, etc. Since, ASCII code uses 8 bits, each character represented in ASCII code occupies 1-byte storage space in a computer.
EBCDIC (Extended Binary Coded Decimal Interchange Code): It is an 8-bit code system which is commonly used on large IBM mainframe computers, most IBM minicomputers and computers from many other manufacturers. It allows 256 characters to be represented in computers. In this code, placement of the letters of the alphabet is discontinuous and there is no direct character to character match when converting from EBCDIC to ASCII and vice versa.
Fig: EBCDIC Format Source:shieldsdesignllc.com
Unicode: It is a 16-bit character code defined by the Unicode Consortium and International Standard Organization (ISO) that supports up to 65,536 characters. It allows all the characters and symbols in any language in the world to be represented by a single code. For example, the Chinese language has almost 10,000 characters which can be represented by Unicode only. If Unicode is universally adopted, then it will make multilingual software much easier to write and maintain.
Fig:Unicode in Python Source: ian-albert.com Since, Unicode uses 16 bits, each character represented in Unicode occupies 2 bytes storage space in the computer. This coding system has been developed to overcome the drawback of ASCII code that supports only 256 different characters, which is sufficient only for English language but not for all the languages like Chinese, Japanese, etc. which has more than 256 characters. The Unicode Worldwide Character Standard provides up to 4 bytes (32 bits) now.
(Shrestha & Karn, 2015)
Shrestha, R. K., & Karn, M. K. (2015). Computer Science I. Anamnagar, Kathmandu: Buddha Publication.
Adhikari, Deepak et.al., Computer Science-XI, Asia Publication Pvt. Ltd,ktm
Introduction to data types:
Simple data types; Integer, Real. Character and Boolean.
Data structure; Array, Link list, tree and String.