An Introductory Tutorial On Data Structures In C++.
“Data structure can be defined as an organized collection of data that helps a program to access data efficiently and rapidly so that the entire program can function in an efficient manner. “
We know that in the programming world, data is the center and everything revolves around data. We need to do all data operations including storing, searching, sorting, organizing, and accessing data efficiently and only then our program can succeed.
What You Will Learn:
We need to find the most efficient way of storing data that can help us to build dynamic solutions. Data structure helps us in building such solutions.
While organizing or arranging data into structures, we need to ensure that the arrangement represents nearly a real-world object. Secondly, this arrangement should be simple enough so that anyone can access it easily and process it whenever required.
In this series, we will learn in detail about basic as well as an advanced data structure. We will also learn in detail about various searching and sorting techniques that can be performed on data structures.
After learning this tutorial series, the reader should become well acquainted with each data structure and its programming.
Let us go through some of the terms that we use while dealing with data structures:
For Example, take a particular student. A student can have the following details as represented pictorially.
- Data: It is the elementary value. In the above figure, student Roll No. can be data.
- Group item: This is the data item that has more than one sub-items. In the above figure, Student_name has First Name and Last Name.
- Record: It is a collection of data items. In the above example, data items like student Roll No., Name, Class, Age, Grade, etc. form a record together.
- Entity: It is a class of records. In the above diagram, the student is an entity.
- Attribute or field: Properties of an entity are called attributes and each field represents an attribute.
- File: A file is a collection of records. In the above example, a student entity can have thousands of records. Thus a file will contain all these records.
The reader should be aware of all these terms as we use them every now and then when using various data structures.
Data structures are the main building block of the program and as programmers, we should be careful about which data structure to use. The exact data structure to be used is the toughest decision to make as far as programming is concerned.
Let us discuss the need for data structure in Programming.
Need For Data Structure In Programming
As the amount of data continues to grow, the applications become more and more complex, hence it becomes difficult for the programmer to manage this data as well as the software.
Typically, at any time, the application may face the following hurdles:
#1) Searching Large amounts of Data: With a large amount of data being processed and stored, at any given time our program may be required to search a particular data. If the data is too large and not organized properly, it will take a lot of time to get the required data.
When we use data structures to store and organize data, the retrieval of data becomes faster and easier.
#2) Speed of Processing: Disorganized data may result in slow processing speed as a lot of time will be wasted in retrieving and accessing data.
If we organize the data properly in a data structure while storing, then we will not waste time in activities like retrieving, organizing it every time. Instead, we can concentrate on the processing of data to produce the desired output.
#3) Multiple Simultaneous Requests: Many applications these days need to make a simultaneous request to data. These requests should be processed efficiently for applications to run smoothly.
If our data is stored just randomly, then we will not be able to process all the concurrent requests simultaneously. So it’s a wise decision to arrange data in a proper data structure so as to minimize the concurrent requests turnaround time.
Data Structure Classification
Data structures used in C++ can be classified as follows.
A data structure is a way of organizing the data. So we can classify data structures as shown into primitive or standard data structures and non-primitive or user-defined data structures.
We have seen all the data types supported in C++. As this is also a way of organizing data, we say it’s a standard data structure.
The other data structures are non-primitive and the user has to define them before using them in a program. These user-defined data structures are further classified into linear and non-linear data structures.
Linear Data Structure
Linear data structures have all their elements arranged in a linear or sequential fashion. Each element in a linear data structure has a predecessor (previous element) and a successor (next element)
Linear data structures are further divided into static and dynamic data structures. Static data structures usually have a fixed size and once their size is declared at compile time it cannot be changed. Dynamic data structures can change their size dynamically and accommodate themselves.
The most popular example of linear static data structure is an array.
An array is a sequential collection of elements of the same type. Each element of the array can be accessed using its position in the array called an index or subscript of the array. The name of the array points to the first element in the array.
The above shown is an array ‘a’ of n elements. The elements are numbered from 0 to n-1. The size of the array (n in this case) is also called the dimension of the array. As shown in the above figure, the name of the array points to the first element of the array.
The array is the simplest data structure and is efficient as elements can be accessed using subscripts directly. If we want to access the third element in the array then we just have to say a.
But adding or deleting the array elements is difficult. Hence we use arrays only in simple applications or in applications where addition/deletion of elements is not required.
Popular linear dynamic data structures are linked list, stack, and queue.
A linked list is a collection of nodes. Each node contains the data element and a pointer to the next node. Nodes can be added and deleted dynamically. A linked list can be a singly linked list in which each node has a pointer to the next element only. For the last element, the next pointer is set to null.
In the doubly linked list, each node has two pointers one to the previous node and the second one to the next node. For the first node, the previous pointer is null and for the last node, the next pointer is null.
As shown in the figure above, the beginning of the list is called the head while the end of the linked list is called the tail. As shown above, each node has a pointer to the next element. We can easily add or delete elements by changing the pointer to the next node.
Stack is a linear data structure in which the elements can be added or removed only from one end known as “Top” of the stack. In this way, stack exhibits LIFO (Last In, First Out) type of memory access.
As shown above, elements in the stack are always added at one end and also removed from the same end. This is called the “Top” of the stack. When an element is added, it is pushed down the stack and top of the stack is incremented by one position.
Similarly, when an element is removed, the top of the stack is decremented. When a stack is empty, the top of the stack is set to -1. There are two main operations “Push” and “Pop” that are performed on the stack.
The queue is yet another linear data structure in which the elements are added at one end called “rear” and deleted from another end called “front”. Queue demonstrates FIFO (First In, First Out) the type of memory access methodology.
The above diagram shows a queue with rear and front ends. When the queue is empty, the rear and front pointers coincide with each other.
Non-linear Data Structure
In non-linear data structures, data is not arranged sequentially, instead, it is arranged in a non-linear fashion. Elements are connected to each other in a non-linear arrangement.
Non-linear data structures are Trees and Graphs.
Trees are non-linear multilevel data structures having a hierarchical relationship between the elements. Elements of the tree are called Nodes.
The node at the top is called the root of the tree. The root can have one or more child nodes. The subsequent nodes can also have one or more child nodes. The nodes that do not have child nodes are called leaf nodes.
In the above diagram, we have shown a tree with 6 nodes. Out of these three nodes are the leaf nodes, one topmost node is the root and the others are child nodes. Depending on the number of nodes, child nodes, etc. or the relationship between the nodes, we have different types of trees.
The graph is a set of nodes called vertices connected to each other by means of the links called Edges. Graphs can have a cycle inside it i.e. the same vertex can be a starting point as well as the end point of a particular path. Trees can never have a cycle.
The above diagram is an undirected graph. We can also have directed graphs where we represent the edges using directed arrows.
Operations On Data Structure
All the data structures perform various operations on its elements.
These are common to all data structures and are listed as follows:
- Searching: This operation is performed to search for a particular element or a key. The most common searching algorithms are sequential/linear search and binary search.
- Sorting: Sorting operation involves arranging the elements in a data structure in a particular order either ascending or descending. There are various sorting algorithms that are available for data structures. Most popular among them are Quicksort, Selection sort, Merge sort, etc.
- Insertion: Insertion operation deals with adding an element to the data structure. This is the most important operation and as a result of the addition of an element, the arrangement changes and we need to take care that the data structure remains intact.
- Deletion: Deletion operation removes an element from the data structure. The same conditions that are to be considered for insertion are to be fulfilled in case of the deletion operation as well.
- Traversing: We say that we traverse a data structure when we visit each and every element in the structure. Traversing is required to carry out certain specific operations on the data structure.
In our subsequent topics, we will first learn the various searching and sorting techniques to be performed on data structures.
Advantages Of Data Structure
- Abstraction: Data structures are often implemented as abstract data types. The users only access its outer interface without worrying about the underlying implementation. Thus data structure provides a layer of abstraction.
- Efficiency: Proper organization of data results in efficient access of data thereby making programs more efficient. Secondly, we can select the proper data structure depending on our requirements.
- Reusability: We can reuse the data structures that we have designed. They can be compiled into a library as well and distributed to the client.
With this, we conclude this tutorial on introduction to data structures. We have introduced each of the data structures in brief in this tutorial.
In our subsequent tutorials, we will explore more about data structures along with the various searching and sorting techniques.