Computer Science: Data Structures

Welcome to Computer Science: Data Structures!

This course is intended for any intermediate level computer science students at UNT. It will provide an overview of the four most common data structures: arrayslinked listbinary search trees, and graphs. After learning the basic theory and design of these data structures you will be quizzed test your knowledge. Getting familiar and practiced working with these data structure will help students succeed in their early and mid level computer science courses, because they will be expected to implement these structures for various programming assignments.

Good luck with the course!

 

Data Structures Overview

What Are Data Structures?

Overview

Data structures are a "way in which data are stored for efficient search and retrieval. Different data structures are suited for different problems. Some data structures are useful for simple general problems, such as retrieving data that has been stored with a specific identifier. For example, an online dictionary can be structured so that it can retrieve the definition of a word. On the other hand, specialized data structures have been devised to solve complex specific search problems." (Britannica)

Up until know you might have only gotten familiar with using fundamental data types of your prefered programming language. These would include integers, floats, or characters. Although these fundamental data types serve an important role they are very limited in their ability to store or manipulate large sets of data. For example, if you wanted to write a program that could store and sort 100 different numbers you would need to create and name 100 different integer variables. This would be very time consuming, unwieldy and impractical. Instead we could create a structure that is used to store and manipulate an entire set of identical fundamental data types. How the data is organized in our structure and what types of operations we can perform on it will define what type of data structure we have created. 

Simple Data Structure Example

Without a data structure, each individual integer is stored separately in memory and it is left of to the programmer to keep track of their order. With a list every integer in the list is associated with another creating one combined entity, and we can store them in any order we choose. In this case we have stored the integers in numerical order, which will make finding any value in our list faster and simpler. 

Other Uses for Data Structures

For most programs that need to store, sort, and access large sets of homogeneous data, a data structure serves as an ideal tool. Some example scenarios are provided below.

  • A game needs to store and sort the user names and score values for it's players. (array or linked list)
  • The city wants a digital simulation of it's road network. (graph)
  • A calculator application needs to create and execute basic math operations for quickly and often. (binary search trees)

Data Structures: Arrays

Understanding Arrays

Definition 

An array is a linear homogeneous collection of data. The structure is stored in a continuous block of memory, which allows the computer to access any item in an array almost instantly. Items in an array are called elements. The size of an array defines how many elements an array has. How arrays are stored in memory also implies a natural ordering for each element, referred to as an index. Indexes are normally represented by only positive integers, including 0. The value of an index relates to its location in the array.

Fixed Array

A fixed array, as the name implies, has a fixed size that is defined when the array is first created. Once the size is declared it can not be changed. Attempting to access an index that exceeds the size of the fixed array will result in a memory access error.

Dynamic Array

A dynamic array does not have a fixed sized, it instead is able to change in size by growing or shrinking. This is accomplished by incorporating dynamic memory when declaring a standard fixed array. Depending on the nature of the dynamic array using indexes that are greater than its size might still result in a memory access error, so be sure to check how the dynamic array behaves.

Fixed Array Example (Above)

Arrays can be used to store various types of data, for example the above fixed array stores characters but this array cannot store any other type of data because arrays can only store homogenous data types.

Array Knowledge Check #1

Complete the sentence using the new keywords and facts you have learned about arrays. If you get stuck it's okay to go back and review again.

1.  An array is a  collection of data that is typically stored in a  block of memory.

Array Knowledge Check #2

Complete the sentence using the new keywords and facts you have learned about arrays. If you get stuck it's okay to go back and review again.

2.  Items in an array are called .

Array Knowledge Check #3

Complete the sentence using the new keywords and facts you have learned about arrays. If you get stuck it's okay to go back and review again.

3.  An index can be only a .

Array Knowledge Check #4

Complete the sentence using the new keywords and facts you have learned about arrays. If you get stuck it's okay to go back and review again.

4.  The index of an array starts counting from .

Array Knowledge Check #5

Complete the sentence using the new keywords and facts you have learned about arrays. If you get stuck it's okay to go back and review again.

5.  A  array cannot change size once it is made.

Array Knowledge Check #6

Complete the sentence using the new keywords and facts you have learned about arrays. If you get stuck it's okay to go back and review again.

6.  A  array can change size once it is made.

Array Keywords Review

Keywords and Definitions 

  • Array - A homogeneous collection of data that is stored in a continuous block of memory.
  • Element - Items (or data) stored in an array.
  • Index - The location of an element in an array, can only be a positive value.
  • Size - The total number of elements that can be stored in a particular array.
  • Fixed Array - Arrays that have a fixed size.
  • Dynamic Array - Arrays that can grow or shrink in size.

Data Structures: Linked Lists

Understanding Linked Lists

Definition

A linked list is a linear data structure (like arrays) but each element is a separate object in memory rather than continuous. In linked list, an element is instead called a node. Nodes comprise of two items, the data type being stored and a reference to the next node. A reference is the address in memory where the next node in the list is stored. Commonly these references are stored in a data type called pointers. Linked list are generally considered to be dynamic since they have no fixed size. We can add and remove nodes as much as we want. However, linked list tend to be slower to access from memory than arrays because they are not stored in a continuous block of memory.

Movement

Moving between nodes in a linked list requires there to a be a reference to follow. Without a reference we have no way to know where in memory a node is stored. Essentially, references serve as one way roads in our list. Building a linked list with more than one reference can provide unique advantages at the cost of more memory. 

Singly Linked List

In this type of linked list, every node stores a single reference (or address) of only the next node in the list. Although simpler to build that a doubly linked list, it is impossible to move backwards in the list since there are only references to deeper nodes.

Doubly Linked List

In this type of linked list, every node stores two references. One of the references points to the next node and the other to the previous node. The advantage of storing two references is that we can move in both directions in our list without much difficulty.

Head and Tail

Since linked list are a chain of separate points in memory, we need a way to keep track of the first node in our list. This is accomplished by maintaining a special reference called the head reference (or head pointer). The head reference always stores the address of the first node of our list, if the first node ever changes our head reference changes with it.

Another common special reference used with linked list is the tail reference (or tail pointer). The tail reference works exactly like the head reference, but instead of tracking the first node it tracks the last node of our list.

Insertion

Linked list do not use indexes like arrays, so inserting data is not a straightforward task. There are two main methods. One method uses just a head pointer, the other requires both a head and tail pointer. The inclusion of a tail pointer increase efficiency primarily with doubly linked list.

Insertion With Head Reference

Check the data at the head reference against the insertion data.
  1. If the node is null, add a new node with the data here.
  2. If the node has a smaller value than our insertion data, move to the next node in the list.
  3. If the node has a greater value than our insertion data, create a new node and insert it before the current node.
    • When inserting into the middle of a list, we must reassign the references of the surrounding nodes.
  4. Repeat steps 1-3 until the data is inserted. 

Insertion With Head & Tail Reference

Check the data at the tail reference against the insertion data.

  1. If the node is null, no list exist. Add a new node with the data here.
  2. If the node has a smaller value than our insertion data, insert a new node with the data after the tail node. The new node becomes the tail reference.

Check the data at the head reference against the insertion data.

  1. If the node is null, add a new node with the data here.
  2. If the node has a smaller value than our insertion data, move to the next node in the list.
  3. If the node has a greater value than our insertion data, create a new node and insert it before the current node.
    • When inserting into the middle of a list, we must reassign the references of the surrounding nodes.
  4. Repeat steps 1-3 until the data is inserted.

Linked List Knowledge Check #1

Complete the sentence using the new keywords and facts you have learned about linked list. If you get stuck it's okay to go back and review again.

1.  Items in a linked list are called  and they are stored  in memory.

Linked List Knowledge Check #2

Complete the sentence using the new keywords and facts you have learned about linked list. If you get stuck it's okay to go back and review again.

2.  Nodes store data and a  to the next node in the list.

Linked List Knowledge Check #3

Complete the sentence using the new keywords and facts you have learned about linked list. If you get stuck it's okay to go back and review again.

3.  References are  in memory of data is stored.

Linked List Knowledge Check #4

Complete the sentence using the new keywords and facts you have learned about linked list. If you get stuck it's okay to go back and review again.

4.  Linked list  have a fixed size.

Linked List Knowledge Check #5

Complete the sentence using the new keywords and facts you have learned about linked list. If you get stuck it's okay to go back and review again.

5.  Singly linked list have  reference, while doubly linked list have .

Linked List Knowledge Check #6

Complete the sentence using the new keywords and facts you have learned about linked list. If you get stuck it's okay to go back and review again.

6.  The  reference stores the address of the first node, and the  reference stores the address of the last node.

Linked List Keywords Review

Keywords and Definitions

  • Linked list- A homogeneous collection of data that is stored in a separate blocks of memory called nodes.
  • Nodes - The individual items (or data) that comprise a linked list.
  • Reference -An address in memory. Used to link any node to its next node of a linked list.
  • Singly Linked List - A linked list that only uses one reference per node. Allows for forward movement, but no backward movement.
  • Doubly Linked List - A linked list that uses two references per node. Allows for both forward and backward movement.
  • Head Reference - A special reference (or pointer) that always tracks the first item of the linked list.
  • Tail Reference - A optional special reference (or pointer) that always tracks the last item of a linked list.

Data Structures: Binary Search Trees

Understanding Binary Search Trees

Definition (Tree Data Structures)

A tree is a hierarchical data structure for storing data. Data is stored in nodes (similarly to a linked list) but each node contains 0 or more references to others nodes below it in the hierarchy. The top most node in the hierarchy is called the root node. Nodes within the tree can have both child nodes and parent nodes. Parent nodes have references to any child nodes associated to it. Child nodes hold references to their parent nodes, as well as any children they might have. Any nodes with no child nodes associated to them are called leaf nodes. There are many types of tree data structures in computer science, but we will focus on a very common and fundamental one called a binary search tree.

Binary Search Trees

A binary search tree is a special type of tree that follows a few important rules.

  • Nodes can only have a max of 2 child nodes, which are referred to as the left child and right child.
  • The left and right child will follow a greater than and less than logic defined by the programmer. For example, if a left child is always less than its parent node then the right child would always be greater than its parent node.
  • OPTIONAL: Child nodes can have references to their parent nodes, but this is not required. Adding parent references has a similar effect to making a doubly linked list, making it possible to move back up a tree rather than only down.

By following the two key rules above, binary search trees allow a user to store a large amount of data in a self sorting way. This allows the recalling of any data to be very simple to program and execute. Below is an example of a basic binary search trees using integers. 

Insertion

Since binary search tree must adhere to a specific logic based on values being greater or less than their parents, we should see how inserting data into a binary search tree is done.

  1. We start at our root node and designate it as our "current node".
    • If no root node existed yet, then we insert our data and make it the root node.
  2. Compare the "current node" data to the insertion data. 
    • If the data is less than the "current node" then we move to the left child and make it our new current node.
    • If the data is greater than the "current node" then we move to the right child and make it our new current node.
    • If the "current node" doesn't exist yet, then we insert our data at this location.
  3. Repeat step 2 until data is inserted.

Search

Search for data in a binary search tree works using the same logic as insertion. 

  1. We start at our root node and designate it as our "current node".
    • If no root node existed yet, then data does not exist in our tree.
  2. Compare the "current node" data to the search data. 
    • If the data matches the "current node", then we have found it inside our tree.
    • If the data is less than the "current node" then we move to the left child and make it our new current node.
    • If the data is greater than the "current node" then we move to the right child and make it our new current node.
    • If the "current node" doesn't exist yet, then the data does not exist in our tree.
  3. Repeat step 2 until either the data is found or not.

Binary Search Tree Knowledge Check #1

Complete the sentence using the new keywords and facts you have learned about binary search trees. If you get stuck it's okay to go back and review again.

1.  A tree is a  data structure for storing data.

Binary Search Tree Knowledge Check #2

Complete the sentence using the new keywords and facts you have learned about binary search trees. If you get stuck it's okay to go back and review again.

2.  The first node in a tree is called the  node.

Binary Search Tree Knowledge Check #3

Complete the sentence using the new keywords and facts you have learned about binary search trees. If you get stuck it's okay to go back and review again.

3.  Nodes that have no child nodes associated with them are called  nodes.

Binary Search Tree Knowledge Check #4

Complete the sentence using the new keywords and facts you have learned about binary search trees. If you get stuck it's okay to go back and review again.

4.  A binary search tree can no more than  child nodes branching off a parent node. 

Binary Search Tree Knowledge Check #5

Complete the sentence using the new keywords and facts you have learned about binary search trees. If you get stuck it's okay to go back and review again.

5.  Standard binary search trees place values less than a parent node at its  child, and values greater than a parent at its  child.

Binary Search Tree Knowledge Check #6

Instructions

6.  Label the parts of this binary search tree using the text tiles.

  • Root Node
  • 8's Left Child
  • 8's Right Child
  • 11's Parent Node
  • Leaf Node

Binary Search Tree Knowledge Check #7

Instructions

Using the insert logic you've learned, show where the values 5, 7, and 9 would be placed in this binary search tree.

  • 5
  • 7
  • 9

Binary Tree Keywords Review

Keywords and Definitions

  • Trees - A hierarchical data structure comprised of nodes and references.
  • Root Node - The root node is the top-most node of a tree, and serves and the starting point.
  • Parent Node - A parent node is any node that has 1 or more nodes branching off of it via references. These branched nodes become its child nodes.
  • Child Node - Nodes that branch off of parent nodes are consider its child nodes.
  • Leaf Node - Any node with no child nodes of its own.
  • Binary Search Tree - A special type of tree with nodes that can only have 0-2 references (child nodes) each. The two children are called the left child and right child.
  • Left Child - The left child in a binary search tree must have a value less than its parent node.
  • Right Child - The right child of a binary search tree must have a value greater than its parent node.

Data Structures: Graphs

Understanding Graphs

Definition

A graph is a data structure that is similar to a tree, with a few key differences. 

  • First, a graph is a network (or web) of vertices.
  • Second, vertices can have any number of connections which are called edges

Edges are represented in programs by ordered pairs. An ordered pair is comprised of a starting vertex and a ending vertex with the form (a, b), where 'a'  is the starting vertex and 'b'  is the ending vertex. There are two primary types of graphs, directed and undirected. Directed graphs only have edges that are can "travel" one direction between two vertices. Undirected graphs only have edges that are bidirectional, allowing for "travel" between any two connected vertices. 

Example Undirected Graph (Above)

Example Directed Graph (Above)

Adjacency List vs Adjacency Matrix

Although you might assume graphs would be stored in memory with nodes like a tree, this would be incorrect. Instead graphs are stored either as an adjacency list or an adjacency matrix. 

Adjacency List

Adjacency list are a special type of array and linked list hybrid. Each index of an array is used to represent a vertex in the graph. The data stored in each element of the array is a linked list of vertices connected to the that index. Remember that only undirected graphs have two way edges, in a directed graph each direction must be explicitly drawn.

Adjacency Matrix

Adjacency matrices are a special array called a 2-dimensional array. 2-D Arrays are arrays with with a second array stored inside each element. The result is a grid of rows and columns called a matrix. The slots inside the matrix are called cells. The index of each cell in the matrix is represented by a pair of indexes (row, column). Row indexes are starting vertices for an edge, and column indexes are the destination vertices of that edge. Cell store values of 0 or 1. The number 1 corresponds to connecting edge and the number 0 corresponds to no connection.

Visualizing the Adjacency List & Adjacency Matrix

This is what are adjacency list and adjacency matrix would look like in graph form. Both a undirected and directed version were drawn to help show the difference in how the edges between vertices work. Remember that in the adjacency list and adjacency matrix, the index [0], [1], [2] correlates to the vertex [A], [B], [C]. 

Graph Knowledge Check #1

Complete the sentence using the new keywords and facts you have learned about graphs. If you get stuck it's okay to go back and review again.

1.  Graphs are comprised of  and  creating a web or network of connections.

Graph Knowledge Check #2

Complete the sentence using the new keywords and facts you have learned about graphs. If you get stuck it's okay to go back and review again.

2.  Edges are represented by  of a starting and ending vertex.

Graph Knowledge Check #3

Complete the sentence using the new keywords and facts you have learned about graphs. If you get stuck it's okay to go back and review again.

3.   graphs have edges that connect vertex pairs in both directions, while  graphs have edges that only connect vertex pairs in one direction.

Graph Knowledge Check #4

Complete the sentence using the new keywords and facts you have learned about graphs. If you get stuck it's okay to go back and review again.

4.  All graphs can be represented using either an adjacency  or an adjacency  

Graph Knowledge Check #5

Complete the sentence using the new keywords and facts you have learned about graphs. If you get stuck it's okay to go back and review again.

5.  Adjacency matrices use the values  and  to represent if a connecting edge exist between two vertices.

Graph Knowledge Check #6

Complete the sentence using the new keywords and facts you have learned about graphs. If you get stuck it's okay to go back and review again.

6.  For adjacency matrices, the cell value of 0 means  edge exists between the two vertices and the value of 1 means  edge exist between the two vertices.

Graph Keywords Review

Keywords and Definitions

  • Graph - A network (or web) comprised of vertices and edges.
  • Vertex - The data of a graph is stored in it's vertices, which are connected to one another by edges.
  • Edge - An edge connects only two vertices together. One vertex is designated as a starting vertex and the other is designated as an ending vertex.
  • Ordered Pairs - Ordered pairs are used to represent the edges of a graph. They take the form (a,b) where 'a' is the starting vertex, and 'b' is the ending vertex.
  • Directed Graph - The edges of a directed graph only allow for travel in a single direction, (a,b).
  • Undirected Graph - The edges of an undirected graph allow for travel in both directions, (a,b) also implies (b,a).

Additional Resources

Other Learning Tools