LINFO

Data Structure Definition



A data structure is a way of storing information in a computer so that it can be used efficiently.

Efficiency in this context refers to the ability to find and manipulate data quickly and with the minimum consumption of computer and network resources, mainly CPU (central processing unit) time, memory space and bandwidth.

Numerous types of data structures have been developed; some are very general and widely used, while others are highly specialized for certain types of tasks. Careful selection of data structures can allow the use of the most efficient algorithms for particular tasks and thereby optimize the performance of programs. An algorithm is a precise, unambiguous sets of rules that specify how to solve some problem or perform some task.

Data structures can be classified in several ways, including whether they are linear or graphic and whether they are static or dynamic (i.e., whether the shape or size of the structure changes over time). Linear data structures include lists and associative arrays. List data structures can be divided into arrays, linked lists and V lists. Graph data structures include trees, adjacency lists, disjoint-sets, graph-structured stacks and scene graphs. Other data structures include frames, unions, tagged unions and tables.

Most data structures are formed from two or more of just four basic types of components: arrays, discriminated unions, records and references. For example, linked lists are constructed from records and nullable references. A nullable reference is a reference that can be null (i.e., the actual value of an item is missing, unknown, or not applicable), and it is a combination of references and discriminated unions.

Linked lists are the simplest data structures. Each consists of a sequence of nodes, each of which contains arbitrary data fields and one or two references (i.e., links) pointing to the next and/or previous nodes. Linked lists permit insertion and removal of the nodes at any point in the list, but they do not allow random access.

Trees are a widely-used data structure that resembles an inverted biological tree structure. Each node can have any number of child nodes, which are below it in the tree diagram. Most tree structures are ordered trees, which make a distinction between the various children of a node, whereas ordered trees make such a distinction.

One of the most commonly used data structures is the B-tree, which is a type of tree structure that is very efficient for insertions and deletions and for keeping data sorted. Thus it is employed as the basis for filesystems, databases and routing tables (which are used in routers to direct packets through the most efficient routes on the Internet and other networks). It is not certain what the B stands for, but it could be for Bayer (the inventor's last name), Boeing (the company for which the inventor worked) and/or balanced (because all the leaf nodes are at the same level in the tree).






Created May 6, 2006.
Copyright © 2006 The Linux Information Project. All Rights Reserved.