B+ Tree File Organization
B+ Tree is an advanced method of ISAM file organization. It uses the same concept of key-index, but in a tree like structure. B+ tree is similar to binary search tree, but it can have more than two leaf nodes. It stores all the records only at the leaf node. Intermediary nodes will have pointers to the leaf nodes. They do not contain any data/records.
Consider a student table below. The key value here is STUDENT_ID. And each record contains the details of each student along with its key value and the index/pointer to the next value. In a B+ tree it can be represented as below.
Please note that the leaf node 100 means, it has name and address of student with ID 100, as we saw in R1, R2, R3 etc above.
From the above B+ tree structure, it is evident that
- There is one main node called root of the tree – 105 is the root here.
- There is an intermediary layer with nodes. They do not have actual records stored. They are all pointers to the leaf node. Only the leaf node contains the data in sorted order.
- The nodes to the left of the root nodes have prior values of root and nodes to the right have next values of the root. i.e.; 102 and 108 respectively.
- There is one final node, called leaf node, which has only values. i.e.; 100, 101, 103, 104, 106 and 107
- All the leaf nodes are balanced – all the leaf nodes at same distance from the root node. Hence searching any record is easier.
- Searching any record is linear in this case. Any record can be traversed through single path and accessed easily.
- Since the intermediary nodes have only pointers to the leaf node, the tree structure is of shorter height. Shorter the height, faster is the traversal and hence the retrieval of records.
Advantages of B+ Trees
- Since all records are stored only in the leaf node and are sorted sequential linked list, searching is becomes very easy.
- Using B+, we can retrieve range retrieval or partial retrieval. Traversing through the tree structure makes this easier and quicker.
- As the number of record increases/decreases, B+ tree structure grows/shrinks. There is no restriction on B+ tree size, like we have in ISAM.
- Since it is a balance tree structure, any insert/ delete/ update does not affect the performance.
- Since we have all the data stored in the leaf nodes and more branching of internal nodes makes height of the tree shorter. This reduces disk I/O. Hence it works well in secondary storage devices.