Chapter 9. Using Class RWBTreeOnDisk

This chapter includes the following sections:

Class RWBTreeOnDisk has been designed to manage a B-tree in a disk file. The class represents an ordered collection of associations of keys and values, where the ordering is determined internally by comparing keys. Given a key, a value can be retrieved. Duplicate keys are not allowed.

Keys are arrays of char. The key length is set by the constructor. The ordering in the B-tree is determined by comparing keys with an external function, which you can change.

The type of the values is:

typedef long RWstoredValue;

The values typically represent an offset to a location in a file where an object is stored. Given a key, you can find where an object is stored and retrieve it. As far as class RWBTreeOnDisk is concerned, however, the value has no special meaning(it is up to you to interpret it.

The class RWBTreeOnDisk uses class RWFileManager to manage the allocation and deallocation of space for the nodes of the B-tree. You can use the same RWFileManager to manage the space for the objects themselves if the B-tree and data are to be in the same file. Alternatively, you could use a different RWFileManager, managing a different file, to store the B-tree and data in separate files.

The member functions associated with class RWBTreeOnDisk are similar to those of the in-memory class RWBTreeDictionary, except that keys are arrays of char rather than RWCollectables. There are member functions to add a key-value pair, remove a pair, replace a value associated with a key, query for information associated with a key, operate on all key-value pairs in order, return the number of entries in the tree, and determine if a key is contained in the tree.

Construction

An RWBTreeOnDisk is always constructed from an RWFileManager. If the RWFileManager is managing a new file, then the RWBTreeOnDisk will initialize it with an empty root node. For example, the following code fragment constructs an RWFileManager for a new file called filename.dat and then constructs an RWBTreeOnDisk from it:

#include <rw/disktree.h>
#include <rw/filemgr.h>
 
main(){
  RWFileManager fm("filename.dat");
 
  // Initializes filename.dat with an empty root:
  RWBTreeOnDisk bt(fm);
}

Example

In this example, key-value pairs of character strings and offsets to RWDates representing birthdays are stored. Given a name, you can retrieve a birthdate from disk.

#include <rw/disktree.h>
#include <rw/filemgr.h>
#include <rw/cstring.h>
#include <rw/rwdate.h>
#include <rw/rstream.h>
 
main(){
   RWCString name;
   RWDate birthday;
 
   RWFileManager fm("birthday.dat");
   RWBTreeOnDisk btree(fm);                                   // 1
 
   while (cin >> name)                                        // 2
   {
     cin >> birthday;                                         // 3
     RWoffset loc =  fm.allocate(birthday.binaryStoreSize());// 4
     fm.SeekTo(loc);                                         // 5
     fm << birthday;                                         // 6
     btree.insertKeyAndValue(name, loc);                     // 7
   }
   return 0;
}

Here's the line-by-line description:

//1 

Construct a B-tree. The default constructor is used, resulting in a key length of 16 characters.

//2 

Read the name from standard input. This loop will exit when EOF is reached.

//3 

Read the corresponding birthday.

//4 

Allocate enough space from the RWFileManager to store the birthday. Function binaryStoreSize() is a member function in most Rogue Wave classes. It returns the number of bytes necessary to store an object in an RWFile. If you are storing an entire RWCollection, or using one of the methods recursiveSaveOn() or operator<<(RWFile&, RWCollectable), be sure to use recursiveStoreSize() instead.

//5 

Seek to the location where the RWDate will be stored.

//6 

Store the date at that location. Most Rogue Wave classes have an overloaded version of the streaming operators << and >>.

//7 

Insert the key and offset to the object in the B-tree.

Having stored the names and birthdates on a file, here's how you might retrieve them:

#include <rw/disktree.h>
#include <rw/filemgr.h>
#include <rw/cstring.h>
#include <rw/rwdate.h>
#include <rw/rstream.h>
 
main(){
   RWCString name;
   RWDate birthday;
 
   RWFileManager fm("birthday.dat");
   RWBTreeOnDisk btree(fm);
 
   while(1)
   {
     cout << "Give name: ";
     if (!( cin >> name)) break;                             // 1
     RWoffset loc = btree.findValue(name);                   // 2
     if (loc==RWNIL)                                         // 3
       cerr << "Not found.\n";
     else
     {
       fm.SeekTo(loc);                                       // 4
       fm >> birthday;                                       // 5
       cout << "Birthday is " << birthday << endl;           // 6
     }
   }
   return 0;
}

Here is a description of the program:

//1 

The program accepts names until encountering an EOF.

//2 

The name is used as a key to RWBTreeOnDisk, which returns the associated value, an offset, into the file.

//3 

Check to whether the name was found.

//4 

If the name is valid, use the value to seek to the spot where the associated birthdate is stored.

//5 

Read the birthdate from the file.

//6 

Print it out.

With a little effort, you can easily have more than one B-tree active in the same file. This allows you to maintain indexes on more than one key. Here's how you would create three B-trees in the same file:

#include <rw/disktree.h>
#include <rw/filemgr.h>
 
main(){
   RWoffset rootArray[3];
 
   RWFileManager fm("index.dat");
   RWoffset rootArrayOffset =    fm.allocate(sizeof(rootArray));
 
   for (int itree=0; itree<3; itree++)
   {
     RWBTreeOnDisk btree(fm, 10, RWBTreeOnDisk::create);
     rootArray[itree] = btree.baseLocation();
   }
   fm.SeekTo(fm.start());
   fm.Write(rootArray, 3);
   return 0;
}

And here is how you could open the three B-trees:

#include <rw/disktree.h>
#include <rw/filemgr.h>
 
main(){
   RWoffset rootArray[3];           // Location of the tree roots
   RWBTreeOnDisk* treeArray[3]; // Pointers to the RWBTreeOnDisks
 
   RWFileManager fm("index.dat");
   fm.SeekTo(fm.start());      // Recover locations of root nodes
   fm.Read(rootArray, 3);
   
   for (int itree=0; itree<3; itree++)
   {
     // Initialize the three trees:
     treeArray[itree] = new RWBTreeOnDisk(fm,
                      10,          // Max. nodes cached
        RWBTreeOnDisk::autoCreate, // Will read old tree
                      16,          // Key length
                      FALSE,       // Do not ignore nulls
                      rootArray[itree] // Location of root
                                          );
   }
 
   .
   .
   .
   for (itree=0; itree<3; itree++)  // Free heap memory
      delete treeArray[itree];
 
   return 0;
}