The Karoo Project

Introduction

I think the most common trap for application designers is to assume that a server or an application must be a single OS executable process task.

To make an application which is composed of more than one OS task, we need communication, and some sort of task management (to make sure all the bits and pieces are running and communicating.)

Too much functionality in one OS task leaves it vulnerable. Lots of threads per OS process task is similarly a bad idea. (Better to have lots of OS tasks and thus isolate the damage when one crashes.) Any feature that may be updated during run-time must be a distinct OS task. (Obviously we need to manage all these OS tasks, and provide good communication mechanisms which are as simple as a function call).

This project will address these issues and make it very easy to make large clustered applications.

At the core of the project are efficient and bug-free queues and communications.

The idea is to make an application from multiple OS tasks, rather than just one OS task, and to make it easy for application designers to use this architecture.

Queues

The queues are quite simple, but exhaustively robust so that race conditions and invalid memory access don't happen, and so that the application programmer does not need to worry about these issues.

The principle is that you will never need to use sleep(), usleep(), smaphores, mutexes, or any of the pthread... functions. Rather, if you want something to happen after a delay, then make an object that inherits from the pebble class, and add it to an exeque or exepool object with a time-to-start set in the future. e.g.

 // declaration:
 using namespace karoo;
 exepool pool(3); // set it up with 3 threads for executing pebble tasks
 
 class testpebble : public pebble {
 public:
     testpebble() : pebble() {}
     virtual ~testpebble() {}
     void run();
 };
 
 void testpebble::run()
 {
     // ... your executable code goes here
 }
 
 
 // then later in your code:
   testpebble* task = new testpebble(); // this object inherits from pebble
   task->dereferenceAfterRun();         // set it up to auto garbage collect
   pool.add(20, task);                  // execute in 20 seconds time

Note that this mean you will be using event-based as opposed to batch programming techniques.

Synchronisation

If you want to be sure that a number of things happen synchronously, to ensure that race conditions don't happen - without having to use semaphores or mutexes, then you need to follow these steps:

This will force you to use the event-based paradigm. The problem is, that we all were taught how to program using the batch paradigm, where the flow of execution steps methodically through our code.

E.g. put a pebble into an exeque, and then get on with something else. If the logic that is started in the pebble needs to be continued after the pebble::run() exits, then just before it exits, it must put its results into another pebble, and set that running. The new pebble will continue the flow of the logic started in the first pebble.

If you absolutely have to access memory from more than one thread, then you must protect accesses and modifications: Ensure that variable accesses are not causing race-conditions by using the synchronised<T> class, rather than plain variables. E.g.

 using namespace karoo;

 synchronised<double> x;
 synchronised<double> y;
 
 // e.g. to swap the values x and y
 double d = x;
 x = y;
 y = d;

The library is obviously about much more than just queues, though these are so central to the whole library, that more time has been spent perfecting them than anything else.

There are more building blocks, such as the rocks and datagrams.

Rocks

A rock is a class you han make your OS tasks inherit from. It includes a very convenient way to handle command-line parameters. Command-line parameters are used as the primary way to initialise an OS task. e.g.

 class test : public rock {
 public:
     test(vector<text>& opts, int argc, char *argv[]) : rock(opts, argc, argv) {}
     virtual ~test() {}
     void run();
     // ... etc
 };

 void test::run()
 {
     int size = getArgument("size").atoi());
     int count = getArgument("count").atoi());
     // ... etc
 }

 int main(int argc, char *argv[])
 {
     initQueues();
     vector<text> opts;
     opts.push_back("count");
     opts.push_back("size");
     test me(opts, argc, argv);
     me.run();
     return 0;
 }

The above example has can be run from the command line as e.g.

 test --count 300 --size 5

Related references


Generated on Tue Feb 16 15:04:28 2010 for Karoo by  doxygen 1.5.8