Tuesday, May 21, 2013

A C++ Thread Pool Implementation Using POSIX Threads

Threads are very useful but potentially very tricky constructs in computer programming. They are also generally hard to get right. Building a deadlock-free and correct multi-threaded software requires great care.

And then threads are expensive. The processor needs to do a context switch to jump between threads. Each context switch means saving the state of the current executing thread and then loading the thread selected for execution. Creating a thread for every I/O operation or lengthy computation can push the machine to a halt if the number of requests for I/O and/or computation is too high.

The midway between creating too many threads for all requests for service and doing everything in one thread is to create a pool of threads and reuse a thread as soon as it is done servicing a request. C++ does not have a built-in thread pool library (it even didn't have threading support prior to C++ 11 standard that came out last year). Java has the Executor interface for this. With C++, some people use the Boost threading library or the Boost Asio library for achieving performance gains in their applications.

In this article we will design a very simple thread pool library using C++ and POSIX threads (also known as pthreads). We will not use any other external libraries like Boost.

The core idea of a thread pool is to have a number of worker threads always ready for accepting work from the main process. The process receives requests for work and schedule those requests as doable tasks for the threads. This pattern resembles the well known Readers-writers problem in Computer Science. There is a queue that is populated with tasks as they arrive in the process. The request processing part is the writer and the threads are the readers. Request processor will insert the work items as they arrive and the threads will pick up one item at a time from the queue in a First-In-First-Out fashion.

So a thread pool basically consists of the three primary actors:

1. A number of threads either waiting for or executing tasks.
2. A number of service requests to the server from clients. Each request is considered a task to the threads.
3. A queue holding the incoming service requests.




*image source: http://www.javamex.com/tutorials/threads/thread_pools.shtml

My C++ implementation of a thread pool:

In an attempt to dig deep into the multi-threading and asynchronous world of computation I developed a very simple thread pool using C++ and pthreads.

The source code for the thread pool with example usage can be found on my github page: https://github.com/bilash/threadpool

Each request is represented by an object of the class Task. Class Task essentially wraps a function pointer which points to the actual function that constitutes the task. Since it's a class we also provide a functor based interface to invoke the function pointer stored in the Task class.

The actual thread pool is manged by the ThreadPool class. It stores a vector of threads, a queue to store tasks (instances of the class Task), a method to enqueue incoming tasks, and a method to execute the tasks in a thread. There are also helper methods to initialize and destroy (and clean up) the pool.

Thread synchronization:

When developing a multi-threaded service or application we almost always need to use locks to prevent data corruption and data races in our program. This essentially means data that will be accessed by more than one thread - also known as a critical section - need to be protected by some kind of locking mechanism. The most popular of these locking mechanisms is called a mutex (for mutual exclusion). A mutex achieves what its name suggests - it allows execution of the code in a mutually exclusive way!

In our thread pool program we use mutex to protect our shared resource - the queue holding the tasks. The task queue is populated by the request processing part of the program and the is read by the threads waiting to pick up tasks. The threads also remove the task they have picked from the task queue since the task will no longer be needed after it is executed. Since the queue is accessed (and modified) by multiple actors in the program it is protected by a mutex lock.

The other synchronization mechanism used in the program is called Condition Variables. Condition variables are used to signal threads waiting on a condition to be true. In our program we use it to signal the threads that the queue has been populated with new tasks. The threads wait (put to sleep by the OS) while the task queue is empty. We wake up the waiting threads by using a condition variable.

Feel free to browse through the code and let me know if you find a bug or some issues with it. Again, it's a very minimalist code just to demonstrate the concept of thread pooling, so don't expect it to be very robust and flexible!

Thanks for reading!