1. Introduction

This code provides object-oriented multi-threading support for computation-heavy tasks such as backtracking or exhaustive search. In order to take advantage of the potential speedup offered by multi-threading, your computation must be capable of being broken up into parts that can be performed independently in parallel.

The remainder of this page is divided into four sections. Section 2 describes how to compile the code, which is probably the first thing you should do. Section 3 drills down into the code, which is important to read before you go about incorporating it into your own project. Section 4 summarizes what this code provides and what you must provide, just to make sure that we are both on the same page. Section 5 describes what you need to do to use this code in your own project.

2. Compiling the Code

2.1 Windows and Visual Studio

A Visual Studio solution file threadplusplus.sln has been provided in the root folder. It has been tested with Visual Studio 2019 Community under Windows 10 in both Debug and Release configurations on the x64 and x86 platforms. It consists of two projects, threadplusplus and Test. Project threadplusplus compiles into a library file threadplusplus.lib. Project Test compiles into a test executable Test.exe described in more detail in this documentation.

2.2 *NIX and g++

There is a makefile for the threadplusplus library in directory Src. From the root folder (the one with threadplusplus.sln), type cd Src followed by make all, then make cleanup. You should now see the library file threadplusplus.a.

There is a makefile for the Test program in directory Test. From the root folder (the one with threadplusplus.sln), type cd Test followed by make all. You should now see the executable file Test. Run it by typing ./Test.

3. Drilling Down Into the Code

In order to take full advantage of the potential speedup offered by multi-threading, your computation must be broken up into a reasonable number of parts (ideally between 10 and 1000) that can be performed independently in parallel. We will call these tasks. Each task is described in a task descriptor that includes the data that specifies the task, the code for performing the task, and space for its result (see Section 3.1).

Newly created task descriptors are inserted into a thread-safe (see Section 3.2) request queue from which the threads will take them, perform the task described, and insert the completed task descriptor with its result into a thread-safe result queue. The request queue and the result queue are shared between objects using a monostate class CCommon (see Section 3.3). The computation as a whole is managed by a thread manager (see Section 3.4) which creates and inserts task descriptors into the request queue, initiates the threads (see Section 3.5), waits until all threads terminate, then processes the completed task requests from the result queue.

A timer class is also provided to enable you to easily measure and report elapsed time and total CPU time (see Section 3.6).

3.1 The Task Descriptor

The task descriptor describes a task to be performed by a single thread. This code provides a base task descriptor CBaseTask that includes a task identifier, a thread identifier, and a function to perform the task. You should derive your task descriptor from CBaseTask. Your task descriptor should implement a constructor for any task-related initialization and it should override function CBaseTask::Perform() with the code to perform your task. Be sure to provide space for the task results, if any, since the completed task descriptors will be used after thread termination to process the results.

Each task descriptor you instantiate will automatically get a unique task identifier that can be read using CBaseTask::GetTaskId(). This is maintained using a private static atomic member variable that is incremented and copied to a protected member variable by the CBaseTask constructor. It is recommended that you do not interfere with this process. You are responsible for setting the thread identifier by calling CBaseTask::SetThreadId() when this task descriptor is assigned to a thread. The thread identifier can be read later by calling CBaseTask::GetThreadId(). The task and thread identifiers are provided primarily for debugging purposes and do not impose a significant load on time or memory requirements.

3.2 The Thread-Safe Queue

Fig. 1: The thread-safe queue allows simultaneous requests to insert and delete from concurrent threads, but only one can proceeed at a time.

The thread-safe queue CThreadSafeQueue maintains a private std::queue of task descriptors for communicating task descriptors between the threads and the thread manager. It uses an std::mutex for safety. The mutex is locked whenever a thread is in the process of inserting or deleting a task request from the queue. All other threads are forced to wait until the process is complete and the thread unlocks the mutex. Since inserting to and deleting from an std::queue is fairly efficient, this should not be a significant fraction of time assuming that this code is to be used for computation-heavy tasks such as backtracking or exhaustive search. As shown in Fig. 1, multiple insert/delete requests from concurrent threads can come into the thread-safe queue simultaneously (the colored arrows at the bottom), but the mutex allows only one to proceed at a time (in this case the blue arrow).

CThreadSafeQueue is a templated class which should be instantiated using your task descriptor class derived from CBaseTask (see Section 3.1).

3.3 The Common Variables Class

The common variables class CCommon contains variables to be shared between the threads and the thread manager. It is a monostate, that is, is a class that encapsulates a single instance of shared data without the need for global variables, extended parameter lists, or local copies of the data. The monostate is also called the Borg Idiom in the Python community.

CCommon consists of the request queue, a thread-safe queue of pointers to uncompleted task descriptors (see Section 3.2); the result queue, a thread-safe queue of pointers to completed task descriptors (see also Section 3.2); and a Boolean value CCommon::m_bForceExit to be set if and when the computation is to be aborted prematurely.

CCommon is a templated class which should be instantiated using your task descriptor class derived from CBaseTask (see Section 3.1).

3.4 The Thread Manager

The thread manager takes care of inserting task requests into the request queue, initiating the threads, waiting until all threads terminate, and processing the completed task requests from the result queue. This code provides a base thread manager CBaseThreadManager, a templated class derived from CCommon (see Section 3.3). You should derive your templated thread manager from CBaseThreadManager using your task descriptor class derived from CBaseTask (see Section 3.1) to instantiate the template. Your task manager should overload function CBaseThreadManager::ProcessTask() to process the results from a completed task descriptor.

Your code should call CBaseThreadManager::Insert() in a loop that creates new task descriptors (see Section 3.1). CBaseThreadManager will insert these task requests into the thread-safe request queue (see Section 3.2) that it inherits from CCommon (see Section 3.3). This process is illustrated in Fig. 2.

Fig. 2: The thread manager sequentially creates and inserts task descriptors into the request queue.

It should then call CBaseThreadManager::Spawn() to spawn the threads, which delete task descriptors from the thread-safe request queue, completes the tasks, and inserts completed task descriptors into the thread-safe result queue. This process will be described in more detail in Section 3.5. This process is illustrated in Fig. 3.

Fig. 3: Threads perform the tasks from the request queue and transfer the completed task descriptors into the result queue.

Your code should then call CBaseThreadManager::Wait() to wait until all threads terminate. After this function returns, your code should call CBaseThreadManager::Process() to process the completed task descriptors. CBaseThreadManager::Process() will call your overloaded virtual CBaseThreadManager::ProcessTask() which, if you have done your job correctly, will report the results, ideally to the console and to a file (it's up to you). This is executed sequentially becaise it is assumed that the bulk of the computation has been performed in parallel by the concurrent threads and all that remains is to tally the results. This process is illustrated in Fig. 4.

Fig. 4: The thread manager sequentially processes the completed task descriptors from the result queue.

3.5 The Threads

The thread class CThread provides a constructor that assigns a thread identifier and an implementation of operator() containing the code to be executed by the thread. This code consists of a loop whose body consists of the following:

A pointer to a task descriptor (see Section 3.1) is deleted from the thread-safe request queue (see Section 3.2) inherited from CCommon (see Section 3.3).
The task described in the task descriptor is performed by calling the task descriptor's function overriding CBaseTask::Perform().
A pointer to the completed task descriptor is inserted into the thread-safe result queue (see Section 3.2), also inherited from CCommon (see Section 3.3).

The loop exits when either the request queue is empty (CThreadSafeQueue::Delete() returns false) or the shared Boolean variable CCommon::m_bForceExit is set to true, at which time the thread terminates.

3.6 The Timer

Assuming that you will be using this code for backtracking or exhaustive search, you will also be interested in measuring the CPU time used over the whole computation, summed over all of the threads, and the elapsed time. The elapsed time should ideally be the CPU time divided by the number of threads, but this assumes that you have made good decisions about how your tasks should be parallelized.

The timer class CTimer has a start function CTimer::Start() that should be called from your code when you want to start measuring time, leaving the decision over whether to measure initialization time completely up to you. The main functions of interest are CTimer::GetCurrentDateAndTime() which returns the current date and time as an std::string, and CTimer::GetElapsedTime() and CTimer::GetCPUTime() which return, respectively, the elapsed and CPU time as pretty-printed std::strings. It's up to you to instantiate an instance of CTimer in your code, make the appropriate function calls, and output the resulting strings to the console or to a file using, for example, std::iostream, std::fstream, or stdio.

CTimer uses std::chrono to measure elapsed time, but since there is (alas!) a dearth of platform-independent code to measure CPU time, we are forced to use #ifdefs, some code cribbed from a Microsoft demo to sum CPU time over multiple threads under Windows and Visual Studio, and clock() from time.h under *NIX and g++.

4. Summary

In summary, you need to know what this code provides and what you must provide before working with this code.

4.1 What This Code Provides

A base task descriptor CBaseTask.
A base thread manager CBaseThreadManager.
A thread class CThread.
A thread-safe queue CThreadSafeQueue.
A timer class CTimer.

4.2 What You Must Provide

A task descriptor CTask derived from CBaseTask with member variables that specify the task to be performed and a member function CTask::Perform() to override CBaseTask::Perform() containing code to perform your task as specified in its member variables.
A thread manager CThreadManager derived from CBaseThreadManager <CTask>. It must contain, in addition to a constructor that calls the CBaseThreadManager constructor, a function CThreadManager::ProcessTask(CTask*) that overrides CBaseThreadManager::ProcessTask(CBaseTask*) with code that processes the results of your task pointed to by its single parameter.
A main() that instantiates an instance of CThreadManager and an instance of CTimer (if required). It then creates the required instances of CTask and calls the thread manager's Insert() function to insert them into the request queue. Next, it starts the timer (if required) by calling CTimer::Start(). This line of code may be omitted if initialization is to be included in the time reported. It then calls the thread manager's Spawn() function to spawn the threads and the thread manager's Wait() function to wait for the threads to terminate. When that function returns, the result queue should contain processed task descriptors. Now main() can report CPU time and elapsed time (if required), and finally call the thread managers Process() function to process and report the results of the computation.

A simple test executable is provided as part of this project (see Section 2).

5. Using This Code

You can use thread++ in your own multi-threaded project by following these instructions. For a non-trivial example, see https://github.com/Ian-Parberry/peacefulqueens_mt.

5.1 Windows and Visual Studio

Compile this code in whatever configurations and platforms you require (see Section 2.1).
It is recommended that you create a Windows environment variable THREADPLUSPLUS_DIR and set it to the name of the folder that contains threadplusplus.sln, terminating it with a backslash character \ (see Fig. 5). Make sure that you shut down and restart all instances of Visual Studio before you proceed, otherwise Visual Studio will not see your environment variable and you will get a "header file not found" and/or "library file not found" error message when you try to compile and link your code.

Fig. 5: Creating the environment variable THREADPLUSPLUS_DIR.
Having done the above, create your own Visual Studio solution and project, then in your project properties with your required configuration (ideally All Configurations) and platform (ideally All Platforms), do the following three things under Configuration Properties:
1. In VC++ Directories, add $(THREADPLUSPLUS_DIR)Src to Include Directories (see Fig. 6).
  
  Fig. 6: Adding the Include and Library Directories in Visual Studio.
2. On the same page, add $(THREADPLUSPLUS_DIR)$(Platform)\$(Configuration) to Library Directories (see Fig. 6).
3. In Linker\Input, add threadplusplus.lib to Additional Dependencies (see Fig. 7).
  
  Fig. 7: Adding threadplusplus.lib in Visual Studio.
Remember to #include BaseTask.h, BaseThreadManager.h, and Timer.h wherever appropriate in your code.

5.2 *NIX and g++

Compile this code (see Section 2.2).
Copy the makefile from the Test directory (see Section 2) to your source code directory.
Edit the copy of the makefile in your source code directory.
1. Change line 1 to set the SRC variable to your source and header files.
2. Change line 2 to set the EXE variable to your executable file name.
3. Change line 3 to set the INC variable to the relative path of the Src folder in this project (currently ../Src).
4. Change line 4 to set the LIB variable to the relative path of the Src folder in this project followed by threadplusplus.a (currently ../Src/threadplusplus.a).
5. Save the file.
Type make all to create your executable file.