4 Problems with Processor Thread Programming

Threads are a useful way to get more out of your CPU. Technically known as a “thread of execution,” a thread is the smallest possible sequence of programmed instructions handled by a scheduler. Multiple threads can power a single process, although all must share memory resources.

The processor cores on your PC run threads; these cores have registers, small bits of data that can include a current executing instruction address (for example). When a core switches to another thread, it’s called a context switch: the thread’s current state is “saved” (i.e., stored) while another state is retrieved or resumed.

Many applications are single-threaded, running on one core. My PC’s CPU, an Intel i7-5930K, has 6 cores—a bit like a V6 engine in a car, metaphorically speaking. Each core can also run two threads at the same time, via a process known as hyper-threading. If I build an application that just runs on one thread, I’m using 1/12th of my CPU’s potential (i.e., 8 percent).

Multi-threading means writing your application to run on many cores, and on one or more threads per core—when done well, it can result in a big difference in performance. That being said, it’s not the easiest thing to do.

For example, I once wrote a C++ program to evaluate poker hands, reading in five-card hands from a text file with a million randomly generated hands; it took about 3 micro-seconds to evaluate each hand, so a million hands took just over three seconds, and that was running single-threaded. When I made it multi-threaded, it took 4 seconds, because the million context switches added a second! After slowing hand evaluation down by adding a single millisecond delay per hand, it now takes 35 seconds to process 20,000 hands when running single-threaded—but the multi-threaded version is much quicker, taking 3.5 seconds in total. The 12 cores together run the application about 10 times faster. The lesson is that the time needed to do context switches becomes less important with longer-running applications.

Multi-threading is useful for background computations and updating the Graphical User Interface (GUI) while fetching or sending data to slower devices such as network cards, disk drives and so on. With all that being said, thread programming presents some real challenges. Here are five:

Shared Access to Data

If two threads access a shared variable without any kind of guard, writes to that variable can overlap. Let’s say both threads add 1 to the same memory location; they do this by reading the value in the memory location into a register, incrementing it, then writing it back. Thread 1 (T1) reads the value (let’s say ‘0’ initially), increments it, and writes 1 back. After T1 has read it, but before it writes the value back, T2 reads the 0 value, and then writes 1 back after T1. So despite two increments, the value is 1, when it should be 2.

The access has to be atomic, i.e., done in one operation, in order to prevent overlapping writes. This can be done in a number of ways. The .NET framework has an Interlocked Class with atomic increment and decrement methods; Java has java.util.concurrent.atomic.AtomicInteger with increment and decrement methods.

Locks Can Cause Performance Issues

A lock instruction prefix applies to certain CPU instructions that read-modify-write memory; for instance, INC, XCHG, and others. The core has exclusive ownership of the appropriate cache line, which is typically 64 bytes long. If the core knows that memory address is in a cache, then it reads from that instead of memory, which is much slower (it reads in 64 bytes at once from the cache; this is known as the cache line).

The lock can also stop overlapping increments (as detailed earlier) and the CPU executing instructions out of order, as well as provide atomic access. If another processor is accessing the same cache line, then it can lead to a situation called False Sharing, which impacts performance. That’s more of a problem for C/C++ code than .NET or Java.

There are other locking mechanisms; the following code from Wikipedia shows C# code doing a lock to each call of Account.deposit() and to Account.Withdraw():

class Account {    // this is a monitor of an account
    long val = 0;
    object thisLock = new object();
    public void deposit(const long x) {
        lock(thisLock) {    // only one thread at a time may execute this statement
            val += x;
        }
    }
    public void withdraw(const long x) {
        lock(thisLock) {    // only one thread at a time may execute this statement
            val -= x;
        }
    }
}

 

One way to avoid performance problems with locks is lock-free programming. For example, the University of Cambridge’s Systems Research Group provides libraries of concurrent safe lock-free object-based software transactional memory, multi-word compare-and-swap, and a range of search structures (skip lists, binary search trees, red-black trees).

Exceptions in Threads Can Cause Problems

In .NET and Java, exceptions in threads must be handled by exception handlers within the thread code, or the application will close down. Both .NET and Java can catch unhandled (“uncaught,” in Java lingo) exceptions that come out of a thread via a special handler, which lets you log the exception. If that happens, your application is still likely to be in a bad state.

Background Threads Need Care When Updating a GUI

Performing any non-trivial processing in a GUI thread is likely to make it unresponsive; best practice is to process in another thread. Background threads (or .NET background workers) can run tasks in the background, but they need some care when updating the GUI thread. In the pre-Tasks era in .NET applications, you could perform this in WinForms, using Invoke on a control such as a label on a form.

string Text = "Some value";
form.Label.Invoke((MethodInvoker)delegate {
    form.Label.Text = Text;
});

Xamarin, C# on iOS/Android supports InvokeOnMainThread(), which runs the Lambda expression in the main thread:

  InvokeOnMainThread(() => Label.Text = Text;

A Way to Avoid Many Thread Problems

Many threading problems center on accessing or sharing data between threads. One way to avoid this is to use a messaging system, which provides a robust way of storing and delivering messages between two endpoints; these could be two parts of the same application or two applications running on different networked machines. After sending, the message is stored by the messaging system until it can be delivered.

For instance, the open-source RabbitMQ messaging broker, written in Erlang, lets you send tens of thousands of messages per second. (If you want to see what’s available, try this list of ten open-source messaging libraries.)

Windows has a component, MSMQ, which likewise provides a messaging service. Creating a message is as simple as this code below:

var srm = new SendReceiveMessage ();
If (srm.CreateQueue(@“Private$\Test1”,”Test”)) {
   If (srm.SendMsg(“Hello World!”) {
   Console.WriteLine(“Message sent ok.”);

 

To get this to compile, you will have to add a reference to System.Messaging in the Solution references:

using System;
using System.Messaging;

namespace SendMessage
{
    [Serializable]
    public sealed class SimpleMessage
    {
        public TimeSpan LifeInterval { get; set; }
        public DateTime BornPoint { get; set; }
        public string Text { get; set; }
    }

    class SendReceiveMessage
    {
        MessageQueue messageQueue = null;

        public Boolean CreateQueue(string queuename,string Description)
        {

            if (!MessageQueue.Exists(queuename))
            {
                try
                {
                    MessageQueue.Create(queuename);
                }
                catch (Exception ex)
                {
                    // do something with exception
                    return false;
                }
            }
            messageQueue = new MessageQueue(queuename);
            messageQueue.Label = "Test Queue";
            return true;
        }

        public Boolean SendMsg(string messagetext)
        {
            var m1 = new SimpleMessage();
            m1.BornPoint = DateTime.Now;
            m1.LifeInterval = TimeSpan.FromDays(7); // A week to deliver
            m1.Text = messagetext;

            try
            {
                messageQueue?.Send(m1);  // only call if queue exists
                return true;
            }
            catch (Exception ex)
            {
                // do something with exception
                return false;
            }
        }
    }
}


Conclusion

Consider using Tasks instead of threads (if available). Tasks typically use Threadpools and manage all the threads for you. For example, the .NET Task Parallel Library (TPL) uses a Threadpool, and you never worry about threads at all. Given that threads have an overhead of about 1 MB of RAM when first created, a Threadpool reuses threads as needed without demanding explicit thread creation (an ExecutorService in Java does a similar job of managing a pool of threads).

Related