Work stealing

Does it possible that stealing has a positive meaning ? Maybe in real life no, but in programming work stealing brings more positive than negative points.

Data Engineering Design Patterns

Looking for a book that defines and solves most common data engineering problems? I wrote one on that topic! You can read it online on the O'Reilly platform, or get a print copy on Amazon.

I also help solve your data engineering problems 👉 contact@waitingforcode.com 📩

With the article about executor services in Java we learned a little about work stealing in concurrency programs. This time we can focus more on this aspect and, in the first part, describe its basics. The second part will compare work stealing to another work schedulers.

What is work stealing ?

Work stealing is one of scheduling strategies. In parallel computing, scheduling means the work dispatching to available resources. Concretely, work stealing consists on dispatching not executed tasks to threads being in idle state.

First of all, scheduler has some threads where it can push tasks to execute. However, each thread can execute only one task at given moment. To avoid loosing of not currently executed tasks, these tasks are stored in queues belonging to threads. Once a thread ends to execute given task, it takes a new one from the queue.

But sometimes several threads can have their queues empty, for example, after executing their tasks quicker than another threads. In this case, work stealing model allows not working threads to "steal" tasks in queues belonging to other threads. By doing so, thread pool is always working optimally, mobilizing all available threads.

Work stealing vs work sharing

Work sharing is an opposite concept for work stealing. It's based on one centralized tasks pool where are stored tasks to execute. When one thread has nothing to do, it takes the next task from this pool and works on it. This strategy is called work sharing because all threads share the same working queue.

When one thread becomes in idle state, for example because of working on suspended task, coordinating thread creates new thread to compensate the loss. Idle thread can wake up at every moment. This will inevitably lead to the situation where the number of running threads will be bigger than the number of available processors.

Work stealing isn't impacted by tasks suspension. In the case of suspending a task, thread previously executing this task, will simply start to work on another task. This approach also reduces the dependencies between threads because most of time, threads work only on their internal queues.

Scheduling in parallel programming can be accomplished with a strategy called work stealing. It consists on automatic-management of threads tasks. When a thread has nothing to do, it picks up the not executed tasks of another thread, which has more work to do. Opposite strategy is called work sharing and it consists on sharing the same task pool by all working threads.

Consulting

With nearly 16 years of experience, including 8 as data engineer, I offer expert consulting to design and optimize scalable data solutions. As an O’Reilly author, Data+AI Summit speaker, and blogger, I bring cutting-edge insights to modernize infrastructure, build robust pipelines, and drive data-driven decision-making. Let's transform your data challenges into opportunities—reach out to elevate your data engineering game today!

👉 contact@waitingforcode.com
🔗 past projects