Work stealing

Does it possible that stealing has a positive meaning ? Maybe in real life no, but in programming work stealing brings more positive than negative points.

Looking for a better data engineering position and skills?

You have been working as a data engineer but feel stuck? You don't have any new challenges and are still writing the same jobs all over again? You have now different options. You can try to look for a new job, now or later, or learn from the others! "Become a Better Data Engineer" initiative is one of these places where you can find online learning resources where the theory meets the practice. They will help you prepare maybe for the next job, or at least, improve your current skillset without looking for something else.

👉 I'm interested in improving my data engineering skillset

See you there, Bartosz

With the article about executor services in Java we learned a little about work stealing in concurrency programs. This time we can focus more on this aspect and, in the first part, describe its basics. The second part will compare work stealing to another work schedulers.

What is work stealing ?

Work stealing is one of scheduling strategies. In parallel computing, scheduling means the work dispatching to available resources. Concretely, work stealing consists on dispatching not executed tasks to threads being in idle state.

First of all, scheduler has some threads where it can push tasks to execute. However, each thread can execute only one task at given moment. To avoid loosing of not currently executed tasks, these tasks are stored in queues belonging to threads. Once a thread ends to execute given task, it takes a new one from the queue.

But sometimes several threads can have their queues empty, for example, after executing their tasks quicker than another threads. In this case, work stealing model allows not working threads to "steal" tasks in queues belonging to other threads. By doing so, thread pool is always working optimally, mobilizing all available threads.

Work stealing vs work sharing

Work sharing is an opposite concept for work stealing. It's based on one centralized tasks pool where are stored tasks to execute. When one thread has nothing to do, it takes the next task from this pool and works on it. This strategy is called work sharing because all threads share the same working queue.

When one thread becomes in idle state, for example because of working on suspended task, coordinating thread creates new thread to compensate the loss. Idle thread can wake up at every moment. This will inevitably lead to the situation where the number of running threads will be bigger than the number of available processors.

Work stealing isn't impacted by tasks suspension. In the case of suspending a task, thread previously executing this task, will simply start to work on another task. This approach also reduces the dependencies between threads because most of time, threads work only on their internal queues.

Scheduling in parallel programming can be accomplished with a strategy called work stealing. It consists on automatic-management of threads tasks. When a thread has nothing to do, it picks up the not executed tasks of another thread, which has more work to do. Opposite strategy is called work sharing and it consists on sharing the same task pool by all working threads.

If you liked it, you should read:

📚 Newsletter Get new posts, recommended reading and other exclusive information every week. SPAM free - no 3rd party ads, only the information about waitingforcode!