CPU-Bound Versus I/O-Bound Threads

Threads of execution tend to be either CPU-bound or I/O-bound (Input/Output bound). That is, some threads spend a lot of time using the CPU to do computations, and others spend a lot of time waiting for relatively slow I/O operations to complete. For example - a thread that is sequencing DNA will be CPU bound. A thread taking input for a word processing program will be I/O-bound as it spends most of its time waiting for a human to type. It is not always clear whether a thread should be considered CPU or I/O bound. The best a scheduler can do is guess, if it cares at all. Many schedulers do care about whether or not a thread should be considered CPU or I/O bound, and thus techniques for classifying threads as one or the other are important parts of schedulers.

Schedulers tend to give I/O-bound threads priority access to CPUs. Programs that accept human input tend to be I/O-bound - even the fastest typist has a considerable amount of time between each keystroke during which the program he or she is interacting with is simply waiting. It is important to give programs that interact with humans priority since a lack of speed and responsiveness is more likely to be perceived when a human is expecting an immediate response than when a human is waiting for some large job to finish.

It is also beneficial to the system as a whole to give priority to programs that are I/O-bound but not because of human input2. Because I/O operations usually take a long time it is good to get them started as early as possible. For example, a program that needs a piece of data from a hard disk has a long wait ahead before it gets its data. Kicking off the data request as quickly as possible frees up the CPU to work on something else during the request and helps the program that submitted the data request to be able to move on as quickly as possible. Essentially, this comes down to parallelizing system resources as efficiently as possible. A hard drive can seek data while a CPU works on something else, so having both resources working as early and often as possible is beneficial. Many CPU operations can be performed in the time it takes to get data from a hard drive.


2It is fortunate that both human-interactive and non-human-interactive I/O activity should be awarded a higher priority since there is really no way to tell at the scheduler level what I/O was human-initiated and what was not. The scheduler does not know whether a program is blocked waiting for keyboard input or it is blocked waiting for data from a hard drive.

-- Josh Aas

from "Understanding the Linux CPU Scheduler"

Quoted on Sun Apr 7th, 2013