WTF is a Thread

What is a thread? Before we can talk about a thread we have to understand a process. A process is an instance
of an executing program, at it’s core, a process is
essentially a blob of memory. It has code that needs to be executed, it has data such as variables, it has a register, and a stack to keep track of the order of execution. The code is going to be compiled into instructions that can run on our CPU. So here’s an example of a program in C. When we compile this it’s
going to turn into instructions that are gonna look a
little bit like this. As it runs, instructions
are pulled off of memory and executed on the CPU. There are two really
important parts of a process. There is a program counter. The program counter keeps track of where the process is currently executing. Imagine the program counter like an arrow pointing at an instruction set. As the program runs this
location is kept by the hardware on a register, on the CPU. When the program pauses or is
preempted by another program, we have to store that counter somewhere. There’s also a thing called the stack. This keeps track of the
depth of our program. As we’re executing
here, we will eventually call this increment function and the current context is added onto the stack. As the increment function is called, it will eventually have to return. And whenever we return, where does it go? The program looks on the stack and see where we were currently executing and uses that to determine where
to resume execution. Also, a special thanks to Julia Evans, you can find this code and
this example on her blog,, as well as a
really good blog post describing what exactly stacks are and how they’re implemented. Thank you. At this point we’ve got
roughly everything we need to start and stop a program including the program counter and the stack pointer. We need a way to store all of this data, and so it goes into a struct
in the operating system called the process control block or PCB. This includes process
state, process number, program counter, registers, memory limits, list of open files, signal mask, and CPU scheduling information. In addition to some other things. Also, a fun fact, since a
process is entirely described by a PCB, whenever we
wanna fork a process, essentially all we have to do is copy a parent process control block to a child process control block. So I thought that was kind of interesting. So a process can do, pretty
much, two types of tasks. The first type of task
is a CPU bound task. When the program is running and spending the majority of it’s time
actually executing code. An example would be, processing data or calculating prime numbers. The second type of a task a process can do is called an IO task. An example of an IO task
would be, reading from a file from disk or making a database call or making a network call, for
example, to the Heroku API. As you can guess, from
my use of emojis here, that whenever we’re doing an IO heavy task the process doesn’t do much. It makes the request and
then it just sits there and waits for data to come back. So that it can wake up and
start doing work again. This matters because
we pay for our CPU time and we don’t want our program
to not be doing anything or we don’t want it to just
be sitting there sleeping and waiting on data. And so we wanna maximize
our CPU utilization. So how exactly do we do that? If we have one CPU heavy process
and one CPU then we’re set. The CPU is working at maximum speed. But what happens when we start to see IO? In this example, our program
is just sitting there and sleeping, our CPU is
sitting there and sleeping for a third of the time. And we’re no longer using
our CPU efficiently. Well we can fix this by
adding an additional process. So in this example, while one
of our process is sleeping, the other process is running. So, aside from the context switch we are pretty much using our
CPU 100% of the time. All right so, what are the pros? Whenever we’re using multiple
processes to utilize our CPU then, well, we get better CPU utilization. So that’s good. What are the cons? Unfortunately, to do this,
we need a lot of processes. And as processes take up memory,
which is a finite resource. And loading and unloading
a PCB is expensive. Sharing data between
processes is also hard, you can do things like map
memory to different processes, as well as sharing data
via sockets and in general, it’s kind of so difficult that most people don’t really do it. Wouldn’t it be cool if
there was some kind of really lightweight process,
that uses less memory, something with smaller
time to context switch? Well, we have it, it’s called threads. So previously we looked at a process and this is everything
we need to run a program. With a thread we can share code
and we can also share data. Really, with a thread, all we need is a new stack and a new register. So this is kind of an example of what a single threaded
process would look like. It’s a process and inside
of it we have a thread, and that thread has a
register and a stack. And if we want to have multiple
threads on our process, then we don’t have to duplicate
that code or the data, we just need new registers and new stacks for each of our threads. All right, so what are the
pros of using multiple threads? Well, like before we get
much better CPU utilization. Unlike before we can reuse
existing process memory. Which means that we don’t
have to duplicate that code and we don’t have to duplicate that data. And it means that we’re not
gonna run out of memory, or at least not nearly as fast. It also means that we have a
smaller context switch time between the different threads
as opposed to having the context switch between
different processes. Just because it’s more lightweight. We can also reuse cached value,
such as memory look ups. All right, now there has
to be a downside to this and the downside actually, kind of looks a lot like the upside. Shared code and shared
data inside of our thread does make things faster and it does make things more light weight. Unfortunately, it means it’s
also much harder to work with. Now we have to consider, how will other threads be accessing our data. And we have to introduce
new constructs like mutexes and other forms of synchronization
and condition variables. So the core value add of threads
is having that shared code. That’s basically all a
thread is, is just a process with shared code but
unfortunately, by doing that, we also add a lot of complexities. So hopefully today you have
better understood exactly what a thread is and it’s not as mythical and not as
difficult to understand. And unfortunately you’ve also understood maybe a little bit about
why exactly threads are difficult to use and why they’re hard. But also why we can’t
necessarily make them any easier. Thank you very much for joining me.

Leave a Reply

Your email address will not be published. Required fields are marked *