Everything Linux: The Process

Evan Wireman
5 min readJul 6, 2021
Image from Amirali Mirhashemian on Unsplash.com

Ahh, the process. The first of many abstractions we will go over in this series, and the most fundamental piece of any operating system. Informally, a process can be looked at as a running program. When you sit down and type code, you are writing a program. Once you run your code, the active, sequential execution of your program is called a process.

A process needs a few things in order to execute. Namely, it needs access to memory and access to the CPU. Memory is necessary because, when a process is executing, the code is loaded into memory during the lifespan of the process. Access to the CPU is necessary because this is the piece of the computer that can “understand” the program and execute each instruction.

Now, I want to take you through an interactive exercise. If you are on Mac or a Linux machine, open up your terminal and run

ps aux

If you are on Windows, I do not know the command to view all active processes, maybe in task manager? Either way, the amount of active running processes on your computer is almost definitely going to be larger than 20.

Now, again if you are on Mac or Linux, in your terminal type

getconf _NPROCESSORS_ONLN

This will show how many processors are in your system. On Windows, I believe you can view this in task manager.

Either way, it is very likely that you have far less processors than active processes, which opens up our first big question about Linux:

How Can we Provide the Illusion of Many CPU’s?

Linux creates this illusion by virtualizing the CPU. Essentially, the OS has the power to stop the execution of one process and begin executing another. While a process is not executing, it is unaware that the CPU is being used elsewhere. However, when it is using the CPU, it assumes it is the only active process, taking full access of the CPU.

The core of this virtualization system is the OS’s ability to switch between processes. There are two tools that are used to achieve this:

  • Dispatcher: A low-level method that performs a context switch (discussed in depth later)
  • Scheduler: Higher-level logic that decides which process should run when

Both of these will be covered in more depth later. However, it is worth noting that both systems need to know the state of a process in order to do their jobs.

Process States

Generally, processes can be in one of three states:

  • Running: Where a process is loaded on the processor, and instructions are actively being executed.
  • Ready: In this state, a process is prepared to execute, but is not currently running on a processor.
  • Blocked: In this state, the process has done something that prevents it from being ready to run. A prime example of this is I/O. If a process reads information from a disk, it typically cannot continue execution until this I/O is complete, so it becomes blocked so some other process has access to the CPU.

The state of a process can be found in its Process Control Block (PCB). Every process has a PCB, and it contains the following information:

struct proc {
char* mem; // The start of process memory
enum proc_state state; // The state of the process
int pid; // The Process ID
struct context context; // Switch here to run process
...
}
// Side note: any code written in this series will be in C.
// If you would like me to post a C tutorial series, leave
// a comment.

There is more that exists within the PCB, but the four values I listed are most important for the purposes of this series. Specifically, we will look more into the context structure in the next article.

Aside from this proc structure, all processes have an address space. An address space is essentially a body of memory that is made available to a process. Address spaces are another example of an abstraction, one we will discuss further into this series.

Interacting With Linux: The Process API

The process API is essentially the interface the Linux provides when working with processes. This interface contains a few system calls:

fork()

This system call creates a new process. This new process is referred to as the child process while the process that called the fork() system call is referred to as the parent process. If you have ever heard the term fork bomb, it is due to the fork() system call being called too many times, flooding the system with active processes. Thus, as you likely assumed, we need a way to stop active processes.

kill()

The kill() system call sends a signal to a process. One would pass in the PID of the process they want to terminate as a parameter in the kill() system call, and that process would be stopped, regardless of whether it was finished executing. The kill() system call can be used for more than just killing processes, there are other signals that can be useful to send to a process. But what if we want a process to finish executing on its own, without killing it?

wait()

The wait() system call is used by the parent process to, as you may have guessed, wait for the child process to finish executing. This is useful if you need the return value from a child process. However, as of right now, the child process will execute the same instructions as the parent. Think about it, we called fork() but never specified new code for the child process. This is why we need our final system call.

exec()

The final system call that is worth mentioning in this article is exec(). You would call exec() in the child process and provide the name of some executable, and Linux would know to load the code from the executable and overwrite the current code segment.

Key Take-Aways

  • A program and a process are two different things
  • In order to have multiple processes running at once, we virtualize the CPU
  • A process is always either Running, Ready, or Blocked
  • There is an API that allows for process creation and interaction

Thank you for reading. As always, if you notice something I got wrong, feel free to leave a comment. In the next article, I will cover the scheduler and introduce some queueing theory basics.

Happy Hacking :)

--

--

Evan Wireman

Graduate computer science student with a passion for low-level systems.