Operating Systems Part 1
Part 1: Virtualization
Virtualization: The Operating System takes a physical resource (e.g. the Processor, Memory) and transforms it into a more general, powerful, and easy-to-use virtual form of itself. Which is why we sometimes refer to the Operating System as a virtual machine.
In order for users like ourselves to use the Operating System the OS provides interfaces (APIs) that you can call called system calls (provides a few 100 of these) are available to applications. System Calls are used to run programs, access memory and devices, and other relatable actions. We sometimes say that the Operating System provides a standard library to applications.
Operating Systems role to manage resources (CPU, Memory, etc.).
Lets look at some examples:
CPU (Central Processing Unit)
The brain of the Computer.
This simple CPU simulator program below uses a function called Spin()
which repeatedly checks the time and it returns once it has run for a second. Then after a second it will print out the string we passed in and repeats printing the string until the program is stopped.
#include <stdio.h>
#include <stdlib.h>
#include "common.h"
int main(int argc, char *argv[])
{
if (argc != 2) {
fprintf(stderr, "usage: cpu <string>\n");
exit(1);
}
char *str = argv[1];
while (1) {
printf("%s\n", str);
Spin(1);
}
return 0;
}
Lets run this program ./cpu hello
and pass in a string
Output:
hello
hello
hello
hello
..
The program will keep running (Loops) until we stop it. control c
to stop on MacBook.
We can also use the command ps aux
, which is a tool that is used to monitor processes running on your system. We can we it with grep
command which is a command-line utility for lines that match a regular expression.
In another terminal we an run this, where cpu is the name of our program we can:
ps aux | grep cpu
Output:
devinpowers 6893 100.0 0.0 408497216 1376 s002 R+ 12:28PM 0:30.87 ./cpu devin
devinpowers 6905 0.0 0.0 408832480 1680 s000 U+ 12:28PM 0:00.00 grep cpu
Now above we can see the proccesses running on our computer. Then we can use kill
to stop the process from running, with 6893 being the name of the cpu process currently running.
kill -9 6893
After we can look at our original terminal that was executing/printing hello
and we noticed that the process was killed:
zsh: killed ./cpu hello
We can also run multiple CPU processes, for example:
Lets imagine our computer has a single CPU
./cpu "Devin" & ./cpu "Hello" &
Output:
[1] 7341
[2] 7342
Devin
(base) devinpowers@Devins-MacBook-Pro intro % Hello
Devin
Hello
Devin
Hello
...
...
How does this work (running multiple processes) ? The Operating System with help from the hardware is in charge of a very large number of virtual CPUs, turning a single CPU into a seemingly infinite number of CPUs and thus allowing many programs to seemingly run at once is what we call Virtualizing the CPU.
Instruction Cycle
Machine Language instruction which are stored in memory (copied from hard disks to memory; think RAM when the program is loaded)
While the system is receiving power the control unit in the CPU is constantly performing the instruction cycle
to fetch and execute instructions, as shown below:
Repeat: * Fetch the next instruction * Execute that instruction
In a Loop!
Instruction Trace
Linux Utility Programs
Linux environment includes a rich collection of utility programs which handle common tasks:
- cp - copy a file
- ls - list the contents of a directory
- mv - rename a file
- mkdir - Create a new directory
These utility programs use system calls
and library functions
to do the work.
Example: cp
cp Utility copies files from one to another.
Each of the utility programs is simply a C program
Linux Shells
Two main familes of shells:
Bourne Family
- sh (Bourne Shell)
- Bash
- zsh (What I use on my MacBook)
C-shell Family
- sch
- tcsh
Executable Programs can be machine language programs and shell scripts
What are Shell Scripts are a sequence of commands that can be executed as a new command.
System Calls
What are System Calls? There are five types of System calls
Process Control These system calls deal with processes such as process creation, process termination etc.
File Management
These system calls are responsible for file manipulation such as creating a file
, reading a file
, writing into a file
etc.
Device Management These system calls are responsible for device manipulation such as reading from device buffers, writing into device buffers etc.
Information Maintenance These system calls handle information and its transfer between the operating system and the user program.
Communication These system calls are useful for interprocess communication. They also deal with creating and deleting a communication connection.
Virtual Memory
Remember Memory is just an array of bytes; to read memory, one must specify an address to be able to access the data stored there; to write (or update) memory, one must also specify the data to be written to the given address.
Memory is accessed at all the time when a program is running. Each instruction of the program is in memory as well.
Let’s run a program that allocates (creates) memory by calling malloc().
Memory.c
#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>
#include "common.h"
int main(int argc, char *argv[]) {
if (argc != 2) {
fprintf(stderr, "usage: mem <value>\n");
exit(1);
}
int *p;
p = malloc(sizeof(int));
assert(p != NULL);
printf("(%d) addr pointed to by p: %p\n", (int) getpid(), p);
*p = atoi(argv[1]); // assign value to addr stored in p
while (1) {
Spin(1); // Wait 1 second to print out to the screen.
*p = *p + 1;
printf("(%d) value of p: %d\n", getpid(), *p);
}
return 0;
}
Run ./memory
Output:
(2709) addr pointed to by p: 0x600000674020
(2709) value of p: 2
(2709) value of p: 3
(2709) value of p: 4
(2709) value of p: 5
(2709) value of p: 6
The program does a few things. First it allocates some memory p = malloc(sizeof(int));
. It also prints out a number (2709)
, which is called the Process Identifier (the PID) of the running program, the PID is unique to each running process. Our newly allocated menory is at address 0x600000674020
.
We can run Muliple Memory Programs Multiple Times as shown below:
./mem 1 & ./mem 1 &
Output:
(3955) addr pointed to by p: 0x600000e70020
(3956) addr pointed to by p: 0x600000e70020
(3955) value of p: 2
(3956) value of p: 2
(3955) value of p: 3
(3956) value of p: 3
(3955) value of p: 4
(3956) value of p: 4
...
...
We can see from the output above that the program has allocated memory at the same address 0x600000e70020
independently. It is as if each running program has its own private memory, instead of sharing the same physical memory with other running programs.
The Reality is that physical memory is a shared resource managed by the operating system.
Va.c Program
- Show locations (Address Spaces) of the code, heap, and the stack
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char *argv[]) {
printf("location of code : %p\n", main);
printf("location of heap : %p\n", malloc(100e6));
int x = 3;
printf("location of stack: %p\n", &x);
return 0;
}
Run executable:
./Va
location of code : 0x102d6bec8
location of heap : 0x153800000
location of stack: 0x16d0975dc
I ran this code on my Mac M1 which we can see that the code comes first, then the heap, and finally the stack!