Operating Systems Part 1

Part 1: Virtualization

Virtualization: The Operating System takes a physical resource (e.g. the Processor, Memory) and transforms it into a more general, powerful, and easy-to-use virtual form of itself. Which is why we sometimes refer to the Operating System as a virtual machine.

In order for users like ourselves to use the Operating System the OS provides interfaces (APIs) that you can call called system calls (provides a few 100 of these) are available to applications. System Calls are used to run programs, access memory and devices, and other relatable actions. We sometimes say that the Operating System provides a standard library to applications.

Operating Systems role to manage resources (CPU, Memory, etc.).

Lets look at some examples:

CPU (Central Processing Unit)

The brain of the Computer.

This simple CPU simulator program below uses a function called Spin() which repeatedly checks the time and it returns once it has run for a second. Then after a second it will print out the string we passed in and repeats printing the string until the program is stopped.

#include <stdio.h>
#include <stdlib.h>
#include "common.h"

int main(int argc, char *argv[])
{
    if (argc != 2) {
	fprintf(stderr, "usage: cpu <string>\n");
	exit(1);
    }
    char *str = argv[1];

    while (1) {
	printf("%s\n", str);
	Spin(1);
    }
    return 0;
}

Lets run this program ./cpu hello and pass in a string

Output:

hello
hello
hello
hello
..

The program will keep running (Loops) until we stop it. control c to stop on MacBook.

We can also use the command ps aux, which is a tool that is used to monitor processes running on your system. We can we it with grep command which is a command-line utility for lines that match a regular expression.

In another terminal we an run this, where cpu is the name of our program we can:

ps aux | grep cpu

Output:

devinpowers       6893 100.0  0.0 408497216   1376 s002  R+   12:28PM   0:30.87 ./cpu devin
devinpowers       6905   0.0  0.0 408832480   1680 s000  U+   12:28PM   0:00.00 grep cpu

Now above we can see the proccesses running on our computer. Then we can use kill to stop the process from running, with 6893 being the name of the cpu process currently running.

kill -9 6893

After we can look at our original terminal that was executing/printing hello and we noticed that the process was killed:

zsh: killed     ./cpu hello

We can also run multiple CPU processes, for example:

Lets imagine our computer has a single CPU

./cpu "Devin" & ./cpu "Hello" &

Output:

[1] 7341
[2] 7342
Devin                                                                           
(base) devinpowers@Devins-MacBook-Pro intro % Hello
Devin
Hello
Devin
Hello
...
...

How does this work (running multiple processes) ? The Operating System with help from the hardware is in charge of a very large number of virtual CPUs, turning a single CPU into a seemingly infinite number of CPUs and thus allowing many programs to seemingly run at once is what we call Virtualizing the CPU.

Instructions

Instruction Cycle

Instruction Cycle

Machine Language instruction which are stored in memory (copied from hard disks to memory; think RAM when the program is loaded)

While the system is receiving power the control unit in the CPU is constantly performing the instruction cycle to fetch and execute instructions, as shown below:

Repeat: * Fetch the next instruction * Execute that instruction

In a Loop!

Instruction Trace

Linux Utility Programs

Linux environment includes a rich collection of utility programs which handle common tasks:

  • cp - copy a file
  • ls - list the contents of a directory
  • mv - rename a file
  • mkdir - Create a new directory

These utility programs use system calls and library functions to do the work.

Example: cp

cp Utility copies files from one to another.

Each of the utility programs is simply a C program

Linux Shells

Two main familes of shells:

Bourne Family

  • sh (Bourne Shell)
  • Bash
  • zsh (What I use on my MacBook)

C-shell Family

  • sch
  • tcsh

Executable Programs can be machine language programs and shell scripts

What are Shell Scripts are a sequence of commands that can be executed as a new command.

System Calls

What are System Calls? There are five types of System calls

Process Control These system calls deal with processes such as process creation, process termination etc.

File Management These system calls are responsible for file manipulation such as creating a file, reading a file, writing into a file etc.

Device Management These system calls are responsible for device manipulation such as reading from device buffers, writing into device buffers etc.

Information Maintenance These system calls handle information and its transfer between the operating system and the user program.

Communication These system calls are useful for interprocess communication. They also deal with creating and deleting a communication connection.

Virtual Memory

Remember Memory is just an array of bytes; to read memory, one must specify an address to be able to access the data stored there; to write (or update) memory, one must also specify the data to be written to the given address.

Memory is accessed at all the time when a program is running. Each instruction of the program is in memory as well.

Let’s run a program that allocates (creates) memory by calling malloc().

Memory.c

#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>
#include "common.h"

int main(int argc, char *argv[]) {
    if (argc != 2) { 
	fprintf(stderr, "usage: mem <value>\n"); 
	exit(1); 
    } 
    int *p; 
    p = malloc(sizeof(int));

    assert(p != NULL);

    printf("(%d) addr pointed to by p: %p\n", (int) getpid(), p);

    *p = atoi(argv[1]); // assign value to addr stored in p

    while (1) {
	Spin(1);   // Wait 1 second to print out to the screen.
	*p = *p + 1;
	printf("(%d) value of p: %d\n", getpid(), *p);
    }
    return 0;
}

Run ./memory

Output:

(2709) addr pointed to by p: 0x600000674020
(2709) value of p: 2
(2709) value of p: 3
(2709) value of p: 4
(2709) value of p: 5
(2709) value of p: 6

The program does a few things. First it allocates some memory p = malloc(sizeof(int));. It also prints out a number (2709), which is called the Process Identifier (the PID) of the running program, the PID is unique to each running process. Our newly allocated menory is at address 0x600000674020.

We can run Muliple Memory Programs Multiple Times as shown below:

./mem 1 & ./mem 1 &

Output:

(3955) addr pointed to by p: 0x600000e70020
(3956) addr pointed to by p: 0x600000e70020
(3955) value of p: 2
(3956) value of p: 2
(3955) value of p: 3
(3956) value of p: 3
(3955) value of p: 4
(3956) value of p: 4
...
...

We can see from the output above that the program has allocated memory at the same address 0x600000e70020 independently. It is as if each running program has its own private memory, instead of sharing the same physical memory with other running programs.

The Reality is that physical memory is a shared resource managed by the operating system.

Va.c Program

  • Show locations (Address Spaces) of the code, heap, and the stack
#include <stdio.h>
#include <stdlib.h>

int main(int argc, char *argv[]) {
    printf("location of code : %p\n", main);
    printf("location of heap : %p\n", malloc(100e6));
    int x = 3;
    printf("location of stack: %p\n", &x);
    return 0;
}

Run executable:

./Va

location of code : 0x102d6bec8
location of heap : 0x153800000
location of stack: 0x16d0975dc

I ran this code on my Mac M1 which we can see that the code comes first, then the heap, and finally the stack!