3.2 Tasks of an Operating System

When you turn on the power to a computer, the first program that runs is usually a set of instructions kept in the computer’s read-only memory (ROM). This code examines the system hardware to make sure everything is functioning properly. This power-on self test (POST) checks the CPU, memory, and basic input-output systems (BIOS) for errors and stores the result in a special memory location. Once the POST has successfully completed, the software loaded in ROM (sometimes called the BIOS or firmware) will begin to activate the computer’s disk drives. In most modern computers, when the computer activates the hard disk drive, it finds the first piece of the operating system: the bootstrap loader.

The bootstrap loader is a small program that has a single function: It loads the operating system into memory and allows it to begin operation. In the most basic form, the bootstrap loader sets up the small driver programs that interface with and control the various hardware subsystems of the computer. It sets up the divisions of memory that hold the operating system, user information and applications. It establishes the data structures that will hold the myriad signals, flags and semaphores that are used to communicate within and between the subsystems and applications of the computer. Then it turns control of the computer over to the operating system.

The operating system’s tasks, in the most general sense, fall into six categories:

(1) Processor management;

(2) Memory management;

(3) Device management;

(4) Storage management;

(5) Application interface;

(6) User interface.

While there are some who argue that an operating system should do more than these six tasks, and some operating-system vendors do build many more utility programs and auxiliary functions into their operating systems, these six tasks define the core of nearly all operating systems. Next, let’s look at the tools the operating system uses to perform each of these functions.

3.2.1 Processor Management

The heart of managing the processor comes down to two related issues:

(1) Ensuring that each process and application receives enough of the processor’s time to function properly.

(2) Using as many processor cycles as possible for real work.

The basic unit of software that the operating system deals with in scheduling the work done by the processor is either a process or a thread, depending on the operating system.

It’s tempting to think of a process as an application, but that gives an incomplete picture of how processes relate to the operating system and hardware. The application you see (word processor, spreadsheet or game) is, indeed, a process, but that application may cause several other processes to begin, for tasks like communications with other devices or other computers. There are also numerous processes that run without giving you direct evidence that they ever exist. For example, Windows XP and UNIX can have dozens of background processes running to handle the network, memory management, disk management, virus checks and so on.

A process, then, is software that performs some action and can be controlled—by a user, by other applications or by the operating system.

It is processes, rather than applications, that the operating system controls and schedules for execution by the CPU. In a single-tasking system, the schedule is straightforward. The operating system allows the application to begin running, suspending the execution only long enough to deal with interrupts and user input.

(3-2) Interrupts are special signals sent by hardware or software to the CPU. It’s as if some part of the computer suddenly raised its hand to ask for the CPU’s attention in a lively meeting. Sometimes the operating system will schedule the priority of processes so that interrupts are masked—that is, the operating system will ignore the interrupts from some sources so that a particular job can be finished as quickly as possible. There are some interrupts (such as those from error conditions or problems with memory) that are so important that they can’t be ignored. These non-maskable interrupts (NMIs) must be dealt with immediately, regardless of the other tasks at hand.

While interrupts add some complication to the execution of processes in a single-tasking system, the job of the operating system becomes much more complicated in a multi-tasking system. Now, the operating system must arrange the execution of applications so that you believe that there are several things happening at once. This is complicated because the CPU can only do one thing at a time. Today’s multi-core processors and multi-processor machines can handle more work, but each processor core is still capable of managing one task at a time.

In order to give the appearance of lots of things happening at the same time, the operating system has to switch between different processes thousands of times a second. Here’s how it happens:

(1) A process occupies a certain amount of RAM. It also makes use of registers, stacks and queues within the CPU and operating-system memory space.

(2) When two processes are multi-tasking, the operating system allots a certain number of CPU execution cycles to one program.

(3) After that number of cycles, the operating system makes copies of all the registers, stacks and queues used by the processes, and notes the point at which the process paused in its execution.

(4) It then loads all the registers, stacks and queues used by the second process and allows it a certain number of CPU cycles.

(5) When those are complete, it makes copies of all the registers, stacks and queues used by the second program, and loads the first program.

3.2.2 Process Management

All of the information needed to keep track of a process when switching is kept in a data package called a process control block. The process control block typically contains:

(1) An ID number that identifies the process.

(2) Pointers to the locations in the program and its data where processing last occurred.

(3) Register contents.

(4) States of various flags and switches.

(5) Pointers to the upper and lower bounds of the memory required for the process.

(6) A list of files opened by the process.

(7) The priority of the process.

(8) The status of all I/O devices needed by the process.

Each process has a status associated with it. Many processes consume no CPU time until they get some sort of input. For example, a process might be waiting for a keystroke from the user. While it is waiting for the keystroke, it uses no CPU time. While it’s waiting, it is “suspended”. When the keystroke arrives, the OS changes its status. When the status of the process changes, from pending to active, for example, or from suspended to running, the information in the process control block must be used like the data in any other program to direct execution of the task-switching portion of the operating system.

This process swapping happens without direct user interference, and each process gets enough CPU cycles to accomplish its task in a reasonable amount of time.

(3-3) Trouble can begin if the user tries to have too many processes functioning at the same time. The operating system itself requires some CPU cycles to perform the saving and swapping of all the registers, queues and stacks of the application processes. If enough processes are started, and if the operating system hasn’t been carefully designed, the system can begin to use the vast majority of its available CPU cycles to swap between processes rather than run processes. When this happens, it’s called thrashing, and it usually requires some sort of direct user intervention to stop processes and bring order back to the system.

One way that operating-system designers reduce the chance of thrashing is by reducing the need for new processes to perform various tasks. Some operating systems allow for a “process-lite”, called a thread, which can deal with all the CPU-intensive work of a normal process, but generally does not deal with the various types of I/O and does not establish structures requiring the extensive process control block of a regular process. A process may start many threads or other processes, but a thread cannot start a process.

So far, all the scheduling we’ve discussed has concerned a single CPU. In a system with two or more CPUs, the operating system must divide the workload among the CPUs, trying to balance the demands of the required processes with the available cycles on the different CPUs. Asymmetric operating systems use one CPU for their own needs and divide application processes among the remaining CPUs. Symmetric operating systems divide themselves among the various CPUs, balancing demand versus CPU availability even when the operating system itself is all that’s running.

If the operating system is the only software with execution needs, the CPU is not the only resource to be scheduled. Memory management is the next crucial step in making sure that all processes run smoothly.

3.2.3 Memory and Storage Management

When an operating system manages the computer’s memory, there are two broad tasks to be accomplished:

(1) Each process must have enough memory in which to execute, and it can neither run into the memory space of another process nor be run into by another process.

(2) The different types of memory in the system must be used properly so that each process can run most effectively.

The first task requires the operating system to set up memory boundaries for types of software and for individual applications.

As an example, let’s look at an imaginary small system with 1 megabyte (1024 kilobyte) of RAM. During the boot process, the operating system of our imaginary computer is designed to go to the top of available memory and then “back up” far enough to meet the needs of the operating system itself. Let’s say that the operating system needs 300 kilobytes to run. Now, the operating system goes to the bottom of the pool of RAM and starts building up with the various driver software required to control the hardware subsystems of the computer. In our imaginary computer, the drivers take up 200 kilobytes. So after getting the operating system completely loaded, there are 500 kilobytes remaining for application processes.

When applications begin to be loaded into memory, they are loaded in block sizes determined by the operating system. If the block size is 2 kilobytes, then every process that’s loaded will be given a chunk of memory that’s a multiple of 2 kilobytes in size. Applications will be loaded in these fixed block sizes, with the blocks starting and ending on boundaries established by words of 4 or 8 bytes. These blocks and boundaries help to ensure that applications won’t be loaded on top of one another’s space by a poorly calculated bit or two. With that ensured, the larger question is what to do when the 500-kilobyte application space is filled.

In most computers, it’s possible to add memory beyond the original capacity. For example, you might expand RAM from 1 to 2 gigabytes. This works fine, but can be relatively expensive. It also ignores a fundamental fact of computing—most of the information that an application stores in memory is not being used at any given moment. A processor can only access memory one location at a time, so the vast majority of RAM is unused at any moment. Since disk space is cheap compared to RAM, then moving information in RAM to hard disk can greatly expand RAM space at no cost. This technique is called virtual memory management.

Disk storage is only one of the memory types that must be managed by the operating system, and it’s also the slowest. Ranked in order of speed, the types of memory in a computer system are:

(1) High-speed cache—This is fast, relatively small amounts of memory that are available to the CPU through the fastest connections. Cache controllers predict which pieces of data the CPU will need next and pull it from main memory into high-speed cache to speed up system performance.

(2) Main memory—This is the RAM that you see measured in megabytes when you buy a computer.

(3) Secondary memory—This is most often some sort of rotating magnetic storage that keeps applications and data available to be used, and serves as virtual RAM under the control of the operating system.

The operating system must balance the needs of the various processes with the availability of the different types of memory, moving data in blocks (called pages) between available memory as the schedule of processes dictates, see Fig.3-2.

Fig.3-2 Relations of different types of memory in a computer system

3.2.4 Device Management

The path between the operating system and virtually all hardware not on the computer’s motherboard goes through a special program called a driver. Much of a driver’s function is to be the translator between the electrical signals of the hardware subsystems and the high-level programming languages of the operating system and application programs. Drivers take data that the operating system has defined as a file and translate them into streams of bits placed in specific locations on storage devices, or a series of laser pulses in a printer.

Because there are such wide differences in the hardware, there are differences in the way that the driver programs function. Most run when the device is required, and function much the same as any other process. The operating system will frequently assign high-priority blocks to drivers so that the hardware resource can be released and readied for further use as quickly as possible.

(3-4) One reason that drivers are separate from the operating system is so that new functions can be added to the driver—and thus to the hardware subsystems—without requiring the operating system itself to be modified, recompiled and redistributed. Through the development of new hardware device drivers, development often performed or paid for by the manufacturer of the subsystems rather than the publisher of the operating system, input/output capabilities of the overall system can be greatly enhanced.

Managing input and output is largely a matter of managing queues and buffers, special storage facilities that take a stream of bits from a device, perhaps a keyboard or a serial port, hold those bits, and release them to the CPU at a rate with which the CPU can cope. This function is especially important when a number of processes are running and taking up processor time. The operating system will instruct a buffer to continue taking input from the device, but to stop sending data to the CPU while the process using the input is suspended. Then, when the process requiring input is made active once again, the operating system will command the buffer to send data. This process allows a keyboard or a modem to deal with external users or computers at a high speed even though there are times when the CPU can’t use input from those sources.

Managing all the resources of the computer system is a large part of the operating system’s function and, in the case of real-time operating systems, may be virtually all the functionality required. For other operating systems, though, providing a relatively simple, consistent way for applications and humans to use the power of the hardware is a crucial part of their reason for existing.

3.2.5 Application Interface

Just as drivers provide a way for applications to make use of hardware subsystems without having to know every detail of the hardware’s operation, application program interfaces (APIs) let application programmers use functions of the computer and operating system without having to directly keep track of all the details in the CPU’s operation. Let’s look at the example of creating a hard disk file for holding data to see why this can be important.

A programmer writing an application to record data from a scientific instrument might want to allow the scientist to specify the name of the file created. The operating system might provide an API function named MakeFile for creating files. When writing the program, the programmer would insert a line that looks like this:

MakeFile [1, %Name, 2]

In this example, the instruction tells the operating system to create a file that will allow random access to its data (signified by the 1—the other option might be 0 for a serial file), will have a name typed in by the user (%Name) and will be a size that varies depending on how much data is stored in the file (signified by the 2—other options might be zero for a fixed size, and 1 for a file that grows as data is added but does not shrink when data is removed). Now, let’s look at what the operating system does to turn the instruction into action.

The operating system sends a query to the disk drive to get the location of the first available free storage location.

With that information, the operating system creates an entry in the file system showing the beginning and ending locations of the file, the name of the file, the file type, whether the file has been archived, which users have permission to look at or modify the file, and the date and time of the file’s creation.

The operating system writes information at the beginning of the file that identifies the file, sets up the type of access possible and includes other information that ties the file to the application. In all of this information, the queries to the disk drive and addresses of the beginning and ending point of the file are in formats heavily dependent on the manufacturer and model of the disk drive.

Because the programmer has written the program to use the API for disk storage, the programmer doesn’t have to keep up with the instruction codes, data types and response codes for every possible hard disk and tape drive. The operating system, connected to drivers for the various hardware subsystems, deals with the changing details of the hardware. The programmer must simply write code for the API and trust the operating system to do the rest.

APIs have become one of the most hotly contested areas of the computer industry in recent years. Companies realize that programmers using their API will ultimately translate this into the ability to control and profit from a particular part of the industry. This is one of the reasons that so many companies have been willing to provide applications like readers or viewers to the public at no charge. They know consumers will request that programs take advantage of the free readers, and application companies will be ready to pay royalties to allow their software to provide the functions requested by the consumers.

3.2.6 User Interface

Every computer that is to be operated by an individual requires a user interface. The user interface is usually referred to as a shell and is essential if human interaction is to be supported. The user interface views the directory structure and requests services from the operating system that will acquire data from input hardware devices, such as a keyboard, mouse or credit card reader, and requests operating system services to display prompts, status messages and such on output hardware devices, such as a video monitor or printer. The two most common forms of a user interface have historically been the command-line interface, where computer commands are typed out line-by-line, and the graphical user interface, where a visual environment (most commonly a WIMP) is present.

Most of the modern computer systems support graphical user interfaces (GUI), and often include them. In some computer systems, such as the original implementation of the classic Mac OS, the GUI is integrated into the kernel.

While technically a graphical user interface is not an operating system service, incorporating support for one into the operating system kernel can allow the GUI to be more responsive by reducing the number of context switches required for the GUI to perform its output functions. Other operating systems are modular, separating the graphics subsystem from the kernel and the Operating System. In the 1980s UNIX, VMS and many others had operating systems that were built this way. Linux and Mac OS are also built this way. Modern releases of Microsoft Windows such as Windows Vista implement a graphics subsystem that is mostly in user-space; however the graphics drawing routines of versions between Windows NT 4.0 and Windows Server 2003 exist mostly in kernel space. Windows 9x had very little distinction between the interface and the kernel.

Many computer operating systems allow the user to install or create any user interface they desire. The X Window System in conjunction with GNOME or KDE Plasma 5 is a commonly found setup on most UNIX and UNIX-like (BSD, Linux, Solaris) systems. A number of Windows shell replacements have been released for Microsoft Windows, which offer alternatives to the included Windows shell, but the shell itself cannot be separated from Windows.

Numerous UNIX-based GUIs have existed over time, most derived from X11. Competition among the various vendors of UNIX (HP, IBM, Sun) led to much fragmentation, though an effort to standardize in the 1990s to COSE and CDE failed for various reasons, and were eventually eclipsed by the widespread adoption of GNOME and K Desktop Environment. Prior to free software-based toolkits and desktop environments, Motif was the prevalent toolkit/desktop combination (and was the basis upon which CDE was developed).

Graphical user interfaces evolve over time. For example, Windows has modified its user interface almost every time a new major version of Windows is released, and the Mac OS GUI changed dramatically with the introduction of Mac OS X in 1999.

In human–computer interaction, WIMP is the mostly used system, which stands for “windows, icons, menus, pointer”, denoting a style of interaction using these elements of the user interface. It was coined by Merzouga Wilberts in 1980. Other expansions are sometimes used, substituting “mouse” and “mice” or “pull-down menu” and “pointing”, for menus and pointer, respectively.

Though the term has fallen into disuse, some use it as an approximate synonym for graphical user interface (GUI). Any interface that uses graphics can be called a GUI, and WIMP systems derive from such systems. However, while all WIMP systems use graphics as a key element (the icon and pointer elements), and therefore are GUIs, the reverse is not true. Some GUIs are not based in windows, icons, menus, and pointers. For example, most mobile phones represent actions as icons, and some may have menus, but very few include a pointer or run programs in a window.

WIMP interaction was developed at Xerox PARC (see Xerox Alto, developed in 1973) and popularized with Apple’s introduction of the Macintosh in 1984, which added the concepts of the“menu bar” and extended window management.[8]

In a WIMP system:

(1) A window runs a self-contained program, isolated from other programs that (if in a multi-program operating system) run at the same time in other windows.

(2) An icon acts as a shortcut to an action the computer performs (e.g., execute a program or task).

(3) A menu is a text or icon-based selection system that selects and executes programs or tasks.

(4) The pointer is an onscreen symbol that represents movement of a physical device that the user controls to select icons, data elements, etc.

This style of system improves human–computer interaction (HCI) by emulating real-world interactions and providing better ease of use for non-technical people. Users can carry skill at a standardized interface from one application to another.