Understanding and troubleshooting page faults and memory swapping: Site24x7

In modern computing, paging and memory swapping are two common memory management techniques for the allocation and deallocation of memory. Memory management can be used to reduce fragmentation of a program's address space, improve performance by reducing cache misses, or increase security by preventing buffer overflows.

Memory management in operating systems involves a combination of methods for memory allocation and usage tracking by applications and processes.

The memory management unit (MMU) maps logical addresses to physical addresses while managing the movement of processes between the storage disk and main memory during execution.

In this article, we’ll examine pertinent issues with page faults and memory swaps as commonly occurring exceptions. We’ll also look at the typical scenarios that cause them.

Understanding page faults

Operating systems use paging to transfer data between the main memory and secondary storage for efficient memory management. A page (memory/virtual page) is a portion of the running process that represents a logical unit of memory. The physical portion of memory that contains a single process page is known as a frame. All frames are of fixed length, allowing for non-contiguous (non-sharing) allocation of address space.

What is a page fault?

A page fault is an exception raised by the memory management unit that happens when a process needs to access data within its address space, it fails to load in the physical memory. The exception usually instructs the machine to find this data block within virtual memory, that way it can be sent to the physical memory from storage devices.

Page fault handling is typically automated where the operating system’s kernel allocates or denies RAM access to specific processes. Page faults are common, and often helpful to improve performance by raising the quantum of memory available for programs.

Page fault types

Page faults can be categorized as:

Minor page faults

Minor page faults, also called soft faults, occur when the memory page is shared by multiple programs, among which some have already brought the page to main memory. A minor page fault is a type of exception that occurs when the operating system encounters an error in memory that can be caused by either hardware or software errors.

The most common cause of minor page faults is a bad sector on the disk, which causes the operating system to stop reading data from the hard drive, starting over again. A secondary cause could be a corrupted file, but it is more likely due to a bug in the program.

Major page faults

A major page fault is called so, because it is one of the most serious exceptions that can occur on a computer system—a stark contrast to minor faults, such as segmentation violations and general protection faults.

A major page fault is an exception that occurs when a process attempts to access memory in a way that exceeds its permissions. For example, if the program attempts to access data from an unmapped region of physical memory, or if it writes beyond the end of allocated virtual address space, a major page fault occurs.

Invalid page faults

Invalid page faults are common exceptions that occur when a process attempts to retrieve an invalid memory address. The operating system detects this error, and prevents the execution of the program by terminating it, or by sending it into an infinite loop

In most cases, an invalid page fault occurs when the kernel detects that the page table entry corresponding to the requested virtual address does not exist in physical memory

Page fault example: Copy-on-Write

Copy-on-Write, or COW, is a memory management technique that allows the operating system to share physical memory between multiple processes. With this approach, each process can have its own private view of the shared data without having to allocate new memory for it. This makes COW an extremely efficient mechanism for reducing virtual memory usage, and for improving application performance.

The Copy-on-Write virtual memory management technique helps handle page faults by allowing parent and child processes to initially reside within the same memory page. As soon as a process tries to modify the shared page, a copy of the page is created, so that only modifications affect the active process. If the child process is not modified, it continues to exist as a reference to the parent process.

Invalid conditions and impacts of page faults

There are a number of conditions that can cause a processor to generate a page fault. The most common condition is when an application attempts to access memory at a location outside of its allocated address space. A second condition occurs when the operating system needs more physical memory than is available in the computer's main memory. In this case, the operating system will allocate some additional memory from disk storage.

Page faulting is an issue that affects all modern operating systems, including Linux. A page fault typically occurs when a process attempts to access memory in a virtual address space that it does not own. When this happens, the kernel has to take necessary actions before allowing the process to continue execution.

Apart from hardware problems such as memory overclocking, software bugs are also typical causes of major and invalid page faults. These faults usually result in an OS crash, or a segmentation violation that results in a core dump.

Memory swap-in/out

Memory swapping is a collection of techniques that enables the operating system to allocate memory for processes that require more memory than is present in physical RAM. This approach ensures that the computer processes data better by optimally using storage space and virtual memory as additional resources.

Most operating systems automatically manage the creation of the swap file and the memory swapping process. A swap file is typically initiated as soon as all physical RAM is exhausted and applications require more. The processes’ physical memory pages are then mapped onto the swap space, which enhances the system’s virtual memory capacity. This makes memory swapping a crucial memory management method that ensures system stability and availability.

Swap-in/out exceptions

Swap-in and swap-out are the mechanisms that allow a process to allocate memory from the operating system's virtual address space. The kernel uses these mechanisms to move pages of physical RAM into or out of the process' virtual address space as needed.

While swap-in is the process of removing a program from swap memory in storage before placing it in the physical RAM, swap-out is the process of moving memory pages from RAM and onto physical storage.

Swap-in/out errors

Swap memory is usually fixed to a default size as defined by the operating system and physical memory. Swap-in/out errors occur when the available swap memory cannot sustain running processes and the swapping process is slowed down. This makes it difficult for the system to perform any task, even when the CPU is not in constant use. Swap-in/out exceptions typically occur due to memory overcommitment.

Causes of read errors

There are two main causes of swap-in/out errors:

Low PC main memory: If the physical memory is lower than the memory needs of the running processes, most programs will go through a slow swap, causing programs to respond slowly or crash.
Low swap device storage: If the swap space is low, most processes are queued within the main memory, which slows down the execution of processes. It also reduces the rate of swapping between main memory and physical storage.

How to address page faults and swaps

Page faults and memory swaps are normal exceptions that are typically handled under the hood without needing intervention from users. If left unchecked, however, these errors typically degrade the system and may cause OS and application crashes. Here are a few takeaways and questions to consider when dealing with page faults and swaps.

When to worry about page faults

Operating systems offer different approaches to handle page fault errors. If the program that receives errors is overwhelmed, the OS performs a default action, which typically involves terminating the offending process. It is recommended that system admins monitor the frequency of errors that result in process termination and core dumps, taking further action when the physical memory is in a constant state of faults.

How to reduce or eliminate page faults

While use cases may differ, two of the most common approaches to reduce the occurrence of page faults are:

Increasing physical RAM: Installing more physical memory ensures that all or most programs can access memory pages.
Managing the application’s memory usage: Ensuring there are no memory leaks in the application reduces the frequency of page faults, since there’s always enough physical memory to handle the changing memory requirements of application processes.

When should memory swapping be a concern

Swapping in and out becomes a concern when the operating system is constantly under pressure to move pages between main memory and swap space. This can cause the system to fail to allocate memory for processes, slowing down performance.

Memory swapping metrics to monitor

Some major indicators of memory swapping include:

Swap activity: This measure denotes the number of swap-ins and swap-outs that occur within a certain duration. The higher the swap activity, the more memory and advanced memory management strategies are required to keep the processes running optimally.
Amount of swap space in use: Administrators should monitor the size of applications stored in the swap space and the amount of virtual memory left. This will help determine the amount of free physical memory that is available to run processes.

Summary

Paging is a useful feature in memory management, as it helps implement virtual memory by enabling the exchange of data between the secondary storage and the main memory. Memory swapping is the movement of processes or programs between swap (virtual memory in storage) and the main memory. Both of these processes aid in resource optimization by ensuring that RAM is available for processes that need it.

Page faults and memory swap exceptions occur when the available memory (both physical and virtual) cannot sustain running processes. When not handled properly, these exceptions result in system crashes, process dumps and the eventual degradation of application performance.

Was this article helpful?

Sorry to hear that. Let us know how we can improve the article.

Previous Troubleshooting Linux performance using the top command

Next Killing a process from the Command Line in Linux

Understanding and troubleshooting page faults and memory swapping