Memory Organization in Computer Architecture
Memory Organization in Computer Architecture
A memory unit is the collection of storage units or devices together. Main memory refers to physical memory that is internal to the computer. The word main is used to distinguish it from external mass storage devices such as disk drives. Other terms used to mean main memory include RAM and primarystorage. The computer can manipulate only data that is inmain memory.
A memory unit is the collection of storage units or devices together. The memory unit stores the binary information in the form of bits. Generally, memory/storage is classified into 2 categories:
- Volatile Memory: This loses its data, when power is switched off.
- Non-Volatile Memory: This is a permanent storage and does not lose any data when power is switched off.
Memory Hierarchy
The total memory capacity of a computer can be visualized by hierarchy of components. The memory hierarchy system consists of all storage devices contained in a computer system from the slow Auxiliary Memory to fast Main Memory and to smaller Cache memory.
Auxillary memory access time is generally 1000 times that of the main memory, hence it is at the bottom of the hierarchy.
The main memory occupies the central position because it is equipped to communicate directly with the CPU and with auxiliary memory devices through Input/output processor (I/O).
When the program not residing in main memory is needed by the CPU, they are brought in from auxiliary memory. Programs not currently needed in main memory are transferred into auxiliary memory to provide space in main memory for other programs that are currently in use.
The cache memory is used to store program data which is currently being executed in the CPU. Approximate access time ratio between cache memory and main memory is about 1 to 7~10
Memory Access Methods
Each memory type, is a collection of numerous memory locations. To access data from any memory, first it must be located and then the data is read from the memory location. Following are the methods to access information from memory locations:
- Random Access: Main memories are random access memories, in which each memory location has a unique address. Using this unique address any memory location can be reached in the same amount of time in any order.
- Sequential Access: This methods allows memory access in a sequence or in order.
- Direct Access: In this mode, information is stored in tracks, with each track having a separate read/write head.
Memory Table
powers-of-2.htm
data type Size in byte Range
Multi-tier or multilevel caching has become popular in server and desktop architectures, with different levels providing greater efficiency through managed tiering. Simply put, the less frequently certain data or instructions are accessed, the lower down the cache level the data or instructions are written.
data type Size in byte Range
short int | 2 | -32,768 to 32,767 |
unsigned short int | 2 | 0 to 65,535 |
unsigned int | 4 | 0 to 4,294,967,295 |
int | 4 | -2,147,483,648 to 2,147,483,647 |
long int | 4 | -2,147,483,648 to 2,147,483,647 |
unsigned long int | 4 | 0 to 4,294,967,295 |
long long int | 8 | -(2^63) to (2^63)-1 |
cache memory
Cache memory, also
called CPU memory, is high-speed static random access memory (SRAM) that a computer microprocessor can
access more quickly than it can access regular random access memory (RAM). This memory is typically integrated
directly into the CPU chip or placed on a separate chip that has a
separate bus interconnect with the CPU. The
purpose of cache memory is to store program instructions and data that are used
repeatedly in the operation of programs or information that the CPU is likely
to need next. The computer processor can access this information quickly from
the cache rather than having to get it from computer's main memory. Fast access
to these instructions increases the overall speed of the program.
As the microprocessor processes data, it looks first in the
cache memory. If it finds the instructions or data it's looking for there from
a previous reading of data, it does not have to perform a more time-consuming
reading of data from larger main memory or other data storage devices. Cache
memory is responsible for speeding up computer operations and processing.
Once they have been opened and operated for a time, most
programs use few of a computer's resources. That's because frequently
re-referenced instructions tend to be cached. This is why system performance
measurements for computers with slower processors but larger caches can be
faster than those for computers with faster processors but less cache space.
Multi-tier or multilevel caching has become popular in server and desktop architectures, with different levels providing greater efficiency through managed tiering. Simply put, the less frequently certain data or instructions are accessed, the lower down the cache level the data or instructions are written.
Implementation and history
Mainframes used an early version of cache memory, but the
technology as it is known today began to be developed with the advent of
microcomputers. With early PCs, processor performance increased much faster
than memory performance, and memory became a bottleneck, slowing systems.
In the 1980s, the idea took hold that a small amount of more
expensive, faster SRAM could be used to improve the performance of the less
expensive, slower main memory. Initially, the memory cache was separate from
the system processor and not always included in the chipset. Early PCs
typically had from 16 KB to 128 KB of cache memory.
With 486 processors, Intel added 8 KB of memory to the CPU as
Level 1 (L1) memory. As much as 256 KB of external
Level 2 (L2) cache memory was used in these systems. Pentium processors saw the
external cache memory double again to 512 KB on the high end. They also split
the internal cache memory into two caches: one for instructions and the other
for data.
Processors based on Intel's P6 microarchitecture, introduced in
1995, were the first to incorporate L2 cache memory into the CPU and enable all
of a system's cache memory to run at the same clock speed as the processor. Prior to
the P6, L2 memory external to the CPU was accessed at a much slower clock speed
than the rate at which the processor ran, and slowed system performance
considerably.
Early memory cache controllers used a write-through cache
architecture, where data written into cache was also immediately updated in
RAM. This approached minimized data loss, but also slowed operations. With
later 486-based PCs, the write-back cache architecture was developed, where RAM
isn't updated immediately. Instead, data is stored on cache and RAM is updated
only at specific intervals or under certain circumstances where data is missing
or old.
Cache memory mapping
Caching configurations continue to evolve, but cache memory
traditionally works under three different configurations:
·
Direct
mapped cache has each block mapped to exactly one cache memory
location. Conceptually, direct mapped cache is like rows in a table with three
columns: the data block or cache line that contains the actual data fetched and
stored, a tag with all or part of the address of the data that was fetched, and
a flag bit that shows the presence in the
row entry of a valid bit of data.
·
Fully
associative cache mapping is
similar to direct mapping in structure but allows a block to be mapped to any
cache location rather than to a prespecified cache memory location as is the
case with direct mapping.
·
Set
associative cache mapping can
be viewed as a compromise between direct mapping and fully associative mapping
in which each block is mapped to a subset of cache locations. It is sometimes
called N-way set associative mapping, which provides for a location
in main memory to be cached to any of "N" locations in the L1 cache.
Format of the cache hierarchy
Cache memory is fast and expensive. Traditionally, it is
categorized as "levels" that describe its closeness and accessibility
to the microprocessor.
L1 cache, or primary cache, is extremely fast but relatively small, and
is usually embedded in the processor chip as CPU cache.
L2 cache, or secondary cache, is often more capacious than L1. L2 cache
may be embedded on the CPU, or it can be on a separate chip or coprocessor and have a high-speed
alternative system bus connecting the cache and CPU. That way it doesn't get
slowed by traffic on the main system bus.
Level 3 (L3) cache is specialized memory developed to
improve the performance of L1 and L2. L1 or L2 can be significantly faster than
L3, though L3 is usually double the speed of RAM. With multicore processors, each core can have
dedicated L1 and L2 cache, but they can share an L3 cache. If an L3 cache
references an instruction, it is usually elevated to a higher level of cache.
In the past, L1, L2 and L3 caches have been created using
combined processor and motherboard components. Recently, the trend has been
toward consolidating all three levels of memory caching on the CPU itself.
That's why the primary means for increasing cache size has begun to shift from
the acquisition of a specific motherboard with different chipsetsand bus architectures to buying a CPU with the
right amount of integrated L1, L2 and L3 cache.
Contrary to popular belief, implementing flash or more dynamic
RAM (DRAM) on a system won't increase cache memory.
This can be confusing since the terms memory caching (hard
disk buffering) and cache memory are often used
interchangeably. Memory caching, using DRAM or flash to buffer disk
reads, is meant to improve storage I/O by caching data that is frequently
referenced in a buffer ahead of slower magnetic disk or tape. Cache memory, on
the other hand, provides read buffering for the CPU.
Specialization and functionality
In addition to instruction and data caches, other caches are
designed to provide specialized system functions. According to some
definitions, the L3 cache's shared design makes it a specialized cache. Other
definitions keep instruction caching and data caching separate, and refer to
each as a specialized cache.
Translation lookaside buffers (TLBs) are also specialized memory caches whose
function is to record virtual address to physical address
translations.
Still other caches are not, technically speaking, memory caches
at all. Disk caches, for instance, can use RAM
or flash memory to provide data caching
similar to what memory caches do with CPU instructions. If data is frequently
accessed from disk, it is cached into DRAM or flash-based silicon storage
technology for faster access time and response.
SSD caching vs.
primary storage
Dennis Martin, founder and president of
Demartek LLC, explains the pros and cons of using solid-state drives as cache
and as primary storage.
Specialized caches are also available for applications such as
web browsers, databases, network address binding and client-side Network File System protocol support.
These types of caches might be distributed across multiple networked hosts to
provide greater scalability or performance to an application that uses them.
Locality
The ability of cache memory to improve a computer's performance
relies on the concept of locality of reference. Locality describes various
situations that make a system more predictable, such as where the same storage
location is repeatedly accessed, creating a pattern of memory access that the
cache memory relies upon.
There are several types of locality. Two key ones for cache are
temporal and spatial. Temporal locality is when the same resources are accessed
repeatedly in a short amount of time. Spatial locality refers to accessing
various data or resources that are in close proximity to each other.
Cache vs. main memory
DRAM serves as a computer's main memory, performing calculations
on data retrieved from storage. Both DRAM and cache memory are volatile memories that lose their
contents when the power is turned off. DRAM is installed on the motherboard,
and the CPU accesses it through a bus connection.
APPALOOSA
An example of dynamic
RAM.
DRAM is usually about half as fast as L1, L2 or L3 cache memory,
and much less expensive. It provides faster data access than flash storage,
hard disk drives (HDDs) and tape storage. It came into use in the last few
decades to provide a place to store frequently accessed disk data to improve
I/O performance.
DRAM must be refreshed every few milliseconds. Cache memory,
which also is a type of random access memory, does not need to be refreshed. It
is built directly into the CPU to give the processor the fastest possible
access to memory locations, and provides nanosecond speed access time to
frequently referenced instructions and data. SRAM is faster than DRAM, but
because it's a more complex chip, it's also more expensive to make.
Cache vs. virtual memory
A computer has a limited amount of RAM and even less cache
memory. When a large program or multiple programs are running, it's possible
for memory to be fully used. To compensate for a shortage of physical memory,
the computer's operating system (OS) can create virtual memory.
To do this, the OS temporarily transfers inactive data from RAM
to disk storage. This approach increases virtual address space by using active
memory in RAM and inactive memory in HDDs to form contiguous addresses that
hold both an application and its data. Virtual memory lets a computer run
larger programs or multiple programs simultaneously, and each program operates
as though it has unlimited memory.
Where virtual memory fits in the memory hierarchy.
In order to copy virtual memory into physical memory, the OS
divides memory into pagefiles or swap files that contain a certain number of
addresses. Those pages are stored on a disk and when they're needed, the OS
copies them from the disk to main memory and translates the virtual addresses
into real addresses.
Associative memory: A type of computer memory from which items may be retrieved by matching some part of their content, rather than by specifying their address (hence also called associative storage or Content-addressable memory (CAM).) Associative memory is much slower than RAM, and is rarely encountered in mainstream computer designs.For example, that serves as an identifying tag. Associative memory is used in multilevel memory systems, in which a small fast memory such as a cache may hold copies of some blocks of a larger memory for rapid access.
To retrieve a word from associative memory, a search key (or descriptor) must be presented that represents particular values of all or some of the bits of the word. This key is compared in parallel with the corresponding lock or tag bits of all stored words, and all words matching this key are signaled to be available.
Associative memory is expensive to implement as integrated circuitry.Associative memory can use in certain very-high-speed searching applications. Associative memory can search data (tag) for access by content of data rather than address.
Memory Management Unit
As a program runs, the memory addresses that it uses to reference its data is the logical address. The real time translation to the physical address is performed in hardware by the CPU’s Memory Management Unit (MMU). The MMU has two special registers that are accessed by the CPU’s control unit. A data to be sent to main memory or retrieved from memory is stored in the Memory Data Register (MDR). The desired logical memory address is stored in the Memory Address Register (MAR). The address translation is also called address binding and uses a memory map that is programmed by the operating system.
Note
The job of the operating system is to load the appropriate data into the MMU when a processes is started and to respond to the occasionalPage Faults by loading the needed memory and updating the memory map.
Before memory addresses are loaded on to the system bus, they are translated to physical addresses by the MMU
Virtual Memory | Operating System
Virtual Memory is a storage allocation scheme in which secondary memory can be addressed as though it were part of main memory. The addresses a program may use to reference memory are distinguished from the addresses the memory system uses to identify physical storage sites, and program generated addresses are translated automatically to the corresponding machine addresses.
The size of virtual storage is limited by the addressing scheme of the computer system and amount of secondary memory is available not by the actual number of the main storage locations.
The size of virtual storage is limited by the addressing scheme of the computer system and amount of secondary memory is available not by the actual number of the main storage locations.
It is a technique that is implemented using both hardware and software. It maps memory addresses used by a program, called virtual addresses, into physical addresses in computer memory.
- All memory references within a process are logical addresses that are dynamically translated into physical addresses at run time. This means that a process can be swapped in and out of main memory such that it occupies different places in main memory at different times during the course of execution.
- A process may be broken into number of pieces and these pieces need not be continuously located in the main memory during execution. The combination of dynamic run-time address translation and use of page or segment table permits this
Comments
Post a Comment