1. Introduction
The human race accumulates knowledge and develops intelligence from generation
to generation. The engineered systems around us have increased the amount
of intelligence embedded in the microelectronic components used in the
industrial designs as well as consumer electronics such as personal phones,
watches, television, entertainment systems, and cameras. The components
became more sophisticated as engineers and software designers acquired
more knowledge and experience on processor families since the commercial
single-chip microprocessor was developed in 1971. As system designers strive
to combine human intelligence with nanosecond-and picosecond-order microelectronics,
new generations of microprocessor-based systems continue to evolve. As
the technology advances, the microelectronic systems not only gain in speed
but become small, more compact, lightweight, and affordable, making the
way to invade every corner of the industry and society. Microprocessors
and microcontrollers have become indispensable in system design. This section
provides an overview of the fundamentals and applications of microprocessors
and microcontrollers.
2. What Is a System?
The word system can be used to describe and understand a complex whole.
It’s a way of limiting our thoughts to those aspects important for us
to understand in any complex whole at the depth required. Therefore, a
system is "an adequate way of describing a complex whole." Any
system can be viewed as a single unit or a connected group of subsystems
or subunits.
2.1 A System as a Single Unit--The External View
When searching for the external view of a system, we ignore its internal
details and try to identify its inputs, process, and output ( FIG. 1).
The process takes place within the system "box." The process
acts on the inputs to change the internal condition of the system or produce
an output.

2.2 A System as a Connected Group of Subsystems--The Internal View
When a system is viewed internally as a connected group of subunits, it
can be drawn as a block diagram. Each block in the system can be treated
as a separate system for further analysis, and each block's external and
internal views can be separately drawn. In this manner, the most complex
whole can be decomposed into manageable, understandable parts.
2.3 Combinational and Sequential Systems
In a combinational system, the results of the process are available only
when all the inputs are applied and the output does not depend on the sequence
in which the inputs are applied but on the combination of the inputs used.
In a sequential system, the output of the process depends completely on
the sequence in which the inputs are applied. The result is different when
the sequence of inputs is changed. Normally, all inputs are not required
to produce an output. The process is sequential and the subunits don’t
work simultaneously.
3. Central Control
Consider a simple machine, such as a simple domestic washing machine (
FIG. 2). Its internal components easily can be identified.
The subsystems must communicate with each other and information in the
form of signals must pass between them to carry out the process. A sequential
system operates in a sequence of steps. In between steps can be a pause,
while a signal is passed. During the pause, all internal signals are maintained
by the system. Such a pattern of internal signals is known as an internal
condition (or internal state) of the system. During the process, the machine
proceeds from one state to another, in a sequential manner. These changes
are called state transitions.

FIG. 2 Internal view of a washing machine
The process acts on the inputs of the system to produce an output or transfer
the system from one state to another.
In a sequential system, the subunits are linked during the process. Subunits
must communicate with each other to carry out their jobs. At its maximum,
central control would completely ban direct communication between subsystems;
all communication would have to be channeled through the central controller.
Washing is a sequential process; therefore, it conveniently can be performed
through a central controller. FIG. 3 shows the links when the system is
put under the central control.
All subsystems attached to the central controller are called its peripherals.
In the next level of analysis we treat the central controller as a system
having its own inputs, output, and a process.

FIG. 3 The washing process under central control
3.1 The Central Controller as a System
FIG. 4 shows an external view of the controller and Table 1 summarizes
the input to and output of the controller.
Each input conveys some information to the controller, keeping it well
informed of what is happening. The controller has complete control over
its peripherals. It makes decisions based on the information conveyed by
the sensors and generates a set of commands to its peripherals. These are
the output of the central controller. Thus, the behavior of the overall
system is fully determined and controlled by the central controller. Depending
on the system being controlled the central controller could be any of the
following:
• Mechanical (with rods and levers and so forth).
• Electromechanical (with relays and breakers and the like).
• Pneumatic (with valves, compressors, and so on).
• Microelectronic (with microprocessors, memory, and I/O interface ICs).
• Biological (with brain cells).
Microelectronic central controllers have the advantages of high reliability,
low maintenance, low cost, compactness, and high flexibility.
Next the process of the system should be designed. The process is the
sequence of events that takes place during the operation of a system. A
flowchart is the tool for designing a process.

FIG. 4 External view of the central controller
===
TABLE 1 Inputs to and Output of the Controller
Inputs to the Controller ---Output of the Controller
Signal Description A The values selected by the user, such as the temperature
or start button D Water level ("drum full") E Temperature K The
door lock sensor
Signal Description B Output to LEDs on the indicator panel C Opening and
closing of the inlet valve F-G Agitator motor control H Heater control
I Door control J Water pump control
===




FIG. 5 Continuation of flowchart for the washing process
3.2 The Washing Process
When planning the process, we first write down macro instructions, ignoring
small details. Then each macro is decomposed into smaller (micro) steps.
The macro steps of the washing process are
1. Receive a load of clothes.
2. Receive the detergent.
3. Receive the start signal.
4. Read the panel selection.
5. Fill the drum with water and heat up to set temperature.
6. Agitate for 30 minutes.
7. Empty water from the drum.
8. Refill the drum with fresh water.
9. Agitate for 5 minutes.
10. Drain the water while spinning the drum.
11. Indicate completion of washing process.
3.3 Flowchart of the Process
The microelectronic controller can do only one small job at a time and
so needs a more detailed sequence of instructions. A flowchart specifies
those details: the activities, events, and sequence (flow). FIG. 5 depicts
the flowchart of the washing process.
3.4 Writing the Program
The flowchart carries instructions and relationships but these are not
understood by the controller. Therefore, the steps in the flowchart must
be converted into instructions to the microcontroller, called program coding.
Coding is a translation. The flowchart does not carry the syntax of a computer
language. After coding, it carries the syntax and the program can be stored
in the semiconductor memory of the controller.
The programming process has several basic steps to be planned and carried
out:
1. Study the block diagram.
2. Identify the sequence of macro steps.
3. Break down the macro step into smaller steps.
4. Draw a flowchart.
5. Code the program.
6. Store the coded program in the memory of the controller.
3.5 Centralized vs. Decentralized Control
Is central control applicable to all situations? Certainly not. It may
be the best approach for small, dedicated systems in which the system is
not very complex and its parts are close together. Think of yourself. You
are a biological system under central control. All your sensors (senses)
are wired to the human brain, the great central controller. The brain collects
information from the sensors, processes them, makes decisions based on
them, and issues commands to the various actuators (muscles) of the body.
It works.
But think of large social systems like organizations and governments,
which have to manage diversity in a wide geographical spread. They attempt
to decentralize control to achieve greater efficiency and effectiveness.
A computer network spreads its computing power to a set of scattered PCs
to decentralize control. For large systems having widespread entry/exit
points, decentralization may be a better technique. However, at the periphery
of such a system, the nodes may be under a central control. Therefore,
the social system may be treated as a collection of subsystems, each under
its own central control.
4. Stored Program Control
Inside the central controller is a program stored in the memory subsystem
that will determine the sequence of steps the controller has to carry out.
A program is a sequence of instructions arranged in a particular, meaningful
order to produce a useful result from the system. The program controls
the process.
Therefore, the control is centralized at two levels: at the central controller,
and at a program device (memory) within the central controller.
For example, a read-only memory (ROM) may hold the sequences of instructions
needed to perform the job. The system is under the control of a stored
program that carries the know-how and expertise of the team that developed
the program. The team's intelligence is at work even in its absence. This
is the advantage of the stored program. It creates portable human intelligence.
If you write a program, in your absence, your intelligence will be at work,
controlling all the systems that carry your program. You can package and
market your intelligence, and it will remain in the world.
5. Inside the Microelectronic Central Controller
FIG. 6 shows the internal view of a central controller that consists
of three subsystems: processing, memory, and input/output.
The three subsystems are linked to each other using buses, a group of
conductors running in parallel. They are used to interconnect subsystems.
The subsystems communicate through the buses, sending digital data and
messages.
5.1 The Processing Subsystem
The processing subsystem is the programmable VLSI device called the microprocessor.
It should be provided with a power supply, a clock, and instructions (operational
codes). When these are provided, it takes overall control of the system.
FIG. 6 Internal view of a microelectronic central controller
Rwm -Read/write memory NVM -Nonvolatile memory

FIG. 7 Classification of semiconductor memory
5.2 The Memory Subsystem
The memory system contains programs and data. It consists of two main
types of memory, which have several subcategories, as shown in FIG. 7.
5.2.1 Read/Write Memory
Read/write memory is also known as random access memory (RAM). RAM is
fast compared to other forms of memory. RAM is temporary storage. When
electrical power is switched off, RAM looses all its contents (volatility).
Therefore, it’s used to store programs and data temporarily, while the
system is running. Memory speed is measured by the access time in nanoseconds.
The smaller the access time, the faster the memory. The parity technique
is used to check the integrity of the data bytes. Therefore, a parity bit
is saved along with each byte in a memory location. But parity can detect
only single-bit errors. A normal RAM can be accessed from only one set
of bus lines, but dual-ported RAM supports two separate buses. This feature
is useful in applications like video RAM, where the microprocessor and
the cathode ray tube controller demand access to the same RAM. Another
version, nonvolatile RAM, keeps the memory contents even in the absence
of an external power supply.
Depending on the operating principle, there are two main categories of
RAM: static RAM (SRAM) and dynamic RAM (DRAM). DRAM remembers data for
only a short period of time. Therefore, it has its own circuitry to refresh
itself (i.e., to remind itself of what it has remembered). It uses capacitors
to remember the logic's ones and zeros. When the capacitor is charged it’s
in the logic 1 state. But capacitors remain charged for only a short period
of time, and therefore, the refreshing circuit needs to keep recharging
them. This circuitry has to look at each location and recharge the capacitors
to their previous levels.
DRAM has a very high packing density and therefore offers lower cost/bit
than SRAM. Static RAM, on the other hand, uses flip-flop circuits to store
bits. But one flip-flop takes much more space than the capacitors used
in DRAM. Therefore, packing density is low. But they are faster than capacitors.
Access times often are as low as 8-14 ns. Also, they need no refreshing.
The design variations in RAM ICs are identified as FPM RAM, EDO RAM, and
SDRAM. In FPM (fast page model) RAM, a complete address is sent only once
for a column, in its internal matrix. Then, for all other memory locations
in that column, only the row address is supplied. Within that memory page,
the access is faster; therefore, it’s faster in accessing bytes sequentially.
EDO (extended data out) RAM starts getting the next address before it finishes
reading the last address. It holds old data for a longer period of time
for the microprocessor to read, while it’s getting the next address from
the microprocessor. Therefore, the time gap between two successive readings
can be reduced, increasing performance. Synchronous dynamic RAM (SDRAM)
is a faster version of DRAM. Average access time can be as low as 8-10
ns. SDRAM employs an addressing modification. After one address operation,
for each clock cycle, it gives the next sequential address. Therefore,
it offers other advantages of DRAM with low access time.
5.2.2 Nonvolatile Memory
The contents of nonvolatile memory won’t be erased even when the power
is switched off. In this category of ICs, the ratio of read operations
to write operations is very high. Nonvolatile memory includes ICs such
as ROM, PROM, UVEPROM, and EEPROM.
Read-only memory is a form of permanent storage. ROM is written during
the manufacturing process with user-supplied data. Once written, it can
never be changed. Because information on a ROM cannot be changed, it’s
useful to designers for storing parts that have no need to change (such
as the primary loader for starting). RAM often is faster than ROM. Therefore,
some designs "shadow" the ROM (copy it into RAM and then always
access the RAM version). Although this wastes RAM, a designer may trade
off performance for space.
Programmable read-only memory (PROM) is a form of permanent storage that
you can write to, but just once for its lifetime. A blank PROM has all
ones. After the user writes data to a PROM programmer unit, it can only
be read. Pin-compatible PROMs can be used in place of ROMs during software
development.
Erasable programmable read-only memory (EPROM) can be programmed and reprogrammed
by the user with the assistance of a PROM programming unit and an ultraviolet
eraser. There are two types of EPROMs: ultraviolet EPROM (UVEPROM) and
electrically erasable PROM (EEPROM). UVEPROM can be erased only by exposing
it to ultraviolet light of specified intensity over a specified duration.
But no erasing of selective locations is possible, because the UV light
causes bulk erasing. One disadvantage is that, even to modify few locations,
the whole IC needs to be erased and reprogrammed.
EEPROM is a form of EPROM that can be erased electrically. Erasing is
achieved by sending special electrical voltage. Erasing of selective locations
is possible. During reading, they work just like any other EPROM. EEPROMs
can be rewritten a few million times.
One form of EEPROM, known as flash ROM, can be erased in a "flash." This
form of EEPROM is manufactured so that it usually is erased in blocks of
memory rather than character by character, and it’s cheaper to manufacture.
Other forms of nonvolatile memory include battery-backed RAMs and ferroelectric
memory.
5.3 The Input/Output Interface Subsystem
Any external device connected to a microprocessor-based central controller
is called a peripheral (e.g., the keyboard, VDU, mouse, printer, scanner,
light pen, joystick, hard and floppy disk drives, CD drive, plotter, magnetic
tape drive, and transducers, both sensors and actuators). The number of
peripherals can be tailored to fit the requirements of the user. Therefore,
computer systems have become very flexible and adaptable. Peripherals are
connected through ports ready-made for the export and import of information.
A port can be programmed to do either input operations, output operations,
or both. Ports are either parallel ports and serial ports.
The input/output (I/O) interface subsystem allows a digital message to
be passed between the central controller and the peripheral devices or
I/O devices.
The interface comes between a peripheral device and the microprocessor
buses, providing communication between peripherals and the microprocessor.
Peripherals usually are slower than the microprocessor, creating a timing
problem. Furthermore, connecting them to the buses of the central controller
presents additional problems, such as electrical buffering, code conversion,
and analog-to-digital conversion. The interface subsystem is there to solve
these problems. A few common examples of peripheral interfacing devices
follow:
PIO --parallel input/output device.
PIA --programmable interface adapter (the same as PIO). USRTm universal
synchronous receiver-transmitter.
UART m universal asynchronous receiver-transmitter.
CTC ~ counter-timer circuit.
FDCm floppy disk controller.
HDC--hard disk controller: CRT Cm cathode ray tube controller.
5.4 Buses
The three subsystems are joined by the three conventional data buses.
The width of the data bus (number of lines) is an important figure for
the system (e.g., 8 bit, 16 bit, 32 bit, 64 bit). It carries both data
bytes and instruction operational codes. Each memory location and each
port have a unique address, given to them by the system designer. When
the microprocessor wants to exchange information with a memory IC or a
port, its address first is deposited on the address bus. An address bus
broadcasts it to all subsystems. Therefore, it’s unidirectional. With an
n-bit address bus, the microprocessor can generate 2 n different addresses.
Control lines can be further categorized into "system status" lines,
which bring in information, and "system control" lines, which
carry the commands generated by the microprocessor to the other ICs in
the system.
===
TABLE 2 Signal Groups of a Microprocessor Group | Description
Data-lines Address lines CPU control (system status) lines System control/CPU
status lines Service lines
These lines carry data bytes and operational codes between the microprocessor
and the external ICs; they are bi-directional These carry memory addresses
and addresses of input/output devices from the microprocessor to other
subsystems; therefore they are unidirectional.
These lines carry signals that provide information to the CPU relating
the present state of the system; the microprocessor reacts to these signals;
therefore they also are called CPU control signals (e.g., the interrupt
line, reset line, hold line).
These carry signals that provide information to the system relating to
the current status of the microprocessor; the system reacts to these signals
(e.g., the R/W line).
These provide basic services needed for the microprocessor to operate
(e.g., the power supply, external clock requirements).
===
6. Microprocessor Architecture
6.1 External View of a Microprocessor
To simplify the external view of a microprocessor, its pins can be categorized
into five groups as shown in Table 2. A typical microprocessor, Z80, and
its pin diagram and signal classification are shown in FIG. 8.
Even though these lines appear distinct in the Z80 block diagram, in some
microprocessors certain lines are made to perform different functions during
different parts of their computing cycles. To reduce the pin count of the
IC, certain pins are assigned with two non-overlapping functions, a technique
called pin multiplexing (e.g., in the Z80, address and data lines are multiplexed
on the same set of pins). At one time, addresses appear on the lines and
in the next moment data appear on the same lines. Additional control signals
are sent to the system to identify them; For example, DS (data strobe,
this pin goes high when data come out) or AS (address strobe, this pin
goes high when an address comes out). The system is responsible for separating
addresses and data using the information of the AS and DS pins. This is
called demultiplexing.
6.2 Internal View of a Microprocessor
A typical microprocessor consists of three main components: the arithmetic
and logic unit (ALU), the control unit, and a set of registers ( FIG. 9).
The register set can be further categorized into general purpose and special
purpose registers.
6.2.1 Arithmetic and Logic Unit (ALU)
The ALU performs arithmetic operations such as addition and subtraction
and logic operations such as AND, OR, Exclusive OR, Complement, and Compare.
Multiplication and division are not available as direct operations in most
cases, but the control unit may use a program sequence (i.e., multiple
addition and shifts) to generate a multiplication. Hence, these operations
take much longer to execute than direct operations. In some operations
like shifting both arithmetic and logical versions are available. Shift
operations are typically 1-bit shifts, with multiple-bit shifts performed
as a successive sequence of single-bit shifts. Shift operations can be:
• Logical shifts that don’t preserve the sign and fill empty bits with
zero.
• Arithmetic shifts that preserve the sign and fill empty bits with zeros
or an extended sign bit.
• Cyclical shifts that simply rotate the contents.
Bit operations are available to set or clear bits or test whether a specific
bit is set or reset.

FIG. 8 A typical microprocessor, Z80, pin diagram and classification.
(Source: Zilog Z80 databook.)
6.2.2 The Control Unit (Micro Code Interpreter)
The control unit or sequencer decodes the instruction given to it and
performs the appropriate sequence of actions. The timing and sequencing
of all the minute steps to execute the instruction are done by the control
unit.
6.2.3 The Registers
A microprocessor has several special purpose registers. Typical special
purpose registers are the accumulator(s), program counter, stack pointer,
flag register, and index register. The accumulator is used to feed one
input number to the ALU and, immediately after the operation, collect and
store the result of the operation. The program counter contains the address
of the next program instruction to be executed. Fetching the next instruction
starts when the program counter deposits this address on the address bus.
Index registers are used to offset addresses in memory when addressing
data tables and blocks.
The status register consists of a set of 1-bit indicators known as flags.
Each flag may be set or reset to indicate a condition. Therefore, the status
register also is known as the condition code register. The number and the
designations of these indicators vary from processor to processor. However,
carry (C) flags, zero (Z) flags, negative (N) flags, overflow (V) flags,
and interrupt (I) flags are common to almost all microprocessors. The results
of the arithmetic and logic operations typically affect the flags. For
example, if the result is 0, a Z flag will be set (Z --1); if it’s negative,
an N flag gets set. Using these flags, the programmer can program the microprocessor
to make decisions during the actual operation.


FIG. 9 The basic architecture of a microprocessor: (a) Conceptual model,
(b) Practical implementation in Z80 (Source: Zilog Z80 databook.)
These are useful when programming backward or forward conditional jumps
of a flowchart.
In applications, it becomes necessary to temporarily save registers and
other data for future use, such as when a subprogram (subroutine) is executed
or the program is interrupted and the interrupt is to be entertained. A
stack is a sequential block of memory reserved for this purpose. A stack
pointer is used to manage this memory block as a last-in-first-out (LIFO)
data structure. The only byte that can be accessed on the stack is the
last byte pushed into it, which is at the top of the stack. A stack pointer
is the register that keeps this address. Furthermore, it employs destructive
reading (i.e., the byte is erased when read). Many ALU operations require
two input numbers. In accumulator-based microprocessors, the first number
usually comes from the accumulator and the other must be retrieved from
memory into an internal data buffer and then fed to the ALU. In register-based
microprocessors, a bank of general purpose registers may have direct access
to the ALU, and both operands can be stored directly in two of these registers
prior to the ALU operation (i.e., registers usually are preloaded with
the appropriate operands). In addition, these registers are very useful
for storing intermediate results to speed up operations.
6.3 Bottlenecks in the Basic Architecture
Two main bottlenecks can be identified in the basic architecture:
• The time taken to fetch the instructions from memory. The fetch cycle
does not deliver value to the customer; the execution cycle produces the
end result.
• The sequential nature of scheduling and executing instructions. One
instruction should be fully completed before the microprocessor can pay
attention to the next instruction.
Microprocessor designers have come up with various modifications to improve
these two areas. These attempts have brought about new architectural designs.
Most of them are centered around the concept of parallel processing. In
parallel processing, the three architectural designs are pipelines, array
processing, and multiprocessor systems.
6.3.1 Cache Memory
An increase in operating speed can be achieved through the use of a memory
cache. A cache is a relatively small, high-speed memory buffer. A cache
may be internal or external to the microprocessor. An internal memory cache
is a fast set of registers available within the microprocessor. An external
memory cache is a fast multiport RAM with a data and addressing bus separate
from the sequencer and the ALU. While the ALU is executing one operation,
the control unit loads the next instruction into an instruction cache register,
decodes the instruction, and loads all possible data operands into the
data cache registers. Once in the cache registers, these instructions and
data can be accessed instantly, reducing the normal memory read time. This
significantly increases the processing speed.
6.3.2 Pipeline Processing
The time taken by a processor to complete a program is determined by three
factors:
1. The number of instructions required to execute the program.
2. The average number of processor cycles required to execute an instruction.
3. The processor cycle time.
Processor performance is improved by reducing the completion time, which
involves reducing one or more of these factors. Pipeline processing is
a technique used by both reduced instruction set computers (RISC) and complex
instruction set computers (CISC) to break the execution of individual instructions
into stages and overlap the stages so several instructions can be processed
in parallel.
To process instructions in a pipeline, the various steps of execution
need to be performed by pipeline stages, units that independently execute
the steps of different instructions. The result of each pipeline stage
is communicated to the next pipeline stage via a register between the stages.
Both 486 and Pentium CPUs use a five-stage pipeline:
1. Fetch an instruction from the processor cache or memory.
2. Decode the instruction.
3. Generate a memory address if the instruction includes a memory reference.
4. Execute the instruction.
5. Store, or "write back," the result.
Pipelines allow more than one microprocessor instruction to be serviced
at once, which allows the microprocessor to average one clock cycle for
each instruction. Under ideal conditions, each stage requires one clock
cycle. When the pipeline is fully loaded, an average of one instruction
per clock cycle can be produced by the pipeline. For example, the Pentium's
superscalar design, incorporating two independent pipelines, nominally
doubles the processor's instruction throughput.
The term scalar processor denotes a processor that executes one instruction
at a time. A superscalar processor reduces the average number of cycles
per instruction beyond what is possible in a scalar processor by concurrent
execution of scalar instructions. Superscalar microprocessors are the next
step in the evolution of microprocessors.
(cont. in part 2)
|