Architecture Overview (Continued)
1.1 INTEGER UNIT
The integer unit consists of:
• Instruction Buffer
• Instruction Fetch
• Instruction Decoder and Execution
The pipelined integer unit fetches, decodes, and executes
x86 instructions through the use of a five-stage integer
The instruction fetch pipeline stage generates, from the on-
chip cache, a continuous high-speed instruction stream for
use by the processor. Up to 128 bits of code are read dur-
ing a single clock cycle.
Branch prediction logic within the prefetch unit generates a
predicted target address for unconditional or conditional
branch instructions. When a branch instruction is detected,
the instruction fetch stage starts loading instructions at the
predicted address within a single clock cycle. Up to 48
bytes of code are queued prior to the instruction decode
The instruction decode stage evaluates the code stream
provided by the instruction fetch stage and determines the
number of bytes in each instruction and the instruction
type. Instructions are processed and decoded at a maxi-
mum rate of one instruction per clock.
The address calculation function is pipelined and contains
two stages, AC1 and AC2. If the instruction refers to a
memory operand, AC1 calculates a linear memory address
for the instruction.
The AC2 stage performs any required memory manage-
ment functions, cache accesses, and register file
accesses. If a floating point instruction is detected by AC2,
the instruction is sent to the floating point unit for process-
The execution stage, under control of microcode, executes
instructions using the operands provided by the address
Write-back, the last stage of the integer unit, updates the
register file within the integer unit or writes to the load/store
unit within the memory management unit.
1.2 FLOATING POINT UNIT
The floating point unit (FPU) interfaces to the integer unit
and the cache unit through a 64-bit bus. The FPU is x87-
instruction-set compatible and adheres to the IEEE-754
standard. Because almost all applications that contain FPU
instructions also contain integer instructions, the GX1 pro-
cessor’s FPU achieves high performance by completing
integer and FPU operations in parallel.
FPU instructions are dispatched to the pipeline within the
integer unit. The address calculation stage of the pipeline
checks for memory management exceptions and accesses
memory operands for use by the FPU. Once the instruc-
tions and operands have been provided to the FPU, the
FPU completes instruction execution independently of the
1.3 WRITE-BACK CACHE UNIT
The 16 KB write-back unified (data/instruction) cache is
configured as four-way set associative. The cache stores
up to 16 KB of code and data in 1024 cache lines.
The GX1 processor provides the ability to allocate a portion
of the L1 cache as a scratchpad, which is used to acceler-
ate the Virtual Systems Architecture technology algorithms
as well as for some graphics operations.
1.4 MEMORY MANAGEMENT UNIT
The memory management unit (MMU) translates the linear
address supplied by the integer unit into a physical address
to be used by the cache unit and the internal bus interface
unit. Memory management procedures are x86-compati-
ble, adhering to standard paging mechanisms.
The MMU also contains a load/store unit that is responsible
for scheduling cache and external memory accesses. The
load/store unit incorporates two performance-enhancing
• Load-store reordering that gives memory reads
required by the integer unit a priority over writes to
• Memory-read bypassing that eliminates unnecessary
memory reads by using valid data from the execution
1.5 INTERNAL BUS INTERFACE UNIT
The internal bus interface unit provides a bridge from the
GX1 processor to the integrated system functions (i.e.,
memory subsystem, display controller, graphics pipeline)
and the PCI bus interface.
When external memory access is required, the physical
address is calculated by the memory management unit and
then passed to the internal bus interface unit, which trans-
lates the cycle to an X-Bus cycle (the X-Bus is a proprietary
internal bus which provides a common interface for all of
the integrated functions). The X-Bus memory cycle is arbi-
trated between other pending X-Bus memory requests to
the SDRAM controller before completing.
In addition, the internal bus interface unit provides configu-
ration control for up to 20 different regions within system
memory with separate controls for read access, write
access, cacheability, and PCI access.