Skip to content

Branch prediction pipelines

Pavel I. Kryukov edited this page May 7, 2019 · 3 revisions

Introduction

As it was mentioned in previous manual, BPU model tries to guess a direction of a branch. It enables the Fetch stage to continue fetching instructions without stalling the pipeline. Predicted instructions are fetched and enter the pipeline. Of course, BPU is not able to provide us with 100% legit information, that’s why there is a need in verifying its predictions. In case of detection of wrongly fetched instruction we have to flush from the pipeline this instruction and younger ones.

Three branch types

pikryukov: You are talking about stages here, but you describe them in the next paragraph. Can you describe stages first, and then map branch types?

Type When do we know the direction When do we know the target
Unconditional direct
( J, JAL)
Always taken At Decode stage
Conditional direct
( BEQ, BNE etc.)
At Execute stage At Decode stage
Unconditional indirect
( JR, JALR)
Always taken At Execute stage

Every thing, which is unknown at Fetch stage, is to be predicted. Failed predicting of direction is called direction misprediction, of target – target misprediction.

General overview of stages

Fetch stage

pikryukov: I don't care what C++ function we obtain target. I want to know that happens here. What is "target"? What it defines? What is BPU doing here? You multiply a lot of confusion here and your story would not resolve 10% of it afterwards. Confuse people slowly, introduce complexity step by step and as much as it is needed.

In module fetch we obtain the target with get_cached_target( cycle) function. This function checks different sources ( core, BPU, decode etc.) with certain priority and transmits target if it is valid and available. Received target is used to fetch instruction and to predict next target using BPU.

Decode stage

Some mispredictions are detected in module decode with is_misprediction() function. It takes decoded instruction and prediction about the next instruction and checks correctness of this prediction. Uncovered at this stage mistake causes short flush, i.e., instruction at Fetch stage has to be removed from the pipeline.

pikryukov: What are the "some mispredictions"? Again, everyone is confused even more! Would you attend lectures if topic was described like that?

Execute stage

At this stage, in module execute, we can catch remaining mispredictions. There is a special module branch with is_misprediction() function, which is used in similar way as is_misprediction() function in decode module. These mispredictions cause flush of instructions at Fetch stage and Decode stage.

Implementation

Unconditional direct jumps

Fetch stage

pikryukov: you are retelling the C++ code here. All the names of ports, methods, C++ types are irrelevant, we describe microarchitecture here.

Following ports are used:

  • write side of port "TARGET" – to provide next cycle with BPU prediction;
  • read side of port "TARGET" – to get BPU prediction from previous cycle;
  • read side of port "DECODE_2_FETCH" – to get the true information to update BPU;
  • read side of port DECODE_2_FETCH_TARGET" – to get decoded target and fetch the right instruction.

Decode stage

At this stage we know for sure direction and target of the branch.
Following ports are used:

  • write side of port "DECODE_2_FETCH" – to send actual information to update BPU;
  • write side of port "DECODE_2_FETCH_TARGET" – to send valid decoded target to fetch module in case of misprediction;
  • write side of port "DECODE_2_FETCH_FLUSH" – to send, in case of misprediction, flush signal to instruction in fetch module ( which would be in decode module next cycle);
  • read side of port "DECODE_2_FETCH_FLUSH" – to get flush signal from previous cycle and remove instruction from pipeline.

Conditional direct branches

Fetch stage

  • write side of port "TARGET" – to provide next cycle with BPU prediction;
  • read side of port "TARGET" – to get BPU prediction from previous cycle;
  • read side of port "DECODE_2_FETCH" – to get the true information to update BPU;
  • read side of port "DECODE_2_FETCH_TARGET" – to get decoded target and fetch the right instruction;
  • read side of port "BRANCH_2_FETCH" – to get the true information to update BPU;
  • read side of port "BRANCH_2_FETCH_TARGET" – to get calculated target and fetch the right instruction.


Processing...

Clone this wiki locally