update version 0.82
June 1, 2000: Branch Prediction
|Thanks to Andreas Kaiser,
The Branch Prediction hardware was changed from
the simple 2-bit Smith predictor in the prototype to a more up-to-date
two-level GShare predictor based on the Global branch history for
the final production version. The only reference found in the Athlon documentation
is in the Athlon optimization guide, appendix A, page 180 where a GHBC
table is mentioned. (Global branch History Bimodal Counter table)
-- The outcome of the last N conditional branches
(taken or not-taken).
Global Branch History:
-- All branches are taken into account. (One
History register needed)
Local Branch History:
-- Only the branch-to-be-predicted is tracked.
(Many History registers needed for the individual branches)
-- A saturating two bit up/down counter.
Counts up if the branch is taken. Counts down if a branch is not-taken.
Does not count up past 3 (the maximum) and does not count down past 0 (the
minimum) The counter values 2 and 3 predict the next branch to be taken,
0 and 1 predict not-taken.
Global Branch History Bimodal
-- The N bits of the global history are
used as an index into a table of two bit up/down counters.
The value of the counter is used for the prediction.
Why does it work?
-- A particular part of a program may exhibit
a single or a few typical branch-patterns like 1-0-0-1-0-0..
The particular branch-to-be-predicted will "train"
this entry in the table to predict the branch right.
Do all the branches use the same table?
-- Yes. In most cases a provision is taken to
avoid that two branches with identical patterns trash each others predictions.
The address of the branch may for instance be XOR-ed with the pattern.
Also see Andreas Kaiser's http://www.s.netic.de/ak/k7doc.pdf
and Paul Hsieh's 7th
generation x86 CPU comparisons
The AMD patents issued so far do not offer a clear
indication that AMD plans to use a technique similar to the Alpha 21264
branch predictor selector technique which selects the recently most successful
predictor from a whole range of predictors. This method is believed to
be the most successful. (US
patent 5758142). Intel has a patent on a similar technique (US
patent 5687360) Filed later but issued earlier. (It is the file date
which counts in the US). It is clear however that AMD is very aware of
this technique. All its recent patents on branch prediction refer to the