update version 0.82

June 1, 2000:  Branch Prediction corrections.

Thanks to Andreas Kaiser,

The Branch Prediction hardware was changed from the simple 2-bit Smith predictor in the prototype to a more up-to-date two-level GShare  predictor based on the Global branch history for the final production version. The only reference found in the Athlon documentation is in the Athlon optimization guide, appendix A, page 180 where a GHBC table is mentioned. (Global branch History Bimodal Counter table) 

Branch History:
-- The outcome of the last N conditional branches (taken or not-taken).
Global Branch History:
-- All branches are taken into account. (One History register needed)
Local Branch History:
--  Only the branch-to-be-predicted is tracked. (Many History registers needed for the individual branches)
Bimodal Counter:
--  A saturating two bit up/down counter. Counts up if the branch is taken. Counts  down if a branch is not-taken. Does not count up past 3 (the maximum) and does not count down past 0 (the minimum) The counter values 2 and 3 predict the next branch to be taken, 0 and 1 predict not-taken. 
Global Branch History Bimodal Counter Table:
--  The N bits of the global history are used as an index  into a table of  two bit up/down counters. The value of the counter is used for the prediction.
Why does it work?
--  A particular part of a program may exhibit a single or a few typical branch-patterns like 1-0-0-1-0-0..
The particular branch-to-be-predicted will "train" this entry in the table to predict the branch right.
Do all the branches use the same table?
-- Yes. In most cases a provision is taken to avoid that two branches with identical patterns trash each others predictions. The address of the branch may for instance be XOR-ed with the pattern. 

Also see Andreas Kaiser's  http://www.s.netic.de/ak/k7doc.pdf
and Paul Hsieh's 7th generation x86 CPU comparisons


The AMD patents issued so far do not offer a clear indication that AMD plans to use a technique similar to the Alpha 21264 branch predictor selector technique which selects the recently most successful predictor from a whole range of predictors. This method is believed to be the most successful. (US patent 5758142). Intel has a patent on a similar technique (US patent 5687360) Filed later but issued earlier. (It is the file date which counts in the US). It is clear however that AMD is very aware of this technique. All its recent patents on branch prediction refer to the Alpha patent.