twofish-speedup.txt
73a08b53c6f5abdfd076fdc277fa80c1aa01d2a01f98328177cbbaa1bed8daaf
Date: Mon, 04 Jan 1999 09:35:37 -0600
From: Bruce Schneier <schneier@counterpane.com>
To: cypherpunks@toad.com
Subject: Twofish speedup -- 258 clocks per block!
We came up with a new register assignment scheme that allowed us to cut out
nearly two clocks per Twofish round. There is no change to any performance
numbers except for the Pentium Pro in ASM compiled mode, where a block can
now be encrypted or decrypted in 258 clocks, and the key setup is 10,900
clocks. The number seems to fluctuate between 257 and 258, depending on
the time of day, so we feel more comfortable with the larger number. It's
not worthwhile to go into more detail on how it works; look at the code
(namely, the macro RF_PPro) if you want to understand it.
Twofish was already the fastest AES submssion on general 32-bit CPUs.
Twofish is now within 3% of the fastest algorithm (RC6 at 250 clocks per
block) on the Pentium Pro/II.
There is new ASM code on the Twofish website, at
<http://www.counterpane.com/twofish.html>. There's also new reference C
and optimized C code. The new code is cleaned up a little, includes Watcom
C support. There are also new explanatory comments and test code in the
include file AES.H.
Many thanks to Allan Stokes for the idea.
Bruce
**********************************************************************
Bruce Schneier, President, Counterpane Systems Phone: 612-823-1098
101 E Minnehaha Parkway, Minneapolis, MN 55419 Fax: 612-823-1590
Free crypto newsletter. See: http://www.counterpane.com