Date: Mon, 04 Jan 1999 09:35:37 -0600 From: Bruce Schneier To: cypherpunks@toad.com Subject: Twofish speedup -- 258 clocks per block! We came up with a new register assignment scheme that allowed us to cut out nearly two clocks per Twofish round. There is no change to any performance numbers except for the Pentium Pro in ASM compiled mode, where a block can now be encrypted or decrypted in 258 clocks, and the key setup is 10,900 clocks. The number seems to fluctuate between 257 and 258, depending on the time of day, so we feel more comfortable with the larger number. It's not worthwhile to go into more detail on how it works; look at the code (namely, the macro RF_PPro) if you want to understand it. Twofish was already the fastest AES submssion on general 32-bit CPUs. Twofish is now within 3% of the fastest algorithm (RC6 at 250 clocks per block) on the Pentium Pro/II. There is new ASM code on the Twofish website, at . There's also new reference C and optimized C code. The new code is cleaned up a little, includes Watcom C support. There are also new explanatory comments and test code in the include file AES.H. Many thanks to Allan Stokes for the idea. Bruce ********************************************************************** Bruce Schneier, President, Counterpane Systems Phone: 612-823-1098 101 E Minnehaha Parkway, Minneapolis, MN 55419 Fax: 612-823-1590 Free crypto newsletter. See: http://www.counterpane.com