random memes }

A measure of Skein

Looking for a fast hash function for generating large (~256 bit) hash values (for another experiment). Recalled from Schneier's write up that Skein was relatively fast compared to other entrants in the SHA3 competition. Needed a number, so pulled down the V1.3 NIST submission, and wrote a small program to measure the rate of bulk hashing. The sources are up on Github as skein-nist

$ make
mkdir o
mkdir bin
cc -O3 -c -o o/rate.o sources/rate.cpp
cc -O3 -c -o o/skein.o sources/skein.c
cc -O3 -c -o o/skein_block.o sources/skein_block.c
cc -O3 -lc++ -o bin/rate o/rate.o o/skein.o o/skein_block.o
bin/rate
Size of buffer : 4194304
Size of chunk  : 65536
Running for minimum of 30 seconds...
Passes: 2081 elapsed: 30 total: 8324MB rate: 277MB/s
Expected hash bits : 512 bytes: 64
       0                        1                        2                        3                        4                        5
       0  1 2  3 4  5 6  7 8  9 0  1 2  3 4  5 6  7 8  9 0  1 2  3 4  5 6  7 8  9 0  1 2  3 4  5 6  7 8  9 0  1 2  3 4  5 6  7 8  9 0  1 2  3 4  5 6  7 8  9
       048260482604826048260482604826048260482604826048260482604826048260482604826048260482604826048260482604826048260482604826048260482604826048260482604826
Hash : ea326072100f5bc7eb0c38a64bbb87cf597fa3c2270651979088b6fd95771f909ae1b4069ff6b4b20b5df0cdc92814d41bc93427f1dbe60bed13748b731ce507

~277MB/s on a late-2013 MacBookPro (2.3Ghz i7 CPU).

I am looking to "fingerprint" bulk data for de-duplication of storage. The Skein result is a bit slow for my purpose, but good enough for initial experiments. There may be faster implementations of Skein. There may be faster hash functions with the needed properties. (Something to check, later.)