There's a bunch of different haskell projects that implements some cryptohashes (sha1, md5, ..). However they are very differently constructed, with unconsistent interface overall. Also most of them has poor performance, and expose only the high level operation: hash data in one go, giving back a digest.

Also most of them implement really common cryptohashes like sha1, sha2 and md5, but it's hard to find anything out there for less common cryptohashes (md2, md4, ripemd comes to mind).

To rectify this problem, here come the hs-cryptohash package; it's a bundle of 9 differents algorithms implementing all common cryptohash (sha1, sha2 family, md5), but also some less common one (md2, md4, ripemd160). Each interfaces of thoses modules are completely similar, which make it simple to replace one hash algorithm by another.

Under the hood, all thoses algorithms are implemented in C, with a relatively well optimized for most of them. Some of them come close to openssl, one of the fastest library for cryptohash, whereas most of them are around the same ball park as the sum programs available in GNU coreutils.

Over the hood, everything is pure despite having IO related calls because of the C FFI. There's 2 mains interfaces exposed: the one-go interface, and the incremental interface.

The one-go interface exposes 2 calls hash and hashlazy, that takes a strict bytestring and lazy bytestring respectively, and gives back a digest bytestring

The incremental interface exposes 3 calls: init, update and finalize. This is on par with what most cryptohash algorithm exposes as interface. This is very useful to use thoses calls when data comes through multiples chunks and you don't want to "store" all the chunks before hashing. This interface is slighly slower than the one-go one, for the simple reason that to expose a pure interface, the context cannot be updated in place.

Here's goes some numbers comparing debian coreutils digest binaries (sha1sum, sha256sum ..) with openssl and then with new haskell cryptohash modules in incremental and then one pass mode. each run is processing a 500mb random file, then averaged on 4 trials.

	coreutils	cryptohash(op)	cryptohash(in)	openssl
sha1 	2.97500 	2.65250 	3.64250 	1.90250
sha224 	5.36000 	5.78250 	6.78250 	4.30500
sha256 	5.33500 	5.76000 	6.77750 	4.26750
sha384	3.65500 	3.65750 	4.69000 	2.78500
sha512	3.66000		3.66500 	4.70750 	2.79750
md5	1.65750		1.75750		2.70750		1.50250
ripemd	  N/A		3.80000		4.77000		3.83

And now some numbers with some other hackage library (digesting a 500mb file):

		SHA	PureMD5	cryptohash	speedup
sha1		23.1s	N/A	2.65s		x 8.7
sha256		44.8s	N/A	5.76s		x 7.7
sha512		25.4s	N/A	3.55s		x 7.1
md5		N/A	9.42	1.76s		x 5.3

Comments

Sorry, this form isn't yet working. Please send me comment through my email address directly.