Why not use split
to divide the file into chunks—say, 1 GiB per chunk—and record the hash for each one? Then, after transmitting them, you could hash each chunk and make sure that each checksum matches. That way, you can hash all the chunks in parallel. And as an added bonus, if one chunk doesn’t match, you don’t have to retry the entire file at once. Once they all check out, just use cat
to re-assemble them in order.
You can’t add two hashes together to get the same result as hashing the combined file. That would have serious security implications. It would only work if your hashing algorithm is to add up all the bytes, and keep the least significant n bits…so hopefully not. But, I can’t think of a reason why you can’t check the sums per-chunk, then add the files together.
I would add that all hashing algorithms with a finite-size result must have collisions, through pigeon-holing. md5 collisions are a security/crypto concern, since you can pretty easily modify a file to have a desired sum using its many vulnerabilities. But, the chance of a random modification causing a collision is 1/(2^128), or 0.000000000000000000000000000000000000293873587%. That’s beyond lottery odds, to say the least. Unless you’re concerned about deliberate tampering with the file in-transit, I would go with whatever hashing algorithm executes fastest.
I’m not sure exactly how you would parallelize the hashing operations, but maybe somebody here knows?
Also, here's my sha256 speeds (i9-9900k at 4.9GHz):
$ openssl speed sha256
Doing sha256 for 3s on 16 size blocks: 21075457 sha256's in 3.00s
Doing sha256 for 3s on 64 size blocks: 11810309 sha256's in 3.00s
Doing sha256 for 3s on 256 size blocks: 5437878 sha256's in 3.00s
Doing sha256 for 3s on 1024 size blocks: 1701300 sha256's in 3.00s
Doing sha256 for 3s on 8192 size blocks: 230657 sha256's in 3.00s
Doing sha256 for 3s on 16384 size blocks: 115969 sha256's in 3.00s
OpenSSL 1.1.1f 31 Mar 2020
built on: Mon Apr 20 11:53:50 2020 UTC
options:bn(64,64) rc4(16x,int) des(int) aes(partial) blowfish(ptr)
compiler: gcc -fPIC -pthread -m64 -Wa,--noexecstack -Wall -Wa,--noexecstack -g -O2 -fdebug-prefix-map=/build/openssl-P_ODHM/openssl-1.1.1f=. -fstack-protector-strong -Wformat -Werror=format-security -DOPENSSL_TLS_SECURITY_LEVEL=2 -DOPENSSL_USE_NODELETE -DL_ENDIAN -DOPENSSL_PIC -DOPENSSL_CPUID_OBJ -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DKECCAK1600_ASM -DRC4_ASM -DMD5_ASM -DAESNI_ASM -DVPAES_ASM -DGHASH_ASM -DECP_NISTZ256_ASM -DX25519_ASM -DPOLY1305_ASM -DNDEBUG -Wdate-time -D_FORTIFY_SOURCE=2
The 'numbers' are in 1000s of bytes per second processed.
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes
sha256 112402.44k 251953.26k 464032.26k 580710.40k 629847.38k 633345.37k
And md5 for comparison: (~1.5x speedup)
$ openssl speed md5
Doing md5 for 3s on 16 size blocks: 33872370 md5's in 3.00s
Doing md5 for 3s on 64 size blocks: 19564317 md5's in 3.00s
Doing md5 for 3s on 256 size blocks: 8616480 md5's in 3.00s
Doing md5 for 3s on 1024 size blocks: 2660204 md5's in 3.00s
Doing md5 for 3s on 8192 size blocks: 356744 md5's in 3.00s
Doing md5 for 3s on 16384 size blocks: 179171 md5's in 3.00s
OpenSSL 1.1.1f 31 Mar 2020
built on: Mon Apr 20 11:53:50 2020 UTC
options:bn(64,64) rc4(16x,int) des(int) aes(partial) blowfish(ptr)
compiler: gcc -fPIC -pthread -m64 -Wa,--noexecstack -Wall -Wa,--noexecstack -g -O2 -fdebug-prefix-map=/build/openssl-P_ODHM/openssl-1.1.1f=. -fstack-protector-strong -Wformat -Werror=format-security -DOPENSSL_TLS_SECURITY_LEVEL=2 -DOPENSSL_USE_NODELETE -DL_ENDIAN -DOPENSSL_PIC -DOPENSSL_CPUID_OBJ -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DKECCAK1600_ASM -DRC4_ASM -DMD5_ASM -DAESNI_ASM -DVPAES_ASM -DGHASH_ASM -DECP_NISTZ256_ASM -DX25519_ASM -DPOLY1305_ASM -DNDEBUG -Wdate-time -D_FORTIFY_SOURCE=2
The 'numbers' are in 1000s of bytes per second processed.
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes
md5 180652.64k 417372.10k 735272.96k 908016.30k 974148.95k 978512.55k
Surprisingly, I’m way behind compared to AMD. For 16384 bytes, I got 633345.37k, while thro got 1996559.70k. That’s more than 3x faster! I guess those specialized hashing operations are no joke. So my tirade about Intel IPC and clocks goes out the window, I guess
I wonder if Zen 2 (3000-series) improves on this at all, and if it’s only SHA or other algorithms as well.