This is very weird. The system has been stable so I copied over some large (8-25GB) data files and noted some mismatching md5sum values. What was odd, was that they didn't match for just the larger files.
No bad sectors on the drives, no SMART errors, no errors on the copy. Memory all tested fine.
Further, duplicate md5sum on the same file give different results (again, only with files >10GB). Here's an example with stable md5sum in a 4gb file, unstable for 12gb.
Code: Select all
dd if=/dev/urandom iflag=fullblock of=output.dat bs=1G count=4
4+0 records in
4+0 records out
4294967296 bytes (4.3 GB) copied, 66.9364 s, 64.2 MB/s
[jake@localhost ~]$ md5sum output.dat
d407ce8d4306ec8c65a0c2ebe27fa789 output.dat
[jake@localhost ~]$ md5sum output.dat
d407ce8d4306ec8c65a0c2ebe27fa789 output.dat
[jake@localhost ~]$ md5sum output.dat
d407ce8d4306ec8c65a0c2ebe27fa789 output.dat
[jake@localhost ~]$ md5sum output.dat
d407ce8d4306ec8c65a0c2ebe27fa789 output.dat
[jake@localhost ~]$ rm output.dat
[jake@localhost ~]$ dd if=/dev/urandom iflag=fullblock of=output.dat bs=1G count=6
6+0 records in
6+0 records out
6442450944 bytes (6.4 GB) copied, 98.9325 s, 65.1 MB/s
[jake@localhost ~]$ md5sum output.dat
6bc0ca2954a8df9067ff856d3b2d09db output.dat
[jake@localhost ~]$ md5sum output.dat
6bc0ca2954a8df9067ff856d3b2d09db output.dat
[jake@localhost ~]$ md5sum output.dat
6bc0ca2954a8df9067ff856d3b2d09db output.dat
[jake@localhost ~]$ md5sum output.dat
6bc0ca2954a8df9067ff856d3b2d09db output.dat
[jake@localhost ~]$ md5sum output.dat
6bc0ca2954a8df9067ff856d3b2d09db output.dat
[jake@localhost ~]$ rm output.dat
[jake@localhost ~]$ dd if=/dev/urandom iflag=fullblock of=output.dat bs=1G count=12
12+0 records in
12+0 records out
12884901888 bytes (13 GB) copied, 198.55 s, 64.9 MB/s
[jake@localhost ~]$ md5sum output.dat
20a096b779ad26cd6c596f1657c6f5a1 output.dat
[jake@localhost ~]$ md5sum output.dat
05a08057cfdb3c418da34cd06531133a output.dat
[jake@localhost ~]$ md5sum output.dat
30333dd0e1cb4f068d81548bbf8dabd8 output.dat
[jake@localhost ~]$ md5sum output.dat
30333dd0e1cb4f068d81548bbf8dabd8 output.dat
[jake@localhost ~]$ md5sum output.dat
30333dd0e1cb4f068d81548bbf8dabd8 output.dat
[jake@localhost ~]$ md5sum output.dat
30333dd0e1cb4f068d81548bbf8dabd8 output.dat
And, if you look at parts of larger files (head -c 4G | md5sum), you always get consistent values until you pass more than 8-10G to md5sum. These "partial" md5sums also always match the original files. If the files were corrupted, that wouldn't always be the case.
I found this post:
https://askubuntu.com/questions/968123/ ... vme-drives
I removed some ram as suggested, didn't change the behavior.
Any ideas? Maybe I should move this to a new post.