“Size on disk” issue

If it doesn't fit in another category, ask it here.
Post Reply
andreiv3103
Posts: 12
Joined: 2009/10/26 10:21:14

“Size on disk” issue

Post by andreiv3103 » 2011/09/23 14:07:45

I have a problem understanding why a bunch of files that occupied a certain size on one drive occupy much more on another. Here are the details: we have a file sever that until today stored the files on a RAID 0 device formed of 2 drives, one of 750 Gb and the other of 250 Gb. The total amount of space available on the md0 device formed of the 2 drives, after formatting (ext3), was somewhere around 900 Gb.
The RAID device was almost full, so we purchased a 2Tb drive to replace it. After formatting (ext4 this time) the available space was 1.8 Tb.
I copied with rsync the files from the old RAID to the new drive.
Surprise! The Space occupied by the files copied from the RAID drive is 1.3 TB!!! How come there is such a difference in the space by the same files on the old RAID drive and on the new 2Tb drive?
What can I check?
I ran rsync several times with the –delete option enabled in order to be sure that the files from the RAID device are mirrored exactly, without any duplicates.
But still, the occupied size reported by df –H is the same: 1.3 Tb.

gerald_clark
Posts: 10642
Joined: 2005/08/05 15:19:54
Location: Northern Illinois, USA

“Size on disk” issue

Post by gerald_clark » 2011/09/23 14:12:54

A RAID 1 formed from a 250G drive and a 750G drive will be less than 250G, not 900G.
Your 1.8T filesystem likely uses a bigger block size than your older filesystem, so files will
take up more disk space.

User avatar
TrevorH
Site Admin
Posts: 33215
Joined: 2009/09/24 10:40:56
Location: Brighton, UK

Re: “Size on disk” issue

Post by TrevorH » 2011/09/23 14:18:40

I'd suspect that you had some sparse files on the original file system and you didn't tell rsync to use --sparse to save space on the destination.

andreiv3103
Posts: 12
Joined: 2009/10/26 10:21:14

Re: “Size on disk” issue

Post by andreiv3103 » 2011/09/23 14:20:47

Sorry, I corrected that. Was RAID 0 not 1.

andreiv3103
Posts: 12
Joined: 2009/10/26 10:21:14

Re: “Size on disk” issue

Post by andreiv3103 » 2011/09/23 14:24:10

[quote]
TrevorH wrote:
I'd suspect that you had some sparse files on the original file system and you didn't tell rsync to use --sparse to save space on the destination.[/quote]

I see ... maybe, I didn't think about that. Any way to solve the issue?

User avatar
TrevorH
Site Admin
Posts: 33215
Joined: 2009/09/24 10:40:56
Location: Brighton, UK

Re: “Size on disk” issue

Post by TrevorH » 2011/09/23 21:51:32

If you still have the old disks then you could recopy the data but I suspect you either haven't got them or the data on the new disk has changed since the original copy. I did find a perl script online that purports to find files that occupy less blocks than are indicated by the size of the file but I am not entirely sure that it's bug free :-) However it looks like this

[code]
#!/usr/bin/perl -w

use strict;
use warnings;
use File::Find;
sub process_file {
my $f=$File::Find::name;
(my $dev,my $ino,my $mode,my $nlink,my $uid,my $gid,my $rdev,my $size,my $atime,my $mtime,my $ctime,my $blksize,my $blocks) = stat($f);
if ($blocks * $blksize < $size) {
printf "\t$f => SZ: %u BLKSZ: %u BLKS: %u = %u \n", $size, $blksize, $blocks, $blksize * $blocks;
}
}
find(\&process_file,("/mnt/whereever"));
[/code]

You'd need to change the last line to start the search where your old disks are mounted. That would find any sparse files under /mnt/whereever and show you the size and then how many bytes were actually used. Once identified you would then need to work out if the file had now changed or if you can recopy it using something that respects sparse files (cp --sparse=auto -p oldfile newfile?)

andreiv3103
Posts: 12
Joined: 2009/10/26 10:21:14

Re: “Size on disk” issue

Post by andreiv3103 » 2011/09/25 05:38:30

Thank you very much for your reply. Fortunatelly, I have a spare 1.5 Tb drive, so I think I am going to copy there all the files and then copy them back with the correct parameters this time.
Ragarding block size, as someone sugested earlier, what do you think? I formatted the drive with default settings. I didn't set the block size. Should I change it?

Thanks again.

User avatar
TrevorH
Site Admin
Posts: 33215
Joined: 2009/09/24 10:40:56
Location: Brighton, UK

Re: “Size on disk” issue

Post by TrevorH » 2011/09/25 09:44:45

I just tested copying a file containing nothing but binary zeroes and using --sparse-always with cp does make the new output file a sparse one.

You can find out about block size if the filesystem is ext3 by running e.g.

[code]
tune2fs -l /dev/mapper/vg500-LogVolTmp | grep Block
[/code]

Post Reply