Copy on file with multithread

A 5 star hangout for overworked and underpaid system admins.
ChristoheB
Posts: 7
Joined: 2019/11/09 18:04:26

Copy on file with multithread

Post by ChristoheB » 2019/11/09 18:19:09

Hi Everybody,

I would like copy one file using multi thread simultany.
For explain, if use "scp root@192.168.1.1:/home/OneFileOnly.img /home/OneFileOnly.img" then i use one thread. I would like use serval thread.

Then, i'm looking for a tool (command line) for that.

Regards,
Christophe B

User avatar
TrevorH
Site Admin
Posts: 33202
Joined: 2009/09/24 10:40:56
Location: Brighton, UK

Re: Copy on file with multithread

Post by TrevorH » 2019/11/09 18:26:06

The only thing I can think of off the top of my head that does this sort of thing is bittorrent...
The future appears to be RHEL or Debian. I think I'm going Debian.
Info for USB installs on http://wiki.centos.org/HowTos/InstallFromUSBkey
CentOS 5 and 6 are deadest, do not use them.
Use the FAQ Luke

ChristoheB
Posts: 7
Joined: 2019/11/09 18:04:26

Re: Copy on file with multithread

Post by ChristoheB » 2019/11/09 18:45:53

The concept of bittorent is good. The soft need to copy multi block of one file simultanous.

But i'm looking for a tool in command line and a tool "good admin's tool" as scp, rsync ...

User avatar
jlehtone
Posts: 4523
Joined: 2007/12/11 08:17:33
Location: Finland

Re: Copy on file with multithread

Post by jlehtone » 2019/11/09 19:00:18

Do you mean transfer protocols that can use multiple (tcp) connections, like SMB 3.0's MultiChannel or iRODS?

ChristoheB
Posts: 7
Joined: 2019/11/09 18:04:26

Re: Copy on file with multithread

Post by ChristoheB » 2019/11/09 19:06:05

Yes, the transfert protocols can use multiple TCP session. One sessions TCP by Thread.
For sample, the first Thread (so the first TCP session) copy the blocks 1-25, the second thread copy the blocks 26-50 ...

tunk
Posts: 1205
Joined: 2017/02/22 15:08:17

Re: Copy on file with multithread

Post by tunk » 2019/11/09 20:55:11

Out of curiosity, why would you want to do this? - wouldn't network or disks be the bottlenecks?

ChristoheB
Posts: 7
Joined: 2019/11/09 18:04:26

Re: Copy on file with multithread

Post by ChristoheB » 2019/11/09 22:08:23

For sample, my network is 10Gbps and my Controller disk card is 12 Gbps.
When i want to transfert a big file, i copy to 3,x Gbps. Why ?
Because rsync and scp are monothread and i obtain a process to 100% busy.
If i launch two scp (then 2 files), I copy to 7 Gbps.

Then i'm looking for copy one file with multi Vcpu.

tunk
Posts: 1205
Joined: 2017/02/22 15:08:17

Re: Copy on file with multithread

Post by tunk » 2019/11/09 22:54:36

If it's an alternative, maybe NFS plus cp have less overhead?

ChristoheB
Posts: 7
Joined: 2019/11/09 18:04:26

Re: Copy on file with multithread

Post by ChristoheB » 2019/11/09 23:42:14

No, i can win 15%, peraphs 20 ...
I can disable crypto and i win 20% too. But i would like use all Vcpu.

User avatar
TrevorH
Site Admin
Posts: 33202
Joined: 2009/09/24 10:40:56
Location: Brighton, UK

Re: Copy on file with multithread

Post by TrevorH » 2019/11/10 01:08:51

The problem is not disk or bandwidth, it is that it bottlenecks on encryption which runs only as fast as one single core can go. If you can do without encryption then there are other methods - such as NFS or Samba. If some form of encryption is required then perhaps you can select different ciphers to see if one is quicker than another. It used to be true that arcfour was quicker than most but I suspect that's now been removed from recent CentOS versions as it's not seen as secure.

I tested that to see if it could be used and between 2 CentOS 7 systems, scp -c arcfour fails as arcfour,arcfour128,arcfour256 are not enabled in the sshd and have to be specified in sshd_config. I did enable them for the test and then tested an scp forcing each cipher in turn and copying a ~7GB file to /dev/null on the destination machine = e.g. scp -c arcfour my7gb.file othermachine:/dev/null

Enabled only for testing, disabled by default sshd_config
arcfour 220.4MB/s
arcfour128 223.0MB/s
arcfour256 240.5MB/s

Ciphers enabled by default on the CentOS 7.7 destination server in the order they were listed in the rejection message when I tried -c arcfour and it was not enabled (server replied : no matching cipher found. Their offer: <list of ciphers>):
chacha20-poly1305@openssh.com 157.2MB/s
aes128-ctr 322.1MB/s
aes192-ctr 317.2MB/s
aes256-ctr 307.8MB/s
aes128-gcm@openssh.com 315.9MB/s
aes256-gcm@openssh.com 308.6MB/s
aes128-cbc 222.9MB/s
aes192-cbc 210.6MB/s
aes256-cbc 191.5MB/s
blowfish-cbc 88.4MB/s
cast128-cbc 80.5MB/s
3des-cbc 24.3MB/s

Oh, and finally I tested using two copies of the utility 'nc', on the receiving machine I ran nc -l rec.eiv.ing.ip 9999 > /dev/null and then on the sending machine I ran time nc rec.eiv.ing.ip 9999 < my7gb.file and that completed in 0m8.779s which is 776.6MB/s.

You can find a list of the ciphers available in man 5 ciphers

I re-ran the first scp to make sure that it ran from a cached copy of the 7GB file on the sending side and watched iostat for a while and could see no disk traffic so there was no disk bottleneck. My network is 10Gbps and has been tested with iperf and runs at least 9.5Gbps. From the tests that I ran you can see that the non-default arcfour* ciphers are indeed faster than some of the ones that are still enabled but they are considerably slower than the much more secure aes*-ctr ciphers which perform best of all here. The sending machine has 16GB RAM, an i5-3570S CPU @ 3.10GHz and that has aes-ni instructions. The receiving machine has 32GB RAM, a Xeon e3-1245v3 3.4GHz and also has aes-ni. I suspect the results of the tests on older machines that do not have aes-ni would be considerably slower. On both sending and receiving machines, if you watch a cpu monitor that shows the activity of all cores, you can see that it runs on one core at a time and runs that core at pretty much 100%.
The future appears to be RHEL or Debian. I think I'm going Debian.
Info for USB installs on http://wiki.centos.org/HowTos/InstallFromUSBkey
CentOS 5 and 6 are deadest, do not use them.
Use the FAQ Luke

Post Reply