/usr/bin/cp very slow

Hello,
recently we noticed that using cp on a directory with many files was taking much longer on almalinux 9.3 than it was taking on oracle linux 7.9. The machines are the same, just an OS reinstallation.

After trying many things, I copied the cp binary from oracle linux 7 on the alma linux 9.3 machine and the time went from 830 minutes on alma to 14 minutes.

What could possibly have changed in cp that would have such an impact?

oracle linux cp:
cp (GNU coreutils) 8.22
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Written by Torbjörn Granlund, David MacKenzie, and Jim Meyering.

alma linux cp:
cp (GNU coreutils) 8.32
Copyright (C) 2020 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later https://gnu.org/licenses/gpl.html.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Written by Torbjorn Granlund, David MacKenzie, and Jim Meyering.

Thanks for any hints on how to get that to work as fast as it used to.

Pierre,

It looks like nobody is interested in this problem but, as a user, I feel concerned. If this is a general problem with cp, everyone should feel worried!

But I’m trying to understand, and I don’t have enough data.

  • How many files is “many files”? And how small are they?
  • What filesystem? And on what kind of devices?
  • Does rsync suffer from the same problem, or it’s much faster?
  • Have you tried timing on fewer files, but still with relevantly different results between the cp versions?
  • Any relevant warnings or errors, I don’t know, in dmesg or elsewhere?

This is stunning.

Hello,
For the tests, I’m using the gcc-13.2 tar which has around 120000 files, I don’t know which size but source code so probably not very big for most of them.

The filesystem that shows this problem is an NFS server running truenas. I tried a few NFS clients in case it was specific to 1 machine and now I’m doing all the tests from the same machine to have numbers I can compare.

rsync is working as expected. scp is as slow as cp (I used scp without a hostname so maybe it calls cp underneath).

I haven’t tried on fewer files.

I haven’t seen anything that seems relevant in logs but I could’ve missed it.

I downloaded the source code to coreutils 8.32 (same version as alma) and compiled it and this version seems to be ok. I also compiled coreutils-9.0 and that version is slow (15m25.343s vs 954m49.535s). I don’t know what that means but it might be related. Also tried the centos stream cp program and that one seems to behave the same as the alma linux one.

I also asked someone in a different department that’s also using alma linux 9 with NFS and it seems to be working at normal speed for them. Maybe we can get information from comparing our systems.

Thanks.

  • So the problem only occurs on non-local filesystems?

  • I’m pretty sure that scp does not call cp.

  • Very strange thing that 8.32 recompiled works fine, but 9.0 does not. Centos 9-Stream has still 8.23, so it makes sense to behave like Alma’s. But why compiling it by hand gives different results?

The mystery doesn’t seem to give up.

Ah, but no. You should build it from the SRPM, i.e. https://repo.almalinux.org/vault/9.3/BaseOS/Source/Packages/coreutils-8.32-34.el9.src.rpm, because coreutils.spec will then apply 36 patch files! One of these patches is likely to be the reason for your performance issues!

I suspect this one:

# basic support for checking NFSv4 ACLs (#2137866)
Patch19:  coreutils-nfsv4-acls.patch

Do you happen to have on your system an alias, so that cp actually performs a cp --preserve=xattr? I would rather use alias cp="rsync -ah --progress" :wink:

Otherwise, maybe you have too small rsize and wsize values in fstab for that mount.

You might be right with the acl patch, I was thinking of something acl related and a backport of some code from a more recent version.

With a manual mount and the bigger wsize and rsize options, it looks like the server rejects the higher values and puts lower ones.

We’ll have to try and see if it’s possible to change that but I don’t want to do that on production servers so we’ll have to see how we can test this.

I think the acl patch might be something to look into more details as well.

Thanks for all the inputs.

I did many more tests, eventually rebuilt coreutils and disabled the nfsv4-acls patch but that didn’t have any effect on the time.

Looking throught the coreutils.spec, I noticed other cp patches and this comment:
# cp: default to --reflink=auto (#1861108)

So I did a test with --reflink=never and got the correct time. It seems like this option is very costly for us at least.

1 Like