Skip to content

file_util: improve rmtree performance#1666

Closed
nikitych wants to merge 2 commits intorpm-software-management:mainfrom
nikitych:file_util.rmtree
Closed

file_util: improve rmtree performance#1666
nikitych wants to merge 2 commits intorpm-software-management:mainfrom
nikitych:file_util.rmtree

Conversation

@nikitych
Copy link
Contributor

@nikitych nikitych commented Dec 8, 2025

Optimize rmtree to significantly reduce cleanup time, especially for large
buildroots. The previous Python-based implementation caused substantial delays
(e.g., ~13 minutes for a ~2M-file buildroot). Introduce a faster backend, using
either shutil.rmtree or native rm -r, which reduces cleanup time to under
one minute.

Also add support for handling paths longer than PATH_MAX, but only for
directories not listed in exclude. Extending this to excluded paths would
overcomplicate the logic, and excluded-path handling was already nonfunctional
and not in high demand.

Some benchmark:
$ sudo find /var/lib/mock/some-big-buildroot/ | wc -l
2056654
$ time mock -r some-big-buildroot --clean
INFO: mock.py version 6.3 starting (python version = 3.9.21, NVR = mock-6.3-1.el9)

previous rmtree implementaiont
real 13m21.176s
user 9m51.712s
sys 2m29.270s

shutil.rmtree as _fastRm
real 1m8.450s
user 0m14.331s
sys 0m31.525s

rm -r as _fastRm
real 0m52.990s
user 0m3.089s
sys 0m29.184s

partially fixes: #31

Optimize `rmtree` to significantly reduce cleanup time, especially for large
buildroots. The previous Python-based implementation caused substantial delays
(e.g., ~13 minutes for a ~2M-file buildroot). Introduce a faster backend, using
either `shutil.rmtree` or native `rm -r`, which reduces cleanup time to under
one minute.

Also add support for handling paths longer than PATH_MAX, but only for
directories not listed in `exclude`. Extending this to excluded paths would
overcomplicate the logic, and excluded-path handling was already nonfunctional
and not in high demand.

Some benchmark:
$ sudo find /var/lib/mock/some-big-buildroot/ | wc -l
2056654
$ time mock -r some-big-buildroot --clean
INFO: mock.py version 6.3 starting (python version = 3.9.21, NVR = mock-6.3-1.el9)

previous rmtree implementaiont
real    13m21.176s
user    9m51.712s
sys     2m29.270s

shutil.rmtree as _fastRm
real    1m8.450s
user    0m14.331s
sys     0m31.525s

rm -r as _fastRm
real    0m52.990s
user    0m3.089s
sys     0m29.184s

Smaller buildroots also see noticeable improvements.
$ mock -r fedora-43-x86_64 --init
$ time mock -r fedora-43-x86_64 --clean
real    0m2.219s
user    0m1.726s
sys     0m0.363s

shutil.rmtree as _fastRm
real    0m0.952s
user    0m0.689s
sys     0m0.178s

rm -r as _fastRm
real    0m0.879s
user    0m0.627s
sys     0m0.194s

partially resolves: rpm-software-management#31
Copy link

@github-advanced-security github-advanced-security bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

vcs-diff-lint found more than 20 potential problems in the proposed changes. Check the Files changed tab for more details.

@praiskup
Copy link
Member

praiskup commented Dec 8, 2025

This is just too big to give it a prompt review, give me some time, please.

The previous Python-based implementation caused substantial delays (e.g., ~13 minutes for a ~2M-file buildroot).

Can you document why it was slow? Where are we saving the syscalls, etc.

@Conan-Kudo
Copy link
Member

Is this an AI generated change? This seems nonsensical, and the tests fail.

@xsuchy
Copy link
Member

xsuchy commented Dec 8, 2025

The only change that makes sense to me is the replacement of os.listdir by os.scandir (the second one is much faster). But the second rm only complicates maintenance.
I will close this one. If you want to push this, please open separate PR for each change:

  • replace of os.listdir
  • handling paths longer than PATH_MAX
  • _fastRm()

@xsuchy xsuchy closed this Dec 8, 2025
@nikitych
Copy link
Contributor Author

nikitych commented Dec 8, 2025

The only change that makes sense to me is the replacement of os.listdir by os.scandir (the second one is much faster). But the second rm only complicates maintenance. I will close this one. If you want to push this, please open separate PR for each change:

* replace of os.listdir

* handling paths longer than PATH_MAX

* _fastRm()

Well, the idea was that the interpreter was spending too much time constructing fullname for exclude processing. So I added exclude analysis, and for paths that are clearly not in the exclude list we call the fast library implementation. However, profiling showed that the main bottleneck is actually the traceLog() decorator.

Mon Dec  8 13:31:32 2025    /tmp/out.prof

         3207566512 function calls (3205556916 primitive calls) in 1490.879 seconds

   Ordered by: cumulative time
   List reduced from 463 to 10 due to restriction <10>

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
1004587/1   20.540    0.000 1499.566 1499.566 /usr/lib/python3.9/site-packages/mockbuild/trace_decorator.py:57(trace)
1004587/1    9.368    0.000 1499.554 1499.554 /usr/lib/python3.9/site-packages/mockbuild/file_util.py:34(rmtree)
  1004587   30.361    0.000 1341.630    0.001 /usr/lib64/python3.9/inspect.py:1524(getouterframes)
 25119143   78.405    0.000 1294.998    0.000 /usr/lib64/python3.9/inspect.py:1485(getframeinfo)
 25119143  122.171    0.000  850.117    0.000 /usr/lib64/python3.9/inspect.py:809(findsource)
 51242960  103.391    0.000  634.200    0.000 /usr/lib64/python3.9/inspect.py:693(getsourcefile)
 25119143   87.511    0.000  213.004    0.000 /usr/lib64/python3.9/inspect.py:727(getmodule)
 51242970   87.414    0.000  209.656    0.000 /usr/lib64/python3.9/inspect.py:655(getfile)
102485927   66.293    0.000  195.296    0.000 {built-in method builtins.any}
 76362217  143.114    0.000  143.114    0.000 {built-in method posix.stat}

If you comment it out, the performance of file_util.rmtree gets close to shutil.rmtree. So for high-performance recursive deletion, we basically need a function without traceLog().

os.listdir can save a bit of memory, but even the current implementation uses only a few dozen megabytes, which is acceptable.

Support for paths longer than PATH_MAX came as a side effect of using reference implementations of the deletion logic. Both shutil.rmtree and rm -r handle such paths.

I agree, even though rm -r would bypass call-stack limits and is generally faster than the Python implementation, the added complexity might not be worth it.

To summarize:

  • For a significant performance improvement, it’s enough to remove file_util.rmtree from @traceLog().
  • If we want a bit more performance, we can keep the proposed exclude analysis but call shutil.rmtree instead of _fastRm(), which would give us PATH_MAX handling in most scenarios.

What do you think, at which point should we stop?

@praiskup
Copy link
Member

praiskup commented Dec 8, 2025

So I added exclude analysis, and for paths that are clearly not in the exclude list we call the fast library implementation. However, profiling showed that the main bottleneck is actually the traceLog() decorator.

Thank you for the additional steps, very useful conclusion. If we start with a new issue first, and let people talk - I think we could get rid of the decorator once and forever (but +1 at least for removing it from the method that is being slowed down). Also, note there's an option to do MOCK_TRACE_LOG=false (for debugging, the decorator actually contrary to its purpose complicates debugging).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

util.rmtree fails with RecursionError

4 participants