Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slow-ish #2

Open
r3m0t opened this issue Sep 9, 2014 · 6 comments
Open

Slow-ish #2

r3m0t opened this issue Sep 9, 2014 · 6 comments

Comments

@r3m0t
Copy link
Owner

r3m0t commented Sep 9, 2014

It would be faster by using realloc instead of malloc for our strings (avail_out) and thus putting that loop into C-space.

@dstromberg
Copy link

Using this with pypy3 I'm finding that compression is faster than CPython's lzma module, but decompression is a little more than 30% slower than CPython's lzma module. I'd love to see a speedup.

@r3m0t
Copy link
Owner Author

r3m0t commented Sep 30, 2017

Huh, I have a user! What compression ratio do your files have? Try setting file._decrompressor._bufsiz = io.DEFAULT_BUFFER_SIZE * 5 if your compression ratio is 0.2 for example. Do that before your first call to read().

You're the first user I know of outside of my then-employee, so I'm pretty chuffed. Are you using xz blocks, or you just wanted a pypy-friendly xz library?

@dstromberg
Copy link

I'm using lzma (xz) compression in a filesystem backup program: http://stromberg.dnsalias.org/~strombrg/backshift/
I have one 3+ terabyte backup repo for a half dozen machines, comprising my home backups. Some files are compressed with CPython's native lzma module, some are compressed with a ctypes-based xz module, and some are compressed with the lzma module that comes with pypy3 which I believe is your lzmaffi module.

Because I'm backing up many different file types, I believe my compression ratios are all over the place, but I'm not displeased with them. Here's a 3 year old analysis of compression in my personal backshift use: http://stromberg.dnsalias.org/~strombrg/backshift/documentation/for-all/chunk-sizes.html . I believe I'm getting just-OK compression because much of what I'm backing up is DVD rips, which is of course already lossily compressed.

I'm not looking for xz blocks; I just want something that'll work on pypy3 faster than the xz+ctypes code I wrote and faster than CPython's lzma module, so I can switch to pypy3 for backups. Right now, your first backup of a given filesystem is faster with pypy3 and subsequent backups are faster with cpython3. I'd like to find a way to get to both being faster with pypy3.

As I said, your module appears to be faster for compression, but slower for decompression. Initial backups are compression-heavy with very little decompression, but subsequent backups are doing both compression and decompression.

Here's some code I've been using to performance-test two of the different lzma modules:
http://stromberg.dnsalias.org/svn/utime-performance-comparison/trunk

There was a memory leak in the lzma module that comes with pypy3; they fixed that. Did they let you know about it? Or even get it from you?

I don't see anything about bufsiz in /usr/local/pypy3-5.8.0-with-lzma-fixes/lib-python/3/lzma.py . I'm now starting to wonder if pypy3's lzma code has diverged significantly from yours.

Thanks!

@r3m0t
Copy link
Owner Author

r3m0t commented Oct 1, 2017

I didn't realise they had forked my project no! I assumed pypy3 might want a more compatible copy of the CPython lzma module (without the extra features I added). But I'm very happy to see it there.

It hasn't received huge changes, the class LZMADecompressor is just implemented in the _lzma module rather than the lzma module. If I take your microbenchmark and change _bufsiz on the LZMADecompressor instance to the expected decompressed size 1000000 then pypy3 and CPython become practically the same speed (0.37s), while setting it to 1000000-1 makes it slow again (0.48s).

The algorithm for growing the buffer that liblzma outputs decompressed data is quite bizarre, as it calls realloc 2479 times in this case. If it simply grew the output buffer by 8KB each time, it would only call it 122 times. And all The algorithm is from the CPython module's source code though.

Anyway, I'm not sure this is a good benchmark when your actual decompressed data is only 10% bigger than the compressed data. Also the usual caveats about benchmarking pypy apply - it's slow when it starts, but after a few thousand loops the JIT has kicked in and you'll see its real speed. I added another 10 or 100 calls to alt_lzma_decompression_test and the last call took 0.43 or 0.41 seconds instead of the usual 0.48.

Well, good luck finding the reason for the speed difference. :)

@dstromberg
Copy link

dstromberg commented Oct 2, 2017 via email

@dstromberg
Copy link

In Pypy3 5.9, xz decompression is more than twice as slow as that found in CPython 3.6. So it's actually gotten worse.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants