-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Slow-ish #2
Comments
Using this with pypy3 I'm finding that compression is faster than CPython's lzma module, but decompression is a little more than 30% slower than CPython's lzma module. I'd love to see a speedup. |
Huh, I have a user! What compression ratio do your files have? Try setting You're the first user I know of outside of my then-employee, so I'm pretty chuffed. Are you using xz blocks, or you just wanted a pypy-friendly xz library? |
I'm using lzma (xz) compression in a filesystem backup program: http://stromberg.dnsalias.org/~strombrg/backshift/ Because I'm backing up many different file types, I believe my compression ratios are all over the place, but I'm not displeased with them. Here's a 3 year old analysis of compression in my personal backshift use: http://stromberg.dnsalias.org/~strombrg/backshift/documentation/for-all/chunk-sizes.html . I believe I'm getting just-OK compression because much of what I'm backing up is DVD rips, which is of course already lossily compressed. I'm not looking for xz blocks; I just want something that'll work on pypy3 faster than the xz+ctypes code I wrote and faster than CPython's lzma module, so I can switch to pypy3 for backups. Right now, your first backup of a given filesystem is faster with pypy3 and subsequent backups are faster with cpython3. I'd like to find a way to get to both being faster with pypy3. As I said, your module appears to be faster for compression, but slower for decompression. Initial backups are compression-heavy with very little decompression, but subsequent backups are doing both compression and decompression. Here's some code I've been using to performance-test two of the different lzma modules: There was a memory leak in the lzma module that comes with pypy3; they fixed that. Did they let you know about it? Or even get it from you? I don't see anything about bufsiz in /usr/local/pypy3-5.8.0-with-lzma-fixes/lib-python/3/lzma.py . I'm now starting to wonder if pypy3's lzma code has diverged significantly from yours. Thanks! |
I didn't realise they had forked my project no! I assumed pypy3 might want a more compatible copy of the CPython lzma module (without the extra features I added). But I'm very happy to see it there. It hasn't received huge changes, the class The algorithm for growing the buffer that liblzma outputs decompressed data is quite bizarre, as it calls realloc 2479 times in this case. If it simply grew the output buffer by 8KB each time, it would only call it 122 times. And all The algorithm is from the CPython module's source code though. Anyway, I'm not sure this is a good benchmark when your actual decompressed data is only 10% bigger than the compressed data. Also the usual caveats about benchmarking pypy apply - it's slow when it starts, but after a few thousand loops the JIT has kicked in and you'll see its real speed. I added another 10 or 100 calls to Well, good luck finding the reason for the speed difference. :) |
Where did you change _bufsiz?
I tried:
for counter in range(100):
_unused = counter
decompressor = lzma.LZMADecompressor(format=lzma.FORMAT_XZ,
memlimit=max_size)
if hasattr(decompressor, '_bufsiz'):
print('Setting _bufsize to {}'.format(max_size))
decompressor._bufsiz = max_size
result = decompressor.decompress(compressed_data)
_unused = result
...but I didn't get a speedup. (You can see this in context at
http://stromberg.dnsalias.org/svn/utime-performance-comparison/trunk/upc)
Would you consider sending a small diff?
Thanks for thinking about it!
…On Sat, Sep 30, 2017 at 6:08 PM, Tomer Chachamu ***@***.***> wrote:
I didn't realise they had forked my project no! I assumed pypy3 might want
a more compatible copy of the CPython lzma module (without the extra
features I added). But I'm very happy to see it there.
It hasn't received huge changes, the class LZMADecompressor is just
implemented in the _lzma module rather than the lzma module. If I take
your microbenchmark and change _bufsiz on the LZMADecompressor instance
to the expected decompressed size 1000000 then pypy3 and CPython become
practically the same speed (0.37s), while setting it to 1000000-1 makes
it slow again (0.48s).
The algorithm for growing the buffer that liblzma outputs decompressed
data is quite bizarre, as it calls realloc 2479 times in this case. If it
simply grew the output buffer by 8KB each time, it would only call it 122
times. And all The algorithm is from the CPython module's source code
though.
Anyway, I'm not sure this is a good benchmark when your actual
decompressed data is only 10% bigger than the compressed data. Also the
usual caveats about benchmarking pypy apply - it's slow when it starts, but
after a few thousand loops the JIT has kicked in and you'll see its real
speed. I added another 10 or 100 calls to alt_lzma_decompression_test and
the last call took 0.43 or 0.41 seconds instead of the usual 0.48.
Well, good luck finding the reason for the speed difference. :)
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#2 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AA0yGscxVGEUIX4yEhdYfB6Zn7jOd8dOks5snuX5gaJpZM4CgDze>
.
--
Dan Stromberg
|
In Pypy3 5.9, xz decompression is more than twice as slow as that found in CPython 3.6. So it's actually gotten worse. |
It would be faster by using realloc instead of malloc for our strings (avail_out) and thus putting that loop into C-space.
The text was updated successfully, but these errors were encountered: