Add xerial snappy read/writer#838
Conversation
Forked from [github.com/eapache/go-xerial-snappy](https://github.com/eapache/go-xerial-snappy). Changes: * Uses [S2](https://github.com/klauspost/compress/tree/master/s2#snappy-compatibility) for better/faster compression and decompression. * Fixes 0-length roundtrips. * Adds `DecodeCapped`, which allows decompression with capped output size. * `DecodeInto` will decode directly into destination if there is space enough. * `Encode` will now encode directly into 'dst' if it has space enough. * Fixes short snappy buffers returning `ErrMalformed`. * Renames `EncodeStream` to `Encode`. * Adds `EncodeBetter` for better than default compression at ~half the speed. Comparison (before/after): ``` BenchmarkSnappyStreamEncode-32 959010 1170 ns/op 875.15 MB/s 1280 B/op 1 allocs/op BenchmarkSnappyStreamEncode-32 1000000 1107 ns/op 925.04 MB/s 0 B/op 0 allocs/op --> Output size: 913 -> 856 bytes BenchmarkSnappyStreamEncodeBetter-32 477739 2506 ns/op 408.62 MB/s 0 B/op 0 allocs/op --> Output size: 835 bytes BenchmarkSnappyStreamEncodeMassive-32 100 10596963 ns/op 966.31 MB/s 40977 B/op 1 allocs/op BenchmarkSnappyStreamEncodeMassive-32 100 10220236 ns/op 1001.93 MB/s 0 B/op 0 allocs/op --> Output size: 2365547 -> 2256991 bytes BenchmarkSnappyStreamEncodeBetterMassive-32 69 16983314 ns/op 602.94 MB/s 0 B/op 0 allocs/op --> Output size: 2011997 bytes BenchmarkSnappyStreamDecodeInto-32 1887378 639.5 ns/op 1673.19 MB/s 1088 B/op 3 allocs/op BenchmarkSnappyStreamDecodeInto-32 2707915 436.2 ns/op 2452.99 MB/s 0 B/op 0 allocs/op BenchmarkSnappyStreamDecodeIntoMassive-32 267 4559594 ns/op 2245.81 MB/s 71120 B/op 1 allocs/op BenchmarkSnappyStreamDecodeIntoMassive-32 282 4285844 ns/op 2389.26 MB/s 0 B/op 0 allocs/op ```
Hmm, that's a surprisingly good question :) It sounds like there are some bug fixes and some feature improvements mixed together here? Though it's not clear to me at a glance whether the bug fixes are just side-effects of the switch to S2, or whether the framing code in xerial.go actually changed meaningfully as well? The last time I touched the xerial-snappy package was January, in eapache/go-xerial-snappy#9 - @dnwe and I had a chat in that PR (which you are welcome to read for context) about the future of the package and its primary user (https://github.com/IBM/sarama/), given I'm not really involved in any of the related projects anymore.
To answer the original question though - I'm certainly happy to accept small bug fixes and improvements upstream if you'd like to contribute them there instead / as well / depending on the discussion with Dom. Swapping the underlying library for S2 is not something that I would feel comfortable accepting just because it's a major change and I don't have the setup or time to verify for myself that it actually maintains compatibility, wouldn't break Sarama, etc. |
|
@eapache I can send the fixes for the decoding edge cases. That'll give me an excuse to add the test cases outside of the fuzz tests. I understand that you aren't looking to test replacing the encoder - I had a hunch about that, which is why I forked it. It is of course safe, but I wouldn't take such a change without spending time investigating myself. Yes, it would be drop-in, except Sidenote: It took me a while to understand how the "append" mechanic of EncodeStream works. I think I will add some more documentation on that. I will send a PR with the fixes ASAP. You can then decide on what the fate of your package will be :) |
This PR contains the following updates: | Package | Type | Update | Change | |---|---|---|---| | [github.com/klauspost/compress](https://togithub.com/klauspost/compress) | indirect | minor | `v1.16.7` -> `v1.17.0` | --- ### Release Notes <details> <summary>klauspost/compress (github.com/klauspost/compress)</summary> ### [`v1.17.0`](https://togithub.com/klauspost/compress/releases/tag/v1.17.0) [Compare Source](https://togithub.com/klauspost/compress/compare/v1.16.7...v1.17.0) #### What's Changed - Add dictionary builder by [@​klauspost](https://togithub.com/klauspost) in [https://github.com/klauspost/compress/pull/853](https://togithub.com/klauspost/compress/pull/853) - Add xerial snappy read/writer by [@​klauspost](https://togithub.com/klauspost) in [https://github.com/klauspost/compress/pull/838](https://togithub.com/klauspost/compress/pull/838) - flate: Add limited window compression by [@​klauspost](https://togithub.com/klauspost) in [https://github.com/klauspost/compress/pull/843](https://togithub.com/klauspost/compress/pull/843) - s2: Do 2 overlapping match checks by [@​klauspost](https://togithub.com/klauspost) in [https://github.com/klauspost/compress/pull/839](https://togithub.com/klauspost/compress/pull/839) - flate: Add amd64 assembly matchlen by [@​klauspost](https://togithub.com/klauspost) in [https://github.com/klauspost/compress/pull/837](https://togithub.com/klauspost/compress/pull/837) - gzip: Copy bufio.Reader on Reset by [@​thatguystone](https://togithub.com/thatguystone) in [https://github.com/klauspost/compress/pull/860](https://togithub.com/klauspost/compress/pull/860) - zstd: Remove offset from bitReader by [@​greatroar](https://togithub.com/greatroar) in [https://github.com/klauspost/compress/pull/854](https://togithub.com/klauspost/compress/pull/854) - fse, huff0, zstd: Remove always-nil error returns by [@​greatroar](https://togithub.com/greatroar) in [https://github.com/klauspost/compress/pull/857](https://togithub.com/klauspost/compress/pull/857) - tests: unnecessary use of fmt.Sprintf by [@​testwill](https://togithub.com/testwill) in [https://github.com/klauspost/compress/pull/836](https://togithub.com/klauspost/compress/pull/836) - tests: Fix OSS fuzzer t.Run by [@​klauspost](https://togithub.com/klauspost) in [https://github.com/klauspost/compress/pull/852](https://togithub.com/klauspost/compress/pull/852) - tests: Use Go 1.21.x by [@​klauspost](https://togithub.com/klauspost) in [https://github.com/klauspost/compress/pull/851](https://togithub.com/klauspost/compress/pull/851) #### New Contributors - [@​testwill](https://togithub.com/testwill) made their first contribution in [https://github.com/klauspost/compress/pull/836](https://togithub.com/klauspost/compress/pull/836) - [@​thatguystone](https://togithub.com/thatguystone) made their first contribution in [https://github.com/klauspost/compress/pull/860](https://togithub.com/klauspost/compress/pull/860) **Full Changelog**: klauspost/compress@v1.16.7...v1.17.0 </details> --- ### Configuration 📅 **Schedule**: Branch creation - "before 4am on the first day of the month" (UTC), Automerge - At any time (no schedule defined). 🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied. ♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox. 🔕 **Ignore**: Close this PR and you won't be reminded about this update again. --- - [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check this box --- This PR has been generated by [Renovate Bot](https://togithub.com/renovatebot/renovate). <!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNi4xMDkuNCIsInVwZGF0ZWRJblZlciI6IjM2LjEwOS40IiwidGFyZ2V0QnJhbmNoIjoibWFpbiJ9-->
Forked from github.com/eapache/go-xerial-snappy.
Changes:
DecodeCapped, which allows decompression with capped output size.DecodeIntowill decode directly into destination if there is space enough.Encodewill now encode directly into 'dst' if it has space enough.ErrMalformed.EncodeStreamtoEncode.EncodeBetterfor better than default compression at ~half the speed.Comparison (before/after):
@eapache - I don't know if you are interested in having these changes upstream?