Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allocation heavy #32

Open
emberian opened this issue Aug 2, 2015 · 10 comments
Open

Allocation heavy #32

emberian opened this issue Aug 2, 2015 · 10 comments

Comments

@emberian
Copy link

emberian commented Aug 2, 2015

Every occurence of String or Vec is an allocation. Especially for sending messages, this seems a bit wasteful! The Cow (clone-on-write) type (and Into<Cow<str>>) can ease this burden, by allowing the flexibility of the user providing either a slice or a String, and the library only cloning it for unique ownership where necessary (if ever). This isn't a very practical concern, admittedly, and is more about good style and creating the best IRC library in existence.

@emberian
Copy link
Author

emberian commented Aug 2, 2015

It may also be wise to use some sort of interning for things like tags or commands. Servo uses the excellent string_cache library.

@aatxe
Copy link
Owner

aatxe commented Aug 2, 2015

Yeah, I've been meaning to reduce the allocations, but have been putting it off as likely premature optimization. It should probably be done though. I'll look into interning.

@angelsl
Copy link
Contributor

angelsl commented Feb 24, 2016

I've attempted to do this and after about three-four days of work I honestly think it's not worth it.

To avoid allocations as far as possible, the Message and Command structs/enums must be able to take

  • separate strings for each argument (e.g. when we specify PRIVMSG("#target", "message")); or
  • one string for the whole message (e.g. when we receive one message from the server ":Somebody PRIVMSG #channel :Message")

and for those two cases, either

  • String; or
  • &str.

The case of separate strings for each argument is trivial: we stick Cow<'a, str> in place of where String is used now.

But if we have one string for the whole message, and we don't want to allocate for each token, we have to

  • store the one string somewhere and use Range<usize>s to refer to the appropriate string slices; or
  • store references to the string and the range together, using some sort of reference-counted pointer; or
  • use some sort of string that allows you to take arbitrary slices that are then reference counted (I'm not sure if a crate exists for such a string).

This makes the library a lot more complex, at least to me; many functions need to be duplicated and slightly tweaked to parse into ranges or parse into slices, and there is really no way to use a macro or something to reduce this semi-repetition.

Or, perhaps, my Rust ability is not good enough to do this. I don't know.

You can take a look at my abandoned attempt here. It does not compile.

@8573
Copy link

8573 commented Dec 9, 2018

To complement the suggestion of string_cache, I'd like to mention the small-string-optimized inlinable_string — of course with each having its own pros and cons.

@agausmann
Copy link
Contributor

The best example that I have encountered of such an implementation would have to be url::Url. It is backed by a single string with u32 pointers to each component. You could make this zero-copy friendly by having a Cow<'a, str> or even a separate type like the dichotomy of Path and PathBuf, str and String, etc, to signify an owned and a borrowed message type.

@agausmann
Copy link
Contributor

agausmann commented Jun 6, 2019

I'm going to start working on this in my own fork.

If you all would like to review and contribute to it, maybe we can move it into a branch here to eventually be merged further.

@aatxe
Copy link
Owner

aatxe commented Jun 6, 2019

I would prefer it on 0.14 over develop, since that's the release I'd like to include the change in. I had attempted some work in this direction on the fewer_allocations branch as well. I think there were some issues when I merged 53fb890a7e69d4fa65bba4c07dd46e32e71d3022 though which causes 0.14 to regress on some features. This was pointed out in at least one other issue by @8573. I think they should not be relevant to this task though.

@agausmann
Copy link
Contributor

agausmann commented Jun 6, 2019

I'll take a look at it, thanks!

@angelsl While I understand the sentiment of zero allocation, building it up from separate string slices, I think producing a single owned, allocated buffer from the parts would be a decent compromise on performance that will reduce the complexity. Your idea, in theory, could be retained as part of a builder pattern that eventually can get converted into an owned message or formatted with Display.

@agausmann
Copy link
Contributor

agausmann commented Jun 7, 2019

Update: I put together an optimized version of the Message type that owns a single String with shallow references to the individual parts, like the example from Url that I gave before. It should compile standalone, but it lacks some features and isn't fully integrated with the rest of the system, so the crate fails to build. I haven't tested the parser yet either, but I will write tests for it tomorrow.

I toyed around with the idea of a separate DST type based around str, but I saw little benefit to using it as it would require quite a bit of unsafe and it's impossible to use existing raw string slices without copying them. str works essentially by coercing a pointer to [u8], which works because it doesn't carry any extra data, just the guarantee that it is valid UTF-8. Unfortunately that doesn't work here as we have to add more data (the message part indices) that doesn't fit in the original pointer.

EDIT: Rebased on 0.14

@agausmann
Copy link
Contributor

agausmann commented Jun 19, 2019

Going to put some of my thoughts here to give some focus and direction to what I'm doing. Feedback is welcome!

When designing the new protocol implementation, we need to think about the ways that the end-user is going to use it. The two main ways I can think of are:

  • Parsing (reading / decoding) received messages.
  • Building (encoding / writing) messages to be sent.

In both of these scenarios, we want to avoid as much copying and allocation as possible, as that is the whole point of this issue. Other variables include the complexity of the API, which we should try to minimize. I'm not sure whether that should be prioritized over the first point, and if so, how much we should compromise.

For the parsing scenario, we are given a stream of data or a string slice that we need to read, parse, and return string slices from, with the given goals:

  • If we need to read data into a buffer, that buffer should ideally be reusable to prevent re-allocation.

  • To truly minimize allocations, we would have to defer parsing of things with dynamic lengths like parameters and tags to Iterators. I haven't experimented with how well this performs over allocating a Vec for each of those parts, but iterators should outperform if that data only needs to be parsed once. I believe we can assume that to be true for most use cases, as the top-level message is usually only used once to build a Command that is then handed off to the appropriate handler.

  • The Command enum should exist mostly of borrows and fixed-size collections wherever possible. To facilitate the passing of specific message types, we could maybe make them their own named structs; for example, a struct Privmsg, with the variant then becoming Command::Privmsg(Privmsg).

For building messages, it makes sense to employ some kind of builder pattern that simply holds on to references for the message parts and then constructs the message in some buffer or writer.

I'm a bit wary about using borrowing and lifetimes right now, because it gets hairy when you try to give borrowed data to a future that is supposed to be 'static. Things like that will become easier with stuff like borrowing across yield points being possible, but this is only once async/await is stabilized and once the ecosystem shifts toward it.

In the meantime, a good alternative might the bytes crate, which stores its data in atomic reference-counted buffers. There will be some additional overhead for converting them to &str, though we may be able to (un)safely optimize this out and only enable encoding checks in debug if it is a big performance issue. I doubt it will be.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants