Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve usage as a library #110

Merged
merged 10 commits into from
Feb 24, 2025
Merged

Improve usage as a library #110

merged 10 commits into from
Feb 24, 2025

Conversation

devongovett
Copy link
Contributor

@devongovett devongovett commented Feb 16, 2025

First of all, thanks for building this project! Very excited about it. I'm working on a Rust-based HTML transformer/minifier for Parcel, and was able to take advantage of oxvg for embedded SVGs in HTML. This PR includes several things I needed in order to get that working. See parcel-bundler/parcel#10090.

Trait lifetimes

For my HTML transformer, I'm using html5ever with a custom DOM implementation that uses an arena allocator rather than rcdom. This improves allocation performance, and also makes some things easier by passing around references rather than clones. However, oxvg's Node trait currently requires 'static lifetimes, and does some dynamic dispatch stuff that mean it's not possible to implement when a node has references. I was able to avoid this by defining a Parent associated type rather than using impl.

I also needed to return Atom directly from methods like get_attribute. I couldn't figure out a way to return a reference with lifetimes and an impl Deref. Maybe there's a way but cloning these seems ok since it is just a reference count. Do you have benchmarks showing significant overhead here?

Config options

Many of the jobs have #[serde(default)] attributes defined, but these don't apply when constructing the options programmatically. I've added manual implementations of the Default trait matching these serde defaults. An alternative would be a proc macro to automatically generate these, but it didn't seem too hard to just do it manually.

I've also marked all of the config properties as pub so they can be set programmatically.

Other fixes

While testing, I noticed that the convert_colors job was converting more attributes than just colors. For example, <rect width="100"> was becoming <rect width="100px">. This is because it was parsing all attributes as CSS, and then re-serializing them. I've addressed this by parsing only presentation attributes, and only converting color properties.

# Conflicts:
#	crates/oxvg_optimiser/src/jobs/cleanup_list_of_values.rs
#	crates/oxvg_optimiser/src/jobs/cleanup_numeric_values.rs
#	crates/oxvg_optimiser/src/jobs/inline_styles.rs
#	crates/oxvg_optimiser/src/jobs/merge_paths.rs
#	crates/oxvg_optimiser/src/jobs/prefix_ids.rs
#	crates/oxvg_optimiser/src/jobs/remove_attrs.rs
#	crates/oxvg_optimiser/src/jobs/remove_deprecated_attrs.rs
#	crates/oxvg_optimiser/src/jobs/remove_elements_by_attr.rs
#	crates/oxvg_optimiser/src/jobs/remove_off_canvas_paths.rs
#	crates/oxvg_optimiser/src/jobs/remove_scripts.rs
#	crates/oxvg_optimiser/src/jobs/remove_style_element.rs
#	crates/oxvg_optimiser/src/jobs/remove_title.rs
#	crates/oxvg_optimiser/src/jobs/remove_unknowns_and_defaults.rs
#	crates/oxvg_optimiser/src/jobs/remove_useless_stroke_and_fill.rs
#	crates/oxvg_optimiser/src/jobs/sort_defs_children.rs
@noahbald
Copy link
Owner

Thanks Devon, I really appreciate you going to all this work. I’m still working through getting this ready for a proper library release.

I’ll have a look through your PR and try get it merged first since it seems to cover a good range of things.

I’m also still fairly new to Rust, so please do let me know if you find any other quirks!

@noahbald
Copy link
Owner

noahbald commented Feb 17, 2025

Do you have benchmarks showing significant overhead here?

there are some benches set up in oxvg_optimiser you can take a look at. I haven’t looked too closely into performance differences here.

I was trying not to expose anything too closely to the underlying structs (i.e. rcdom) since I’m still looking into other parsers/libs that may cover some issues from html5ever (namely bad handling of namespaces and doctype entities).

If Atom is needed directly, would refining traits or a qualified ‘Element5Ever::get_attribute’ work?


Edit: I do have a vague note in the description of this PR, if that helps
#85

@noahbald
Copy link
Owner

I'm using html5ever with a custom DOM implementation that uses an arena allocator rather than rcdom

Is this available as a library? Sounds like it could be useful in place of rcdom, which has other limitations (such as read-only element names, read-only comments, etc)

@devongovett
Copy link
Contributor Author

I was trying not to expose anything too closely to the underlying structs (i.e. rcdom) since I’m still looking into other parsers/libs that may cover some issues from html5ever (namely bad handling of namespaces and doctype entities).

Yeah. I like the way you've used traits to enable different parsers. Made it easy to use html5ever to parse SVG embedded in HTML, which has different parsing rules than pure XML. I wonder if we can fix some of the issues in xml5ever too? I noticed the xmlns issue you reported, and I'm currently using a fork that has the proposed fix. Looks like the tests got updated yesterday, so hopefully it'll be fixed soon.

If Atom is needed directly, would refining traits or a qualified ‘Element5Ever::get_attribute’ work?

It's not necessarily needed, but in my case the impl Deref needs to somehow capture the lifetime of the owning element and I haven't found a way to do that.

Is this available as a library? Sounds like it could be useful in place of rcdom, which has other limitations (such as read-only element names, read-only comments, etc)

It's wip still but I guess it could be. I forked one of the examples in the html5ever repo: https://github.com/parcel-bundler/parcel/blob/html-rs/crates/html/src/arena.rs and here are the oxvg trait implementations https://github.com/parcel-bundler/parcel/blob/html-rs/crates/html/src/oxvg.rs

@@ -266,6 +292,32 @@ impl<'a> Color<'a> {
_ => Color::None,
}
}

fn get_colors_for_attr(attr: &'a mut PresentationAttr) -> Self {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're using the visitors feature, so maybe it'd be simpler to have a struct the implements visit_color

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can do this in a separate PR maybe

@noahbald
Copy link
Owner

the impl Deref needs to somehow capture the lifetime

I'm not sure we strictly need to use impl Deref since I don't really see myself using any but Ref/RefMut -- so I'd be fine for it to be updated if the lifetimes are needed

@noahbald
Copy link
Owner

noahbald commented Feb 18, 2025

the impl Deref needs to somehow capture the lifetime

I'm not sure we strictly need to use impl Deref since I don't really see myself using any but Ref/RefMut -- so I'd be fine for it to be updated if the lifetimes are needed

Hey Devon, just saw your tweet, I was wondering where all that traffic was coming from 😅 -- anyway, thanks for the shoutout!

Happy to approve this as it is now, just wondering whether you'd prefer to do this as part of this pr (updating impl Deref -> Ref<'arena, _>), otherwise I can work on it some time next week separately

@noahbald noahbald merged commit 06b8806 into noahbald:main Feb 24, 2025
@devongovett
Copy link
Contributor Author

Sorry I ran out of time last week! I'll try to get back to it soon if you don't get to it first.

@noahbald
Copy link
Owner

Nw I’ve been working on it while waiting, I’ll have a PR once I’m done with work

@noahbald
Copy link
Owner

How does #114 look @devongovett?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants