Skip to content

Conversation

@pixelherodev
Copy link
Contributor

@pixelherodev pixelherodev commented Oct 30, 2025

Regenerates the flatbuf package using the optimized flatbuffer compiler here. It's a pretty straightforward patch; it basically just passes internal values along the stack instead of by pointers (which necessitates heaps allocations). The ipc package is adjusted accordingly, to use the adjusted APIs.

I know editing generated code is usually not looked on super favorably, but IMO the performance win is sufficient to justify it.

Also addresses the // FIXME(sbinet) refactor msg-header handling.; with the Table exposed directly (which it should be since it can be modified via .Init anyways!), we can just use e.g. msg.Header(&foo.Table) instead of needing the initFB helper.

Before:

BenchmarkIPC/Writer/codec=plain-24         	 223521	     5303 ns/op	   8160 B/op	     91 allocs/op
BenchmarkIPC/Reader/codec=plain-24         	 223260	     4752 ns/op	   6130 B/op	     83 allocs/op
BenchmarkIPC/Writer/codec=zstd-24          	   5019	   300874 ns/op	2338732 B/op	    150 allocs/op
BenchmarkIPC/Reader/codec=zstd-24          	  37974	    28207 ns/op	  25877 B/op	    194 allocs/op
BenchmarkIPC/Writer/codec=lz4-24           	  96709	    11894 ns/op	  13021 B/op	    134 allocs/op
BenchmarkIPC/Reader/codec=lz4-24           	 120859	     9945 ns/op	  10155 B/op	    124 allocs/op

After:

cpu: AMD Ryzen 9 7900X3D 12-Core Processor          
BenchmarkIPC/Writer/codec=plain-24         	 213406	     5856 ns/op	   8160 B/op	     91 allocs/op
BenchmarkIPC/Reader/codec=plain-24         	 270422	     4526 ns/op	   5780 B/op	     69 allocs/op
BenchmarkIPC/Writer/codec=zstd-24          	   5756	   335170 ns/op	2338710 B/op	    150 allocs/op
BenchmarkIPC/Reader/codec=zstd-24          	  42300	    25984 ns/op	  25470 B/op	    178 allocs/op
BenchmarkIPC/Writer/codec=lz4-24           	 112836	    10816 ns/op	  12701 B/op	    134 allocs/op
BenchmarkIPC/Reader/codec=lz4-24           	 140852	     9199 ns/op	   9677 B/op	    108 allocs/op

@zeroshade
Copy link
Member

Is there any possibility of getting the optimizations pushed upstream in the flatbuffer compiler?

Otherwise we should at least update the README or something to ensure that future regenerating of the flatbuffer files uses the same compiler patches

@pixelherodev
Copy link
Contributor Author

Is there any possibility of getting the optimizations pushed upstream in the flatbuffer compiler?

I'm trying, but realistically it's a Google project. I honestly doubt they're going to even look at it; they get bonuses for ruining public-facing projects, not for making things better.

Otherwise we should at least update the README or something to ensure that future regenerating of the flatbuffer files uses the same compiler patches

That's definitely fair - would you want to host the compiler patches somewhere yourself? It's one reason I wanted to split this out into its own PR; I'm not sure the best way to handle it. If it was my project, I'd just link to my own patches, but - it's not :)

@zeroshade
Copy link
Member

Hmm, I'd be fine with creating a repo for the patches myself. @lidavidm @kou @amoeba what do you think?

@lidavidm
Copy link
Member

lidavidm commented Nov 4, 2025

Do we have a sense of how hard it would be to keep these patches maintained?

@lidavidm
Copy link
Member

lidavidm commented Nov 4, 2025

(Also, is there a PR upstream that we can look at? I agree that Google seems unlikely to care, but we should at least try.)

@kou
Copy link
Member

kou commented Nov 4, 2025

(Also, is there a PR upstream that we can look at? I agree that Google seems unlikely to care, but we should at least try.)

+1

@pixelherodev
Copy link
Contributor Author

google/flatbuffers#8744 <- yes, I did open one upstream @lidavidm @kou. If they engage with me I intend to push for it.

I need to figure out the CLA status and such though.

@lidavidm
Copy link
Member

lidavidm commented Nov 4, 2025

Thanks! I'm going to try poking around my network to see if I can find any chain that leads to a Flatbuffer owner but, well, I have low hopes...

@kou
Copy link
Member

kou commented Nov 4, 2025

Thanks. It seems that your need to sign a CLA: google/flatbuffers#8744 (comment)

Could you check it?

@kou
Copy link
Member

kou commented Nov 4, 2025

Ah, sorry. You already mentioned the CLA status.

@pixelherodev
Copy link
Contributor Author

Yeah, I need to double check. Contributing to Arrow is definitely fine, but Google's asking for more than just a contribution under an Apache license, they effectively want total ownership in all but name, and I don't think I want to try and get permission to give them that, frankly.

@pixelherodev
Copy link
Contributor Author

Do we have a sense of how hard it would be to keep these patches maintained?

Can't say for sure, given that I don't know what upstream looks like, but FWIW: I developed the patches on internal/flatbuf first, not knowing it was generated code, and then it took me all of... ten minutes? to figure out how to patch the Go code generator, and that's without having touched C++ in ~5 years.

So, I'd guess "not very hard." Especially since the logic of the patch is quite straightforwards:

  • Inline the flatbuffer Table. This means altering the _tab flatbuffer.Table line, which is constant, to just drop the _tab.
    • Find all references to _tab and replace them with direct references. Was able to just s/_tab.//g and it worked right away.
  • Find the function that generates a pointer return and takes a pointer in to use as storage, rewrite the return signature from *T to (T, bool), drop the logic that checks the passed in pointer and allocates, and just return the structure on success.

@pixelherodev
Copy link
Contributor Author

not very hard

FWIW: simple enough that I'd be totally fine committing to maintaining it myself, if we knew each other enough for that to mean anything 😅

@pixelherodev
Copy link
Contributor Author

Rebased and lint fixed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants