Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CometBuffer can potentially lead to concurrent modification of a held buffer (aka is "Unsound" in Rust terms) #1035

Open
tustvold opened this issue Oct 23, 2024 · 29 comments · May be fixed by #1050
Assignees

Comments

@tustvold
Copy link

tustvold commented Oct 23, 2024

Describe the bug

It was brought to my attention in apache/arrow-rs#6616 that comet is currently violating the aliasing rules of the Rust compiler. In particular it is mutating memory without exclusive ownership.

The docs on CometBuffer actually call out that the type is unsound - https://github.com/apache/datafusion-comet/blob/main/native/core/src/common/buffer.rs#L166.

This is the underlying cause of #1030, which is a relatively harmless manifestation of what is ultimately undefined behaviour.

Even putting aside that UB effectively means the program could do literally anything, the exact scenario in #1030 could easily lead to out of bounds memory access, e.g. by unmasking a dictionary key that was previously null and now points to some arbitrary memory location.

I debated filing this ticket, as I wasn't sure how it would be received, but I think it is a sufficiently critical vulnerability that should at the very least be tracked / documented. The way it was being dismissed made me honestly a little uncomfortable. Ultimately CometBuffer is unsound, and there is a concrete example of this unsoundness leading to undefined behaviour in #1030.

Steps to reproduce

Expected behavior

No response

Additional context

FYI @viirya

@tustvold tustvold added the bug Something isn't working label Oct 23, 2024
@viirya
Copy link
Member

viirya commented Oct 23, 2024

I don't know why this is tagged as bug...

@viirya viirya removed the bug Something isn't working label Oct 23, 2024
@tustvold
Copy link
Author

I don't know why this is tagged as bug...

Apologies, I presumed that a security vulnerability was a bug...

@alamb alamb changed the title CometBuffer is Unsound CometBuffer can potentially lead to concurrent modification of a held buffer (aka is "Unsound" in Rust terms) Oct 23, 2024
@viirya
Copy link
Member

viirya commented Oct 23, 2024

I debated filing this ticket, as I wasn't sure how it would be received, but I think it is a sufficiently critical vulnerability that should at the very least be tracked / documented. The way it was being dismissed made me honestly a little uncomfortable.

As you mentioned the doc of CometBuffer already documents the unsafe behavior. Not sure why you also said it is not documented.

I also don't know why it was being dismissed. We have dealt with that with deep copying the arrays in the necessary cases. The fact is that the take kernel has the inconsistent behavior not well documented which can easily confuse users. Before digging into its detail, how does an user know it clones a buffer sometimes and not sometimes? So for an user, isn't the first impression that it works well when we don't deep copy the array? How does an user know it has different behavior that we must deep copy for it in corner cases? Especially it is not documented.

@alamb
Copy link
Contributor

alamb commented Oct 23, 2024

I tried to update the title of this PR to better reflect what it is reporting.

@tustvold
Copy link
Author

tustvold commented Oct 23, 2024

I am trying to report that your codebase has a critical bug in its handling of memory, without causing unnecessary angst. Modifying buffers behind the back of the Rust compiler is undefined behaviour, and can trivially lead to out of bounds memory access... This is typically considered a serious security defect...

As you mentioned the doc of CometBuffer already documents the unsafe behavior

CometBuffer does document that it has unsafe invariants, this isn't in and of itself an issue. Yes, it is unsound which Rust purists will complain about, but provided nothing uses it in a way that triggers undefined behaviour it isn't the end of the world. The problem is #1030 highlights that it is not being used in a manner that avoids undefined behaviour. This is what I think should be documented, highlighted, and arguably fixed.

Edit: to expand on this, if, for example, comet were instead using into_mutable().unwrap() or similar, #1030 would instead be a runtime panic, which would avoid UB and make tracking down where assumptions are violated much easier.

@viirya
Copy link
Member

viirya commented Oct 23, 2024

The problem is #1030 highlights that it is not being used in a manner that avoids undefined behaviour

As I mentioned earlier, we did effort to avoid it for the cases we know...
But the problem is, nobody knows the kernel behaves inconsistently except for ones already know the details. If we know that in advance, we definitely avoid it like we did for other operations. As I mentioned many times, the kernel behavior is inconsistent and undocumented. It definitely causes user confusion and spend time digging into implementation details just for figuring out why it happens.

@viirya
Copy link
Member

viirya commented Oct 23, 2024

Edit: to expand on this, if, for example, comet were instead using into_mutable().unwrap() or similar, #1030 would instead be a runtime panic, which would avoid UB and make tracking down where assumptions are violated much easier.

I believe that Comet scan is developed long before the mutable API comes out...
Not to mention that if it is suitable for our cases.

@alamb
Copy link
Contributor

alamb commented Oct 23, 2024

I believe that Comet scan is developed long before the mutable API comes out.

I wonder if there might be be some way to use Arc / ref counting to verify at runtime that there are no remaining references to the underlying CometBuffer before it is reused 🤔 Kind of like the try_unary_mut style kernel -- that way it would at least be easier to track down places where the assumptions are invalidated (that there is no more sharing)

@viirya
Copy link
Member

viirya commented Oct 24, 2024

I wonder if there might be be some way to use Arc / ref counting to verify at runtime that there are no remaining references to the underlying CometBuffer before it is reused

The scan code is not developed by me. I guess that may not work as the CometBuffer internally doesn't use pointer like Arc. It is very low-level raw pointer manipulation. That's why I said the into_mutable may not be suitable.

@tustvold
Copy link
Author

tustvold commented Oct 24, 2024

As I mentioned earlier, we did effort to avoid it for the cases we know...
But the problem is, nobody knows the kernel behaves inconsistently except for ones already know the details.

So basically it is safe predicated on undocumented assumptions about third-party code. This is at best optimistic 😅

I dunno, I'm going to disengage at this point but I have to say I have found the response here deeply disappointing and tbh rather troubling. I had hoped there'd at least be an acknowledgement that this API is hard to use safely, but instead the response is to suggest the problem lies in the third-party code for not conforming to your assumptions, and that you'll just add some extra copies to "workaround" this... There are ways to do this safely and I would have hoped that would be seen as a north star, even if not a priority at the moment...

@alamb
Copy link
Contributor

alamb commented Oct 24, 2024

Well, I don't presume to know what is / isn't the right design for other systems.

I think this ticket will serve as a good discussion for how to potentially improve the situation and I don't find the discussions or responses troubling.

@alamb
Copy link
Contributor

alamb commented Oct 24, 2024

The scan code is not developed by me. I guess that may not work as the CometBuffer internally doesn't use pointer like Arc. It is very low-level raw pointer manipulation. That's why I said the into_mutable may not be suitable.

Makes sense -- thus it sounds like removing this assumption is likely a non trivial effort

What I personally hope / plan to do is to is work with people make the ParquetExec reader in DataFusion so utterly compelling in terms of performance that Comet can consider using that instead of CometBuffer.

Current obsession is making pushdown predicates even faster with @XiangpengHao and @tustvold: apache/arrow-rs#5523

@andygrove
Copy link
Member

I do think that some valid concerns have been raised here. The questions at the top of my mind right now are:

  1. What additional tests could we add that would have caught this issue? Even if take was meeting the original assumptions today, it could have changed in the future, so how do we ensure that our use of the unsafe CometBuffer is safe and that we don't have regressions if the behavior in arrow-rs changes in the future?
  2. What would an integration with the arrow-rs / datafusion Parquet reader/exec look like? We need to integrate with Spark to read the raw bytes from Spark data sources (could be HDFS, S3, and many others, including custom readers) and would then need to hand those bytes off to native code for decoding. I would certainly like to be able to take advantage of the new Utf8View support.

@viirya
Copy link
Member

viirya commented Oct 24, 2024

2. What would an integration with the arrow-rs / datafusion Parquet reader/exec look like? We need to integrate with Spark to read the raw bytes from Spark data sources (could be HDFS, S3, and many others, including custom readers) and would then need to hand those bytes off to native code for decoding. I would certainly like to be able to take advantage of the new Utf8View support.

Integration with DataFusion Parquet reader needs to break some designs around current native reader like I mentioned in #1031. For now, it is not a easy transition as simple as to replace A with B. There will be a huge effort on this, not to mention that a JVM reader frontend is important for some cases.

@viirya
Copy link
Member

viirya commented Oct 25, 2024

I wonder if there might be be some way to use Arc / ref counting to verify at runtime that there are no remaining references to the underlying CometBuffer before it is reused

I just took another look at the related code. Arc / ref counting check won't be effective here, is also because these arrays are passed across JVM/native through C Data interface. Even I try to add an Arc to guard exclusive modification on the buffer data. Once we export the arrays through JVM and import into native, imported buffers won't get the Arc ref counts back but start with new Arc pointers. We cannot get Arc working through import/export arrays.

@tustvold
Copy link
Author

tustvold commented Oct 25, 2024

start with new Arc pointers

This isn't an issue provided the JVM doesn't free or modify the buffers whilst any native code has them. arrow-rs will always treat externally shared memory as immutable, it requires the other participants to do likewise. Once the arrow reference count reaches 0 it will call release, and this can decrement the Java side reference count, or whatever data structure it is using to track this. This is how FFI with pyarrow and arrow-cpp works

@viirya
Copy link
Member

viirya commented Oct 25, 2024

@alamb suggested if we can use Arc to check if there are no remaining references to the buffer before it is reused. What I said above is that this cannot work as we can not keep Arc reference count through JVM/native. I don't know how it is related to what you are talking about. Of course that's how FFI works, but it doesn't work for the proposed check.

@tustvold
Copy link
Author

tustvold commented Oct 25, 2024

I don't know how it is related to what you are talking about

I interpreted "I wonder if there might be be some way to use Arc / ref counting " as referring to the ref-counting machinery present in Buffer, and was expanding on how it might be used to achieve the stated goal. The machinery to do this safely already exists, it is just a question of hooking it up.

Edit: To expand further as there seems to be a lot of confusion here, the method linked in the original issue description, just needs to pass something instead of Arc::new(0) that prevents the JVM from reusing or freeing the buffer until it is dropped.

@viirya
Copy link
Member

viirya commented Oct 25, 2024

Edit: To expand further as there seems to be a lot of confusion here, the method linked in the original issue description, just needs to pass something real instead of Arc::new(0) that prevents the JVM from reusing or freeing the buffer until it is dropped.

Huh? When exporting the array from native to JVM, I believe JVM doesn't care about this Allocation parameter. Am I missing something?

@tustvold
Copy link
Author

tustvold commented Oct 25, 2024

I believe JVM doesn't care about this Allocation parameter

Indeed, currently in Comet the JVM doesn't use this parameter, and that is the problem 😅

Contrast this with, for example, how arrow-rs handles receiving data over the C Data interface, e.g. for interop with arrow-cpp or pyarrow. It first constructs an FFI_ArrowArray to manage the allocation, and then passes this as the Allocation parameter - https://github.com/apache/arrow-rs/blob/master/arrow-array/src/ffi.rs#L239. This ensures that the memory is not freed or mutated by the external process until all arrow-rs buffers are done with it.

@viirya
Copy link
Member

viirya commented Oct 25, 2024

I believe that the reason why Allocation parameter is here is because we claim an imported buffer uses custom allocation which arrow-rs won't try to deallocate it when dropping the buffer. I doubt that it is used to make sure the memory is not freed or mutated by external process (actually I doubt how it is possible to guarantee it in any ways). On the contrary, it prevents the memory allocated by external won't be freed by arrow-rs when dropping the imported buffer.

@tustvold
Copy link
Author

tustvold commented Oct 25, 2024

I doubt that it is used to make sure the memory is not freed or mutated by external process (actually I doubt how it is possible to guarantee it in any ways)

Nope that is precisely what it is for - https://arrow.apache.org/docs/format/CDataInterface.html#memory-management

Edit: I realize you could have been saying why Allocation is used in Comet, and yes you are right, however, by using it in this way the JVM has no way to know when it can safely free or reuse the buffer, which is my whole point. If you instead did something similar to what the C Data interface does and gave it an actual allocation object you wouldn't have this issue

@viirya
Copy link
Member

viirya commented Oct 25, 2024

Nope that is precisely what it is for - https://arrow.apache.org/docs/format/CDataInterface.html#memory-management

I feel that we still are not on the same page...

However, any data pointed to by the struct MUST be allocated and maintained by the producer.
Therefore, the consumer MUST not try to interfere with the producer’s handling of these members’ lifetime. The only way the consumer influences data lifetime is by calling the base structure’s release callback.

It clearly states that the data memory is allocated and maintained by the producer. The consumer must not interfere with the lifetime of the memory allocated by the producer. That is exactly what I said above.

@viirya
Copy link
Member

viirya commented Oct 25, 2024

Okay. I posted the above reply before seeing your addition to previous comment.

@viirya
Copy link
Member

viirya commented Oct 25, 2024

It first constructs an FFI_ArrowArray to manage the allocation, and then passes this as the Allocation parameter - https://github.com/apache/arrow-rs/blob/master/arrow-array/src/ffi.rs#L239. This ensures that the memory is not freed or mutated by the external process until all arrow-rs buffers are done with it.
If you instead did something similar to what the C Data interface does and gave it an actual allocation object you wouldn't have this issue

I'm still not sure what you referred to. It is clear to me that when Deallocation::Custom is given when importing array through FFI, it means it won't interfere with memory deallocation. This is completely following the C Data interface statement about the memory lifetime I quoted above #1035 (comment). It has no means to prevent external process to free/mutate the memory.

I also feel it somehow distracts from what was discussed at the beginning.

I wonder if there might be be some way to use Arc / ref counting to verify at runtime that there are no remaining references to the underlying CometBuffer before it is reused

This is what @alamb suggested at the beginning. However, as I posted above, Arc ref counting won't work across JVM/native. We cannot rely on it to do the check for remaining references.

This isn't an issue provided the JVM doesn't free or modify the buffers whilst any native code has them.

But you claimed that is not an issue. I don't know why. And, Comet doesn't free or modify the buffers in JVM.

@viirya
Copy link
Member

viirya commented Oct 25, 2024

I believe JVM doesn't care about this Allocation parameter

Indeed, currently in Comet the JVM doesn't use this parameter, and that is the problem

Btw, to clarify this, although this is not focus but I think there is confusion. Comet doesn't implement JVM Arrow FFI. It is from Java Arrow.

I remember Java Arrow has similar thing like the arrow-rs Deallocation::Custom to prevent Java Arrow directly releases imported buffer.

@tustvold
Copy link
Author

tustvold commented Oct 25, 2024

Ok I will try one final time to clarify from first principles:

  1. Something modifies the memory of a null buffer whilst an arrow-rs Buffer points to it, violating the Rust memory model (UB)
  2. It has been indicated that this is being done by the JVM parquet reader reusing the buffer
  3. Presuming we want to avoid UB we therefore need a way to stop 2. from happening
  4. This means we need some mechanism for arrow-rs to indicate to the JVM when it no longer holds any references to the data
  5. The ref counting machinery in Buffer provides the ability to register a custom Allocation
  6. When the reference count drops to zero it calls Allocation::drop
  7. This can be used to inform the JVM that the memory is no longer in use by arrow-rs and can be reclaimed / unpinned / reused
  8. Therefore by hooking up Allocation we can resolve the unsoundness of the current API and avoid future UB
  9. The way that arrow-rs hooks up the C Data Interface is an example of how this can be done

@viirya
Copy link
Member

viirya commented Oct 25, 2024

  1. It has been indicated that this is being done by the JVM parquet reader reusing the buffer

No, It is not JVM reusing the buffer, but the native reader reuses it. But I get your point. Let's replace it with native parquet reader.

  1. The ref counting machinery in Buffer provides the ability to register a custom Allocation
  2. When the reference count drops to zero it calls Allocation::drop
  3. This can be used to inform the JVM that the memory is no longer in use by arrow-rs and can be reclaimed / unpinned / reused
  4. Therefore by hooking up Allocation we can resolve the unsoundness of the current API and avoid future UB
  5. The way that arrow-rs hooks up the C Data Interface is an example of how this can be done

Okay. I got what you said.

I believe that the confusion began from #1035 (comment), #1035 (comment) and #1035 (comment).

I thought you were suggested to provide custom Allocation to JVM and it is confused as JVM doesn't take it when importing arrays through FFI.

What you are suggesting can be simplified to one sentence. Providing a custom Allocation when CometBuffer exports arrow buffer, and using it to detect if exported buffer is dropped from the importer. It sounds making sense. I will give it a try. Thanks.

@andygrove
Copy link
Member

What I personally hope / plan to do is to is work with people make the ParquetExec reader in DataFusion so utterly compelling in terms of performance that Comet can consider using that instead of CometBuffer.

@alamb @viirya I created a separate issue to continue this discussion:

#1040

@viirya viirya self-assigned this Oct 28, 2024
@viirya viirya linked a pull request Nov 5, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants