Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Support encoding into a bytes tensor #635
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support encoding into a bytes tensor #635
Changes from 28 commits
7921558
2a19014
73bdc85
54f5543
24842b6
c3ac80a
5b39c8f
1f9f904
f525848
9150137
872b569
a0dcafd
f49d507
ee3a199
485ee2e
27fdbac
8467b92
ee7a217
7b3847f
d85baa2
42c6373
254529f
3f0417c
290c96e
5f42d15
0f415c1
2954c9b
29866fc
9cb31c9
ff6c1e0
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We may relax the mutual-exclusivity check above eventually, if we implement both
write
andread
within the same class. For now, mutual-exclusivity is assumed and enforced, because we use the existence ofwrite
to set thewrite_flag
below.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Encoder.[cpp, h]
rely onAVIOToTensorContext
, and specifically on theAVIOToTensorContext::getOutputTensor()
method. They don't rely on theAVIOContextHolder
base class, like the decoder. For this reason we have to addAVIOBytesContext.cpp
to the source dependency here.Alternatively, I think we could make
getOutputTensor
a virtual method of the base class? But this method wouldn't make much sense for the existing child classes (likeAVIOBytesContext
), so this doesn't sounds like a great OOP design.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed on not making
getOutputTensor()
virtual on the base class, since it's only applicable to one of the derived classes. We could potentially pullAVIOToTensorContext
out ofAVIOBytesContext.[h|cpp]
to limit what gets put intolibtorchcodec_decoderN.so
.I think the issue here is that the
Encoder
class needs to callgetOutputTensor()
, which means that it must actually hold a reference to an actualAVIOToTensorContext
rather than just the baseAVIOContextHolder
. (Which is whatSingleStreamDecoder
does.) A potential way around this is forAVIOToTensorContext
to accept a tensor rather than creating its own. The caller, however, would still need to keep track of how many bytes it asked it to encode in order to do the final narrow on the tensor, so that's not necessarily any cleaner. Your call on what you think makes the most sense.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We enforce that
fileName
andformatName
parameters are mutually exclusive. Happy to consider alternative designs. Creating separate constructors seemed more complex and doesn't provide much value IMHO.Note that this only exists at the C++ constructor level. The custom ops expose two distinct entry-points:
encode_audio_to_file()
andencode_audio_to_tensor()
. I suspect that the public Python API will look similar to this, but that's still up for discussion.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should follow the same pattern set out in
SingleStreamDecoder
: one constructor that takes the file name, and another constructor that takes theAVIOToTensorContext
as an argument. That means it's the callers responsibility to create theAVIOToTensorContext
object; that's how it currently works inSingleStreamDecoder
. That would also require pulling all common initialization into someinitializeEncoder()
function that both constructors would call.The value, to me, is that it's more clear when creating the object that you're doing it correctly, and then when reading the initialization code, it's more clear what's required for, and correct for, each path.