Skip to content

Conversation

@hanouticelina
Copy link
Contributor

Discussed internally in this thread.
This PR adds support for the image-text-to-image and image-text-to-video tasks for wavespeed provider. These tasks support models that accept both optional image + text input.
For these tasks, we use the same endpoints as for image-to-image and image-to-video, when no image is provided, we send a 1x1 fully transparent image. Note that wavespeed's doesn't follow a consistent endpoint naming pattern that would allow us to have the same logic as for fal in #1879.

I've tested it (i.e. text only input) with FLUX-2 and and the results looks reasonable to me:

"A robot playing chess in a garden"
wavespeed_placeholder_5

"A butterfly landing on a flower"

wavespeed_video_placeholder_4.mp4

Copy link
Contributor

@Wauplin Wauplin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hacky but nice it works! 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants