-
Notifications
You must be signed in to change notification settings - Fork 182
feat: Add asJavaStream to Sink #2156
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This definitely makes it easier to find.
This does beg the question whether we should move and deprecate all methods in https://pekko.apache.org/docs/pekko/current/stream/operators/index.html#additional-sink-and-source-converters .
The warning there (which is also in your javadoc/scaladoc) is that this will use blocking operations. That could be a reason to discourage this API, and putting them in the separate StreamConverters
might be a good way to do that. So from that perspective, perhaps having them in StreamConverters
(and thus harder to find) might be a good thing?
* Be aware that Java ``Stream`` blocks current thread while waiting on next element from downstream. | ||
* As it is interacting wit blocking API the implementation runs on a separate dispatcher | ||
* configured through the ``pekko.stream.blocking-io-dispatcher``. | ||
*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add @since 2.0.0
I think the Does this method live outside before Java 8? |
@raboof As a Java developer, I think that's bad to put them on StreamConverters, users need to remember too many, today, LLM can only handle less mcp tools and mcp servers, so does humans. I think Pekko should make code smooth brain just as ZIO and Spring :), which actually will cover 80% the daily usage. |
Gotcha, so I see the original I see you have added "As it is interacting wit blocking API the implementation runs on a separate dispatcher configured through the
I agree, we should be making things "as simple as possible, but not simpler" 😄 |
I would be against removing these blocking methods, as I have in the past used the frequently with other API's that only work with There is nothing you can do about these methods being blocking as the underlying |
The Java Stream (talking about I experienced this myself, not sure what the easiest way to deal with this is from an API don't shoot yourself in the foot perspective. This is why the current API is designed so you only get the reference from an I am not sure about the new Java 1.8 Stream API is blocking or not, we have to be careful here. |
I'm definitely not suggesting removing the blocking methods, just 'not moving them to Sink'. I guess I should have said "keeping them in the separate |
@mdedetrich You can give it a try, this code will work ok if the event loop is a virtual thread; otherwise, dead deadlock. At work, Ajdk( a JDK backed on Dragonwell) 21 backported and fixed many issues around Loom, and now we can safely use Loom. I do encounter some deadlocks when using Loom, but now it's all fixed, I think that's why OX is starting direct style. |
I know that it won't block if you use a There is a reason why the current StreamCoverters are designed the way they are. |
When you are running inside a Virtual thread, and then you are blocked by, let's say, a Lock, and then the virtual thread is parked and unmounted, so the carrier thread can execute other things, has the virtual thread been blocked? yes, it cann't make any progress until it is unparked, and no, the carrier thread is still running, just some memory copying. You can try this with Pekko 1.2, set the max thread count to |
I am aware that Virtual Threads solve the issues, the point is we don't know and we cannot force people to always use VirtualThreads so the API cannot be designed assuming that people will always use a |
I think these APIs historically existed because Lightbend advocated that blocking should be managed, so reactive, and you can see, other reactive libraries which have many stars, just make these api easy to find. |
Yes the PR exposes Java 1.8 Stream in a
Irrelevant, not everyone is running on JVM 25 and people run Pekko in other environments.
I don't know how mutiny works. All I am saying is that a core design of Pekko Streams specifically is that it's built on the assumption that everything is purely asynchronous which is why as I explained before that materialized values/stream elements can never block and since we can't enforce that users will always 100% of the time use Unfortunately this was the issue with Java not having a core async/concurrency primitive when it was released (in contrast to Kotkin which has coroutines and Go which has channels) and it took 30 years to release an officially supported one (although you can argue that That actually opens up another set of questions, we may be able to simplify the current stream converters API by wrapping the results in a I am not against have another API which is designed for |
5df6af3
to
e0bc6de
Compare
Kotlin's coroutines are fake, even worse than Java 21's virtual threads. I added this API quite cautiously. You see, I didn't add toOutputStream to Sink. If other libraries had these features, a new user might see that Pekko Stream doesn't have this feature and choose another library. It seems you're planning to build a Kotlin-specific DSL for Pekko. While that's great, it also presents significant challenges, as it will have to compete with Kotlin's Flow API. It's a real challenge to have something that others don't have, and to be superior to others. |
I don't know what you mean by fake, but coroutines are by far the best implementation of an async concurrent primitive out of any of the JVM languages. Its performance is far higher than Loom as coroutines just compile do jump statements in JVM bytecode, it supports automatic cancellation of resources along with cancellation of coroutines and to boot it's also supported on all other kotlin platforms (native/js) with zero performance overhead. Loom on the hand only works on newer JVMs and is slower, the only good thing about it is that it uses the same programming model as threads but that is also what makes it slow (i.e. it needs to maintain stack traces). Akka/Pekko actors are ironically faster than Loom as a concurrency primitive as you only pass around pure values.
I am planning to build a kotlin but what I am saying is not specifically because of this, it's an issue with Java as a language. |
If we have an |
We are using Kotlin at work, some part of our server and our ios and android sharing the same codebase, but on server side, mostly using Java 21 nowadays. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds a new asJavaStream
method to the Sink API, providing a convenience method for converting Pekko streams into Java 8 Streams. This method delegates to the existing StreamConverters.asJavaStream()
functionality.
- Adds
asJavaStream()
method to both Scala and Java Sink APIs - Updates documentation with new method and examples
- Adds test coverage for the new functionality
Reviewed Changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 2 comments.
Show a summary per file
File | Description |
---|---|
stream/src/main/scala/org/apache/pekko/stream/scaladsl/Sink.scala | Adds asJavaStream method to Scala Sink API |
stream/src/main/scala/org/apache/pekko/stream/javadsl/Sink.scala | Adds asJavaStream method to Java Sink API |
stream-tests/src/test/scala/org/apache/pekko/stream/scaladsl/SinkAsJavaStreamSpec.scala | Adds test case for new Scala API method |
stream-tests/src/test/java/org/apache/pekko/stream/io/SinkAsJavaSourceTest.java | Adds test case for new Java API method |
docs/src/test/scala/docs/stream/operators/converters/StreamConvertersToJava.scala | Adds documentation example for Scala API |
docs/src/test/java/jdocs/stream/operators/converters/StreamConvertersToJava.java | Adds documentation example for Java API |
docs/src/main/paradox/stream/operators/index.md | Updates operator index with new method |
docs/src/main/paradox/stream/operators/Sink/asJavaStream.md | Creates dedicated documentation page for the new method |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
stream/src/main/scala/org/apache/pekko/stream/scaladsl/Sink.scala
Outdated
Show resolved
Hide resolved
stream/src/main/scala/org/apache/pekko/stream/javadsl/Sink.scala
Outdated
Show resolved
Hide resolved
I don't know, I would have to look into it. I am saying that the co-routine compiler transformations are well known and understood and they just translate to raw JVM bytecode statements Even though its not really used, scala had its own coroutine plugin as part of the compiler and it also did the same and it beat every other Scala IO runtime hands down. As you said, coroutines are slower if they have to interact with blocking Java code and kotlin-grpc might be doing that. On the other hand, if you have a purely async non blocking implementation like akka/pekko then it should be as fast, if not faster. Those same benchmarks show that akka/pekko grpc is just as fast as java vertx and the fact that java vertx is not meaningfully faster than akka/pekko grpc actually shows how Loom is not that fast as Loom has full runtime support in the JVM where as pekko/akka does not, as it implements actors as a library. If JVM implemented actors natively in JVM, then pekko/actor would be much faster, moerso than loom. There are lots of tricks that you can use to make actors very fast which Erlang does but cannot be done with akka/pekko as we don't have full JVM support (we however might be able to make actors even faster with newer versions of JVM as they are giving us more tools).
Sure if you need to interact with old blocking Java code it will be slower, the solution here is to not write/use blocking code. Thats why akka/pekko itself was written completely with async in mind and also why its as fast as java vertx 😉
True, I am just saying there are tradeoffs. If you don't want to deal with the color function problem then you can use loom, I am just saying not everyone is in that boat.
We are going on a bit of a tangent here though, my point is that not everyone who uses akka/pekko is using loom and akka/pekko is not a loom only API design. As I said, I don't have issues with making an API specific for loom users if it makes their life simpler from an API perspective, its just that we need to check for that because if you don't happen to be using loom then the code will just deadlock and thats a terrible user experience. |
Oh and on yet another note, those benchmarks at LesnyRumcajs/grpc_bench#441 show akka being slightly faster than pekko, which means there are more performance improvements on the table for pekko. It might even be possible to make akka/pekko beat vert.x with new JVM capabilities (although we would have to use multi jar for that) |
@mdedetrich maybe a new run will show some different. |
Btw, there are already a |
The current implementation uses |
I would go further and say that |
But reactor-core has that on Flux and no problem. |
Actually now that I think about it, if its not possible to implement this function without blocking (i.e. |
@mdedetrich The Materializer itself is using the |
I did not see much deadlock, because of the blocking method on Reactor-core/Rxjava which have a larger java users base. |
Motivation:
Make code much easy to find for daily job.
refs: https://smallrye.io/smallrye-mutiny/latest/guides/reactive-to-imperative/#iterating-over-multis-items
which has a
asStream
on Mutiny.