Transition away from Java serialization for storing state on disk or in Zookeeper#625
Open
mrflip wants to merge 13 commits into
Open
Transition away from Java serialization for storing state on disk or in Zookeeper#625mrflip wants to merge 13 commits into
mrflip wants to merge 13 commits into
Conversation
config and nimbus currently use the core Java serialization to store StormTopology instances. This commit will change this to use Thrift serialization instead. StormTopology is a Thrift struct so this basically involves finding all the places we call `Utils/serialize` and `Utils/deserialize` and replace them with a call to a method that serializes with Thrift instead.
config.clj and nimbus serialize Storm configuration using the default Java implementation. This commit will phase out Java default serialization in favor of Clojure default serialization. There are obviously many reasons to phase out Java serialization, but the main rationale is that Java serialization will complain about version mismatch even if the change is semantically backward compatible.
TSerializer is not threadsafe. In `Utils` we instantiate a static final Tserializer, but this can (and will) cause odd bugs if we start calling `serialize()` in different threads. Thus, every time we call `Utils/serializeTopology`, we create a new TSerializer. Another way to do this would be to lock it, which performance may or may not merit.
Serialization of configuration is handled in config, but it is not different from a generic method for Clojure form serialization. This commit will move this method to utils so that we can use it for other things, like serialization in cluster.
cluster.clj uses the stock Java serialization implementation. There are obviously many reason to not use standard Java serialization, but our main motivation is that Java will complain about serialized state when the versions don't match even if they're semantically backwards compatible.
Supervisor currently just uses stock jvm serialization to communicate LocalState. This is undesirable for many reasons, so this commit will introduce a serialization interface which makes code cleaner, letting us specify which type of serializer to use without populating Supervisor with unnecessary boxing/unboxing behavior, or LocalState with too much knowledge about what's happening in the Supervisor. Also this commit will introduce a basic implementation sketch for a serializer for LocalState (though it will just use the jvm serialization at this point).
To build the serialization interface in Java, we need to put these constants in a Java file. This commit will put them in Constants.java
Collaborator
|
@mrflip @hausdorff If so I'm +1. |
Collaborator
There was a problem hiding this comment.
Why are we renaming these functions? If we stick the original func name, we will have less code to worry about. No?
|
Hey, cool -- good to see this finally getting resolved. I'll be happy to take a look at this in a couple hours, after I get off work. |
Collaborator
|
@hausdorff @mrflip |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This is all @hausdorff's work -- it addresses #419 and #525, and supersedes #497
I've rebased his commits into the current master, and validated that the tests pass.
Hopefully this helps grease the path for the PR to go in...