Description
Background
This is a follow-up to the optimization efforts started in #3870
In #3870, #4380 we have made already good progress in shaving off the performance overhead which is added by the security plugin to the overall performance shown by an OpenSearch cluster. In the course of this, we performed further performance tests which indicate that an OpenSearch cluster with the security plugin still shows a throughput with is about 10% lower than an OpenSearch cluster without the security plugin: #4380 (comment)
Analysis
It is already well known that the serialization and de-serialization of the user object adds a significant performance penalty: #2780, #4760 . There were a couple of efforts in this area already: One approach was to change the serialization protocol: #2802 . However, unexpected side-effects were encountered, which caused the change to be rolled back: #3771, #3776, #4670 .
In the course of this issue, we want to approach this again. However, this time not by optimizing the serialization protocol, but just by reducing the amount of serializations.
Life cycle of the org.opensearch.security.user.User
object
The major class of which objects are often serialized and deserialized in an OpenSearch cluster is org.opensearch.security.user.User
. So, let's have a look at its life cycle:
-
Instances of
org.opensearch.security.user.User
are usually created when a REST requested is authenticated by the security plugin. This happens in the classBackendRegistry
, which either directly instantiatesUser
objects or delegates it to classes implementing theAuthenticationBackend
interface. Additionally,BackendRegistry
maintains a cache of user objects in the case the same user sends several requests in sequence. -
In a second step,
BackendRegistry
uses classes implementingAuthorizationBackend
to add roles to theUser
object. -
The class
PrivilegesEvaluator
performs role mapping and adds a number of effective roles to theUser
object to make these available to dynamic attributes used in the role configuration. -
If the current node needs to delegate work to other nodes, the
User
object is serialized, put into a transport header and then de-serialized again on the receiving node. In case the current node needs to send messages to n other nodes, the user object will be serialized and de-serialized n times. -
On the receiving node, the de-serialized
User
object will be used. Actual authentication viaBackendRegistry
andAuthenticationBackend
does not happen. That means, theUser
object stays constant.
An immutable User
object
The life cycle shows that the User
object mostly stays constant after the authentication phase was finished. An outlier is the modification performed in PrivilegesEvaluator
. The purpose of this, attribute interpolation, can be also achieved by other context variables. We already did changes in #4380 with the goal to make this modification unnecessary; specifically, this is the introduction of the PrivilegeEvaluationContext
class, which also holds the effective roles.
If we can achieve a User
object which is guaranteed to stay constant after initialization, i.e., which is immutable, we gain capabilities which are very helpful for optimizing serialization performance:
-
An immutable object means that the binary data created by serialization does not change as well. That means, the serialized data can be computed once per user object and then re-used again and again. The existing authentication cache even allows us to re-use the serialized data across different requests by the same user.
-
Additionally, a cache similar to the cache maintained by
BackendRegistry
can be utilized to also cut down the number of de-serialization operations to a single one per user per node. Such a cache would map the serialized binary data to the actual user object. -
Immutable objects are inherently thread-safe. That makes any synchronization or locking mechanisms on the user object unnecessary.
Necessary changes
Obviously, in order to make the User
object immutable, all code which modifies the User
object needs to be changed.
Besides from the creation pefrormed in BackendRegistry
and the AuthenticationBackend
implementations, this pertains especially to two further places.
AuthorizationBackend
The AuthorizationBackend
interface looks currently like this:
Implementors of the AuthorizationBackend
interface are supposed to call addRoles()
on the supplied User
object in order to provide the user roles they have found.
This will not be possible any more with immutable user objects. Thus, the AuthorizationBackend
interface must be changed. There are two options to handle this:
- Pass a builder class for user objects to the
AuthorizationBackend
interface:
void fillRoles(User.Builder userBuilder, AuthCredentials credentials) throws OpenSearchSecurityException;
- Keep the
User
parameter, but let thefillRoles()
create a new, modifiedUser
object and return it from the method:
User fillRoles(User user, AuthCredentials credentials) throws OpenSearchSecurityException;
PrivilegesEvaluator
As described above, the adding of effective roles needs to be removed from PrivilegesEvaluator
:
The same information is already made available via PrivilegeEvaluationContext
; it is just necessary to change all usage of this information to the new class.
Additional Benefits
It is generally agreed that the use of immutable objects contributes to high-quality, robust software, especially in multi-threaded environments. The security plugin was struggling with concurrent modification issues on the User
object in the past: #4642, #2263 . Thus, the change will contribute to the general robustness of the security plugin.