Open
Description
This is a tracking task to list all the work needed to solve one outstanding issue with snakebite. When RPC encryption is enabled for HDFS, the following happens:
- snakebite contacts the HDFS namenode via Hadoop RPC, negotiating the encryption settings using GSS-API via SASL. It needs to retrieve the list of blocks to read/write and the related datanodes to talk to. This part works fine.
- snakebite then has to contact every HDFS datanode, using a specific RPC protocol that is not Hadoop RPC. The authentication is done via DIGEST-MD5 via SASL, that also allows to set the encryption level if needed (to then allow the negotiation of AES encryption). This bit currently doesn't work because the code that would be needed relies on functionalities of SASL that are not implemented in
pure-sasl
(namely DIGEST-MD5).
I opened an issue to pure-sasl
(thobbs/pure-sasl#32) but some work would be needed to add the missing features.
The alternative would be to use sasl
(https://github.com/cloudera/python-sasl) but unfortunately the library is not maintained since 2016. There is a fork that we could consider that should support DIGEST-MD5 + GSS-API: cloudera/python-sasl#15 (comment)
Metadata
Metadata
Assignees
Labels
No labels