Skip to content

Commit 31b8b4a

Browse files
authored
HADOOP-19336: S3A: Test failures after CSE support added (#7164)
Fix/disable tests which do not work or fail as expected when CSE-KMS is used. Move documentation on encryption problems from troubleshooting.md to encryption.md, and extends that with some new stack traces + explanations Followup to HADOOP-18708: S3A: Support S3 Client Side Encryption(CSE) Contributed by Syed Shameerur Rahman
1 parent cd2cffe commit 31b8b4a

File tree

8 files changed

+247
-177
lines changed

8 files changed

+247
-177
lines changed

hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/impl/EncryptionS3ClientFactory.java

+2
Original file line numberDiff line numberDiff line change
@@ -219,6 +219,8 @@ private Keyring createKmsKeyring(S3ClientCreationParameters parameters,
219219
return KmsKeyring.builder()
220220
.kmsClient(kmsClientBuilder.build())
221221
.wrappingKeyId(cseMaterials.getKmsKeyId())
222+
// this is required for backward compatibility with older encryption clients
223+
.enableLegacyWrappingAlgorithms(true)
222224
.build();
223225
}
224226

hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/encryption.md

+214-2
Original file line numberDiff line numberDiff line change
@@ -808,5 +808,217 @@ public class CustomKeyring implements Keyring {
808808

809809
## <a name="troubleshooting"></a> Troubleshooting Encryption
810810

811-
The [troubleshooting](./troubleshooting_s3a.html) document covers
812-
stack traces which may surface when working with encrypted data.
811+
The section covers stack traces which may surface when working with encrypted data.
812+
813+
### <a name="encryption"></a> S3 Server Side Encryption
814+
815+
#### `AWSS3IOException` `KMS.NotFoundException` "Invalid arn" when using SSE-KMS
816+
817+
When performing file operations, the user may run into an issue where the KMS
818+
key arn is invalid.
819+
820+
```
821+
org.apache.hadoop.fs.s3a.AWSS3IOException: innerMkdirs on /test:
822+
S3Exception:
823+
Invalid arn (Service: Amazon S3; Status Code: 400; Error Code: KMS.NotFoundException;
824+
Request ID: CA89F276B3394565),
825+
S3 Extended Request ID: ncz0LWn8zor1cUO2fQ7gc5eyqOk3YfyQLDn2OQNoe5Zj/GqDLggUYz9QY7JhdZHdBaDTh+TL5ZQ=:
826+
Invalid arn (Service: Amazon S3; Status Code: 400; Error Code: KMS.NotFoundException; Request ID: CA89F276B3394565)
827+
```
828+
829+
Possible causes:
830+
831+
* the KMS key ARN is entered incorrectly, or
832+
* the KMS key referenced by the ARN is in a different region than the S3 bucket
833+
being used.
834+
835+
#### Using SSE-C "Bad Request"
836+
837+
When performing file operations the user may run into an unexpected 400/403
838+
error such as
839+
```
840+
org.apache.hadoop.fs.s3a.AWSS3IOException: getFileStatus on fork-4/:
841+
S3Exception:
842+
Bad Request (Service: Amazon S3; Status Code: 400;
843+
Error Code: 400 Bad Request; Request ID: 42F9A1987CB49A99),
844+
S3 Extended Request ID: jU2kcwaXnWj5APB14Cgb1IKkc449gu2+dhIsW/+7x9J4D+VUkKvu78mBo03oh9jnOT2eoTLdECU=:
845+
Bad Request (Service: Amazon S3; Status Code: 400; Error Code: 400 Bad Request; Request ID: 42F9A1987CB49A99)
846+
```
847+
848+
This can happen in the cases of not specifying the correct SSE-C encryption key.
849+
Such cases can be as follows:
850+
1. An object is encrypted using SSE-C on S3 and either the wrong encryption type
851+
is used, no encryption is specified, or the SSE-C specified is incorrect.
852+
2. A directory is encrypted with a SSE-C keyA and the user is trying to move a
853+
file using configured SSE-C keyB into that structure.
854+
855+
### <a name="client-side-encryption"></a> S3 Client Side Encryption
856+
857+
#### java.lang.NoClassDefFoundError: software/amazon/encryption/s3/S3EncryptionClient
858+
859+
With the move to the V2 AWS SDK, CSE is implemented via
860+
[amazon-s3-encryption-client-java](https://github.com/aws/amazon-s3-encryption-client-java/tree/v3.1.1)
861+
which is not packaged in AWS SDK V2 bundle jar and needs to be added separately.
862+
863+
Fix: add amazon-s3-encryption-client-java jar version 3.1.1 to the class path.
864+
865+
#### Instruction file not found for S3 object
866+
867+
Reading an unencrypted file would fail when read through CSE enabled client by default.
868+
```
869+
software.amazon.encryption.s3.S3EncryptionClientException: Instruction file not found!
870+
Please ensure the object you are attempting to decrypt has been encrypted
871+
using the S3 Encryption Client.
872+
```
873+
CSE enabled client should read encrypted data only.
874+
875+
Fix: set `fs.s3a.encryption.cse.v1.compatibility.enabled=true`
876+
#### CSE-KMS method requires KMS key ID
877+
878+
KMS key ID is required for CSE-KMS to encrypt data, not providing one leads
879+
to failure.
880+
881+
```
882+
2021-07-07 11:33:04,550 WARN fs.FileSystem: Failed to initialize filesystem
883+
s3a://ap-south-cse/: java.lang.IllegalArgumentException: CSE-KMS
884+
method requires KMS key ID. Use fs.s3a.encryption.key property to set it.
885+
-ls: CSE-KMS method requires KMS key ID. Use fs.s3a.encryption.key property to
886+
set it.
887+
```
888+
889+
set `fs.s3a.encryption.key=<KMS_KEY_ID>` generated through AWS console.
890+
891+
#### `software.amazon.awssdk.services.kms.model.IncorrectKeyException` The key ID in the request does not identify a CMK that can perform this operation.
892+
893+
KMS key ID used to PUT(encrypt) the data, must be the one used to GET the
894+
data.
895+
```
896+
cat: open s3a://ap-south-cse/encryptedData.txt at 0 on
897+
s3a://ap-south-cse/encryptedData.txt:
898+
software.amazon.awssdk.services.kms.model.IncorrectKeyException: The key ID in the
899+
request does not identify a CMK that can perform this operation. (Service: AWSKMS;
900+
Status Code: 400; ErrorCode: IncorrectKeyException;
901+
Request ID: da21aa8a-f00d-467c-94a0-32b627d32bc0; Proxy: null):IncorrectKeyException:
902+
The key ID in the request does not identify a CMK that can perform this
903+
operation. (Service: AWSKMS ; Status Code: 400; Error Code: IncorrectKeyException;
904+
Request ID: da21aa8a-f00d-467c-94a0-32b627d32bc0; Proxy: null)
905+
```
906+
Use the same KMS key ID used to upload data to download and read it as well.
907+
908+
#### `software.amazon.awssdk.services.kms.model.NotFoundException` key/<KMS_KEY_ID> does not exist
909+
910+
Using a KMS key ID from a different region than the bucket used to store data
911+
would lead to failure while uploading.
912+
913+
```
914+
mkdir: PUT 0-byte object on testmkdir:
915+
software.amazon.awssdk.services.kms.model.NotFoundException: Key
916+
'arn:aws:kms:ap-south-1:152813717728:key/<KMS_KEY_ID>'
917+
does not exist (Service: AWSKMS; Status Code: 400; Error Code: NotFoundException;
918+
Request ID: 279db85d-864d-4a38-9acd-d892adb504c0; Proxy: null):NotFoundException:
919+
Key 'arn:aws:kms:ap-south-1:152813717728:key/<KMS_KEY_ID>'
920+
does not exist(Service: AWSKMS; Status Code: 400; Error Code: NotFoundException;
921+
Request ID: 279db85d-864d-4a38-9acd-d892adb504c0; Proxy: null)
922+
```
923+
If S3 bucket region is different from the KMS key region,
924+
set`fs.s3a.encryption.cse.kms.region=<KMS_REGION>`
925+
926+
#### `software.amazon.encryption.s3.S3EncryptionClientException: Service returned HTTP status code 400` (Service: Kms, Status Code: 400)
927+
928+
An exception may be raised if the Key Management Service (KMS) region is either not specified or does not align with the expected configuration.
929+
930+
Fix: set`fs.s3a.encryption.cse.kms.region=<KMS_REGION>`
931+
932+
#### `software.amazon.encryption.s3.S3EncryptionClientException: Unable to execute HTTP request: Encountered fatal error in publisher: Unable to execute HTTP request: Encountered fatal error in publisher`
933+
934+
When a part upload fails (5xx status error) during a multi-part upload (MPU) with client-side encryption (CSE)
935+
enabled, the partial upload may be retired. Since retrying the multi-part upload is not supported in this encryption scenario,
936+
the entire job must be restarted.
937+
938+
#### `java.lang.ClassNotFoundException: software.amazon.encryption.*`
939+
940+
S3 encryption jars are not bundled into hadoop-aws jar by default. It needs to be added
941+
separately to the class path. Currently, [amazon-s3-encryption-client-java v3.1.1](https://github.com/aws/amazon-s3-encryption-client-java/tree/v3.1.1) is used.
942+
943+
```
944+
software.amazon.encryption.s3.S3EncryptionClientException:
945+
Service returned HTTP status code 400 (Service: Kms, Status Code: 400,
946+
Request ID: XG6CGC5ZH1JQS34S, Extended Request ID: KIyVA/pmbUUGmiqcy/ueyx0iw5ifgpuJMcrs0b4lYYZsXxikuUM2nRCl2lFnya+1TqGCt6YxLnM=):null:
947+
Service returned HTTP status code 400 (Service: Kms, Status Code: 400, Request ID: XG6CGC5ZH1JQS34S, Extended Request ID: KIyVA/pmbUUGmiqcy/ueyx0iw5ifgpuJMcrs0b4lYYZsXxikuUM2nRCl2lFnya+1TqGCt6YxLnM=)
948+
```
949+
950+
Fix: set`fs.s3a.encryption.cse.kms.region=<KMS_REGION>`
951+
952+
953+
#### `software.amazon.awssdk.services.kms.mode.InvalidKeyUsageException: You cannot generate a data key with an asymmetric CMK`
954+
955+
If you generated an Asymmetric CMK from AWS console then CSE-KMS won't be
956+
able to generate unique data key for encryption.
957+
958+
```
959+
Caused by: software.amazon.awssdk.services.kms.mode.InvalidKeyUsageException:
960+
You cannot generate a data key with an asymmetric CMK
961+
(Service: AWSKMS; Status Code: 400; Error Code: InvalidKeyUsageException; Request ID: 93609c15-e490-4035-8390-f4396f0d90bf; Proxy: null)
962+
```
963+
964+
Generate a Symmetric Key in the same region as your S3 storage for CSE-KMS to
965+
work.
966+
967+
#### software.amazon.awssdk.services.kms.mode.NotFoundException: Invalid keyId
968+
969+
If the value in `fs.s3a.encryption.key` property, does not exist
970+
/valid in AWS KMS CMK(Customer managed keys), then this error would be seen.
971+
972+
```
973+
Caused by: software.amazon.awssdk.services.kms.model.NotFoundException: Invalid keyId abc
974+
(Service: AWSKMS; Status Code: 400; Error Code: NotFoundException; Request ID:
975+
9d53552a-3d1b-47c8-984c-9a599d5c2391; Proxy: null)
976+
```
977+
978+
Check if `fs.s3a.encryption.key` is set correctly and matches the
979+
same on AWS console.
980+
981+
#### software.amazon.awssdk.services.kms.model.KmsException: User: <User_ARN> is not authorized to perform : kms :GenerateDataKey on resource: <KEY_ID>
982+
983+
User doesn't have authorization to the specific AWS KMS Key ID.
984+
```
985+
Caused by: software.amazon.awssdk.services.kms.model.KmsException:
986+
User: arn:aws:iam::152813717728:user/<user> is not authorized to perform:
987+
kms:GenerateDataKey on resource: <key_ID>
988+
(Service: AWSKMS; Status Code: 400; Error Code: AccessDeniedException;
989+
Request ID: 4ded9f1f-b245-4213-87fc-16cba7a1c4b9; Proxy: null)
990+
```
991+
992+
The user trying to use the KMS Key ID should have the right permissions to access
993+
(encrypt/decrypt) using the AWS KMS Key used via `fs.s3a.encryption.key`.
994+
If not, then add permission(or IAM role) in "Key users" section by selecting the
995+
AWS-KMS CMK Key on AWS console.
996+
997+
#### `S3EncryptionClientException` "Encountered fatal error in publisher"
998+
999+
1000+
```
1001+
software.amazon.encryption.s3.S3EncryptionClientException:
1002+
Unable to execute HTTP request: Encountered fatal error in publisher:
1003+
Unable to execute HTTP request: Encountered fatal error in publisher
1004+
1005+
...
1006+
1007+
Caused by: java.lang.IllegalStateException: Must use either different key or iv for GCM encryption
1008+
at com.sun.crypto.provider.CipherCore.checkReinit(CipherCore.java:1088)
1009+
at com.sun.crypto.provider.CipherCore.update(CipherCore.java:662)
1010+
at com.sun.crypto.provider.AESCipher.engineUpdate(AESCipher.java:380)
1011+
at javax.crypto.Cipher.update(Cipher.java:1835)
1012+
at software.amazon.encryption.s3.internal.CipherSubscriber.onNext(CipherSubscriber.java:52)
1013+
at software.amazon.encryption.s3.internal.CipherSubscriber.onNext(CipherSubscriber.java:16)
1014+
at software.amazon.awssdk.utils.async.SimplePublisher.doProcessQueue(SimplePublisher.java:267)
1015+
at software.amazon.awssdk.utils.async.SimplePublisher.processEventQueue(SimplePublisher.java:224)
1016+
```
1017+
An upload of a single block of a large file/stream failed due to a transient failure of an S3 front end server.
1018+
1019+
For unencrypted uploads, this block is simply posted again; recovery is transparent.
1020+
However, the cipher used used in CSE-KMS is unable to recover.
1021+
1022+
There is no fix for this other than the application itself completely regenerating the entire file/upload
1023+
1024+
Please note that this is a very rare problem for applications running within AWS infrastructure.

hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/troubleshooting_s3a.md

-158
Original file line numberDiff line numberDiff line change
@@ -1007,164 +1007,6 @@ of the destination of a rename is a directory -only that it is _not_ a file.
10071007
You can rename a directory or file deep under a file if you try -after which
10081008
there is no guarantee of the files being found in listings. Try not to do that.
10091009

1010-
## <a name="encryption"></a> S3 Server Side Encryption
1011-
1012-
### `AWSS3IOException` `KMS.NotFoundException` "Invalid arn" when using SSE-KMS
1013-
1014-
When performing file operations, the user may run into an issue where the KMS
1015-
key arn is invalid.
1016-
1017-
```
1018-
org.apache.hadoop.fs.s3a.AWSS3IOException: innerMkdirs on /test:
1019-
S3Exception:
1020-
Invalid arn (Service: Amazon S3; Status Code: 400; Error Code: KMS.NotFoundException;
1021-
Request ID: CA89F276B3394565),
1022-
S3 Extended Request ID: ncz0LWn8zor1cUO2fQ7gc5eyqOk3YfyQLDn2OQNoe5Zj/GqDLggUYz9QY7JhdZHdBaDTh+TL5ZQ=:
1023-
Invalid arn (Service: Amazon S3; Status Code: 400; Error Code: KMS.NotFoundException; Request ID: CA89F276B3394565)
1024-
```
1025-
1026-
Possible causes:
1027-
1028-
* the KMS key ARN is entered incorrectly, or
1029-
* the KMS key referenced by the ARN is in a different region than the S3 bucket
1030-
being used.
1031-
1032-
### Using SSE-C "Bad Request"
1033-
1034-
When performing file operations the user may run into an unexpected 400/403
1035-
error such as
1036-
```
1037-
org.apache.hadoop.fs.s3a.AWSS3IOException: getFileStatus on fork-4/:
1038-
S3Exception:
1039-
Bad Request (Service: Amazon S3; Status Code: 400;
1040-
Error Code: 400 Bad Request; Request ID: 42F9A1987CB49A99),
1041-
S3 Extended Request ID: jU2kcwaXnWj5APB14Cgb1IKkc449gu2+dhIsW/+7x9J4D+VUkKvu78mBo03oh9jnOT2eoTLdECU=:
1042-
Bad Request (Service: Amazon S3; Status Code: 400; Error Code: 400 Bad Request; Request ID: 42F9A1987CB49A99)
1043-
```
1044-
1045-
This can happen in the cases of not specifying the correct SSE-C encryption key.
1046-
Such cases can be as follows:
1047-
1. An object is encrypted using SSE-C on S3 and either the wrong encryption type
1048-
is used, no encryption is specified, or the SSE-C specified is incorrect.
1049-
2. A directory is encrypted with a SSE-C keyA and the user is trying to move a
1050-
file using configured SSE-C keyB into that structure.
1051-
1052-
## <a name="client-side-encryption"></a> S3 Client Side Encryption
1053-
1054-
### java.lang.NoClassDefFoundError: software/amazon/encryption/s3/S3EncryptionClient
1055-
1056-
With the move to the V2 AWS SDK, CSE is implemented via
1057-
[amazon-s3-encryption-client-java](https://github.com/aws/amazon-s3-encryption-client-java/tree/v3.1.1)
1058-
which is not packaged in AWS SDK V2 bundle jar and needs to be added separately.
1059-
1060-
Fix: add amazon-s3-encryption-client-java jar version 3.1.1 to the class path.
1061-
1062-
### Instruction file not found for S3 object
1063-
1064-
Reading an unencrypted file would fail when read through CSE enabled client by default.
1065-
```
1066-
software.amazon.encryption.s3.S3EncryptionClientException: Instruction file not found!
1067-
Please ensure the object you are attempting to decrypt has been encrypted
1068-
using the S3 Encryption Client.
1069-
```
1070-
CSE enabled client should read encrypted data only.
1071-
1072-
Fix: set `fs.s3a.encryption.cse.v1.compatibility.enabled=true`
1073-
### CSE-KMS method requires KMS key ID
1074-
1075-
KMS key ID is required for CSE-KMS to encrypt data, not providing one leads
1076-
to failure.
1077-
1078-
```
1079-
2021-07-07 11:33:04,550 WARN fs.FileSystem: Failed to initialize filesystem
1080-
s3a://ap-south-cse/: java.lang.IllegalArgumentException: CSE-KMS
1081-
method requires KMS key ID. Use fs.s3a.encryption.key property to set it.
1082-
-ls: CSE-KMS method requires KMS key ID. Use fs.s3a.encryption.key property to
1083-
set it.
1084-
```
1085-
1086-
set `fs.s3a.encryption.key=<KMS_KEY_ID>` generated through AWS console.
1087-
1088-
### `software.amazon.awssdk.services.kms.model.IncorrectKeyException` The key ID in the request does not identify a CMK that can perform this operation.
1089-
1090-
KMS key ID used to PUT(encrypt) the data, must be the one used to GET the
1091-
data.
1092-
```
1093-
cat: open s3a://ap-south-cse/encryptedData.txt at 0 on
1094-
s3a://ap-south-cse/encryptedData.txt:
1095-
software.amazon.awssdk.services.kms.model.IncorrectKeyException: The key ID in the
1096-
request does not identify a CMK that can perform this operation. (Service: AWSKMS;
1097-
Status Code: 400; ErrorCode: IncorrectKeyException;
1098-
Request ID: da21aa8a-f00d-467c-94a0-32b627d32bc0; Proxy: null):IncorrectKeyException:
1099-
The key ID in the request does not identify a CMK that can perform this
1100-
operation. (Service: AWSKMS ; Status Code: 400; Error Code: IncorrectKeyException;
1101-
Request ID: da21aa8a-f00d-467c-94a0-32b627d32bc0; Proxy: null)
1102-
```
1103-
Use the same KMS key ID used to upload data to download and read it as well.
1104-
1105-
### `software.amazon.awssdk.services.kms.model.NotFoundException` key/<KMS_KEY_ID> does not exist
1106-
1107-
Using a KMS key ID from a different region than the bucket used to store data
1108-
would lead to failure while uploading.
1109-
1110-
```
1111-
mkdir: PUT 0-byte object on testmkdir:
1112-
software.amazon.awssdk.services.kms.model.NotFoundException: Key
1113-
'arn:aws:kms:ap-south-1:152813717728:key/<KMS_KEY_ID>'
1114-
does not exist (Service: AWSKMS; Status Code: 400; Error Code: NotFoundException;
1115-
Request ID: 279db85d-864d-4a38-9acd-d892adb504c0; Proxy: null):NotFoundException:
1116-
Key 'arn:aws:kms:ap-south-1:152813717728:key/<KMS_KEY_ID>'
1117-
does not exist(Service: AWSKMS; Status Code: 400; Error Code: NotFoundException;
1118-
Request ID: 279db85d-864d-4a38-9acd-d892adb504c0; Proxy: null)
1119-
```
1120-
If S3 bucket region is different from the KMS key region,
1121-
set`fs.s3a.encryption.cse.kms.region=<KMS_REGION>`
1122-
1123-
### `software.amazon.awssdk.services.kms.mode.InvalidKeyUsageException: You cannot generate a data key with an asymmetric CMK`
1124-
1125-
If you generated an Asymmetric CMK from AWS console then CSE-KMS won't be
1126-
able to generate unique data key for encryption.
1127-
1128-
```
1129-
Caused by: software.amazon.awssdk.services.kms.mode.InvalidKeyUsageException:
1130-
You cannot generate a data key with an asymmetric CMK
1131-
(Service: AWSKMS; Status Code: 400; Error Code: InvalidKeyUsageException; Request ID: 93609c15-e490-4035-8390-f4396f0d90bf; Proxy: null)
1132-
```
1133-
1134-
Generate a Symmetric Key in the same region as your S3 storage for CSE-KMS to
1135-
work.
1136-
1137-
### software.amazon.awssdk.services.kms.mode.NotFoundException: Invalid keyId
1138-
1139-
If the value in `fs.s3a.encryption.key` property, does not exist
1140-
/valid in AWS KMS CMK(Customer managed keys), then this error would be seen.
1141-
1142-
```
1143-
Caused by: software.amazon.awssdk.services.kms.model.NotFoundException: Invalid keyId abc
1144-
(Service: AWSKMS; Status Code: 400; Error Code: NotFoundException; Request ID:
1145-
9d53552a-3d1b-47c8-984c-9a599d5c2391; Proxy: null)
1146-
```
1147-
1148-
Check if `fs.s3a.encryption.key` is set correctly and matches the
1149-
same on AWS console.
1150-
1151-
### software.amazon.awssdk.services.kms.model.KmsException: User: <User_ARN> is not authorized to perform : kms :GenerateDataKey on resource: <KEY_ID>
1152-
1153-
User doesn't have authorization to the specific AWS KMS Key ID.
1154-
```
1155-
Caused by: software.amazon.awssdk.services.kms.model.KmsException:
1156-
User: arn:aws:iam::152813717728:user/<user> is not authorized to perform:
1157-
kms:GenerateDataKey on resource: <key_ID>
1158-
(Service: AWSKMS; Status Code: 400; Error Code: AccessDeniedException;
1159-
Request ID: 4ded9f1f-b245-4213-87fc-16cba7a1c4b9; Proxy: null)
1160-
```
1161-
1162-
The user trying to use the KMS Key ID should have the right permissions to access
1163-
(encrypt/decrypt) using the AWS KMS Key used via `fs.s3a.encryption.key`.
1164-
If not, then add permission(or IAM role) in "Key users" section by selecting the
1165-
AWS-KMS CMK Key on AWS console.
1166-
1167-
11681010
### <a name="not_all_bytes_were_read"></a> Message appears in logs "Not all bytes were read from the S3ObjectInputStream"
11691011

11701012

0 commit comments

Comments
 (0)