Skip to content

Conversation

@serhiy-storchaka
Copy link
Member

@serhiy-storchaka serhiy-storchaka commented Nov 6, 2025

The "+" and "/" characters are no longer recognized as the part of the Base64 alphabet in base64.urlsafe_b64decode() and base64.b64decode() the altchars argument that does not contain them.

The "+" and "/" characters are no longer recognized as the part of
the Base64 alphabet in base64.urlsafe_b64decode() and base64.b64decode()
the altchars argument that does not contain them.
@sethmlarson
Copy link
Contributor

@serhiy-storchaka Thanks for this, can you link this PR to either the original issue or a new issue for tracking purposes.

Copy link
Contributor

@sethmlarson sethmlarson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm really concerned about the subtle breakages that this change could cause, especially because the default behavior is to throw away characters that aren't in the current alphabet. If the default behavior was to raise an error I would feel better about this change.

Makes me wonder if we should target validate=True with this behavior change (because IMO, the silent dropping of invalid characters is in itself a concerning behavior) and then long-term move to having validate be enabled by default?

@serhiy-storchaka serhiy-storchaka changed the title gh-141061: Fix decoding with non-standard Base64 alphabet gh-125346: Fix decoding with non-standard Base64 alphabet Nov 6, 2025
Copy link
Member Author

@serhiy-storchaka serhiy-storchaka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This worries me too. We can keep the old behavior but emit a warning if characters + or / occur in Base64 data with the alternative alphabet.

But urlsafe_b64decode() does not have the validate parameter.

@serhiy-storchaka serhiy-storchaka marked this pull request as draft November 7, 2025 08:20
@sethmlarson
Copy link
Contributor

But urlsafe_b64decode() does not have the validate parameter.

I wonder if for urlsafe_b64decode() it is okay to error out on bad characters as the function name is more clear that this is for a specific base64 alphabet?

@serhiy-storchaka
Copy link
Member Author

But is not passing altchars to b64decode() also makes it clear that it uses a different alphabet?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

needs backport to 3.13 bugs and security fixes needs backport to 3.14 bugs and security fixes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants