Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow to set non-UTF8 exception messages #126742

Open
picnixz opened this issue Nov 12, 2024 · 10 comments
Open

Allow to set non-UTF8 exception messages #126742

picnixz opened this issue Nov 12, 2024 · 10 comments
Assignees
Labels
3.12 bugs and security fixes 3.13 bugs and security fixes 3.14 new features, bugs and security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) topic-C-API type-feature A feature request or enhancement

Comments

@picnixz
Copy link
Contributor

picnixz commented Nov 12, 2024

Feature or enhancement

Proposal:

This is a follow-up to the discussion in #126555 (comment).

dlerror() may return non-UTF-8 messages, possibly translated. We should be able to set exception messages according to the current locale. To that end, we'll expose some internal helper:

extern void 
_PyErr_SetLocaleStringTstate(PyThreadState *tstate, PyObject *exception, const char *string);

extern void
_PyErr_SetLocaleString(PyObject *exception, const char *string);

cc @ZeroIntensity @encukou

For now, both functions would be only declared as extern and not part of the public C API.

Linked PRs

@picnixz picnixz added type-feature A feature request or enhancement interpreter-core (Objects, Python, Grammar, and Parser dirs) labels Nov 12, 2024
@picnixz picnixz self-assigned this Nov 12, 2024
@picnixz
Copy link
Contributor Author

picnixz commented Nov 12, 2024

Let's actually mark it as a bugfix since we'll need to backport it in order to use it later.

@picnixz picnixz added type-bug An unexpected behavior, bug, or error 3.12 bugs and security fixes 3.13 bugs and security fixes 3.14 new features, bugs and security fixes and removed type-feature A feature request or enhancement labels Nov 12, 2024
@ZeroIntensity
Copy link
Member

ZeroIntensity commented Nov 12, 2024

Hmm, so is PyErr_SetLocaleString supposed to be implicitly private? I'm not the biggest fan of that--if something is supposed to be private, then we need to prefix it with _, not only so users don't touch it, but so people reading the source don't mistake it as public. If we want a variation of it that takes a thread state, we could have something like _PyErr_SetLocaleStringTstate.

@picnixz
Copy link
Contributor Author

picnixz commented Nov 12, 2024

Hmm, so is PyErr_SetLocaleString supposed to be implicitly private

For now, but maybe we can directly expose to the C API? I want to be consistent with the naming of each function.

@picnixz
Copy link
Contributor Author

picnixz commented Nov 12, 2024

Nevermind, there is a precedent: _PyErr_FormatFromCauseTstate. Let's have
_PyErr_SetLocaleStringTstate and _PyErr_SetLocaleString and both private.

@encukou
Copy link
Member

encukou commented Nov 12, 2024

How many places in the code would use this helper? I don't see the advantage over calling PyUnicode_DecodeLocale and PyErr_SetObject.

@picnixz
Copy link
Contributor Author

picnixz commented Nov 12, 2024

There are a few occurrences of dlerror() in _ctypes.c and callproc.c and dynload_shlib as well.

NVM: the dynload_shlib uses its own logic.

@encukou
Copy link
Member

encukou commented Nov 12, 2024

errors.c uses PyUnicode_DecodeLocale (with surrogateescape) on strerror()'s result. That looks like a good precedent.

I don't think it's worth it to add a project-wide helper here. Calling the 2 functions should be fine.

@ZeroIntensity
Copy link
Member

My main arguments for adding a new internal function would be that a) it helps remove the risk of reference leaks (which in my experience, are all too common when dealing with string mutations) and b) makes it easier to deal with locale in the future (we have incorrect uses of dlerror() internally that skip the locale check).

@encukou
Copy link
Member

encukou commented Nov 13, 2024

As far as I know, this is not a bug on any supported platforms, and so can't be tested in our CI. It's also not been reported/patched on any other platform I know of.

I don't know if it should be called a “bug” (that's more a question about the definition of the term), but I don't think a fix for this issue should be backported. Let's just improve main.

As for the helper: IMO, it's better to first fix the code, and then, if it makes sense, refactor to add a helper. In clear-cut cases you'd combine the steps in one commit, but let's not do that here.

@ZeroIntensity ZeroIntensity added type-feature A feature request or enhancement and removed type-bug An unexpected behavior, bug, or error labels Nov 13, 2024
@ZeroIntensity
Copy link
Member

As far as I know, this is not a bug on any supported platforms, and so can't be tested in our CI. It's also not been reported/patched on any other platform I know of.

FWIW, I did find BPO-41894 when looking through the places where we use PyUnicode_DecodeLocale.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.12 bugs and security fixes 3.13 bugs and security fixes 3.14 new features, bugs and security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) topic-C-API type-feature A feature request or enhancement
Projects
None yet
Development

No branches or pull requests

3 participants