Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

frontend: 404 when build_chroot is not found in database from task_id #3649

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

nikromen
Copy link
Member

@nikromen nikromen commented Mar 1, 2025

Addresses concern raised in
#3457 (comment)

Providing nonexisting chroot will result in NoResultFound too.
This change returns 404 instead of 500.

Fix #3457

Copy link

github-actions bot commented Mar 1, 2025

Pull Request validation

Failed

🔴 Review - Missing review from a member (2 required)

Success

🟢 CI - All checks have passed

Adresses concern raised in
fedora-copr#3457 (comment)

Providing nonexisting chroot will result in NoResultFound too.
This change returns 404 instead of 500.

Relates to fedora-copr#3457
@nikromen
Copy link
Member Author

nikromen commented Mar 3, 2025

tested on stg and it returns 404 as it should instead of 500:

$ curl -w "Code: %{http_code}\n" -X GET https://copr.stg.fedoraproject.org/backend/get-build-task/8711675-fedora-43-x86_64
{"msg":"Specified task ID not found"}
Code: 404

also instead of messy log, we get in logs

2025-03-03 17:55:51,010 [ERROR][/usr/share/copr/coprs_frontend/coprs/error_handlers.py:42|error_handlers:handle_error][ANON] Response error: 404 The requested URL was not found on the server. If you entered the URL manually please check your spelling and try again.

@FrostyX
Copy link
Member

FrostyX commented Mar 6, 2025

Thank you for testing @nikromen,
but I am still worried. Sorry, I couldn't point you to the exact issue on the meeting.

There are a lot of layers of abstractions on the backend, but IMHO we will hit this error handling in case something is wrong:

if response.status_code >= 500:
# Server error. Hopefully this is only temporary problem, we wan't
# to re-try, and wait till the server works again.
raise RequestRetryError(
"Request server error on {}: {} {}".format(
url, response.status_code, response.reason))
if response.status_code >= 400:
# Client error. The mistake is on our side, it doesn't make sense
# to continue with retries.
raise RequestError(
"Request client error on {}: {} {}".format(
url, response.status_code, response.reason))

And it behaves differently for 4xx and 5xx status codes. So I really wanted to make sure that backend won't start doing anything crazy once it wants to build something and we now return a different status code.

@nikromen
Copy link
Member Author

nikromen commented Mar 10, 2025

Ok, I can see your point now.

tldr;let's close this PR since it works, the only inconvenience is a rare occasion of bad HTTP code in error logs

Honestly, the original error is a bit of a mystery for me then :D I originally thought the build https://copr.fedorainfracloud.org/coprs/petersen/haskell-language-server/build/6742605/ was happening during f39 branching and thus we got the database error (which should be addressed by #3225 I think)... but looking to copr's branching history - F39 was branched on Aug 11, 2023 so it's some weird glitch :/

Anyway, yes, with the change in this PR, the task above would fail at the backend side - since it was successful after the second retry (see the logs of the f39-aarch build) and because BuildsLogic.get_by_build_id_and_name is used only for backend-related code right now it should really return 5xx in this case...

My issue with this is - we use the BuildsLogic and CoprsLogic methods not only for backend-related code but, in general, everywhere else - even inside apiv3_ns when parsing users' input... and sometimes we return or parse 4xx error, sometimes 5xx. This is hard for the maintainer to process when the methods return 4xx and when 5xx to use them for user or backend-related tasks (e.g. adding/editing some functionality and using the methods can result in different retval codes than expected).

But to add more headache to this, - due to a lot of layers of abstraction in this - it would be hard to change the behaviour so backend receives 5xx when it should, and the rest gets 4xx, meanwhile not to break anything else :D And since what we have now works fine and nobody complains (except confusing HTTP code in our logs), I don't want to touch it anymore :D

WDYT about not accepting this change and closing the issue that we are fine with it?

try:
build_chroot = BuildChrootsLogic.get_by_build_id_and_name(build_id, chroot_name)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a valid code movement.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

true, that's maybe the only thing I would want to accept right now from this PR

except DataError as ex:
raise MalformedArgumentException(
f"Invalid build_id: {build_id} or name: {name}"
) from ex
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure where the DataError comes from. Do we really need this part at all?

Copy link
Member Author

@nikromen nikromen Mar 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reproducer: https://copr.fedorainfracloud.org/backend/get-build-task/6742s605-fedora-39-aarch64

but looking at usage of this method... it's only used for backend related part so it really isn't a big deal tbh and it can be dropped

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Error 500 instead of 404?
3 participants