Skip to content

[Code scan] Validate downloaded files and reject unsuccessful HTTP responses #2448

Description

@njzjz

This issue is a Codex global repository scan finding for deepmodeling/reacnetgenerator at commit 4fa8e2b.

Fresh downloads are accepted without checking the HTTP status code or validating the downloaded bytes against the supplied SHA256. The SHA256 check only runs when a file already exists before the download.

Relevant code:

if os.path.isfile(pathfilename) and (
sha256 is None or checksha256(pathfilename, sha256)
):
return pathfilename
# from https://stackoverflow.com/questions/16694907
for url in must_be_list(urls):
logger.info(f"Try to download {pathfilename} from {url}")
with s.get(url, stream=True) as r, open(pathfilename, "wb") as f:
try:
shutil.copyfileobj(r.raw, f)
break

Line 493 opens the output file immediately, copies r.raw, and then breaks after the first response that does not raise during copyfileobj(). There is no r.raise_for_status() and no post-download checksha256() call before returning.

Impact:

  • a 404/500 HTML response can be saved as the trajectory and accepted
  • a corrupted or truncated file can pass through when sha256 was provided
  • fallback URLs are skipped after the first copied response, even if that response is an HTTP error page
  • later analysis may fail with confusing parser errors or, worse, process the wrong data

Suggested fix:

Call raise_for_status() before writing or before accepting the response, catch request errors around the full request/write path, and validate sha256 after every fresh download when a hash is provided. If validation fails, delete the bad file and try the next URL before raising.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingtriage

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    Status
    Todo

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions