This issue is a Codex global repository scan finding for deepmodeling/reacnetgenerator at commit 4fa8e2b.
Fresh downloads are accepted without checking the HTTP status code or validating the downloaded bytes against the supplied SHA256. The SHA256 check only runs when a file already exists before the download.
Relevant code:
|
if os.path.isfile(pathfilename) and ( |
|
sha256 is None or checksha256(pathfilename, sha256) |
|
): |
|
return pathfilename |
|
|
|
# from https://stackoverflow.com/questions/16694907 |
|
for url in must_be_list(urls): |
|
logger.info(f"Try to download {pathfilename} from {url}") |
|
with s.get(url, stream=True) as r, open(pathfilename, "wb") as f: |
|
try: |
|
shutil.copyfileobj(r.raw, f) |
|
break |
Line 493 opens the output file immediately, copies r.raw, and then breaks after the first response that does not raise during copyfileobj(). There is no r.raise_for_status() and no post-download checksha256() call before returning.
Impact:
- a 404/500 HTML response can be saved as the trajectory and accepted
- a corrupted or truncated file can pass through when
sha256 was provided
- fallback URLs are skipped after the first copied response, even if that response is an HTTP error page
- later analysis may fail with confusing parser errors or, worse, process the wrong data
Suggested fix:
Call raise_for_status() before writing or before accepting the response, catch request errors around the full request/write path, and validate sha256 after every fresh download when a hash is provided. If validation fails, delete the bad file and try the next URL before raising.
This issue is a Codex global repository scan finding for deepmodeling/reacnetgenerator at commit 4fa8e2b.
Fresh downloads are accepted without checking the HTTP status code or validating the downloaded bytes against the supplied SHA256. The SHA256 check only runs when a file already exists before the download.
Relevant code:
reacnetgenerator/reacnetgenerator/utils.py
Lines 485 to 496 in 4fa8e2b
Line 493 opens the output file immediately, copies
r.raw, and then breaks after the first response that does not raise duringcopyfileobj(). There is nor.raise_for_status()and no post-downloadchecksha256()call before returning.Impact:
sha256was providedSuggested fix:
Call
raise_for_status()before writing or before accepting the response, catch request errors around the full request/write path, and validatesha256after every fresh download when a hash is provided. If validation fails, delete the bad file and try the next URL before raising.