Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Invalid UTF-8 output in build scripts silently prevents displaying any output on error #14787

Open
ChrisDenton opened this issue Nov 5, 2024 · 3 comments
Labels
A-build-scripts Area: build.rs scripts C-bug Category: bug S-needs-team-input Status: Needs input from team on whether/how to proceed.

Comments

@ChrisDenton
Copy link
Member

ChrisDenton commented Nov 5, 2024

Problem

Usually when a build script panics, cargo will display whatever was output to stdout and stderr. However, if a build script outputs invalid UTF-8 then the output is suppressed.

Steps

Create a new project with the following build.rs:

use std::io::{stdout, Write};

fn main() {
    // Print some nice UTF-8 output
    println!("Hello world!");

    // Print some bad non UTF-8 output
    let bad = b"\xff\n";
    stdout().write_all(bad).unwrap();

    // panic to indicate an error
    panic!("oh no!");
}

Use cargo build. Note the output is missing an --- stdout section:

$ cargo build
   Compiling output v0.1.0 (/home/chris/output)
error: failed to run custom build command for `output v0.1.0 (/home/chris/output)`

Caused by:
  process didn't exit successfully: `/home/chris/output/target/debug/build/output-4fb2dce3ec4b9a04/build-script-build` (exit status: 101)
  --- stderr
  thread 'main' panicked at build.rs:12:5:
  oh no!
  note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

If you amend the above build.rs to print bad to stderr then it'll also be missing the ---stderr section.

Possible Solution(s)

At the very least warn in this situation. A better fix would be to warn and display the output with the replacement character used in place of the bad bytes (or perhaps some more complex escaping).

Notes

The real world source of this report is rust-lang/cc-rs#1260. The best fix there would be for cc-rs to correctly re-encode the output of cl.exe as UTF-8 (if it isn't already). However, I think Cargo should do something here too.

Version

cargo 1.82.0 (8f40fc5 2024-08-21)
release: 1.82.0
commit-hash: 8f40fc5
commit-date: 2024-08-21
host: x86_64-unknown-linux-gnu
libgit2: 1.8.1 (sys:0.19.0 vendored)
libcurl: 8.9.0-DEV (sys:0.4.74+curl-8.9.0 vendored ssl:OpenSSL/1.1.1w)
ssl: OpenSSL 1.1.1w 11 Sep 2023
os: Ubuntu 22.4.0 (jammy) [64-bit]

@ChrisDenton ChrisDenton added C-bug Category: bug S-triage Status: This issue is waiting on initial triage. labels Nov 5, 2024
@epage epage added the A-build-scripts Area: build.rs scripts label Nov 6, 2024
@epage
Copy link
Contributor

epage commented Nov 6, 2024

I believe this is being done at

if let Some(out) = stdout {
match str::from_utf8(out) {
Ok(s) if !s.trim().is_empty() => {
desc.push_str("\n--- stdout\n");
desc.push_str(s);
}
Ok(..) | Err(..) => {}
}
}
if let Some(out) = stderr {
match str::from_utf8(out) {
Ok(s) if !s.trim().is_empty() => {
desc.push_str("\n--- stderr\n");
desc.push_str(s);
}
Ok(..) | Err(..) => {}
}
}

That is in contrast to how we get stdout/stderr in non-error cases:

for line in String::from_utf8_lossy(new_lines).lines() {

As this code is generic, a part of me worries about making a blanket assumption that the output is text and rendering it in a lossy manner. For example, we've been discussing binary formats for Cargo (https://rust-lang.zulipchat.com/#narrow/channel/246057-t-cargo/topic/.60cargo.20metadata.60.20performance/near/476523460). If we extend that to rustc, then we could have the rustc binary output rendered as lossy UTF-* which would be messy.

Maybe in cases like exec_with_streaming where they already assume the output is text, we can make the output lossy ahead of time so the error will only ever see it as UTF-8

@epage epage added S-needs-team-input Status: Needs input from team on whether/how to proceed. and removed S-triage Status: This issue is waiting on initial triage. labels Nov 6, 2024
@ChrisDenton
Copy link
Member Author

I would only note that cargo build -vv does print the offending line in the diagnostics.

@epage
Copy link
Contributor

epage commented Nov 7, 2024

Yes, this ties into my comment

That is in contrast to how we get stdout/stderr in non-error cases:

All of the build script non-error processing,. including --verbose handling, is done after that from_utf8_lossy call.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-build-scripts Area: build.rs scripts C-bug Category: bug S-needs-team-input Status: Needs input from team on whether/how to proceed.
Projects
None yet
Development

No branches or pull requests

2 participants