tag_no_case panics on &str input when Unicode case-fold changes byte length (DoS)

### Description

`bytes::complete::tag_no_case` (and `bytes::streaming::tag_no_case`) panic when matching an `&str` input where the matched character case-folds to a character of a different UTF-8 byte length than the tag character. An attacker who controls the parser input can crash any nom-based parser that uses `tag_no_case` with `&str` input.

### Reproduction (nom 7.1.3)

```rust
use nom::{bytes::complete::tag_no_case, error::Error, IResult};

fn main() {
    // KELVIN SIGN (U+212A, 3 bytes) case-folds to ASCII 'k' (1 byte)
    let input: &str = "\u{212A}xyz";   // 4 chars, 6 bytes
    let _: IResult<&str, &str, Error<&str>> = tag_no_case("k")(input);
    // PANIC: byte index 1 is not a char boundary; it is inside 'K' (bytes 0..3)

    // OHM SIGN (U+2126, 3 bytes) case-folds to ω (2 bytes)
    let input2: &str = "\u{2126}xyz";
    let _: IResult<&str, &str, Error<&str>> = tag_no_case("ab")(input2);
    // PANIC: byte index 2 is not a char boundary; it is inside 'Ω' (bytes 0..3)
}
```

### Root cause

`Compare<&str> for &str::compare_no_case()` (`src/traits.rs:845`) does char-level comparison with `to_lowercase()`. After deciding "this matches", `tag_no_case` (`src/bytes/complete.rs:85`) slices the input using the **byte length of the *tag***, not the byte length of the matched prefix in the input:

```rust
let tag_len = tag.input_len();     // byte length of the LITERAL tag
…
CompareResult::Ok => Ok(i.take_split(tag_len)),
```

When the matched character in the input has more bytes than the tag character it case-folded to, `tag_len` lands inside a multi-byte UTF-8 character and `split_at` panics.

### Property that fails

```rust
use proptest::prelude::*;
use nom::{bytes::complete::tag_no_case, error::Error, IResult};

proptest! {
    #[test]
    fn tag_no_case_should_never_panic(tag in "[a-zA-Z]{1,5}", input in ".*") {
        // tag_no_case must either return Ok or Err, never panic
        let _result: IResult<&str, &str, Error<&str>> =
            tag_no_case::<_, _, Error<&str>>(tag.as_str())(input.as_str());
    }
}
// Shrinks to tag="k", input="\u{212A}"
```

### Threat model

Any nom parser using `tag_no_case` on `&str` and exposed to untrusted input is vulnerable to denial-of-service. Examples: HTTP/SMTP header parsers, config-file parsers, URL/email validators, query parsers.

The attacker needs only to include `U+212A` (Kelvin sign), `U+2126` (Ohm sign), `U+017F` (long s, folds to 's'), or any other case-folding-with-byte-length-change character in a position the parser tries `tag_no_case` against. The crash is a plain Rust `panic!`, which (unless callers wrap calls in `std::panic::catch_unwind` — which most async/web frameworks don't) terminates the thread/process.

### Suggested fix

After confirming a case-insensitive match, slice the input by the **input** byte length of the matched prefix, not the tag's byte length. Concretely, in `bytes/complete.rs::tag_no_case`, derive the slice length from iterating the input's chars and summing `c.len_utf8()` for as many chars as the tag has, not from `tag.input_len()`.

Equivalent fix at the `Compare` trait level: have `compare_no_case` return the matched *input prefix length* alongside `CompareResult::Ok`.

### Environment

- nom: 7.1.3
- Rust: 1.80+

### Other affected codepoints

`U+017F` (ſ → s, 2 → 1), `U+0130` (İ → i̇, 2 → 3), `U+1FBE` (ι → ι, 3 → 2), various Greek / German sharp s.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

tag_no_case panics on &str input when Unicode case-fold changes byte length (DoS) #1884

Description

Reproduction (nom 7.1.3)

Root cause

Property that fails

Threat model

Suggested fix

Environment

Other affected codepoints

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

tag_no_case panics on &str input when Unicode case-fold changes byte length (DoS) #1884

Description

Description

Reproduction (nom 7.1.3)

Root cause

Property that fails

Threat model

Suggested fix

Environment

Other affected codepoints

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions