Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug-Candidate]: Source mapping indexes exceed source length when special characters are present in Solidity files #2692

Open
nivcertora opened this issue Mar 26, 2025 · 6 comments
Labels
bug-candidate Bugs reports that are not yet confirmed

Comments

@nivcertora
Copy link

Describe the issue:

We are encountering an issue with Slither’s source mapping when analyzing Solidity files that include certain special Unicode characters. In our use case, the source mapping index returned for some functions exceeds the length of the source file. For example, we observed that for a file with a total length of 54,863 characters, Slither reported an internal function with a start index of 55,053.

After investigation, we suspect that the presence of characters such as:

+.*•´.*:˚.°*.˚  
≈  
½

may be causing encoding or processing issues within Slither (or its underlying CryticCompile component), leading to miscalculation of character positions.

Steps to Reproduce:

  1. Create a Solidity file (e.g., Test.sol) that includes a library or contract containing these special characters in comments or string literals.
  2. Run Slither (via CryticCompile) on the file.
  3. Observe that the source mapping for at least one function returns a start index greater than the total file length.

Code example to reproduce the issue:

(https://github.com/Vectorized/solady/blob/main/src/utils/FixedPointMathLib.sol)

Version:

0.11.0

Relevant log output:

@nivcertora nivcertora added the bug-candidate Bugs reports that are not yet confirmed label Mar 26, 2025
@elopez
Copy link
Member

elopez commented Mar 26, 2025

Hi @nivcertora ! Thanks for the report. If you have some time, could you check if the changes in #2662 improve the situation?

From my understanding, the issue stems from the fact that the offsets are in bytes, not "characters" in the string sense, which causes differences when you have multi-byte characters in your source code.

@nivcertora
Copy link
Author

Thanks for the quick response. Is there an easy way to install the version with the changes?

@elopez
Copy link
Member

elopez commented Mar 26, 2025

You should be able to install it from the PR branch with

pip install https://github.com/crytic/slither/archive/refs/heads/fix-unicode-src-mappings.zip

@bohendo
Copy link
Contributor

bohendo commented Mar 26, 2025

Did you notice if there's a specific detector that's reporting misaligned source maps? Or are you writing a Python script that checks the Source objects directly?

@nivcertora
Copy link
Author

I wrote a Python script, and it looks like there is still an offset issue.
Here, I compare the offsets between a regex-based function and the source mapping

Image

@bohendo
Copy link
Contributor

bohendo commented Mar 26, 2025

Could you use function.source_mapping.content? This will handle encoding to translate the mapping values to a char position in the source code. If you're trying to map manually, be aware that index values such as source_mapping.start are the byte offset, not the char offset so you'd need to do something like this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug-candidate Bugs reports that are not yet confirmed
Projects
None yet
Development

No branches or pull requests

3 participants