Save/load local file from code editor corrupts non-ASCII characters

I discovered today that non-ASCII UTF-8 characters do not survive round trips through the IDE's save and load to file functions, instead getting silently corrupted in a non-reversible manner. Although the Espruino may not have full Unicode support, I think Unicode code comments at least ought to be considered legitimate, and the IDE will mangle those irreparably if they include characters in other scripts/languages.

Specifically, in Firefox, both save and load will corrupt the characters. In Chrome/Chromium, UTF-8 files appear to load correctly but not save without corruption.

For Chrome/Chromium, I believe https://github.com/ticalc-travis/EspruinoWebIDE/commit/af961746e0ca6aca603c85637665c5d20efeaac9 would fix the save issue (disclaimer: only tested briefly). The problem here is that it's trying to convert the text to a Blob by instantiating a Uint8Array and trying to write each byte individually in a loop using `String.charCodeAt`, which for Unicode strings may return values above 255. Trying to store those as Uint8 will naturally mangle them. Passing the string itself directly to the Blob constructor instead should preserve the encoding.

For Firefox/non-Chrome, there is a separate code path for save/load which, as far as I can tell, calls into code in the EspruinoTools repo, specifically [fileOpenDialog](https://github.com/espruino/EspruinoTools/blob/5f316eb953b3647bc112d04ea18891324a0f6867/core/utils.js#L604) and [fileSaveDialog](https://github.com/espruino/EspruinoTools/blob/5f316eb953b3647bc112d04ea18891324a0f6867/core/utils.js#L666). This code has the same issue described above, but there's a comment in fileOpenDialog that suggests it's specifically trying to avoid UTF-8 interpretation. I'm not sure why, as I can't imagine how arbitrarily clipping Unicode points to a 0…255 range would be useful. Am I missing some context? At any rate, since that's in a separate repository, I'm not sure how to live-test changes to this code with the IDE.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Save/load local file from code editor corrupts non-ASCII characters #283

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Save/load local file from code editor corrupts non-ASCII characters #283

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions