**Perf: Optimize text selection and navigation performance for large documents**#7917
Open
rustbasic wants to merge 2 commits intoemilk:mainfrom
Open
**Perf: Optimize text selection and navigation performance for large documents**#7917rustbasic wants to merge 2 commits intoemilk:mainfrom
rustbasic wants to merge 2 commits intoemilk:mainfrom
Conversation
…documents**
**Perf: Optimize text selection and navigation performance for large documents**
#### **Summary**
This PR significantly improves the performance of text selection (double-clicking) and cursor navigation within `TextEdit` and `Label` widgets, particularly when handling large documents (e.g., 1MB+ or logs). It eliminates several $O(N^2)$ bottlenecks and unnecessary memory allocations in `text_cursor_state.rs`.
#### **Problems Identified**
1. **$O(N^2)$ Word Boundary Scanning:** In `next_word_boundary_char_index`, `char_index_from_byte_index` was called repeatedly inside a loop. This caused the entire document to be scanned from the beginning for every word found, leading to quadratic time complexity.
2. **Heavy String Allocations:** `ccursor_previous_word` used `collect::<String>()` and `rev()` to search backwards, causing a full copy and memory allocation of the text (or line) every time the user moved the cursor or double-clicked.
3. **Inefficient Line Start Finding:** `find_line_start` performed global character counts (`text.chars().count()`) and global skips, which is very slow for large files.
4. **Global Search Scope:** `select_word_at` was performing word boundary searches across the entire document even for simple double-click actions.
#### **Key Changes & Optimizations**
1. **Line-Scoped Selection:** Updated `select_word_at` to first identify the current line and then perform word boundary searches within that local scope. This reduces the search space from millions of characters to hundreds.
2. **Linear Time ($O(N)$) Boundary Search:** Refactored `next_word_boundary_char_index` to use a running cumulative character counter. This ensures the text is scanned only once.
3. **Zero-Allocation Backwards Search:** Optimized `ccursor_previous_word` to use `next_back()` on the `DoubleEndedIterator` provided by `unicode-segmentation`. This removes all temporary `String` allocations.
4. **Byte-Based Line Search:** Optimized `find_line_start` to use byte-based reverse scanning (`rfind('\n')`), which is significantly faster than counting characters from the start of the document.
#### **Performance Impact**
In my tests with large text files (over 10,000 lines / 1MB+):
- **Before:** Double-clicking a word caused a UI freeze for 2–5 seconds.
- **After:** Word selection and navigation are near-instantaneous (0–1ms), providing a smooth "native-like" experience even in WASM environments.
|
Preview available at https://egui-pr-preview.github.io/pr/7917-patch170 View snapshot changes at kitdiff |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Perf: Optimize text selection and navigation performance for large documents
Summary
This PR significantly improves the performance of text selection (double-clicking) and cursor navigation within$O(N^2)$ bottlenecks and unnecessary memory allocations in
TextEditandLabelwidgets, particularly when handling large documents (e.g., 1MB+ or logs). It eliminates severaltext_cursor_state.rs.Problems Identified
next_word_boundary_char_index,char_index_from_byte_indexwas called repeatedly inside a loop. This caused the entire document to be scanned from the beginning for every word found, leading to quadratic time complexity.ccursor_previous_wordusedcollect::<String>()andrev()to search backwards, causing a full copy and memory allocation of the text (or line) every time the user moved the cursor or double-clicked.find_line_startperformed global character counts (text.chars().count()) and global skips, which is very slow for large files.select_word_atwas performing word boundary searches across the entire document even for simple double-click actions.Key Changes & Optimizations
select_word_atto first identify the current line and then perform word boundary searches within that local scope. This reduces the search space from millions of characters to hundreds.next_word_boundary_char_indexto use a running cumulative character counter. This ensures the text is scanned only once.ccursor_previous_wordto usenext_back()on theDoubleEndedIteratorprovided byunicode-segmentation. This removes all temporaryStringallocations.find_line_startto use byte-based reverse scanning (rfind('\n')), which is significantly faster than counting characters from the start of the document.Performance Impact
In my tests with large text files (over 10,000 lines / 1MB+):