Open
Conversation
New skill capturing performance optimization patterns for Mojo code. Layers on top of mojo-syntax and is triggered when profiling, benchmarking, tuning latency, or porting performance-sensitive code to Mojo. Covers hot-path inlining, unsafe pointer access in inner loops, pre-allocation and lazy containers, struct layout for cache efficiency, views over owned strings, ref over var in loops, _Global lazy caches, init_pointee_move for heap fields, hash-keyed caching, comptime specialization, nibble-based SIMD byte scanning, prefiltering strategies, numeric accumulation, fast-path dispatch by input shape, and guidance on what not to optimize.
mojo-optimizations skill
Replace the hand-rolled perf_counter_ns harness with the stdlib std.benchmark idiom (Bench, Bencher, BenchConfig, BenchId, keep, ThroughputMeasure). Matches the pattern used in mojo/stdlib/benchmarks/ so users copy-paste from the reference tree instead of reinventing warmup and calibration logic.
…ehavior The previous text incorrectly claimed List, Span, and StringSlice all emit a bounds check on every __getitem__ call. Verified against the stdlib (std/collections/_index_normalization.mojo and std/builtin/debug_assert.mojo): List and Span both pass assert_always=False to normalize_index, so their bounds check compiles out in default (ASSERT=safe) release builds. Only StringSlice[byte=i] emits the check by default, plus a UTF-8 start-byte debug_assert. Replace the section with a per-type table of actual costs, explain what unsafe_ptr() reliably buys you in default release (negative-index branch, trap-free loop optimization, parity with -D ASSERT=all builds), and mention list.unsafe_get(idx) as a safer middle ground.
Replace the regex-specific example (CompiledRegex, _get_regex_cache, ImmSlice) with a neutral get_or_build pattern that applies to any expensive value keyed by a string. Add a brief list of use cases (parsed configs, compiled templates, resolved paths, interned symbols, SQL plans) so readers see the shape beyond regex.
Replace regex-flavored examples (CompiledRegex, DFAMatcher, Match, LazyDFA, compile_regex, match_first, is_match, .*literal) with a variety of neutral domains: - Benchmark: parse_json - Inlining trampolines: JsonParser -> Tokenizer -> ByteScanner - Struct layout: Token (tokenizer output) - Views: tokenize(source) - ref over var: rows, table - Global caches: SymbolTable, intern() - init_pointee_move: Arena - SIMD scanning: JSON whitespace, CSV delimiters, URL-safe chars - Prefilters: log scanning, filename extraction, JSON value detection - Fast-path dispatch: QueryPlan (pk lookup, sequential scan, general) - Unlikely-branch hoisting: validate(input)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
mojo-optimizationsskill capturing performance optimization patterns for Mojo. Layers on top ofmojo-syntaxand is triggered when profiling, benchmarking, tuning latency, or porting performance-sensitive code to Mojo.refovervarin loops,_Globallazy caches,init_pointee_movefor heap fields, hash-keyed caching,comptimespecialization, nibble-based SIMD byte scanning, prefiltering strategies, numeric accumulation, fast-path dispatch by input shape, and guidance on what not to optimize.README.mdwith an entry for the new skill.