Skip to content

Commit 2dfc7f4

Browse files
Copilotkorny
andcommitted
Fix bad HTML filtering regexp for comments with extra characters
Co-authored-by: korny <[email protected]>
1 parent 87ab189 commit 2dfc7f4

File tree

2 files changed

+50
-1
lines changed

2 files changed

+50
-1
lines changed

lib/coderay/scanners/html.rb

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -226,7 +226,7 @@ def scan_tokens encoder, options
226226
case in_tag
227227
when 'script', 'style'
228228
encoder.text_token match, :space if match = scan(/[ \t]*\n/)
229-
if scan(/(\s*<!--)(?:(.*?)(-->)|(.*))/m)
229+
if scan(/(\s*<!--)(?:(.*?)(-->[^\s<]*)|(.*))/m)
230230
code = self[2] || self[4]
231231
closing = self[3]
232232
encoder.text_token self[1], :comment
Lines changed: 49 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,49 @@
1+
require 'test/unit'
2+
require 'coderay'
3+
4+
class HtmlCommentFilteringTest < Test::Unit::TestCase
5+
6+
def test_html_comment_filtering_consistency
7+
# Test case based on issue #262
8+
# Comments with extra characters after --> should be handled consistently
9+
10+
# Normal comment ending with -->
11+
html_normal_comment = <<-HTML
12+
<script>
13+
<!-- This is a normal comment -->
14+
alert('test');
15+
</script>
16+
HTML
17+
18+
# Comment ending with -->!> (extra characters after comment end)
19+
html_comment_with_extra = <<-HTML
20+
<script>
21+
<!-- This is a comment -->!>
22+
alert('test');
23+
</script>
24+
HTML
25+
26+
tokens_normal = CodeRay.scan(html_normal_comment, :html)
27+
tokens_extra = CodeRay.scan(html_comment_with_extra, :html)
28+
29+
html_normal = tokens_normal.html
30+
html_extra = tokens_extra.html
31+
32+
# Both comments should be properly tokenized without error tokens
33+
# The normal comment should end with -->
34+
assert html_normal.include?('--&gt;</span>'), "Normal comment should end with -->"
35+
assert !html_normal.include?('error'), "Normal comment should not contain error tokens"
36+
37+
# The comment with extra chars should end with -->!> and not have separate error tokens
38+
assert html_extra.include?('--&gt;!&gt;</span>'), "Comment with extra chars should end with -->!>"
39+
assert !html_extra.include?('error'), "Comment with extra chars should not contain error tokens"
40+
41+
# Both should have the same basic structure: comment opening, inline content, comment closing
42+
assert html_normal.include?('<span class="comment"> &lt;!--</span>'), "Normal comment should have proper opening"
43+
assert html_extra.include?('<span class="comment"> &lt;!--</span>'), "Extra comment should have proper opening"
44+
45+
assert html_normal.include?('<span class="inline">'), "Normal comment should have inline content"
46+
assert html_extra.include?('<span class="inline">'), "Extra comment should have inline content"
47+
end
48+
49+
end

0 commit comments

Comments
 (0)