Skip to content

Commit 0b60c63

Browse files
Add new FAQ entry on --license-text (#4476)
* Add new FAQ entry on --license-text Signed-off-by: Philippe Ombredanne <[email protected]> Signed-off-by: Ayan Sinha Mahapatra <[email protected]> Co-authored-by: Ayan Sinha Mahapatra <[email protected]>
1 parent 967a784 commit 0b60c63

File tree

2 files changed

+58
-3
lines changed

2 files changed

+58
-3
lines changed

docs/source/cli-reference/basic-options.rst

Lines changed: 8 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -623,8 +623,8 @@
623623
The option ``--license-text-diagnostics`` is a sub-option of and requires the options
624624
``--license`` and ``--license-text``.
625625

626-
In the matched license text, include diagnostic highlights surrounding with square brackets []
627-
words that are not matched.
626+
This adds a new attribute like the matched license text, but includes diagnostic highlights
627+
surrounding with square brackets ``[]`` for words that are not matched.
628628

629629
In a normal scan, whole lines of text are included in the matched license text, including parts
630630
that are possibly unmatched.
@@ -645,9 +645,14 @@
645645
obtaining a copy of this software and associated documentation files (the \"Software\"),
646646
to deal in the Software without restriction
647647

648-
With Diagnostics on::
648+
With Diagnostics on (new attribute with the matched text diagnostics)::
649649

650650
"matched_text":
651+
"License Copyright (c) 2000 - 2006 The Legion Of The Bouncy Castle
652+
(http://www.bouncycastle.org) Permission is hereby granted, free of charge, to any person
653+
obtaining a copy of this software and associated documentation files (the \"Software\"),
654+
to deal in the Software without restriction
655+
"matched_text_diagnostics":
651656
"License [Copyright] ([c]) [2000] - [2006] [The] [Legion] [Of] [The] [Bouncy] [Castle]
652657
([http]://[www].[bouncycastle].[org]) Permission is hereby granted, free of charge, to any person
653658
obtaining a copy of this software and associated documentation files (the \"Software\"),

docs/source/misc/faq.rst

Lines changed: 50 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -82,3 +82,53 @@ When scanning binaries, the line numbers are just a relative indication of where
8282
a detection was found: there is no such thing as lines in a binary. The numbers
8383
reported are based on the strings extracted from the binaries, typically broken
8484
as new lines with each NULL character.
85+
86+
87+
How does ``--license-text`` for ScanCode works exactly?
88+
-------------------------------------------------------------
89+
90+
Is the matched text that gets included into the result exactly the lines of text
91+
from the input file that are covered by the ``start_line`` and ``end_line``
92+
fields of the result? I.e., if I would post-process the input file and extract
93+
``start_line`` to ``end_line`` from it, would I get exactly the ``matched_text``
94+
contents? Or is there some more "magic" involved when populating the
95+
``matched_text`` field?
96+
97+
ScanCode is a bit smarter than just start and end line, as matching is based on
98+
words, not lines of the actual scanned text. And a whole line may not always be matched.
99+
100+
For instance with this command::
101+
102+
$ echo "Foo is a wonder piece of code. Licensed under the GPL. " \
103+
"For support contact [email protected] " > tst
104+
$ scancode --license --license-text --license-text-diagnostics --yaml - tst
105+
...
106+
license_detections:
107+
- license_expression: gpl-1.0-plus
108+
license_expression_spdx: GPL-1.0-or-later
109+
matches:
110+
- license_expression: gpl-1.0-plus
111+
license_expression_spdx: GPL-1.0-or-later
112+
from_file: tst
113+
start_line: 1
114+
end_line: 1
115+
matcher: 2-aho
116+
score: '100.0'
117+
matched_length: 4
118+
match_coverage: '100.0'
119+
rule_relevance: 100
120+
rule_identifier: gpl_85.RULE
121+
rule_url: https://github.com/nexB/scancode-toolkit/tree/develop/src/licensedcode/data/rules/gpl_85.RULE
122+
matched_text: Foo is a wonder piece of code. Licensed under the GPL.
123+
For support contact [email protected]
124+
matched_text_diagnostics: Licensed under the GPL.
125+
...
126+
127+
then:
128+
129+
- ``matched_text`` is based on ``start_line`` and ``end_line``
130+
- ``matched_text_diagnostics`` is based on the exact matched words
131+
132+
Note that ``matched_text_diagnostics`` also includes "tagged" gaps or extra
133+
unmatched words highlighted between the matched words.
134+

0 commit comments

Comments
 (0)