Skip to content

Commit 2659562

Browse files
Merge branch 'main' into license-detection-models
Signed-off-by: Ayan Sinha Mahapatra <[email protected]>
2 parents 2f7943f + b86ec74 commit 2659562

File tree

235 files changed

+34079
-8041
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

235 files changed

+34079
-8041
lines changed

.github/workflows/ci.yml

+4-1
Original file line numberDiff line numberDiff line change
@@ -31,7 +31,7 @@ jobs:
3131
strategy:
3232
max-parallel: 4
3333
matrix:
34-
python-version: ["3.10", "3.11"]
34+
python-version: ["3.10", "3.11", "3.12"]
3535

3636
steps:
3737
- name: Checkout code
@@ -44,6 +44,9 @@ jobs:
4444

4545
- name: Install universal ctags
4646
run: sudo apt-get install -y universal-ctags
47+
48+
- name: Install xgettext
49+
run: sudo apt-get install -y gettext
4750

4851
- name: Install dependencies
4952
run: make dev envfile

.github/workflows/pypi-release.yml

+1-1
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@ jobs:
1717
- name: Set up Python
1818
uses: actions/setup-python@v5
1919
with:
20-
python-version: 3.11
20+
python-version: 3.12
2121

2222
- name: Install pypa/build
2323
run: python -m pip install build --user

.gitignore

+1-1
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,7 @@ __pycache__/
33
*.py[cod]
44

55
*.db
6+
*.sqlite3
67
.installed.cfg
78
parts
89
develop-eggs
@@ -46,7 +47,6 @@ local
4647
/.python-version
4748
/.pytest_cache/
4849
/scancodeio.egg-info/
49-
policies.yml
5050
*.rdb
5151
*.aof
5252
.vscode

CHANGELOG.rst

+263-1
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,253 @@
11
Changelog
22
=========
33

4-
v34.1.0 (unreleased)
4+
v34.7.1 (2024-07-15)
5+
--------------------
6+
7+
- Add pipeline step selection for a run execution.
8+
This allows to run a pipeline in an advanced mode allowing to skip some steps,
9+
or restart from a step, like the last failed step.
10+
The steps can be edited from the Run "status" modal using the "Select steps" button.
11+
This is an advanced feature and should we used with caution.
12+
https://github.com/nexB/scancode.io/issues/1303
13+
14+
- Display the resolved_to_package as link in the dependencies tab.
15+
https://github.com/nexB/scancode.io/pull/1314
16+
17+
- Add support for multiple instances of a PackageURL in the CycloneDX outputs.
18+
The `package_uid` is now included in each BOM Component as a property.
19+
https://github.com/nexB/scancode.io/issues/1316
20+
21+
- Add administration interface. Can be enabled with the SCANCODEIO_ENABLE_ADMIN_SITE
22+
setting.
23+
Add ``--admin`` and ``--super`` options to the ``create-user`` management command.
24+
https://github.com/nexB/scancode.io/pull/1323
25+
26+
- Add ``results_url`` and ``summary_url`` on the API ProjectSerializer.
27+
https://github.com/nexB/scancode.io/issues/1325
28+
29+
v34.7.0 (2024-07-02)
30+
--------------------
31+
32+
- Add all "classify" plugin fields from scancode-toolkit on the CodebaseResource model.
33+
https://github.com/nexB/scancode.io/issues/1275
34+
35+
- Refine the extraction errors reporting to include the resource path for rendering
36+
link to the related resources in the UI.
37+
https://github.com/nexB/scancode.io/issues/1273
38+
39+
- Add a ``flush-projects`` management command, to Delete all project data and their
40+
related work directories created more than a specified number of days ago.
41+
https://github.com/nexB/scancode.io/issues/1289
42+
43+
- Update the ``inspect_packages`` pipeline to have an optional ``StaticResolver``
44+
group to create resolved packages and dependency relationships from lockfiles
45+
and manifests having pre-resolved dependencies. Also update this pipeline to
46+
perform package assembly from multiple manifests and files to create
47+
discovered packages. Also update the ``resolve_dependencies`` pipeline to have
48+
the same ``StaticResolver`` group and mode the dynamic resolution part to a new
49+
optional ``DynamicResolver`` group.
50+
See https://github.com/nexB/scancode.io/pull/1244
51+
52+
- Add a new attribute ``is_direct`` to the DiscoveredDependency model and two new
53+
attributes ``is_private`` and ``is_virtual`` to the DiscoveredPackage model.
54+
Also update the UIs to show these attributes and show the ``package_data`` field
55+
contents for CodebaseResources in the ``extra_data`` tab.
56+
See https://github.com/nexB/scancode.io/pull/1244
57+
58+
- Update scancode-toolkit to version ``32.2.1``. For the complete list of updates
59+
and improvements see https://github.com/nexB/scancode-toolkit/releases/tag/v32.2.0
60+
and https://github.com/nexB/scancode-toolkit/releases/tag/v32.2.1
61+
62+
- Add support for providing pipeline "selected_groups" in the ``run`` entry point.
63+
https://github.com/nexB/scancode.io/issues/1306
64+
65+
v34.6.3 (2024-06-21)
66+
--------------------
67+
68+
- Use the ``--option=value`` syntax for args entries in place of ``--option value``
69+
for fetching Docker images using skopeo through ``run_command_safely`` calls.
70+
https://github.com/nexB/scancode.io/issues/1257
71+
72+
- Fix an issue in the d2d JavaScript mapper.
73+
https://github.com/nexB/scancode.io/pull/1274
74+
75+
- Add support for a ``ignored_vulnerabilities`` field on the Project configuration.
76+
https://github.com/nexB/scancode.io/issues/1271
77+
78+
v34.6.2 (2024-06-18)
79+
--------------------
80+
81+
- Store SBOMs headers in the `Project.extra_data` field during the load_sboms
82+
pipeline.
83+
https://github.com/nexB/scancode.io/issues/1253
84+
85+
- Add support for fetching Git repository as Project input.
86+
https://github.com/nexB/scancode.io/issues/921
87+
88+
- Enhance the logging and reporting of input fetch exceptions.
89+
https://github.com/nexB/scancode.io/issues/1257
90+
91+
v34.6.1 (2024-06-07)
92+
--------------------
93+
94+
- Remove print statements from migration files.
95+
- Display full traceback on error in the ``execute`` management command.
96+
- Log the Project message creation.
97+
- Refactor the ``get_env_from_config_file`` to support empty config file.
98+
99+
v34.6.0 (2024-06-07)
100+
--------------------
101+
102+
- Add a new ``scan_for_virus`` add-on pipeline based on ClamAV scan.
103+
Found viruses are stored as "error" Project messages and on their related codebase
104+
resource instance using the ``extra_data`` field.
105+
https://github.com/nexB/scancode.io/issues/1182
106+
107+
- Add ability to filter by tag on the resource list view.
108+
https://github.com/nexB/scancode.io/issues/1217
109+
110+
- Use "unknown" as the Package URL default type when no values are provided for that
111+
field. This allows to create a discovered package instance instead of raising a
112+
Project error message.
113+
https://github.com/nexB/scancode.io/issues/1249
114+
115+
- Rename DiscoveredDependency ``resolved_to`` to ``resolved_to_package``, and
116+
``resolved_dependencies`` to ``resolved_from_dependencies`` for clarity and
117+
consistency.
118+
Add ``children_packages`` and ``parent_packages`` ManyToMany field on the
119+
DiscoveredPackage model.
120+
Add full dependency tree in the CycloneDX output.
121+
https://github.com/nexB/scancode.io/issues/1066
122+
123+
- Add a new ``run`` entry point for executing pipeline as a single command.
124+
https://github.com/nexB/scancode.io/pull/1256
125+
126+
- Generate a DiscoveredPackage.package_uid in create_from_data when not provided.
127+
https://github.com/nexB/scancode.io/issues/1256
128+
129+
v34.5.0 (2024-05-22)
130+
--------------------
131+
132+
- Display the current path location in the "Codebase" panel as a navigation breadcrumbs.
133+
https://github.com/nexB/scancode.io/issues/1158
134+
135+
- Fix a rendering issue in the dependency details view when for_package or
136+
datafile_resource fields do not have a value.
137+
https://github.com/nexB/scancode.io/issues/1177
138+
139+
- Add a new `CollectPygmentsSymbolsAndStrings` pipeline (addon) for collecting source
140+
symbol, string and comments using Pygments.
141+
https://github.com/nexB/scancode.io/pull/1179
142+
143+
- Workaround an issue with the cyclonedx-python-lib that does not allow to load
144+
SBOMs that contains properties with no values.
145+
Also, a few fixes pre-validation are applied before deserializing thr SBOM for
146+
maximum compatibility.
147+
https://github.com/nexB/scancode.io/issues/1185
148+
https://github.com/nexB/scancode.io/issues/1230
149+
150+
- Add a new `CollectTreeSitterSymbolsAndStrings` pipeline (addon) for collecting source
151+
symbol and string using tree-sitter.
152+
https://github.com/nexB/scancode.io/pull/1181
153+
154+
- Fix `inspect_packages` pipeline to properly link discovered packages and dependencies to
155+
codebase resources of package manifests where they were found. Also correctly assign
156+
the datasource_ids attribute for packages and dependencies.
157+
https://github.com/nexB/scancode.io/pull/1180
158+
159+
- Add "Product name" and "Product version" as new project settings.
160+
https://github.com/nexB/scancode.io/issues/1197
161+
162+
- Add "Product name" and "Product version" as new project settings.
163+
https://github.com/nexB/scancode.io/issues/1197
164+
165+
- Raise the minimum RAM required per CPU code in the docs.
166+
A good rule of thumb is to allow **2 GB of memory per CPU**.
167+
For example, if Docker is configured for 8 CPUs, a minimum of 16 GB of memory is
168+
required.
169+
https://github.com/nexB/scancode.io/issues/1191
170+
171+
- Add value validation for the search complex query syntax.
172+
https://github.com/nexB/scancode.io/issues/1183
173+
174+
- Bump matchcode-toolkit version to v5.0.0.
175+
176+
- Fix the content of the ``package_url`` field in CycloneDX outputs.
177+
https://github.com/nexB/scancode.io/issues/1224
178+
179+
- Enhance support for encoded ``package_url`` during the conversion to model fields.
180+
https://github.com/nexB/scancode.io/issues/1171
181+
182+
- Remove the ``scancode_license_score`` option from the Project configuration.
183+
https://github.com/nexB/scancode.io/issues/1231
184+
185+
- Remove the ``extract_recursively`` option from the Project configuration.
186+
https://github.com/nexB/scancode.io/issues/1236
187+
188+
- Add support for a ``ignored_dependency_scopes`` field on the Project configuration.
189+
https://github.com/nexB/scancode.io/issues/1197
190+
191+
- Add support for storing the scancode-config.yml file in codebase.
192+
The scancode-config.yml file can be provided as a project input, or can be located
193+
in the codebase/ immediate subdirectories. This allows to provide the configuration
194+
file as part of an input archive or a git clone for example.
195+
https://github.com/nexB/scancode.io/issues/1236
196+
197+
- Provide a downloadable YAML scancode-config.yml template in the documentation.
198+
https://github.com/nexB/scancode.io/issues/1197
199+
200+
- Add support for CycloneDX SBOM component properties as generated by external tools.
201+
For example, the ``ResolvedUrl`` generated by cdxgen is now imported as the package
202+
``download_url``.
203+
204+
v34.4.0 (2024-04-22)
205+
--------------------
206+
207+
- Upgrade Gunicorn to v22.0.0 security release.
208+
209+
- Display the list of fields available for the advanced search syntax in the modal UI.
210+
https://github.com/nexB/scancode.io/issues/1164
211+
212+
- Add support for CycloneDX 1.6 outputs and inputs.
213+
Also, the CycloneDX outputs can be downloaded as 1.6, 1.5, and 1.4 spec versions.
214+
https://github.com/nexB/scancode.io/pull/1165
215+
216+
- Update matchcode-toolkit to v4.1.0
217+
218+
- Add a new function
219+
`scanpipe.pipes.matchcode.fingerprint_codebase_resources()`, which computes
220+
approximate file matching fingerprints for text files using the new
221+
`get_file_fingerprint_hashes` function from matchcode-toolkit.
222+
223+
- Rename the `purldb-scan-queue-worker` management command to `purldb-scan-worker`.
224+
225+
- Add `docker-compose.purldb-scan-worker.yml` to run ScanCode.io as a PurlDB
226+
scan worker service.
227+
228+
v34.3.0 (2024-04-10)
229+
--------------------
230+
231+
- Associate resolved packages with their source codebase resource.
232+
https://github.com/nexB/scancode.io/issues/1140
233+
234+
- Add a new `CollectSourceStrings` pipeline (addon) for collecting source string using
235+
xgettext.
236+
https://github.com/nexB/scancode.io/pull/1160
237+
238+
v34.2.0 (2024-03-28)
239+
--------------------
240+
241+
- Add support for Python 3.12 and upgrade to Python 3.12 in the Dockerfile.
242+
https://github.com/nexB/scancode.io/pull/1138
243+
244+
- Add support for CycloneDX XML inputs.
245+
https://github.com/nexB/scancode.io/issues/1136
246+
247+
- Upgrade the SPDX schema to v2.3.1
248+
https://github.com/nexB/scancode.io/issues/1130
249+
250+
v34.1.0 (2024-03-27)
5251
--------------------
6252

7253
- Add support for importing CycloneDX SBOM 1.2, 1.3, 1.4 and 1.5 spec formats.
@@ -39,6 +285,22 @@ v34.1.0 (unreleased)
39285
A data migration is included to facilitate the migration of existing data.
40286
https://github.com/nexB/scancode.io/issues/1099
41287

288+
- Add PurlDB tab, displayed when the PURLDB_URL settings is configured.
289+
When loading the package details view, a request is made on the PurlDB to fetch and
290+
and display any available data.
291+
https://github.com/nexB/scancode.io/issues/1125
292+
293+
- Create a new management command `purldb-scan-queue-worker`, that runs
294+
scancode.io as a Package scan queue worker for PurlDB.
295+
`purldb-scan-queue-worker` gets the next available Package to be scanned and
296+
the list of pipeline names to be run on the Package from PurlDB, creates a
297+
Project, fetches the Package, runs the specified pipelines, and returns the
298+
results to PurlDB.
299+
https://github.com/nexB/scancode.io/pull/1078
300+
https://github.com/nexB/purldb/issues/236
301+
302+
- Update matchcode-toolkit to v4.0.0
303+
42304
v34.0.0 (2024-03-04)
43305
--------------------
44306

Dockerfile

+3-2
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@
2020
# ScanCode.io is a free software code scanning tool from nexB Inc. and others.
2121
# Visit https://github.com/nexB/scancode.io for support and download.
2222

23-
FROM --platform=linux/amd64 python:3.11-slim
23+
FROM --platform=linux/amd64 python:3.12-slim
2424

2525
LABEL org.opencontainers.image.source="https://github.com/nexB/scancode.io"
2626
LABEL org.opencontainers.image.description="ScanCode.io"
@@ -40,7 +40,7 @@ ENV PYTHONPATH $PYTHONPATH:$APP_DIR
4040

4141
# OS requirements as per
4242
# https://scancode-toolkit.readthedocs.io/en/latest/getting-started/install.html
43-
# Also install universal-ctags for symbol collection.
43+
# Also install universal-ctags and xgettext for symbol and string collection.
4444
RUN apt-get update \
4545
&& apt-get install -y --no-install-recommends \
4646
bzip2 \
@@ -60,6 +60,7 @@ RUN apt-get update \
6060
git \
6161
wait-for-it \
6262
universal-ctags \
63+
gettext \
6364
&& apt-get clean \
6465
&& rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*
6566

Makefile

+6-1
Original file line numberDiff line numberDiff line change
@@ -104,6 +104,11 @@ migrate:
104104
@echo "-> Apply database migrations"
105105
${MANAGE} migrate
106106

107+
upgrade:
108+
@echo "-> Upgrade local git checkout"
109+
@git pull
110+
@$(MAKE) migrate
111+
107112
postgresdb:
108113
@echo "-> Configure PostgreSQL database"
109114
@echo "-> Create database user ${SCANCODEIO_DB_NAME}"
@@ -158,4 +163,4 @@ offline-package: docker-images
158163
@mkdir -p dist/
159164
@tar -cf dist/scancodeio-offline-package-`git describe --tags`.tar build/
160165

161-
.PHONY: virtualenv conf dev envfile install check bandit valid isort check-deploy clean migrate postgresdb sqlitedb backupdb run test docs bump docker-images offline-package
166+
.PHONY: virtualenv conf dev envfile install check bandit valid isort check-deploy clean migrate upgrade postgresdb sqlitedb backupdb run test docs bump docker-images offline-package

docker-compose-offline.yml

-2
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,3 @@
1-
version: "3"
2-
31
services:
42
db:
53
image: postgres:13

docker-compose.dev.yml

-2
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,3 @@
1-
version: "3"
2-
31
# Mount the local scanpipe/ directory in the containers
42

53
# This can be used to refresh fixtures from the docker container:

docker-compose.purldb-scan-worker.yml

+17
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,17 @@
1+
include:
2+
- docker-compose.yml
3+
4+
services:
5+
purldb_scan_worker:
6+
build: .
7+
command: wait-for-it --strict --timeout=120 web:8000 -- sh -c "
8+
./manage.py purldb-scan-worker --async --sleep 3"
9+
env_file:
10+
- docker.env
11+
volumes:
12+
- .env:/opt/scancodeio/.env
13+
- /etc/scancodeio/:/etc/scancodeio/
14+
- workspace:/var/scancodeio/workspace/
15+
depends_on:
16+
- db
17+
- web

0 commit comments

Comments
 (0)