Skip to content
This repository was archived by the owner on Dec 3, 2019. It is now read-only.

Modified populate database method to directly fetch upstream data.#53

Open
shivanshuraj1333 wants to merge 4 commits into
eellak:masterfrom
shivanshuraj1333:fix-52
Open

Modified populate database method to directly fetch upstream data.#53
shivanshuraj1333 wants to merge 4 commits into
eellak:masterfrom
shivanshuraj1333:fix-52

Conversation

@shivanshuraj1333
Copy link
Copy Markdown
Contributor

@shivanshuraj1333 shivanshuraj1333 commented Mar 31, 2019

Fixes #52
populate_db.py is modified to fetch updated data directly from https://github.com/spdx/license-list-data.

Also, license_text field contains complete updated license text.

Takes very less time to populate database
Screenshot from 2019-04-01 04-35-19
Note1: CSV file is no more needed to populate data base, I'm not removing it in this pull request (since PR #48 is not merged/closed yet)
Note2: I am intentionally commenting out previous method of data population, since in this pr CSV is not removed"

@zvr
Copy link
Copy Markdown
Collaborator

zvr commented Apr 1, 2019

using PyGitHub requires a GitHub account (I think, and your code confirms it), which should not be a prerequisite for running clio.

the data can be accessed without any account.

@shivanshuraj1333
Copy link
Copy Markdown
Contributor Author

shivanshuraj1333 commented Apr 1, 2019

Removed user authentication in PyGithub. Now anyone can populate/use Clio locally.
PyGithub will also be used in some future work.

@zvr
Copy link
Copy Markdown
Collaborator

zvr commented Apr 2, 2019

comments:

  • code shouldn't be reading the directory entries, but the licenses.json that describes them
  • code should also read the exceptions.json in the same manner
  • more importantly, you shouldn't be reading the repository which might have transient commits, but use only one of the official releases https://github.com/spdx/license-list-data/releases

@zvr zvr mentioned this pull request Apr 2, 2019
@shivanshuraj1333
Copy link
Copy Markdown
Contributor Author

Okay,
Thank you, I will update my script accordingly.

@shivanshuraj1333
Copy link
Copy Markdown
Contributor Author

shivanshuraj1333 commented Apr 3, 2019

Flow of updated script:
Untitled Diagram

Advantages of above approach:

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Populate data base from upstream data

2 participants