Skip to content

Latest commit

 

History

History
24 lines (21 loc) · 509 Bytes

README.md

File metadata and controls

24 lines (21 loc) · 509 Bytes

github-fetch-data-repos

fetch info about repos to analyze data later. get stars, language, contributor size

to merge data files run

cat data/data-* > raw_dataset.csv

to remove duplicates run

  • -u unique
  • -o output file
sort -u -o uniq.csv raw_dataset.csv

to order them based on stars run

  • -g general numeric value sort
  • -r reversed order (desc)
  • -t separator ","
  • -k3 by 3th column, starts with k1
  • -o output file
sort -gr -t , -k3 -o dataset.csv uniq.csv