-
-
Notifications
You must be signed in to change notification settings - Fork 296
Description
Describe the bug
cloc
is finding far more files and lines of code than scc
when running both on the command line.
The discrepancy was so large with cloc finding nearly twice as many Swift files and lines of code that I even tried --exclude-dir=.git
in case it was something silly like this.
2.7M lines vs 820k lines of code reported for the first repo which caught my eye.
To Reproduce
I noticed this on a work repo which I'll show first and then reproduce on open source public repos further below:
$ cloc --exclude-dir=.git .
16615 text files.
10866 unique files.
9006 files ignored.
github.com/AlDanial/cloc v 2.02 T=6.48 s (1676.7 files/s, 471886.4 lines/s)
-------------------------------------------------------------------------------
Language files blank comment code
-------------------------------------------------------------------------------
YAML 84 11 55 924650
JSON 1377 10 0 800474
Swift 4906 66126 109653 398884
XML 1376 1274 384 197462
C/C++ Header 1780 38122 85669 145296
Objective-C 688 19001 16382 98319
Python 107 4018 5766 56489
Markdown 114 7956 2 23243
Objective-C++ 65 3264 2005 18098
C 34 2243 1646 10394
C++ 38 1237 1012 8122
Bourne Shell 49 1041 478 5751
SVG 236 1 0 2522
Ruby 5 88 39 405
CSS 3 91 25 393
Text 2 19 0 42
JavaScript 2 2 22 3
-------------------------------------------------------------------------------
SUM: 10866 144504 223138 2690547
-------------------------------------------------------------------------------
$ scc .
───────────────────────────────────────────────────────────────────────────────
Language Files Lines Blanks Comments Code Complexity
───────────────────────────────────────────────────────────────────────────────
Swift 2628 243515 23026 27060 193429 16431
JSON 1220 38046 9 0 38037 0
SVG 7 72 0 0 72 0
YAML 6 589641 11 55 589575 0
Shell 3 52 11 14 27 0
C Header 2 122 31 54 37 0
Markdown 2 43 17 0 26 0
Gemfile 1 3 1 0 2 0
Objective C 1 678 147 17 514 63
───────────────────────────────────────────────────────────────────────────────
Total 3870 872172 23253 27200 821719 16494
───────────────────────────────────────────────────────────────────────────────
Estimated Cost to Develop (organic) $31,049,515
Estimated Schedule Effort (organic) 50.75 months
Estimated People Required (organic) 54.36
───────────────────────────────────────────────────────────────────────────────
Processed 65746657 bytes, 65.747 megabytes (SI)
───────────────────────────────────────────────────────────────────────────────
I don't have any public Swift repos but I'll reproduce it here on my other public GitHub repos to demonstrate, although the discrepancy isn't as large, a few thousand lines - 77k vs 81k - compared to the much larger discrepancy with the work repo above:
git clone https://github.com/HariSekhon/DevOps-Bash-tools bash-tools
cd bash-tools
$ cloc --exclude-dir=.git .
1712 text files.
1613 unique files.
101 files ignored.
github.com/AlDanial/cloc v 2.02 T=0.26 s (6117.1 files/s, 524832.2 lines/s)
--------------------------------------------------------------------------------
Language files blank comment code
--------------------------------------------------------------------------------
Bourne Shell 1426 22989 33653 59446
JSON 8 0 0 7281
Text 42 254 0 4021
YAML 81 442 1746 2961
Markdown 12 377 33 1920
XML 8 0 0 827
Bourne Again Shell 19 263 767 503
make 2 92 51 323
INI 1 13 0 72
Groovy 7 22 156 20
Expect 1 2 1 17
Properties 3 9 21 15
Python 1 8 23 12
Ruby 1 1 23 7
SQL 1 2 15 4
--------------------------------------------------------------------------------
SUM: 1613 24474 36489 77429
--------------------------------------------------------------------------------
$ scc .
───────────────────────────────────────────────────────────────────────────────
Language Files Lines Blanks Comments Code Complexity
───────────────────────────────────────────────────────────────────────────────
Shell 1393 113681 19154 32796 61731 9161
YAML 80 5233 454 1779 3000 0
Plain Text 42 4275 254 0 4021 0
BASH 16 1299 218 625 456 98
Markdown 12 2330 377 0 1953 0
JSON 8 7281 0 0 7281 0
Groovy 7 198 22 156 20 0
XML 5 351 0 0 351 0
Zsh 5 294 58 200 36 4
Properties File 3 45 9 21 15 0
Makefile 2 466 92 51 323 22
Autoconf 1 865 143 95 627 78
Bitbucket Pipeline 1 38 5 24 9 0
Docker ignore 1 4007 1022 1528 1457 0
Gemfile 1 33 6 23 4 0
License 1 7 3 0 4 0
Python 1 43 1 20 22 0
Ruby 1 31 1 24 6 0
SQL 1 21 2 15 4 0
Vim Script 1 797 98 252 447 28
───────────────────────────────────────────────────────────────────────────────
Total 1582 141295 21919 37609 81767 9391
───────────────────────────────────────────────────────────────────────────────
Estimated Cost to Develop (organic) $2,752,975
Estimated Schedule Effort (organic) 20.21 months
Estimated People Required (organic) 12.10
───────────────────────────────────────────────────────────────────────────────
Processed 4365112 bytes, 4.365 megabytes (SI)
───────────────────────────────────────────────────────────────────────────────
The code lines count seems off by a couple thousand lines in my DevOps-Python-tools repo - 27k vs 29k:
git clone https://github.com/HariSekhon/DevOps-Python-tools pytools
cd pytools
$ cloc .
402 text files.
347 unique files.
56 files ignored.
github.com/AlDanial/cloc v 2.02 T=0.12 s (2982.2 files/s, 352022.7 lines/s)
--------------------------------------------------------------------------------
Language files blank comment code
--------------------------------------------------------------------------------
Python 124 2743 5264 12484
JSON 26 0 0 5471
Bourne Shell 83 1322 1819 4467
YAML 70 371 1419 2802
Markdown 7 183 23 614
XML 4 0 3 566
Text 15 79 0 367
make 2 40 73 165
Bourne Again Shell 2 41 129 78
INI 2 14 3 77
Pig Latin 2 36 88 46
Expect 1 2 14 31
Jinja Template 1 1 0 17
TOML 1 8 0 17
Properties 2 9 20 14
CSV 4 0 0 10
Ruby 1 1 23 6
--------------------------------------------------------------------------------
SUM: 347 4850 8878 27232
--------------------------------------------------------------------------------
$ scc .
───────────────────────────────────────────────────────────────────────────────
Language Files Lines Blanks Comments Code Complexity
───────────────────────────────────────────────────────────────────────────────
Python 124 20381 1125 4655 14601 1828
Shell 83 7608 1299 1896 4413 579
YAML 69 4554 366 1395 2793 0
JSON 27 5506 0 0 5506 0
Plain Text 16 448 79 0 369 0
Markdown 7 820 183 0 637 0
CSV 4 10 0 0 10 0
Makefile 4 322 46 97 179 28
XML 4 569 0 3 566 0
BASH 2 248 41 131 76 17
Properties File 2 43 9 20 14 0
Bitbucket Pipeline 1 38 5 24 9 0
Expect 1 47 2 14 31 1
INI 1 9 1 3 5 0
Jinja 1 18 1 0 17 0
License 1 7 3 0 4 0
Ruby 1 30 1 24 5 0
TOML 1 25 8 0 17 0
───────────────────────────────────────────────────────────────────────────────
Total 349 40683 3169 8262 29252 2453
───────────────────────────────────────────────────────────────────────────────
Estimated Cost to Develop (organic) $935,532
Estimated Schedule Effort (organic) 13.41 months
Estimated People Required (organic) 6.20
───────────────────────────────────────────────────────────────────────────────
Processed 1408944 bytes, 1.409 megabytes (SI)
───────────────────────────────────────────────────────────────────────────────
I've tried it on some of my other smaller simpler public GitHub repos like Jenkins (Groovy)
and GitHub-Actions (YAML) and the results are very close in those cases.
I decided to try a random public Swift repo in case it was more pronounced there, bit of a discrepancy there 52k vs 58k:
git clone https://github.com/tensorflow/swift
cd swift
$ cloc .
88 text files.
82 unique files.
20 files ignored.
github.com/AlDanial/cloc v 2.02 T=0.09 s (878.8 files/s, 657474.7 lines/s)
-------------------------------------------------------------------------------
Language files blank comment code
-------------------------------------------------------------------------------
Jupyter Notebook 15 0 6111 43429
Markdown 35 1721 0 5235
Swift 27 373 625 3610
YAML 2 5 1 90
Bourne Shell 2 14 26 64
Dockerfile 1 5 5 33
-------------------------------------------------------------------------------
SUM: 82 2118 6768 52461
-------------------------------------------------------------------------------
$ scc .
───────────────────────────────────────────────────────────────────────────────
Language Files Lines Blanks Comments Code Complexity
───────────────────────────────────────────────────────────────────────────────
Markdown 35 6956 1721 0 5235 0
Swift 27 4608 370 625 3613 424
Jupyter 15 49540 0 0 49540 0
Shell 2 104 14 28 62 2
YAML 2 96 5 1 90 0
Dockerfile 1 43 5 5 33 4
License 1 201 32 0 169 0
───────────────────────────────────────────────────────────────────────────────
Total 83 61548 2147 659 58742 430
───────────────────────────────────────────────────────────────────────────────
Estimated Cost to Develop (organic) $1,945,321
Estimated Schedule Effort (organic) 17.71 months
Estimated People Required (organic) 9.76
───────────────────────────────────────────────────────────────────────────────
Processed 2637520 bytes, 2.638 megabytes (SI)
───────────────────────────────────────────────────────────────────────────────
Expected behavior
Expected the results from both tools to be closer than they were, especially for the work repo.
I appreciate there may be small differences in the way things are calculated between tools and it doesn't have to be perfectly accurate, more to just give a ballpark idea, but I'm trying to understand why this can be thousands of lines or in the top example 1.8M lines.
Desktop (please complete the following information):
- OS: macOS
- Version 14.1