Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

.gitattributes support? #386

Open
o2sh opened this issue Oct 16, 2019 · 4 comments
Open

.gitattributes support? #386

o2sh opened this issue Oct 16, 2019 · 4 comments

Comments

@o2sh
Copy link

o2sh commented Oct 16, 2019

It would be nice if tokei could parse .gitattributes and respect linguist-documentation rules.

E.g: tokei believes that the project ANESE is written primarily in assembly, when it's actually written entirely in C++:

DeepinScreenshot_select-area_20191016184732
Onefetch is a tool that relies on tokei for language detection

@XAMPPRocky
Copy link
Owner

Thank you for issue! I don't really know what is required for .gitattributes support. AFAIK linguist-documentation is GitHub specific. If you could expand on what you would need, I'd be happy to help. I would like if possible to use a third party crate for handling the parsing .gitattributes as I don't want to maintain that code in tokei and would prefer if everyone could include .gitattributes functionality in their tools.

@DCNick3
Copy link

DCNick3 commented Jun 2, 2022

Even though linguist-documentation is github-specific, I think it would be nice to support this attribute (along with linguist-generated and linguist-vendored, documented here

As for the .gitattributes parsing crate, one could use one from git-oxide: git-attributes. From the first glance it appears to be able to parse the .gitattibutes, but it's very much WIP and it seems to miss the matching logic (I think?)

@XAMPPRocky
Copy link
Owner

XAMPPRocky commented Jun 3, 2022

@DCNick3 To be clear, I do want to support it, but my bandwidth is very limited so it needs to be supported in a way that's not intrusive or likely to rot in a way that leads to more work down the road.

If someone wants to add support using git-attributes, I'd be happy to accept a PR. You'll want to add the check here because we want to be able to skip entire directories if possible. I would add support for linguist-vendored, linguist-generated, and linguist-documentation, also like the VCS based logic, there needs to be a no --no-gitattributes flag to disable it.

tokei/src/utils/fs.rs

Lines 58 to 86 in f5686a0

walker.build_parallel().run(move || {
let tx = tx.clone();
Box::new(move |entry| {
let entry = match entry {
Ok(entry) => entry,
Err(error) => {
use ignore::Error;
if let Error::WithDepth { err: ref error, .. } = error {
if let Error::WithPath {
ref path,
err: ref error,
} = **error
{
error!("{} reading {}", error, path.display());
return Continue;
}
}
error!("{}", error);
return Continue;
}
};
if entry.file_type().map_or(false, |ft| ft.is_file()) {
tx.send(entry).unwrap();
}
Continue
})
});

@mloughran
Copy link

In my case .gitattributes is only used to mark generated code, and combined with .gitignore covers all required exclusions, so I simply autogenerate a .tokeignore file on demand:

sed 's/ linguist-generated$//' .gitattributes > .tokeignore
tokei .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants