Gajus Kuizinas

Posted on Sep 21, 2020

.gitignore mistake that everyone makes

#github #git

.gitignore should be a whitelist, not a blacklist of files you want to include.

If we look at a bunch of random open-source projects, they all instead try to exclude every known undesirable file, e.g.

This setup means that whenever a new developer joins the team or a new tool is adopted by someone in a team, you need to update .gitignore configuration. Examples of this include .idea, .vscode (IDE specific), .nyc_output, .next (tool specific), .DS_Store (OS specific).

A better solution is "ignore everything with exclusions". In practice, this means that you ignore all files (or at least all hidden files) by default and add exceptions to those files that were agreed to be added to the project, e.g.

coverage
dist
node_modules
package-lock.json
*.log
.*
!*/*.babelrc.js
!.dockerignore
!.editorconfig
!.eslintignore
!.eslintrc
!.gitignore
!.gitlab-ci.yml
!.npmignore
!.storybook
!.npmrc
!.prettierignore

In this project we are ignoring all files that start with a dot, but we've added exceptions to the configuration files that belong to the project.

This configuration also ensures that you do not accidentally commit private files (keys) that are conventionally prefixed with a dot too.

Adopting this convention will save back-and-forth discussing what exclusions should be added to .gitignore.

Top comments (15)

Dave • Sep 22 '20

Colour me confused, but isn't the clue in the title? The file has the word ignore in it, so one would presume that the file contains a list of things to ignore.

My second point of confusion, is why would anyone need to add anything to .gitignore when a new developer joins the team? The only example I can think of, is that they use a different IDE to everyone else, and they don't want to be pushing metadata for their IDE to VCS.

Since I'm a little confused, I went to check the documentation on git-scm.com. The very first sentence in the description says "A gitignore file specifies intentionally untracked files that Git should ignore. " - which to me, is the way I've always done it, but you're saying is a mistake? Did whomever wrote that document also make a mistake?

It seems to me, that the intent of the .gitignore file is indeed to be a blacklist.

Then, of course, comes the discussion about what files we should be staging (and even, what file parts) - source code, lives in VCS, period. If it isn't source code (or documentation tightly coupled with that source code), it has no business being in VCS and shouldn't be staged, let alone committed - or worse still, pushed.

Please do let me know why a new developer or new tool, should change the way we use .gitignore files. Examples would be awesome, because I've never seen it (save for, as I said, IDE differences).

Michael de Gans • Sep 22 '20

His approach is probably safer. I accidentally commit files all the time that I would rather have excluded, and that's permanent.

Say I'm working on a video app, and I get it pumping out test video, I say "yay!", and commit my changes, only to discover in horror that I've also committed a bunch of test video. No, It shouldn't happen, but developers are people, we run on caffeine, and at the end of the day sometimes we're not 100%.

This would remedy that. I already do it with Docker to avoid accidental giant images. There are security considerations as well. Search GitHub for passwords sometime if you're curious.

Dave • Sep 22 '20

There's nothing wrong with mistakes (such as committing the wrong thing), git has powerful history rewriting capabilities.

But really, you can add pre-commit hooks prompting you with checklists etc if you keep making mistakes.

My issue, is that the author suggests that using a tool in the manner that those who made it say you should, is somehow a mistake.

Michael de Gans • Sep 22 '20

Rewriting history breaks stuff, so I try to avoid that, and checklists are nice but it's still possible to miss stuff. As to what's intended, you are probably right, but I tend to prefer allowlists in general, so this is an approach I might consider in the future. I already do it with Python packaging, Docker and more, so why not?

The syntax is flexible enough to allow either, so I'm not sure there is any wrong way. I suppose it's mostly down to personal taste and philosophy.

kamesh akella • Sep 22 '20

.gitignore should be a whitelist, not a blacklist

I wish you had a better choice of words, to start this article, we need to bring the change by being with our words of communication.

Great article nevertheless.

Michael de Gans • Sep 22 '20

Like someone said on the kernel mailing list, arguments against the language change don't scale. Allowlist and blocklist are perfectly suitable alternatives. The etymology doesn't matter.

Dan Jones • Sep 28 '20

Yikes. No.

This setup means that whenever a new developer joins the team or a new tool is adopted by someone in a team, you need to update .gitignore configuration.

Your suggestion means that whenever you add a new file to the repo, you have to update .gitignore.

Ayan Banerjee • Sep 22 '20

This setup means that whenever a new developer joins the team or a new tool is adopted by someone in a team, you need to update .gitignore configuration. Examples of this include .idea, .vscode (IDE specific), .nyc_output, .next (tool specific), .DS_Store (OS specific).

This is generally better handled with global gitignore.

Michael de Gans • Sep 22 '20

I do this to make sure empty folders are included. You can create a .gitignore that excludes all except itself and stick it in the empty folder you want to keep.

It's not my idea. I found it on stack exchange somewhere. It's handy for test data download folders that you want to exist, but for which you don't want the binaries committed.

Cliff • Sep 26 '20 • Edited

Your example is still a blacklist with exceptions to broad rules, not a whitelist. A whitelist would block everything that's not explicitly allowed. And a true whitelist would authorize files individually.

That idea feels a bit too much like ClearCase to me. I do agree with your thinking on ignoring all hidden files by default, though. They should only be added explicitly. As for special files for editors and other tools used by individual devs (instead of team tools), developers should really get better about using their global git config and global excludes (gitignore). I also pay attention to things like swap/backup files and configure my editor (Vim, mostly) to not create those files alongside the originals, but in a private directory instead (e.g. ~/.vim/backup, ~/.vim/swap)

Owen Melbourne • Sep 22 '20

Very controversial...

I feel specifically for many PHP projects, especially those like Laravel/Symfony you'd be forever adding hundreds of files to the whitelist, making this file huge and a pain to maintain?

Is there a "type" of project that is best suited to this technique?

Dave • Sep 22 '20 • Edited

If you want to treat it as a whitelist - which in my view you shouldn't, but hey, who am I to judge... you should probably see the pattern format in the documentation.

A trailing "/**" matches everything inside. For example, "abc/**" matches all files inside directory "abc", relative to the location of the .gitignore file, with infinite depth.

So you could, if you wished: !somedir/**

Pacharapol Withayasakpunt • Sep 22 '20

I also like "inclusive" approach, especially for Docker, where less file is usually more desirable. I also use this approach for ESLint.

Include files, rather than ignore

Pacharapol Withayasakpunt ・ Jun 4 ・ 1 min read

#docker #git #eslint

Matti Uusitalo • Sep 22 '20

I have used version controls that operate this way by default, and there are also major problems. One of the best git features in my opinion is that all files are included in the project by default.