DEV Community

Cover image for A Phish on a Fork, no Chips
naugtur
naugtur

Posted on

A Phish on a Fork, no Chips

So you were told that this is the safest way to install a package from github with npm:

"test262": "git+https://github.com/tc39/test262.git#f1ffb1e95fad3468a771a6753595db4e6f55f6bb",
Enter fullscreen mode Exit fullscreen mode

and this is the safest way to install an action:

uses: actions/checkout@47176dbabcf093ccbef4a6689f7c80eb4c7693d6 # v4
Enter fullscreen mode Exit fullscreen mode

That's a very good recommendation.

Because commit hashes are practically globally unique, the maintainer can't make any changes to what you'll get (like they would if you used #branch or #tag)

There's one thing you need to look out for as you update the commit hashes though.

𐂐 The fork

Both examples above will happily install and work even though the commit hashes you see actually exist in forks I made, not the original repositories.

🐟 The phish

  1. find a popular repository that gets installed as a package by package managers or as a GitHub Action.
  2. fork it and introduce malware opinionated improvements
  3. add another commit that looks like a proper version update
  4. find the security-minded users who have been told that it's best to pin the version to the specific commit hash
  5. offer an update to latest, but put your own commit hash in the PR
  6. 🍿

It's a phish on a fork, see?


The rest of this piece explains the problem and offers tools to avoid the risks. Scroll to the bottom to learn what to use.


The full story

Remember how git commits are now hashed with sha256 because sha1 had collisions you could generate if you could get your hands on a bunch of compute?

In case you're into tech history rabbit holes:

Anyway..

sha256 has 2^256 possible hashes. That is about 10^78. There's about 10^80 atoms in the Observable Universe. Git commit hashes are practically globally unique.

And that's great for a lot of reasons.

I'd like to explore one of these.

GitHub is storing an unbelievable number of git repositories. The more popular ones get forked tens of thousands of times. Imagine copying the entire history and structure of the repository for every fork!

React repo forks and stars

Thanks to the uniqueness of commit IDs, you only ever need to store each one once. Regardless of whether it's in the original or a forked repo, the content of the commit with that specific ID will always be the same.

What a great optimization! Without it, the fork and PR workflow would not have been possible!

That also means if you try to load a commit hash from a repo, GitHub will not differentiate between your repo and a fork when fetching it from the database.

This has caused issues before, like when youtube-dl folks confused everyone into thinking GitHub source code was published to the DMCA repo

🍟 The chip

As a remediation, GitHub has introduced a warning chip in the UI so that if you go to a repository and put commit ID from a fork in the URL you get a hint something's not right.

e.g. This is what UI shows for my fork of the checkout action

https://github.com/actions/checkout/commit/47176dbabcf093ccbef4a6689f7c80eb4c7693d6
Enter fullscreen mode Exit fullscreen mode

git-safe-dependencies on npm

I know calling it a chip is a bit of a stretch but it makes for a nice pun in the article title and hardcore designers are not my target audience.

I don't have to tell you that package installation doesn't have much UI space to work with, which results in no warning there, so you don't get the chip.

Would you like to see a warning? Keep reading.

Avoiding the phish on a fork

You could take every commit ID and put it in the URL for the repository you expected to install from and check whether the warning chip shows. But I've been involved in software security long enough to know you won't.

People often lack the patience to review security risk warnings even if they're provided to them inline, in the PR they're working on approving. I doubt they'd be willing to get out of their way to put together the URL they need to look up.

🍣 A solution that, ehm, scales 🫣

Cringe all you want, I'll squeeze the last drop out of this pun.

So I found out how GitHub decides whether to show the chip, added all other hatered best practices I have for using git dependencies and created this tool

git-safe-dependencies

Credit for the git-safe name goes to Leo

Curious to know what it can do? Are you hooked?

Ok, ok. I'll stop now.


The @lavamoat/git-safe-dependencies validator

What does it do? (as of Jan 2025)

Current version supports processing your package.json and lockfile as one command and a separate one for scanning workflow yaml files.

  • validates you only directly depend on git repos and actions pinned to commit id (GH workflows, package.json direct dependencies)
  • validates that commit id belongs to the repository you intended to install from for both direct and transitive dependencies (lockfile, workflows)
  • makes sure resolved URL in lockfile matches pakcage.json (prevents tampering with lockfile)
  • complains if the git URL is not pointing to GitHub (lockfile)

What do I want to add?

  • warning if the package name matches a package that exists on npm
  • warning if the git dependency is trying to run scripts on install

It's all free and opensource. Like all other protections that we build at LavaMoat.


Git safe friends! 🫡

Top comments (0)