Carrie

Posted on Oct 22

New Supply Chain Attack Technique Can Trojanize All Commands (Part 1)

#cybersecurity #opensource

The open-source ecosystem has become a primary target for supply chain attacks due to its widespread use.

Malicious actors often exploit built-in features of open-source packages to automatically distribute and execute harmful code.

They particularly favor two techniques: pre-install scripts that run automatically when installing packages, and seemingly harmless packages that import malicious dependencies.

As these strategies become increasingly recognizable, current security tools and vigilant developers have been able to detect them more quickly. However, one often overlooked but potentially dangerous feature still exists: entry points.

This article explores how attackers can use entry points in multiple programming ecosystems, especially PyPI, to trick victims into running malicious code.

Although this method may not immediately compromise systems like auto scripts or malicious dependencies, it offers a more covert way for patient attackers to infiltrate systems, potentially evading standard security measures.

By understanding this little-known attack vector, we can better defend against the ever-evolving open-source supply chain attacks.

Key Points

1. Entry Points: A powerful feature used to expose package functionality, exploitable in various ecosystems, including PyPI (Python), npm (JavaScript), Ruby Gems, NuGet (.NET), Dart Pub, and Rust Crates.

2. Risks: Attackers can use these entry points to execute malicious code when running specific commands, posing a broad risk in the open-source domain.

3. Techniques: Include command hijacking (impersonating popular third-party tools and system commands) and targeting different stages of the development process with malicious plugins and extensions. Each method carries varying degrees of potential success and detection risks.

4. Entry point attacks, requiring user interaction, offer a more covert and persistent system attack method that might bypass traditional security checks.

5. This attack vector poses risks to individual developers and enterprises, highlighting the need for more comprehensive Python package security measures.

Understanding Python Entry Points

Entry points are a powerful feature of the package system, allowing developers to expose specific functionalities as CLI commands without users needing to know the exact import path or structure of the package.

Uses of Entry Points

Creating command-line scripts that can be run after the user installs the package.
Defining plugin systems, allowing third-party packages to extend the core package’s functionalities.

The most common type of entry point is console_scripts, which points to a function you want to offer as a command-line tool to users who install your package.

Although designed to enhance modularity and plugin systems, entry points can become a vector for embedding and executing harmful code if misused by malicious actors. To understand how attackers might exploit Python entry points, we need to first understand how entry points work originally.

Defining Entry Points in Package Metadata

The location and format of entry point definitions may vary depending on the package format (wheel or source distribution).

Source Distribution (.tar.gz)

For source distributions, entry points are typically defined in the package’s setup configuration. This can be in the traditional setup.py, setup.cfg, or the more modern packaging approach in pyproject.toml.

Here is an example of defining entry points in setup.py:

from setuptools import setup

setup(
    # ... other setup parameters ...
    entry_points={
        'console_scripts': [
            'my_command=mypackage.module:my_function',
        ],
        'my_package.plugins': [
            'plugin_name=my_package.plugins:PluginClass',
        ],
    },
)

Wheel File (.whl)

In a wheel file (a built package format), entry points are defined in the entry_points.txt file under the .dist-info directory.

Here’s what the entry_points.txt file for the above example might look like:

[console_scripts]
my_command = mypackage.module:my_function

[my_package.plugins]
plugin_name = my_package.plugins:PluginClass

The syntax for entry points follows this pattern:

name = package.module:object

name: The name of the entry point (e.g., the command name for a console script).
package.module: The Python module path.
object: The object (function, class, etc.) to use in the module.

In the example above, my_command is a console script created during the installation process. Once the package is installed, when the user types my_command in the terminal, it executes my_function in mypackage.module.

plugin_name is a custom entry point used by my_package to discover plugins. It points to PluginClass in my_package.plugins.

During package installation, these entry points are recorded in the package metadata. Other packages or tools can then query this metadata to discover and use the defined entry points.

If attackers can manipulate the metadata of legitimate packages or convince users to install malicious packages, they might execute arbitrary code on the user’s system when the defined commands or plugins are invoked. In the next sections, I will provide several methods attackers might use to trick users into executing malicious code through entry points.

Understanding CLI Commands in Operating Systems

Command-line interface (CLI) commands are the primary way users interact with the operating system through a text-based interface. These commands are interpreted and executed by the shell, which acts as an intermediary between the user and the operating system.

When a user enters a command, the shell follows a specific resolution mechanism to locate and execute the corresponding program. The exact sequence may vary slightly between different shells. However, this process typically starts by sequentially checking the directories listed in the PATH environment variable and running the first matching executable file it finds. Users can view the current PATH by entering the command echo $PATH in the terminal (the exact command might differ on various operating systems), which displays a list of directories the shell searches for executables.

Understanding this resolution process is crucial when considering how Python entry points (which can create new commands) might interact with or potentially interfere with existing system commands.

How Attackers Can Abuse Entry Points to Execute Malicious Code

Malicious actors can exploit Python entry points in several ways to trick users into executing harmful code. We will explore some strategies, including command hijacking, malicious plugins, and malicious extensions.

Command Hijacking

Impersonating Popular Third-Party Commands

Malicious packages can use entry points to impersonate widely used third-party tools. This strategy is particularly effective against developers who frequently use these tools in their workflows.

For example, attackers might create a package with a malicious aws entry point. When an unsuspecting developer (who frequently uses AWS services) installs this package and later executes the aws command, the fake aws command could steal their AWS access keys and secrets. This attack could be especially devastating in CI/CD environments where AWS credentials are often stored for automated deployments – potentially giving attackers access to entire cloud infrastructures.

Another example might involve a malicious package impersonating the docker command, targeting developers who work with containerized applications. The fake docker command might secretly send images or container specifications to the attacker’s server during builds or deployments. In microservices architectures, this could expose sensitive service configurations or lead to the theft of proprietary container images.

Other popular third-party commands that might be targeted include, but are not limited to:

npm (Node.js package manager)
pip (Python package installer)
git (version control system)
kubectl (Kubernetes command-line tool)
terraform (infrastructure as code tool)
gcloud (Google Cloud command-line interface)
heroku (Heroku command-line interface)
dotnet (.NET Core command-line interface)

These commands are widely used across various development environments, making them tempting targets for attackers seeking to maximize the impact of their malicious packages.

Impersonating System Commands

By using common system command names as entry points, attackers can impersonate basic system tools. Commands like touch, curl, cd, ls, and mkdir could all be hijacked, potentially leading to severe security vulnerabilities when users try to use these basic tools.

While this method may provide attackers with the highest likelihood of tricking victims into executing malicious code accidentally, it also carries the highest risk of failure. The success of this method primarily depends on the order of the PATH. If the directory containing the malicious entry points appears earlier in the PATH than the system directories, the malicious command will execute instead of the system command. This scenario is more likely in development environments where local package directories are prioritized.

Another point to remember is that globally installed packages (requiring root/admin permissions) can overwrite system commands for all users, while user-installed packages will only affect the environment of that specific user.

To be continued...

This article is written by Duyan Intelligence.

I'm Carrie, a cybersecurity engineer and writer, working for SafeLine Team. SafeLine is an open source web application firewall, self-hosted, very easy to use.

DEV Community

New Supply Chain Attack Technique Can Trojanize All Commands (Part 1)

Key Points

Understanding Python Entry Points

Uses of Entry Points

Defining Entry Points in Package Metadata

Source Distribution (.tar.gz)

Wheel File (.whl)

Understanding CLI Commands in Operating Systems

How Attackers Can Abuse Entry Points to Execute Malicious Code

Command Hijacking

Impersonating System Commands

Top comments (0)

Read next

OpenBB vs Proprietary Tools: Why Open Source is the Future of Financial Analysis

I Completed Hacktoberfest 2024. Here is What I Loved and Hated

Discover JSREPL.io – A JavaScript REPL & Playground

Stop Writing Boilerplate! Build Node.js APIs in Seconds [2024 Guide]