Cloud Supply Chain: when your NPM or PyPI dependency points to an attacker server

The pattern repeats: a team speeds up deliveries, bumps versions frequently, and trusts that “npm install” or “pip install” are harmless operations. In cloud, that assumption is dangerous. A dependency can execute code in the pipeline, exfiltrate secrets and, above all, try to access the instance or runner metadata service to obtain temporary credentials to move around the account.

The problem is not only “installing a vulnerable package”. It is allowing your supply chain to point to a server controlled by an attacker: alternative registries, dependencies with direct URLs, typosquatting, or packages that download additional payloads from outside NPM/PyPI. If your CI/CD has Internet egress and credentials with real permissions, the impact stops being theoretical.

What went wrong: the dependency that turns install into an exfiltration channel

The typical failure is not a sophisticated RCE. It is a “practical” decision in a PR: adding an NPM or PyPI dependency that, directly or indirectly, includes an unverified source. In Node.js this is often seen with entries in package.json that point to git+https, to a tarball at a URL, or with lifecycle scripts (for example, preinstall, install, postinstall) that execute code when installing.

In Python, the equivalent shows up when the installation triggers logic in the build/install (depending on packaging) or when the package makes network calls on import/installation. The practical outcome in enterprise is the same: the pipeline executes code that did not go through real review, with network access and, often, with implicit access to credentials (CI environment variables, tokens, or the runner’s temporary cloud credentials).

The most damaging consequence in the cloud appears when that code tries to talk to the metadata service (for example, the link-local metadata endpoint) to obtain temporary credentials associated with the instance/runner role. Many organizations assume that “that only exists in production”, but self-hosted runners, build agents on VMs, or Kubernetes nodes often have exactly that kind of identity.

How a malicious package.json looks for credentials in metadata services (and why it works in CI)

An attacker does not need your application to be running; it is enough that the code executes during install. In Node.js, lifecycle scripts allow running commands in the build environment. A malicious package can try to access the metadata endpoint from the runner and, if it succeeds, obtain temporary credentials that it then exfiltrates to a server controlled by the attacker.

This works because in many pipelines there are three conditions at the same time: unrestricted Internet egress, a cloud identity with permissions (sometimes excessive), and no barriers between the build process and internal resources. In an internal investigation, what often surprises people is how trivial the “attacker step” is: a simple HTTP call to the metadata endpoint followed by a POST to an external domain.

Technical signal: access to link-local endpoints during install

In network or egress logs, seeing connections to link-local addresses associated with metadata (for example, ranges reserved for that purpose) during the dependencies phase usually indicates that something in the pipeline is trying to enumerate credentials. In practice, this signal is more useful than looking for “malicious strings” inside a package, because the attacker can obfuscate, but cannot avoid talking to metadata if their goal is to steal cloud identity.

Real impact: temporary credentials with enough permissions to pivot

In enterprises, roles assigned to runners often have permissions to read artifacts, publish to internal registries, assume other roles, or interact with deployment services. A temporary credential theft here is not “just a token”: it can end in access to repositories, manipulation of signed artifacts, or deployment of altered versions.

Early signals in enterprise: when the supply chain is already pointing outside

Before seeing an incident, there are usually signals in the repository itself. A very common example: dependencies declared as direct URLs or Git repositories instead of packages pinned to a corporate registry. In Node.js, additionally, the presence of scripts in new or little-used packages should raise suspicion, especially when they appear along with changes that “only update dependencies”.

Another signal is operational: builds that suddenly start taking longer and show additional downloads from unusual domains. This happens when a package incorporates “download on install” logic to fetch binaries or payloads. In corporate environments with a proxy, these downloads are sometimes logged as “temporary” exceptions so the build passes, and that exception stays forever.

What I review in dependency PRs

If the pipeline does not have strong guardrails, human review is the last barrier. In practice, it is worth reviewing whether non-standard sources (tarballs/URLs) are introduced, whether installation scripts are enabled, and whether the lockfile changes massively without justification. A huge lockfile is not suspicious by itself, but it is when it drags in “new” packages that are not related to the functional change.

What I look for in runner network observability

In mature environments, the runner (or its VPC/subnet) has egress logs. There I look for newly seen domains, connections to link-local endpoints during build phases, and POST/PUT patterns to external hosts right after installing dependencies. It is not foolproof, but it reduces detection time from days to minutes.

How to do it in practice: block unverified sources in CI/CD without breaking delivery

The most effective measure is to treat dependency consumption as a controlled channel, not as “open Internet”. In enterprise, this means: an internal registry (or proxy) as the only allowed source, policies that prevent direct resolutions to public NPM/PyPI except for approved cases, and pipelines that fail if they detect dependencies outside policy.

For Node.js, the most immediate control is to configure the corporate registry and forbid installs from arbitrary URLs. For Python, in addition to index/extra-index, you must be strict about which indexes are allowed. The goal is not only to “cache” dependencies; it is to prevent the manifest from pointing to an attacker server.

Configure NPM to use only your registry (and validate that it is enforced)

Concrete action: define a .npmrc in the repository or in the runner base image with registry= pointing to the corporate registry (or a controlled proxy). Then, in the pipeline, add a check that fails if the effective registry is not the expected one (for example, running npm config get registry and comparing it). This prevents a job or script from overwriting the registry “as convenient”.

Configure pip for a single allowed index (and break builds that use another)

Concrete action: set in pip.conf (or equivalent variables) the index-url of the corporate index and avoid extra-index-url except for documented exceptions. In the pipeline, validate the effective configuration with pip config debug and make the job fail if it detects unapproved indexes. In environments with many teams, this automated validation reduces dependency “shadow IT”.

If your organization uses self-hosted runners, complement the above with egress control: the runner should be able to talk to the corporate registry and little else during the dependencies phase. This is not theory; it is what prevents, even if a package with malicious code slips in, it from exfiltrating or downloading payloads from the attacker’s infrastructure.

Runner policies and identity: block access to metadata and reduce the value of what is stolen

Even if you block sources, assume that one day something will happen. The second pillar is that the CI/CD environment is not an “easy hop” to cloud credentials. In cloud, the attacker’s goal is usually the metadata service because it delivers temporary credentials associated with the instance/runner identity. If that identity exists and has permissions, the attacker wants it.

Two practical measures: reduce or eliminate access to metadata from jobs and make the runner identity have minimal and traceable permissions. In AWS, for example, many organizations move to IMDSv2 and restrict access to the metadata endpoint; and, in addition, migrate to job-specific ephemeral identities (when the CI model supports it), instead of broad roles tied to a “multi-tenant” VM.

Operational validation: beyond “configuring”, you have to test. A controlled test on the runner that tries to access the metadata endpoint must fail or require a token, and the permissions associated with the runner role must be reviewed as if they were production permissions. In internal audits, it is common to find runners with inherited permissions “for convenience” to unblock urgent deployments.

What I review in AWS to ensure the runner is not an easy target

I review metadata configuration on instances/templates (require tokens, hop limits, etc.) and confirm there are no unexpected routes that allow access from containers or untrusted processes. Then, I inspect the associated role: attached policies, permissions to assume other roles, and access to secrets. If the runner can broadly read secrets, the attacker does not even need metadata: it is enough to run aws and enumerate.

What evidence I leave to be able to demonstrate compliance

It is not enough to “say it is blocked”. I leave reproducible evidence: versioned templates/infra configuration (IaC), egress logs showing that during builds you only talk to the allowed registry, and a verification (control) job that fails if the effective registry changes or if there is connectivity to unapproved destinations. This reduces arguments when a team asks to “open the Internet for a moment”.

Recommendations for corporate environments

If your NPM or PyPI dependency can point to an attacker server, the problem is not the specific library: it is that the pipeline has the freedom to execute and communicate out of control. A package.json or dependency configuration can turn install into a channel to look for credentials in metadata services and exfiltrate them with very little noise.

The measures that most often work in enterprise combine two things: source control (corporate registry as the only path, pipeline validation, restricted egress) and impact reduction (runner with minimal identity, no easy access to metadata, with permissions reviewed as critical assets). When both are well applied, supply chain stops being an open door and becomes a governed process.

Interested in Cloud Security?

Technical analysis, hands-on labs and real-world cloud security insights.