Introducing SCA reachability analysis for Python, Go, and C#

90% of code in modern application is open source, yet only 12% of that code is actually used. Reachability analysis lets us prioritize the risks that can actually impact our applications.

Ron Harnik
Ron Harnik

Here’s how open source software (OSS) risk management typically works with your traditional software composition analysis (SCA) tool:

  1. Developers pull in tons of OSS dependencies, which is why 90% of modern applications’ code ends up being open source.  You may have visibility into the direct dependencies they are pulling in, but likely miss the transitive dependencies automatically brought in by the direct dependencies (usually 75x more)  
  2. Your SCA tool scans the manifest files and compares them to a known vulnerability database, spitting out 1000s of CVEs. The only prioritization you can do is by severity or other external factors. 
  3. You send that list of CVE over to engineering, and the litigation begins. Are we actually using that function? Is this dependency only used in a test environment? Is this really exploitable?

Here’s how OSS risk management should work:

  1. Intervene early by helping developers select the best OSS dependencies based on holistic risk scores that consider the code quality, popularity, activity, and security of the projects and their related packages. You can also translate your risk tolerance into policy-as-code with the assistance of IDE plugins, pull-request comments, and more. This leaves you with less technical and security debt to clean up later. 
  1. Combine manifest scanning with static analysis that maps how all of this OSS code is actually used within your applications. The output is the same list of 1000s of risks, but with the ability to prioritize the ones that are reachable and exploitable. And since research has shown only 12% of code within these OSS packages is typically used by your application, you can easily cut down the vulnerabilities that require remediation by 80%. 
  2. The list you send to engineering are the truly critical and urgent issues that create most of the risk, with exact evidence of your findings within their code path. No lengthy investigations and litigation meetings required.

The key to all of this is reachability analysis - the ability to learn if a direct or transitive dependency is actually being used by the application. If you want to learn exactly how reachability analysis works, please check out the “Reachability Analysis 101” session from LeanAppSec.

The short version is this: Reachability analysis takes vulnerability prioritization to the next level. By using call graphs to show relationships between software functions, you can understand open source library vulnerabilities in their real-world context, i.e. are downstream dependencies actually using them?

By linking call graphs across dependencies, you can trace forward through your application code to see if an attacker could potentially access a given software flaw. Tracing backwards shows what code calls a given function, allowing you to assess operational impacts of changes.

This answers questions like:

  • Is my code actually invoking this library and the vulnerable code within it?
  • What parts of my codebase would be affected if we remove or update this dependency?

The precision of call graph-based analysis lets you to smoke out those vulnerability scanner findings representing issues posing minimal risk in your product or network. It also enables you to evaluate the safety of library upgrades and the elimination of unnecessary dependencies.

Reachability with Endor Labs

Vulnerability prioritization with reachability analysis across both direct and transitive dependencies is already available on Endor Labs for languages such as Java and Rust, and we’re excited to now extend that support to Python, C#, and Go! Each of these languages and ecosystems work and behave differently, but the concepts of reachability analysis are the same. As you can see in this Python example:

Django 3.2.5 contains a SQL injection vulnerability, and the application includes Django 3.2.5. In a typical application, this would be one of dozens or hundreds of findings, each of which would need to be reviewed to prioritize and determine how to prioritize fixes. Our platform now clearly reports several important pieces of information to help developers:

  1. There is a SQL injection vulnerability in Django 3.2.5
  2. Django 3.2.5 is a dependency pulled in by django-cachalot 2.4.0
  3. There is a call path from the application through django-cachalot and django which exposes the vulnerability to users (full details on the right side of the image)
  4. The vulnerability was fixed in django 3.2.13
  5. The issue can be remediated by upgrading to a fixed version of django (3.2.13 or newer)

Things work in much the same way for Golang:

In this example, there is an unhandled exception in Go’s yaml package which is exposed through the confd application. The application actually calls the affected Unmarshall function, as described in the call path in the right side of the image. Upgrading to use the fixed version of the yaml package will resolve the problem.

The impact of prioritizing vulnerable dependencies

When reachability information is included in findings, there is no need for further research. It is immediately apparent which vulnerabilities are reachable and exposed in the application, and the least effort fix is clearly explained. There is no need for discussion or negotiation about the impact or need for remediation. Teams can instead apply that effort toward implementing and testing the fix.

By searching for and prioritizing the reachable findings, teams focus efforts on the risks that actually affect the application; unreachable findings are naturally de-prioritized, to be addressed after the immediate risks are remediated.

Vulnerable dependency findings often dominate the alerts coming from AppSec tools, because approximately 80% of the code contained in modern apps comes from open source dependencies. But apps use only 12% of that open source code, so most of those alerts are likely to be located in unused code. Pairing those findings with accurate reachability information enables developers to quickly focus on the findings that represent real risk to the application.

See it in action

Join us on September 19th for a deep dive into:

  • How reachability analysis works
  • Why is it different for each language
  • How reachability-based vulnerability prioritization works with Endor Labs

Save your spot!