This is a very long article. We didn't want to break this up into multiple posts and make yet another generic "What is SBOM" SEO grab, but we do think some context is important. We also truly believe there's a crucial discussion to be had about the state of SBOMs, and where the industry goes from here. Please use the table of contents to skip to the part that's relevant for you!
Why should I care about SBOMs?
What are the SBOM standards & tools that exist today?
What are the problems surrounding SBOM adoption? <- This is the important part
I am not writing this blog as a marketer. I’m a Certified Information Security Professional (CISSP) who has spent their entire career building security products as a product manager and practicing security in the technology and healthcare industries. I am a maintainer of the Center for Internet Security’s benchmark for NGINX and have contributed to several industry standards. I say this to emphasize that I have a strong personal interest in the success of the security industry and I genuinely want the SBOM movement to succeed.
Today, I write to tell you that the state of the industry movement towards SBOMs needs material interventions to be usable at scale for exceedingly basic use cases. I say this in the hopes that it begins a discussion at the industry level that brings us closer to our desired state and to challenge the notion of what that desired state even is.
If your familiarity with SBOMs is beginner to intermediate level please read all of this. The first portion may seem like marketing fluff to an expert but those less familiar with the state of the industry need context to understand the problem. If you are a specialized expert you should skip to the section “What are the problems surrounding SBOM adoption?”
What is an SBOM?
For those of us involved in procuring, operating, or producing software there are many questions we have about our software. What is the cost to maintain this software? What is the likelihood my software will pose a risk to my intended usage of it? Will the usage of this software in some way conflict with another one of my priorities?
A software bill of materials (SBOM) is intended to offer transparency into the software components used by the application and its users. Transparency is the foundation of visibility. Visibility is the foundation of making informed risk management decisions.
Consumers ask similar questions about the food they eat. People often want to know if what they are eating aligns with their lifestyle and health management goals. To help consumers navigate their food choices, the Food and Drug Administration (FDA) requires that packaged food have nutrition fact labels placed on them when they are sold to US consumers. These nutrition fact labels show a list of ingredients that help consumers make purchasing decisions.
SBOMs are the nutrition fact labels of software. Unfortunately, the term SBOM has become so cliche that there is a serious risk that they are used like nutrition labels are today in practice: used to make informed choices by a few who care deeply, prohibitively inaccessible to most, and misleading to the average consumer.
A software bill of materials or SBOM is intended as a means of establishing transparency and trust in our software supply chain. An SBOM is an artifact that lists the software components that are used to create a software application.
The software components in an SBOM may be used to describe third-party or first-party software components and the relationships that they have with each other in the application. This helps someone reading an SBOM to understand the dependency graph of an application.
SBOMs may also contain information about the creators of the software, and metadata about the software components included in the bill of materials, such as licenses, checksum hashes of the software components, and vulnerabilities.
SBOMs can be created for many types of software, ranging from SaaS solutions to software libraries and frameworks.
In response to Executive Order (EO) 14028, the National Telecommunications and Information Administration and the Department of Commerce have published what they consider to be the minimum elements of an SBOM that will enable the basic use cases of vulnerability management, software inventory management, and software license management.
These elements include:
- Minimum data fields
- Automation Support
- Practices and Processes
Minimum data fields:
The minimum data fields for an SBOM are defined in the table published by the NTIA below:
It's important to remember that the modern SBOM standards today support significantly more data than the minimum data elements. These minimum elements described in the referenced source are the foundation upon which the SBOM movement is being built and are the core of industry direction.
To make the usage of SBOMs scalable and cost-effective, automation to generate SBOMs, share SBOMs regularly, and augment them with additional data is a necessity. Today, there are standards for SBOM data formats and several tools to automate SBOM generation. Tooling and recommendations, however, for practices and processes surrounding SBOMs are in their infancy and need to mature greatly.
Practices and Processes
The practices and processes surrounding the usage of SBOMs are the crux of how their usage is standardized across the industry. There are practices and processes each organization will need to define with any SBOM effort.
- Frequency of SBOM delivery - How frequently does an SBOM producer provide an SBOM to an SBOM consumer? Should producers provide an SBOM every 3 months or every new version of the software they are providing?
- Depth of SBOM coverage - Does an SBOM include only direct dependencies or transitive dependencies as well?
- Acceptance and communication standards for known unknowns - What known issues are there with the automated data?
- Distribution and delivery of SBOMs - Do software consumers email software producers to get an updated SBOM or is it published somewhere and pushed to consumers?
- Access control for SBOM access - What controls do we put in place to allow organizations to share their SBOMs on a need to know basis?
- Accommodation of Mistakes - omissions and errors are an inevitable part of any maturity cycle. SBOM is still early in this cycle and therefore we must define how we handle this as an industry.
Why should I care about SBOMs?
SBOMs provide organizations not only with visibility, but they are able to provide organizations with leading and lagging indicators of risk.
Signal of security program maturity
If your experience at a restaurant is consistently poor, you are probably going to stop going, unless it's the McDonald's less than a mile from home. People use Yelp reviews as a social proxy for a restaurant but they don’t make a decision purely off of the reviews. Other factors like price, location and type of cuisine are factors in the decision.
Like yelp reviews, SBOMs are leading indicators of security literacy and can be used as a signal of high-quality software development practices by the author of the software. When a software producer is able to deliver an SBOM to their consumers or customers, they generally are signaling that they have mature software delivery practices.
A decision should never be solely based on a single leading indicator but be seen as additional evidence of quality when selecting software.
Awareness of unsupported, outdated, or unmaintained software usage
In general, most consumers would agree that they would prefer to purchase milk, fruit, or any other perishable food further away from their expiration date. If they purchase food that is near expiration this purchase may be disposed of and it costs you more the next time you need that good.
Software ages like milk, not wine. If you are sold a product with unsupported, outdated, or unmaintained software in a product there is an increased likelihood that support for a security fix or bug may take longer to come if it comes at all. I must stress that this is a leading indicator of risk and does not mean that decisions should be made solely based on this factor.
Vulnerability management for third-party software
When there is reason to believe that food may cause illness or harm to consumers, the government may require a food safety recall. Food that has a recall issued against it is taken off the shelves, and frequently must be disposed of or returned for a refund. Recalls prevent food-borne outbreaks of illness and may be caused by anything from contamination to undeclared allergens on nutrition labels. If you’ve already bought the food it's typically up to you if you want to dispose of the food, request a refund, or even eat it anyways. You can make this decision based on the context of the recall. If you aren’t aware of a recall then you are out of luck.
A food recall notice is similar to a serious vulnerability discovered in software and is only a lagging indicator of risk. You want to be made aware of the fact that an issue is present, if it will affect you as you use the software and if it does what you can do to update the software to a version no longer impacted.
SBOMs give you the foundational data required to monitor for recall notices. If a recall notice comes out on one of the software components used in the software you use, then you want to know. SBOMs provide you the foundation to be alerted and work with the producer of that software to fix it by creating and distributing a fix that you can install.
SBOMs have many other use cases beyond transparency, vulnerability management, license compliance, and as an indicator of security maturity. Many of these cases rely on industry-standard formats and go beyond the minimal format. To learn more about the use cases of SBOMs see NTIA’s Roles and Benefits for SBOM Across the Supply Chain report.
What are the SBOM standards & tools that exist today?
The two primary formats SBOM producers provide to consumers are SPDX and CycloneDX.
Each format acts as its own standard but both are able to meet the minimum data type requirements of an SBOM. Both also provide capabilities that greatly exceed those minimum data field requirements for greatly expanded use cases. There are pros and cons associated with each standard.
The table below represents the pros, cons, and focus areas of each standard as of their current version today but only scratches the surface of what we need to move forward as an industry.
What are the problems surrounding SBOM adoption?
The security industry is at serious risk of falling to the institutional imperative. Since the US government has signaled its requirement that an SBOM must be provided during their procurement process the word SBOM has become all the rage and has developed a paparazzi-like following.
Contrary to the hype, the adoption of SBOM-related use cases is fairly immature. While industry leaders promoting SBOMs that I’ve spoken to are fairly well grounded in this reality, there is a large amount of misunderstanding among those in the industry on the state of SBOMs today.
I’ve spoken with over 50 security practitioners ranging from security architects and open source enthusiasts to third-party risk management professionals. The results of those conversations have generally been the following:
- There is broad agreement that SBOMs are necessary to help create a holistic inventory of applications in the Enterprise. This inventory will help organizations that have traditionally only had end-to-end visibility into the software components of their first-party applications get similar visibility among their third-party applications.
- Most organizations that I’ve spoken to have not operationalized SBOMs beyond using them as a signal of security program literacy during the procurement process. There are exceptions to this, where programs have begun using them as part of their vulnerability management program, but these programs are in their infancy.
- Security practitioners are aware of SBOMs but a select few have been hands-on with them.
Despite the fact that most people tend to agree with the directional goals of SBOMs, there are material impediments to the mature adoption of the use cases they intend to solve.
Major SBOM Adoption Impediments
Problem 1: Garbage-in, Garbage-out
Data quality, and tooling in the SBOM ecosystem is the single largest barrier to adoption in the industry. This is where the minimum requirements published by the US government actually create a negative impact on the industry.
The security industry doesn’t need guidance on the bare minimum the lawyers and auditors will let them get away with. It needs strong guidelines and standardization movements to help ensure consistent data requirements to meet the use cases SBOMs intend to serve.
Let's explore basic dependency management use cases to get a better understanding of data requirements relative to the minimum requirements of standards bodies. I want to understand if dependencies in an SBOM are vulnerable, outdated, or unmaintained.
To identify a vulnerability or if the package is outdated at a minimum I need a unique identifier for the package version. This will allow me to look up the correct distribution of the software package and the correct software package so that I can retrieve information about the package.
From there I can search public or proprietary datasets to get information about any vulnerabilities associated with the package version. I can also get information about the lifecycle of the software package and its version history to understand if I am on a supported or unreasonably out of date version of the software package. Much of this information is frequently found associated with the package manager or package repository.
If I want to further identify if a software package is unmaintained, I need to take additional steps. I need to link an open-source package back to its source code to begin to infer if the code base is abandoned or if it's stable and mature. This is much harder to do because not all software packages can easily be linked back to their source code or even have that code publicly available but in general, these links exist across packaging and package repository ecosystems at varying degrees of fidelity.
So what minimum data elements do I need to build a tool to help me search for vulnerabilities, outdated releases, and unmaintained software in a software bill of materials provided by a third party, assuming that their software bill of materials has perfect information?
I need a unique identifier to the package version that is being referenced that allows me to know exactly where the package was downloaded from so that I can automatically attempt to download the package and any metadata associated with it and inspect it myself with confidence that I’m referring to the same package version.
This doesn’t mean I need a checksum of the package version. I don’t know how to link that to vulnerabilities without creating a checksum for every package in every downloadable location in existence and mapping a reference database that doesn’t exist to it. The closest thing that exists to achieve this goal is the package url or purl.
But guess what? A package URL covers a lot of use cases successfully but it is also not standardized enough to cover every use case. Using a package URL I have to assume in many cases the download location of the package is standard. For example let's take the package url below:
This tells me that the software package is probably a java application and reasonably allows me to identify most information about the package I need. I have the version, the organization and the package name. But what if this is hosted in a private package repository? I don’t always have information to differentiate between a package that's stored in Maven Central and one that is not. Sso I am forced to assume that if it's not in Maven Central then it's a private repository.
This assumption leads to data quality issues. The package url specification allows for qualifiers to help get me this information. The most common qualifier that can help get me over the finish line is the repository_url qualifier, however, these qualifiers are not incredibly reliable and standardized in practice and are only optional in the package url specification.
Let's assume that we accept this limitation as an industry and say for the sake of argument that all we need is the package url without qualifiers.
Is this a required field in SPDX? No
Is this a required field in CycloneDX? No
Is this a required field in the minimum data elements published by NTIA and the Department of Commerce? No
Other data quality concerns exist beyond this.
On page 12 of the NTIA’s document titled “The Minimum Elements for an SBOM” the depth of an SBOM is discussed as follows: “An SBOM should contain all primary (top level) components, with all their transitive dependencies listed. At a minimum, all top-level dependencies must be listed with enough detail to seek out the transitive dependencies recursively.”
In order to accurately resolve all transitive dependencies, a consumer needs the exact build time of a software package to appropriately resolve transitive dependencies along with the package manager/build system used to build the software package.
Unless dependencies are all pinned across a dependency graph or a lockfile is provided, then in many software ecosystems package managers will resolve dependencies differently based on the time of a package's build. However, no build time is defined in any industry SBOM specification. Instead, specifications provide for the timestamp creation of the SBOM, which may or may not be a close but not fully accurate proxy for build time. While the resolution of dependencies varies greatly across software ecosystems the point remains that this is an unreasonable task to do with full accuracy without these data elements in many circumstances.
As for tooling in the industry, it must go above and beyond the minimum standards to achieve basic use cases. Tools are incredibly inconsistent in how they provide data but I don’t blame the tooling.
Without calling out any specific tools or vendors, if you use various SBOM generation tools across the industry you’ll notice a few profound things:
- Sometimes tools use the package URL as the unique identifier of the package. Others do not. Of those that do, the package URL frequently comes with qualifiers that are inconsistently implemented.
- Many tools don’t even provide the package URL because it's not required. They provide names of varying degrees of usefulness ranging from completely unusable to passably guessable to a highly specialized individual.
- The package URL is put across multiple fields in SBOMs of both formats and is sometimes used in a field defined for the package url and sometimes used as the name or even put in a totally different location.
- Software components are defined and vary greatly in implementation across tooling providers.
The result is that when you open an SBOM for the same exact software version generated by different tools they can range from materially different in how they display the same data to clearly unusable if they were to be used for common but complex use cases like vulnerability management.
There's no consistency for placement of the exact same data across tooling and a lack of data needed for various use cases clearly exists. This is a problem that can only be fixed at an industry standards level and it can only be done by standards having a strong opinion on the data they need. Without this, the industry will be left in a constant state of “Garbage in, Garbage out”.
Problem 2: Long-term risks for software vendors are abundant but incentives are mixed
If you’ve been in security for a while, you probably have heard the old adage: “If an auditor asks you if you know what time it is, well that is a yes or no question.” This adage highlights the fact that during an audit you want to give the absolute minimum amount of information that is required of you to provide.
By providing SBOMs to their customers, vendors make the following tradeoffs:
- Vendors may be providing their customers with evidence of business risks associated with the licenses of their third-party components. They are therefore incentivized to not provide this information.
- Vendors open themselves to questions about software components that they use in their applications for a number of reasons, such as if they are maintained, outdated, or vulnerable. They, therefore, accept additional costs by providing SBOMs that are of high quality.
So if vendors take on this cost we need to give them clear use cases, clear data requirements and a means to help them reduce the creation and support costs of managing questions about their SBOMs. All of these must be standardized or every customer will request different information and providing SBOMs won’t be scalable for vendors.
Tooling needs to support those data requirements and account for flexibility among different use cases. Someone who cares greatly about software vulnerabilities might not care about licenses so tooling should support the ability to not provide licenses. If there are going to be tons of requests about security vulnerabilities in an SBOM then there needs to be a cost-effective way for vendors to manage that, such as a less manual VEX.
So with business incentives actually incentivizing vendors to decrease transparency, they need incentives to increase transparency. Today, that incentive is simple. This is a cost of doing business with some of the vendors' customers.
With nutrition labels, marketing incentives were a solution. Organizations sold their products as organic to help sell more effectively and organic foods command a premium price at the grocery store. Why not allow vendors to market their products in a similar way to create some positive reinforcement?
Problem 3: Processes and practices are immature and not institutionalized
The problems SBOMs could solve are compelling. But we are still in the infancy of the SBOMs evolution. You can find content on a huge number of use cases. As an industry, we need to ensure that we don’t boil the ocean and focus on the areas we want to have an outsized impact. We should solve the most important issues in our first iteration and continue to expand in following iterations as practices and processes mature.
Today, you see a huge number of use cases associated with SBOMs but even the most basic ones are still fairly immature across the industry with some localized heroics from some tools. As a result, standards are doing everything under the sun and software development teams lack focus. This isn’t to say all standards and tools are bad but is intended to say that implementations are based on intended practices and today these practices and tooling surrounding them can be incredibly inconsistent.
Rather than highlighting all of the possible things an SBOM could help to solve, the world should focus on solving two or three problems, institutionalizing their use, and expanding from there. The industry is not focused at the moment and as a result, we’ve decreased the likelihood that we will achieve meaningful results.
SBOMs are getting visibility so now is the time to improve
Log4J has given SBOMs an unprecedented level of visibility. Let's not waste a good incident. I haven’t given up hope on SBOMs. However, I do think that the root of the problem can only be solved at a standard level. To help guide the SBOM movement I encourage everyone to work with us to accomplish the following:
- Be more opinionated about the data we need as an industry. This needs to be defined very clearly and without ambiguity. So kick all of the lawyers and lobbyists out of the room for a moment and let the security engineers and software engineers solve the problem.
- Clearly define the data required for each use case. The industry needs a formal listing of what is required and what is nice to have and this needs to be grounded in reality.
- Prioritize use cases and focus on the top ones. As an industry, we need to partner with tooling providers to focus on achieving our top use cases. We can do this by standardizing the requirements for these use cases more clearly and influencing tooling to accommodate the requirements.
- Incentivize vendors to provide transparency. The industry should consider going beyond making providing an SBOM a cost of doing business.
- Provide guidance on prioritized use cases, practices, and processes. To help the industry scale we need to be crystal clear on the data requirements for each individual use case. This needs to be centralized so we don’t have snowflake requirements and can standardize these practices. Let's assemble a team of experts across the industry to do this.
SBOM as a movement is a terrific idea that thus far has been imperfect in execution. Only by coming together as an industry can we set it on a more effective path. I encourage everyone to share their pains of SBOM adoption. So I ask you: What's your SBOM story?
Reviewing Malware with LLMs: OpenAI vs. Vertex AI
At Endor Labs, we continue evaluating the use of large language models (LLMs) for all kinds of use-cases related to application security. And we continue to be amazed about high-quality responses … until we’re amused about the next laughably wrong answer.