Tern is an open source software composition analysis and SBOM generation tool that generates SBOMs from container images. However, upon running this tool and generating a CycloneDX SBOM, a problem arose. The SBOM was deemed invalid by several SBOM processing tools, such as the CycloneDXWeb Tool:
In addition to the CycloneDX CLI as shown here:
Upon further investigation the reason was found, within the SBOM’s licenses.
The CycloneDX SBOM format defines a license in two main ways; ID (with optional additional data) or Name. The difference between them was the key between a valid SBOM and an invalid one.
A CycloneDX Name is merely a string of characters, and can be of anything. However, a CycloneDX License ID is a string that must consist of a valid SPDXLicense ID e.g. “GPL-2.0”. With this information, we can now review a specific component case within the SBOM, specifically an SBOM created from a Debian Buster container image.
As we can see, the component apt contains a single license, that of the ID “GPLv2+”. However, this is not a valid SPDX License ID. This alone would render the entire SBOM invalid, and there were over 200 such cases in this particular SBOM.
The cause of this problem is unvalidated license generation. Tern creates the license cases for a CycloneDX SBOM using the data from container images “as is'', without regard for formatting or validation in accordance with the CycloneDX Schema, as shown here:
As we can see, the get_license_from_name function simply returns a python dictionary, with the license variable name as the value to an ID key. This is then assembled into a wider CycloneDX dictionary and output as a JSON using the CycloneDXJSON class as shown here:
This chain of events results in a CycloneDX SBOM that outwardly looks correct, but in reality is invalid.
There are several solutions to this issue. The first and easiest, is to simply have the get_license_from_name function always return the license as a name as shown here:
However this robs the SBOM of potentially valuable license ID information that can be passed to an SBOM parser.
As such, a more elegant solution can be applied with the useof the CycloneDX Python Library. Utilizing this library, we can take advantage of its component validation functionality, and create avalidation step:
This modified function first checks the input name variable to determine if it is a valid SPDX License ID. If it is, it returns the license as a license ID. Otherwise, it returns it as a license Name. This can be validated by recreating the original SBOM, and checking the same component.
As we can see “GPLv2+” is now a license Name. If we run this new SBOM through the CycloneDX Webtool:
Inaddition to the CycloneDX CLI:
We can now see the SBOM is valid.
We shortened our vulnerability review timeframe from a day to under an hour. It is our go-to tool and we now know where to focus our limited security resources next.
SBOM Studio saves us approximately 500 hours per project on vulnerability analysis and prioritization for open-source projects.