* feat: add Debian archive (.deb) file cataloger
Add a cataloger that parses Debian package (.deb) archive files directly,
allowing Syft to discover packages from .deb files without requiring
them to be installed on the system. This implements issue #3315.
Key features:
- Parse .deb AR archives to extract package metadata
- Support for gzip, xz, and zstd compressed control files
- Extract package metadata from control files
- Process file information from md5sums files
- Mark configuration files from conffiles entries
- Handle trailing slashes in archive member names
Signed-off-by: Alan Pope <alan.pope@anchore.com>
* chore: run go mod tidy to fix failing workflow
Signed-off-by: Alan Pope <alan.pope@anchore.com>
* add license processing to dpkg archive cataloger + add tests
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
* update json schema with dpkg archive type
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
* update comments
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
---------
Signed-off-by: Alan Pope <alan.pope@anchore.com>
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
Co-authored-by: Alex Goodman <wagoodman@users.noreply.github.com>
* prototype: start bitnami cataloger
Bitnami images have spdx SBOMs at predictable paths, and Syft could more
accurately identify the software in these images by scanning those
SBOMs. Start work on this by forking the sbom-cataloger as a new
bitnami-cataloger.
Signed-off-by: Will Murphy <willmurphyscode@users.noreply.github.com>
* wire up bitnami cataloger to run on images by default
Signed-off-by: Will Murphy <willmurphyscode@users.noreply.github.com>
* feat: add support for Bitnami cataloguer
Signed-off-by: juan131 <jariza@vmware.com>
* feat: use a better SPDX sample for unit tests
Signed-off-by: juan131 <jariza@vmware.com>
* bugfix: only report bitnami pkgs
Signed-off-by: juan131 <jariza@vmware.com>
* feat: adapt JSON schema, spdxutil and packagemetadata
Signed-off-by: juan131 <jariza@vmware.com>
* bugfix: integration tests
Signed-off-by: juan131 <jariza@vmware.com>
* feat: implement FileOwner interface
Signed-off-by: juan131 <jariza@vmware.com>
* bugfix: update json schema
Signed-off-by: juan131 <jariza@vmware.com>
* [wip] add bitnami owned files and fix binary package ownership filtering
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
* feat: obtain bitnami pkg files based on SPDX relationships tree
Signed-off-by: juan131 <jariza@vmware.com>
* preserve type switches
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
* rename bitnami entry metadata type
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
* restrict find main pkg logic
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
* add missing graalvm source info
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
* bugfix: integration tests
Signed-off-by: juan131 <jariza@vmware.com>
* bugfix: mod tidy
Signed-off-by: juan131 <jariza@vmware.com>
---------
Signed-off-by: Will Murphy <willmurphyscode@users.noreply.github.com>
Signed-off-by: juan131 <jariza@vmware.com>
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
Co-authored-by: Will Murphy <willmurphyscode@users.noreply.github.com>
Co-authored-by: Alex Goodman <wagoodman@users.noreply.github.com>
* Use file indexer when scanning with file source
Prevents filesystem walks when scanning a single file, to
optimise memory & scan times in case the scanned file
lives in a directory containing many files.
Signed-off-by: adammcclenaghan <adam@mcclenaghan.co.uk>
* Create filetree resolver
Shared behaviour for resolving indexed filetrees.
Signed-off-by: adammcclenaghan <adam@mcclenaghan.co.uk>
---------
Signed-off-by: adammcclenaghan <adam@mcclenaghan.co.uk>
* add jvm cataloger
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
* simplify version selection
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
* CPEs from JVM cataloger should be declared
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
* ensure package overlap is enabled for sensitive use cases
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
* more permissive glob
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
---------
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
* fix: only skip tmpfs mounts for some paths
Signed-off-by: Will Murphy <will.murphy@anchore.com>
* refactor and add tests
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
* add regression test for archive processing
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
* bump to golang 1.22
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
* remove rule 1 and add more tests
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
---------
Signed-off-by: Will Murphy <will.murphy@anchore.com>
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
Co-authored-by: Alex Goodman <wagoodman@users.noreply.github.com>
Previously, the file resolver was created from incorrect calls
(path.Join instead of filepath.Join) which resulted Go license searches
always missing on Windows. Use filepath.* functions when initializing
the Go config, and when the unindexed file resolver is being created.
Signed-off-by: Will Murphy <will.murphy@anchore.com>
* fix: re-use embedded union reader if possible
Previously, because file.LocationReadCloser embeds a ReadCloser that
might be a UnionReader, but doesn't implement the interface itself, the
type assertion would fall and Syft would fall back to io.ReadAll to
enable seeking on the underlying reader, resulting in a potentially
large extra allocation.
Instead, check whether the passed ReadCloser is a
file.LocationReadCloser, and if so, try to use the embedded ReadCloser
as a UnionReader.
Signed-off-by: Will Murphy <will.murphy@anchore.com>
* lint fix
Signed-off-by: Will Murphy <will.murphy@anchore.com>
* Assert that underlying reader is returned
Signed-off-by: Will Murphy <will.murphy@anchore.com>
---------
Signed-off-by: Will Murphy <will.murphy@anchore.com>
* consider fs types for mount points when ignoring system paths
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
* address feedback
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
---------
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
* survive indexing branches that start with a bad symlink
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
* add log statement
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
---------
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
Because we generate a new JSON schema file every time the schema version
changes, the git diff always shows that the file is completely new.
Therefore, every time the file is re-generated, also write the schema to
a stable path, so that the actual changes to the schema are easily
visible in the git diff of the latest schema file.
Signed-off-by: Will Murphy <will.murphy@anchore.com>
* Adding the resolved and integrity fields of yarn.lock to the parsed metadata. This addition is similar to the metadata added when parsing package-lock.json.
Signed-off-by: asi-cider <88270351+asi-cider@users.noreply.github.com>
* fix comment
Signed-off-by: asi-cider <88270351+asi-cider@users.noreply.github.com>
* Adding the Index field to metadeta when parsing poetry.lock similarly to the existing Pipfile metadata
Signed-off-by: asi-cider <88270351+asi-cider@users.noreply.github.com>
* fixing struct accoding to tests
Signed-off-by: asi-cider <88270351+asi-cider@users.noreply.github.com>
* remove old schema change
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
* remove empty constants
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
* re-generate JSON schema
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
* update document ref
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
* fix linting
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
---------
Signed-off-by: asi-cider <88270351+asi-cider@users.noreply.github.com>
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
Co-authored-by: Alex Goodman <wagoodman@users.noreply.github.com>
* add detection of ELF security features
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
* fix linting
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
* update json schema with file executable data
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
* update expected fixure when no tty present
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
* more detailed differ
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
* use json differ
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
* fix tests
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
* remove json schema addition
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
* regenerate json schema
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
* fix mimtype set ref
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
---------
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
The previous implementation would leak a goroutine if the caller of
AllLocations stopped iterating early. Now, accept a context so that the
caller can cancel the AllLocations iterator rather than leak the
goroutine.
Signed-off-by: Will Murphy <will.murphy@anchore.com>
* [wip]
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
* distinct the package metadata functions
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
* remove metadata type from package core model
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
* incorporate review feedback for names
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
* add RPM archive metadata and split parser helpers
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
* clarify the python package metadata type
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
* rename the KB metadata type
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
* break hackage and composer types by use case
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
* linting fix
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
* fix encoding and decoding for syft-json and cyclonedx
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
* bump json schema to 11
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
* update cyclonedx-json snapshots
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
* update cyclonedx-xml snapshots
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
* update spdx-json snapshots
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
* update spdx-tv snapshots
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
* update syft-json snapshots
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
* correct metadata type in stack yaml parser test
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
* fix bom-ref redactor for cyclonedx-xml
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
* add tests for legacy package metadata names
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
* regenerate json schema v11
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
* fix legacy HackageMetadataType reflect type value check
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
* fix linting
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
* packagemetadata discovery should account for type shadowing
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
* fix linting
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
* fix cli tests
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
* bump json schema version to v12
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
* update json schema to incorporate changes from main
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
* add syft-json legacy config option
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
* add tests around v11-v12 json decoding
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
* add docs for SYFT_JSON_LEGACY
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
* rename structs to be compliant with new naming scheme
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
---------
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
* split up sbom.Format into encode and decode ops
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
* update cmd pkg to inject format configs
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
* bump cyclonedx schema to 1.5
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
* redact image metadata from github encoder tests
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
* add more testing around format decoder identify
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
* add test case for format version options
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
* fix cli tests
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
* fix CLI test
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
* [wip] - review comments
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
* keep encoder creation out of post load function
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
* keep decider and identify functions
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
* add a few more doc comments
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
* remove format encoder default function helpers
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
* address PR feedback
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
* move back to streaming based decode functions
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
* with common convention for encoder constructors
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
* fix tests and allow for encoders to be created from cli options
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
* fix cli tests
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
* fix linting
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
* buffer reads from stdin to support seeking
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
---------
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
Now that the test fixture pins to a particular digest, there's no need
for platform specific architecture switches in this test.
Signed-off-by: Will Murphy <will.murphy@anchore.com>
Signed-off-by: Joseph Palermo <jpalermo@vmware.com>
Signed-off-by: Chris Selzo <cselzo@vmware.com>
Co-authored-by: Joseph Palermo <jpalermo@vmware.com>
* Add support for parsing .NET assemblies
Signed-off-by: Dan Luhring <dluhring@chainguard.dev>
Former-commit-id: 69c33fe4d77357d843c11590f3b07825bc6249ac
* Add dll and exe files
Signed-off-by: Dan Luhring <dluhring@chainguard.dev>
Former-commit-id: b9d204efa6d2ef385b5fbb7a59a3474ecabea641
* Add PE cataloger to directory catalogers
Signed-off-by: Dan Luhring <dluhring@chainguard.dev>
Former-commit-id: 9711c00d9da92e2887e0c1f92edd740ea5345849
* Don't set language to dotnet for PEs
Signed-off-by: Dan Luhring <dluhring@chainguard.dev>
Former-commit-id: 368313fddac9160d8a06a01ebe8c5ac7990232f5
* Fix spelling of cataloger in constructor
Signed-off-by: Dan Luhring <dluhring@chainguard.dev>
Former-commit-id: e42fd77b2f8b6d42e076a84f6cce386861260941
* Adjust which cases in PE parsing return errors
Signed-off-by: Dan Luhring <dluhring@chainguard.dev>
Former-commit-id: 95b25f8fc3a7d4e18fe30e489b09851f316795ff
* remove build binary from branch
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
Former-commit-id: fa54c0d0aef0998d5520e9f44cae51f5f9cd38a2
* Fix failing CLI tests
Signed-off-by: Dan Luhring <dluhring@chainguard.dev>
---------
Signed-off-by: Dan Luhring <dluhring@chainguard.dev>
Co-authored-by: Alex Goodman <wagoodman@users.noreply.github.com>
* add bubbletea UI
Signed-off-by: Alex Goodman <alex.goodman@anchore.com>
* swap pipeline to go 1.20.x and add attest guard for cosign binary
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
* update note in developing.md about the required golang version
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
* fix merge conflict for windows path handling
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
* temp test for attest handler
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
* add addtional test iterations for background reader
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
---------
Signed-off-by: Alex Goodman <alex.goodman@anchore.com>
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
* refactor source API and syft json source block
Signed-off-by: Alex Goodman <alex.goodman@anchore.com>
* update source detection and format test utils
Signed-off-by: Alex Goodman <alex.goodman@anchore.com>
* generate list of all source metadata types
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
* extract base and root normalization into helper functions
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
* preserve syftjson model package name import ref
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
* alias should not be a pointer
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
---------
Signed-off-by: Alex Goodman <alex.goodman@anchore.com>
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>