* fix(dpkg): extract License field for opkg/ipkg entries
opkg and ipkg use the dpkg cataloger but declare the package License
inline in the status DB (unlike Debian dpkg, where licenses live in
copyright files). The cataloger silently dropped the License field at
mapstructure decode time, so all opkg-managed packages reported empty
licenses.
This adds the field to the intermediate decode struct and the public
DpkgDBEntry, and populates licenses in newDpkgPackage using the alpine
cataloger's pattern: try license.ParseExpression first to keep valid
SPDX expressions whole, fall back to whitespace splitting for
space-separated lists.
Standard Debian dpkg status files never carry a License field per
Debian policy, so the new path is a no-op for them; the existing
copyright-file lookup in addLicenses is unaffected.
Closes#4940
Signed-off-by: David Dashti <47575784+Dashtid@users.noreply.github.com>
* remove license from dpkg metadata struct
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
* restore format snapshot files
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
* add additional tests
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
---------
Signed-off-by: David Dashti <47575784+Dashtid@users.noreply.github.com>
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
Co-authored-by: Alex Goodman <wagoodman@users.noreply.github.com>
---------
Signed-off-by: Yoonho Hann <hnnynh125@gmail.com>
Signed-off-by: Christopher Phillips <32073428+spiffcs@users.noreply.github.com>
Co-authored-by: Christopher Phillips <32073428+spiffcs@users.noreply.github.com>
BPL (Borland Package Library) files are standard PE/DLL format used by
Delphi and C++Builder. Adding the extension to the glob list so syft
picks them up during directory scans without users needing to rename
to .dll first.
---------
Signed-off-by: jfjrh2014 <jfjrh2014@gmail.com>
Signed-off-by: Christopher Phillips <32073428+spiffcs@users.noreply.github.com>
Co-authored-by: Christopher Phillips <32073428+spiffcs@users.noreply.github.com>
* fix: detect compressed kernel modules (.ko.gz, .ko.xz, .ko.zst)
The linux-kernel-cataloger only matched plain *.ko files, missing
compressed modules produced when CONFIG_MODULE_COMPRESS is enabled
(common on Debian 13 / Ubuntu 24.04+). This resulted in near-zero
module packages being reported for such filesystems.
Changes:
- Add *.ko.gz, *.ko.xz, *.ko.zst glob patterns to both the cataloger
and capabilities.yaml so the file resolver picks up compressed modules
- Add decompressedModuleReader() which detects the extension and
transparently decompresses via compress/gzip, ulikunitz/xz, or
klauspost/compress/zstd before handing the ELF bytes to the existing
parseLinuxKernelModuleMetadata parser
- Promote github.com/klauspost/compress from indirect to direct dependency
- Add unit tests covering all three compression formats plus the
uncompressed baseline, using a programmatically generated minimal ELF
Fixes#4721
Signed-off-by: Will Bates <william.bates11@outlook.com>
* address reading archives into memory
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
---------
Signed-off-by: Will Bates <william.bates11@outlook.com>
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
Co-authored-by: Will Bates <william.bates11@outlook.com>
Co-authored-by: Alex Goodman <wagoodman@users.noreply.github.com>
* fix: Allow duplicates in Yarn "Berry" files (#4691)
Yarn lockfiles can have multiple versions resolved for the same package
name. We correctly allow this in Yarn v1 lockfiles but the "Berry"
YAML-format lockfiles were doing deduplication by package name. This
change removes that deduplication.
Signed-off-by: Calum Leslie <cleslie@atlassian.com>
* fix linting
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
---------
Signed-off-by: Calum Leslie <cleslie@atlassian.com>
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
Co-authored-by: Calum Leslie <cleslie@atlassian.com>
Co-authored-by: Alex Goodman <wagoodman@users.noreply.github.com>
The elixir-binary and elixir-library classifiers' regexes only matched
the bare semver triplet (and a single sub-segment for the library), so
release-candidate elixir images were either missed entirely or had
their version truncated:
$ syft -q elixir:1.12.0-rc | grep elixir # nothing
$ syft -q elixir:1.13.0-rc.0 | grep elixir
elixir 1.13.0 binary # truncated, "-rc.0" lost
Extend the version capture group to optionally include
"-<a-z0-9>+(\\.<digits>)?" so "1.12.0-rc.1", "1.13.0-rc.0", etc. match
exactly as the elixir.app and the binary's ELIXIR_VERSION line have
them.
Add a logical fixture under testdata/classifiers/snippets/elixir/
1.12.0-rc.1/linux-amd64 (cloned from the existing 1.19.1 fixture with
just the version strings changed) and register it in
Test_Cataloger_PositiveCases.
Closes#4819
Signed-off-by: Chris (ChrisJr404) <11917633+ChrisJr404@users.noreply.github.com>
Co-authored-by: Chris (ChrisJr404) <11917633+ChrisJr404@users.noreply.github.com>
* fix(debian): only parse machine-readable copyright files with Format header
Only parse debian/copyright files as machine-readable DEP-5 format when
they contain the mandatory Format header field pointing to the copyright
specification URI. Files without this header are free-form text and
should not have License: regex patterns applied to them, which previously
produced nonsensical results like "#", "Permission", "This", "see" for
non-machine-readable files.
The fallback license classifier in the debian cataloger will handle
non-machine-readable files by doing full-text license identification.
Closes#4708
Signed-off-by: Bahtya <bahtya@users.noreply.github.com>
Signed-off-by: Bahtya <bahtayr@gmail.com>
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
* decompose parseLicensesFromCopyright to address linting issues
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
---------
Signed-off-by: Bahtya <bahtya@users.noreply.github.com>
Signed-off-by: Bahtya <bahtayr@gmail.com>
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
Co-authored-by: Bahtya <bahtayr@gmail.com>
Co-authored-by: Alex Goodman <wagoodman@users.noreply.github.com>
Empty or whitespace-only .rockspec files cause parseRockspecBlock to
panic with "index out of range" because the existing end-of-data guard
requires len(out) > 0 before returning the "unexpected end of block"
error, letting the bare data[*i] access on the next line crash.
Split the guard so that:
- partial content at end of data still returns the existing error
- empty data (or whitespace-only) returns an empty block cleanly
Closes#4824.
Signed-off-by: Akihiko Komada <aki1770@gmail.com>
The JRuby project migrated their downloads from S3 to GitHub Releases,
causing the old S3 URLs to return HTTP 403 Forbidden and breaking test
fixture image builds.
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
* fix(javascript): ensure deterministic pnpm lockfile parsing
Replace nondeterministic Go map iteration with sorted key iteration
in both v6 and v9 pnpm lockfile parsers. When multiple lockfile keys
collapse to the same package key after peer dependency stripping, the
unsorted map iteration caused different entries to win on each run,
producing unstable artifact IDs and non-reproducible SBOM output.
Fixes#4648
Signed-off-by: lawrence3699 <lawrence3699@users.noreply.github.com>
* add regression test
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
---------
Signed-off-by: lawrence3699 <lawrence3699@users.noreply.github.com>
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
Co-authored-by: lawrence3699 <lawrence3699@users.noreply.github.com>
Co-authored-by: Alex Goodman <wagoodman@users.noreply.github.com>