* feat: add Debian archive (.deb) file cataloger Add a cataloger that parses Debian package (.deb) archive files directly, allowing Syft to discover packages from .deb files without requiring them to be installed on the system. This implements issue #3315. Key features: - Parse .deb AR archives to extract package metadata - Support for gzip, xz, and zstd compressed control files - Extract package metadata from control files - Process file information from md5sums files - Mark configuration files from conffiles entries - Handle trailing slashes in archive member names Signed-off-by: Alan Pope <alan.pope@anchore.com> * chore: run go mod tidy to fix failing workflow Signed-off-by: Alan Pope <alan.pope@anchore.com> * add license processing to dpkg archive cataloger + add tests Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com> * update json schema with dpkg archive type Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com> * update comments Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com> --------- Signed-off-by: Alan Pope <alan.pope@anchore.com> Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com> Co-authored-by: Alex Goodman <wagoodman@users.noreply.github.com>
JSON Schema
This is the JSON schema for output from the JSON presenters (syft packages <img> -o json). The required inputs for defining the JSON schema are as follows:
- the value of
internal.JSONSchemaVersionthat governs the schema filename - the
Documentstruct definition withingithub.com/anchore/syft/syft/formats/syftjson/model/document.gothat governs the overall document shape - generated
AllTypes()helper function within thesyft/internal/sourcemetadataandsyft/internal/packagemetadatapackages
With regard to testing the JSON schema, integration test cases provided by the developer are used as examples to validate that JSON output from Syft is always valid relative to the schema/json/schema-$VERSION.json file.
Versioning
Versioning the JSON schema must be done manually by changing the JSONSchemaVersion constant within internal/constants.go.
This schema is being versioned based off of the "SchemaVer" guidelines, which slightly diverges from Semantic Versioning to tailor for the purposes of data models.
Given a version number format MODEL.REVISION.ADDITION:
MODEL: increment when you make a breaking schema change which will prevent interaction with any historical dataREVISION: increment when you make a schema change which may prevent interaction with some historical dataADDITION: increment when you make a schema change that is compatible with all historical data
Adding a New pkg.*Metadata Type
When adding a new pkg.*Metadata that is assigned to the pkg.Package.Metadata struct field you must add a test case to cmd/syft/internal/test/integration/catalog_packages_cases_test.go that exercises the new package type with the new metadata.
Additionally it is important to generate a new JSON schema since the pkg.Package.Metadata field is covered by the schema.
Generating a New Schema
Create the new schema by running make generate-json-schema from the root of the repo:
- If there is not an existing schema for the given version, then the new schema file will be written to
schema/json/schema-$VERSION.json - If there is an existing schema for the given version and the new schema matches the existing schema, no action is taken
- If there is an existing schema for the given version and the new schema does not match the existing schema, an error is shown indicating to increment the version appropriately (see the "Versioning" section)
Note: never delete a JSON schema and never change an existing JSON schema once it has been published in a release! Only add new schemas with a newly incremented version. All previous schema files must be stored in the schema/json/ directory.