Skip to content
This repository was archived by the owner on Mar 27, 2024. It is now read-only.

Single image analysis #20

Merged
merged 8 commits into from
Aug 18, 2017
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 10 additions & 2 deletions .container-diff-tests.sh
Original file line number Diff line number Diff line change
Expand Up @@ -7,11 +7,19 @@ while IFS=$' \n\r' read -r flag differ image1 image2 file; do
fi
done < tests/differ_runs.txt

while IFS=$' \n\r' read -r flag analyzer image file; do
go run main.go $image $flag -j > $file
if [[ $? -ne 0 ]]; then
echo "container-diff" "$analyzer" "analyzer failed"
exit 1
fi
done < tests/analyzer_runs.txt

success=0
while IFS=$' \n\r' read -r differ actual expected; do
while IFS=$' \n\r' read -r type analyzer actual expected; do
diff=$(jq --argfile a "$actual" --argfile b "$expected" -n 'def walk(f): . as $in | if type == "object" then reduce keys[] as $key ( {}; . + { ($key): ($in[$key] | walk(f)) } ) | f elif type == "array" then map( walk(f) ) | f else f end; ($a | walk(if type == "array" then sort else . end)) as $a | ($b | walk(if type == "array" then sort else . end)) as $b | $a == $b')
if ! "$diff" ; then
echo "container diff" "$differ" "diff output is not as expected"
echo "container-diff" "$analyzer" "$type" "output is not as expected"
success=1
fi
done < tests/diff_comparisons.txt
Expand Down
145 changes: 96 additions & 49 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,12 +5,13 @@ Status](https://travis-ci.org/GoogleCloudPlatform/container-diff.svg?branch=mast

## What is container-diff?

container-diff is an image differ command line tool. container-diff can diff two images along several different criteria, currently including:
container-diff is an image analysis command line tool. container-diff can analyze images along several different criteria, currently including:
- Docker Image History
- Image file system
- apt-get installed packages
- pip installed packages
- npm installed packages
The above analyses can be performed on a single image, or a diff can be performed on two images to compare images.

This tool can help you as a developer better understand what is changing within your images and better understand what your images contain.

Expand All @@ -32,8 +33,18 @@ Download the [container-diff-windows-amd64.exe](https://storage.googleapis.com/c

## Quickstart

To use container-diff you need two Docker images (in the form of an ID, tarball, or URL from a repo). Once you have those images you can run any of the following differs:
To use container-diff to perform analysis on a single image, you need one Docker image (in the form of an ID, tarball, or URL from a repo). Once you have that image, you can run any of the following analyzers:

```
container-diff <img> [Run all analyzers]
container-diff <img> -d [History]
container-diff <img> -f [File System]
container-diff <img> -p [Pip]
container-diff <img> -a [Apt]
container-diff <img> -n [Node]
```

To use container-diff to perform a diff analysis on two images, you need two Docker images (in the form of an ID, tarball, or URL from a repo). Once you have those images, you can run any of the following differs:
```
container-diff <img1> <img2> [Run all differs]
container-diff <img1> <img2> -d [History]
Expand All @@ -43,12 +54,13 @@ container-diff <img1> <img2> -a [Apt]
container-diff <img1> <img2> -n [Node]
```

You can similarly run many differs at once:
You can similarly run many differs or analyzers at once:

```
container-diff <img1> <img2> -d -a -n [History, Apt, and Node]
```
All of the differ flags with their long versions can be seen below:

All of the analyzer flags with their long versions can be seen below:

| Differ | Short flag | Long Flag |
| ------------------------- |:----------:| ----------:|
Expand All @@ -71,30 +83,79 @@ To use the docker client instead of shelling out to your local docker daemon, ad

```container-diff <img1> <img2> -e```

## Analysis Result Format

The JSONs for analysis results are in the following format:
```
{
"Image": "foo",
"AnalyzeType": "Apt",
"Analysis": {},
}
```
The possible structures of the `Analysis` field are detailed below.

### History Analysis

The history analyzer outputs a list of strings representing descriptions of how an image layer was created.

### Filesystem Analysis

The filesystem analyzer outputs a list of strings representing filesystem contents.

### Package Analysis

Package analyzers such as pip, apt, and node inspect the packages installed within the image provided. All package analyses leverage the PackageInfo struct, which contains the version and size for a given package instance, as detailed below:
```
type PackageInfo struct {
Version string
Size string
}
```

#### Single Version Package Analysis

Single version package analyzers (apt) have the following output structure: `map[string]PackageInfo`

In this mapping scheme, each package name is mapped to its PackageInfo as described above.

#### Multi Version Package Analysis

Multi version package analyzers (pip, node) have the following output structure: `map[string]map[string]PackageInfo`

## Output Format
In this mapping scheme, each package name corresponds to another map where the filesystem path to each unique instance of the package (i.e. unique version and/or size info) is mapped to that package instance's PackageInfo.


## Diff Result Format

The JSONs for diff results are in the following format:
```
{
"Image1": "foo",
"Image2": "bar",
"DiffType": "Apt",
"Diff": {},
}
```
The possible structures of the `Diff` field are detailed below.

### History Diff

The history differ has the following json output structure:

```
type HistDiff struct {
Image1 string
Image2 string
Adds []string
Dels []string
}
```

### File System Diff
### Filesystem Diff

The files system differ has the following json output structure:
The filesystem differ has the following json output structure:

```
type DirDiff struct {
Image1 string
Image2 string
Adds []string
Dels []string
Mods []string
Expand All @@ -105,44 +166,33 @@ type DirDiff struct {

Package differs such as pip, apt, and node inspect the packages contained within the images provided. All packages differs currently leverage the PackageInfo struct which contains the version and size for a given package instance.

```
type PackageInfo struct {
Version string
Size string
}
```

#### Single Version Diffs
#### Single Version Package Diffs

Single version differs (apt) have the following json output structure:

```
type PackageDiff struct {
Image1 string
Packages1 map[string]PackageInfo
Image2 string
Packages2 map[string]PackageInfo
InfoDiff []Info
}
```

Image1 and Image2 are the image names. Packages1 and Packages2 map package names to PackageInfo structs which contain the version and size of the package. InfoDiff contains a list of Info structs, each of which contains the package name (which occurred in both images but had a difference in size or version), and the PackageInfo struct for each package instance.
Packages1 and Packages2 map package names to PackageInfo structs which contain the version and size of the package. InfoDiff contains a list of Info structs, each of which contains the package name (which occurred in both images but had a difference in size or version), and the PackageInfo struct for each package instance.

#### Multi Version Diffs
#### Multi Version Package Diffs

The multi version differs (pip, node) support processing images which may have multiple versions of the same package. Below is the json output structure:

```
type MultiVersionPackageDiff struct {
Image1 string
Packages1 map[string]map[string]PackageInfo
Image2 string
Packages2 map[string]map[string]PackageInfo
InfoDiff []MultiVersionInfo
}
```

Image1 and Image2 are the image names. Packages1 and Packages2 map package name to path where the package was found to PackageInfo struct (version and size of that package instance). InfoDiff here is exanded to allow for multiple versions to be associated with a single package.
Packages1 and Packages2 map package name to path where the package was found to PackageInfo struct (version and size of that package instance). InfoDiff here is exanded to allow for multiple versions to be associated with a single package.

```
type MultiVersionInfo struct {
Expand Down Expand Up @@ -190,46 +240,43 @@ Version differences: None
```


## Make your own differ
## Make your own analyzer
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The updates here look really good and seem really clear. Great work!


Feel free to develop your own differ leveraging the utils currently available. PRs are welcome.
Feel free to develop your own analyzer leveraging the utils currently available. PRs are welcome.

### Custom Differ Quickstart
### Custom Analyzer Quickstart

In order to quickly make your own differ, follow these steps:
In order to quickly make your own analyzer, follow these steps:

1. Add your diff identifier to the flags in [root.go](https://github.com/GoogleCloudPlatform/container-diff/blob/ReadMe/cmd/root.go)
2. Determine if you can use existing differ tools. If you can make use of existing tools, you then need to construct the structs to feed into the diff tools by getting all of the packages for each image or the analogous quality to be diffed. To determine if you can leverage existing tools, think through these questions:
- Are you trying to diff packages?
1. Add your analyzer identifier to the flags in [root.go](https://github.com/GoogleCloudPlatform/container-diff/blob/ReadMe/cmd/root.go)
2. Determine if you can use existing analyzing or diffing tools. If you can make use of existing tools, you then need to construct the structs to feed into the tools by getting all of the packages for each image or the analogous quality to be analyzed. To determine if you can leverage existing tools, think through these questions:
- Are you trying to analyze packages?
- Yes: Does the relevant package manager support different versions of the same package on one image?
- Yes: Use `GetMultiVerisonMapDiff` to diff `map[string]map[string]utils.PackageInfo` objects. See [nodeDiff.go](https://github.com/GoogleCloudPlatform/container-diff/blob/master/differs/nodeDiff.go#L33) or [pipDiff.go](https://github.com/GoogleCloudPlatform/container-diff/blob/master/differs/pipDiff.go#L23) for examples.
- No: Use `GetMapDiff` to diff `map[string]utils.PackageInfo` objects. See [aptDiff.go](https://github.com/GoogleCloudPlatform/container-diff/blob/master/differs/aptDiff.go#L29).
- Yes: Implement `getPackages` to collect all versions of all packages within an image in a `map[string]map[string]PackageInfo`. Use `GetMultiVerisonMapDiff` to diff map objects. See [nodeDiff.go](https://github.com/GoogleCloudPlatform/container-diff/blob/master/differs/nodeDiff.go#L33) or [pipDiff.go](https://github.com/GoogleCloudPlatform/container-diff/blob/master/differs/pipDiff.go#L23) for examples.
- No: Implement `getPackages` to collect all versions of all packages within an image in a `map[string]PackageInfo`. Use `GetMapDiff` to diff map objects. See [aptDiff.go](https://github.com/GoogleCloudPlatform/container-diff/blob/master/differs/aptDiff.go#L29).
- No: Look to [History](https://github.com/GoogleCloudPlatform/container-diff/blob/ReadMe/differs/historyDiff.go) and [File System](https://github.com/GoogleCloudPlatform/container-diff/blob/ReadMe/differs/fileDiff.go) differs as models for diffing.

3. Write your Diff driver in the `differs` directory, such that you have a struct for your differ type and a method for that differ called Diff:
3. Write your analyzer driver in the `differs` directory, such that you have a struct for your analyzer type and two method for that differ: `Analyze` for single image analysis and `Diff` for comparison between two images:

```
type YourDiffer struct {}
type YourAnalyzer struct {}

func (d YourDiffer) Diff(image1, image2 utils.Image) (DiffResult, error) {...}
func (a YourAnalyzer) Analyze(image utils.Image) (utils.AnalyzeResult, error) {...}
func (a YourAnalyzer) Diff(image1, image2 utils.Image) (utils.DiffResult, error) {...}
```
The arguments passed to your differ contain the path to the unpacked tar representation of the image. That path can be accessed as such: `image1.FSPath`.
The image arguments passed to your analyzer contain the path to the unpacked tar representation of the image, as well as certain configuration information (e.g. environment variables upon image creation and image history).

If using existing package differ tools, you should create the appropriate structs to diff (determined in step 2 - either `map[string]map[string]utils.PackageInfo` or `map[string]utils.PackageInfo`) and then call the appropriate get diff function (also determined in step2 - either `GetMultiVerisonMapDiff` or `GetMapDiff`).
If using existing package differ tools, you should create the appropriate structs to analyze or diff. Otherwise, create your own analyzer which should yield information to fill an AnalyzeResult or DiffResult in the next step.

Otherwise, create your own differ which should yield information to fill a DiffResult in the next step.

4. Create a DiffResult for your differ.
4. Create a result struct following either the AnalyzeResult or DiffResult interface by implementing the following two methods.
```
type DiffResult interface {
GetStruct() DiffResult
OutputText(diffType string) error
}
GetStruct() DiffResult
OutputText(diffType string) error
```

This is where you define how your differ should output for a human readable format (`OutputText`) and as a struct which can then be written to a `.json` file. See [output_utils.go](https://github.com/GoogleCloudPlatform/container-diff/blob/master/utils/output_utils.go).
This is where you define how your analyzer should output for a human readable format (`OutputText`) and as a struct which can then be written to a `.json` file. See [diff_output_utils.go](https://github.com/GoogleCloudPlatform/container-diff/blob/master/utils/diff_output_utils.go) and [analyze_output_utils.go](https://github.com/GoogleCloudPlatform/container-diff/blob/master/analyze_output_utils.go).

5. Add your differ to the diffs map in [differs.go](https://github.com/GoogleCloudPlatform/container-diff/blob/master/differs/differs.go#L22) with the corresponding Differ struct as the value.
5. Add your analyzer to the `analyses` map in [differs.go](https://github.com/GoogleCloudPlatform/container-diff/blob/master/differs/differs.go#L22) with the corresponding Analyzer struct as the value.



Expand Down
Loading