Skip to content

Commit 4dd4125

Browse files
committed
Improve user doc in REPRO.md
1 parent 4819df4 commit 4dd4125

1 file changed

Lines changed: 68 additions & 46 deletions

File tree

REPRO.md

Lines changed: 68 additions & 46 deletions
Original file line numberDiff line numberDiff line change
@@ -21,27 +21,45 @@ Install the following Arch Linux packages:
2121
* diffoscope (to optionally check the reproducibility of the rootFS)
2222
* diffoci
2323

24-
## Set required environment variables
24+
## Prepare the build environment
2525

2626
Prepare the build environment by setting the following environment variables:
2727

28-
* IMAGE_BUILD_DATE: The build date of the `repro` image you want to reproduce.
28+
* `BUILD_VERSION`: The build version of the `repro` image you want to reproduce.
2929
For instance, if you're aiming to reproduce the `repro-20260331.0.508794` image:
30-
* `export IMAGE_BUILD_DATE="20260331"`
31-
* IMAGE_BUILD_NUMBER: The build number of the `repro` image you want to reproduce.
32-
For instance, if you're aiming to reproduce the `repro-20260331.0.508794` image:
33-
* `export IMAGE_BUILD_NUMBER="0.508794"`
34-
* ARCHIVE_SNAPSHOT: The date of the Arch Linux repository archive snaphot to build
35-
the image against. This is based on the `IMAGE_BUILD_DATE`:
36-
* `export ARCHIVE_SNAPSHOT=$(date -d "${IMAGE_BUILD_DATE} -1 day" +"%Y/%m/%d")`
37-
* SOURCE_DATE_EPOCH: The value to normalize timestamps with during the build.
38-
This is based on the `IMAGE_BUILD_DATE`:
39-
* `export SOURCE_DATE_EPOCH=$(date -u -d "${IMAGE_BUILD_DATE} 00:00:00" +"%s")`
30+
31+
```bash
32+
export BUILD_VERSION="20260331.0.508794"
33+
```
34+
35+
* `ARCHIVE_SNAPSHOT`: The date of the Arch Linux repository archive snaphot to build
36+
the image against. This is based on the date included in the image's `BUILD_VERSION`:
37+
38+
```bash
39+
export ARCHIVE_SNAPSHOT=$(date -d "${BUILD_VERSION%%.*} -1 day" +"%Y/%m/%d")
40+
```
41+
42+
* `SOURCE_DATE_EPOCH`: The value to normalize timestamps with during the build.
43+
This is based on the date included in the image's `BUILD_VERSION`:
44+
45+
```bash
46+
export SOURCE_DATE_EPOCH=$(date -u -d "${BUILD_VERSION%%.*} 00:00:00" +"%s")
47+
```
48+
49+
Then clone the [archlinux-docker](https://gitlab.archlinux.org/archlinux/archlinux-docker)
50+
repository and move into it:
51+
52+
```bash
53+
git clone https://gitlab.archlinux.org/archlinux/archlinux-docker.git
54+
cd archlinux-docker
55+
```
56+
57+
Note that all the following instructions assumes that you are at the root of the
58+
archlinux-docker repository cloned above.
4059

4160
## Build the rootFS and generate the Dockerfile
4261

43-
From a clone of the [archlinux-docker](https://gitlab.archlinux.org/archlinux/archlinux-docker)
44-
repository, build the rootFS with the required parameters:
62+
Build the rootFS with the required parameters:
4563

4664
```bash
4765
make \
@@ -58,18 +76,22 @@ The following built artifact will be located in `$PWD/output`:
5876

5977
## Optional - Check the rootFS reproducibility
6078

61-
At that point, if the above artifacts built for the image you're aiming to reproduce
62-
are still available for download from the
63-
[archlinux-docker pipelines](https://gitlab.archlinux.org/archlinux/archlinux-docker/-/pipelines)
64-
artifacts, you can optionally compare the content of the `repro.tar.zst.SHA256`
79+
At that point, if the artifacts built for the image you're aiming to reproduce
80+
are still available for download from the rootfs stage of the corresponding
81+
[archlinux-docker pipeline](https://gitlab.archlinux.org/archlinux/archlinux-docker/-/pipelines)
82+
, you can optionally compare the content of the `repro.tar.zst.SHA256`
6583
file from the pipeline to the one generated during the above local build (which
6684
should be the same, indicating that the rootFS has been successfully reproduced).
6785

6886
Additionally, you can check differences between the `repro.tar.zst` tarball from
69-
the pipeline and the one built during your local build with `diffoscope`:
70-
`diffoscope /tmp/repro.tar.zst $PWD/output/repro.tar.zst` *(where `/tmp/repro.tar.zst`
71-
is the rootFS tarball downloaded from the pipeline and `$PWD/output/repro.tar.zst` is
72-
the rootFS tarball you just built)*.
87+
the pipeline and the one built during your local build with `diffoscope`
88+
*(where `/tmp/repro.tar.zst` is the rootFS tarball downloaded from the pipeline and
89+
`$PWD/output/repro.tar.zst` is the rootFS tarball you just built)*:
90+
91+
```bash
92+
diffoscope /tmp/repro.tar.zst $PWD/output/repro.tar.zst
93+
```
94+
7395
This should show no difference, acting as additional indicator that the rootFS has been
7496
successfully reproduced.
7597

@@ -84,24 +106,27 @@ podman build \
84106
--source-date-epoch=$SOURCE_DATE_EPOCH \
85107
--rewrite-timestamp \
86108
-f "$PWD/output/Dockerfile.repro" \
87-
-t "archlinux-docker:repro-${IMAGE_BUILD_DATE}.${IMAGE_BUILD_NUMBER}" \
109+
-t "archlinux:repro-$BUILD_VERSION" \
88110
"$PWD/output"
89111
```
90112

91113
The built image will be accessible in your local podman container storage under the name:
92-
`localhost/archlinux-docker:repro-${IMAGE_BUILD_DATE}.${IMAGE_BUILD_NUMBER}`.
114+
`localhost/archlinux:repro-$BUILD_VERSION`.
93115

94116
## Check the image reproducibility
95117

96118
Pull the image you're aiming at reproducing from Docker Hub:
97-
`podman pull docker.io/archlinux/archlinux:repro-${IMAGE_BUILD_DATE}.${IMAGE_BUILD_NUMBER}`
119+
120+
```bash
121+
podman pull docker.io/archlinux/archlinux:repro-$BUILD_VERSION
122+
```
98123

99124
Compare the digest of the image pulled from Docker Hub to the digest of the image you built
100125
locally:
101126

102127
```bash
103-
podman inspect --format '{{.Digest}}' docker.io/archlinux/archlinux:repro-${IMAGE_BUILD_DATE}.${IMAGE_BUILD_NUMBER}
104-
podman inspect --format '{{.Digest}}' localhost/archlinux-docker:repro-${IMAGE_BUILD_DATE}.${IMAGE_BUILD_NUMBER}
128+
podman inspect --format '{{.Digest}}' docker.io/archlinux/archlinux:repro-$BUILD_VERSION
129+
podman inspect --format '{{.Digest}}' localhost/archlinux:repro-$BUILD_VERSION
105130
```
106131

107132
Both digests should be identical, indicating that the image has been successfully reproduced.
@@ -110,7 +135,7 @@ Additionally, you can check difference between the image pulled from Docker Hub
110135
the image you built locally with `diffoci`:
111136

112137
```bash
113-
diffoci diff --semantic --verbose podman://docker.io/archlinux/archlinux:repro-${IMAGE_BUILD_DATE}.${IMAGE_BUILD_NUMBER} podman://localhost/archlinux-docker:repro-${IMAGE_BUILD_DATE}.${IMAGE_BUILD_NUMBER}
138+
diffoci diff --semantic --verbose podman://docker.io/archlinux/archlinux:repro-$BUILD_VERSION podman://localhost/archlinux:repro-$BUILD_VERSION
114139
```
115140

116141
This should show no difference, acting as additional indicator that the image has been
@@ -119,29 +144,26 @@ successfully reproduced *(see the following section about the `--semantic` flag
119144
### Note about `diffoci` requiring the `--semantic` flag (a.k.a "non-strict" mode)
120145

121146
Docker / Podman does not allow to have two images with the same name & tag combination stored
122-
locally, [preventing them to be checked with `diffoci` as-is](https://github.com/reproducible-containers/diffoci/issues/74).
147+
locally, [making it impossible to check two images with the same name with
148+
`diffoci`](https://github.com/reproducible-containers/diffoci/issues/74) by cascade.
123149
To work around this limitation, one of the two image has to be named differently, whether by
124-
setting a different name / tag combination at build time or by renaming it post-build
125-
with e.g. `podman tag`.
150+
setting a different name / tag combination at build time (as done in this guide) or by renaming
151+
it post-build with e.g. `podman tag`.
126152

127153
However, the image name & tag combination is automatically reported (and updated in the case
128154
of a renaming) in the image annotations / metadata and it's apparently not possible to fully overwrite
129155
it during build or update it post-build in a straightforward way.
130156
This introduces unavoidable non-determinism
131-
in the image annotations / metadata that `diffoci` will report by default.
132-
See for instance the following `diffoci` output (with the reported difference being introduced by
133-
using `podman tag` to "rename" one of the images with the "-orig" suffix, in order to avoid name collision):
157+
in the image annotations / metadata that `diffoci` will therefore systematically report by default.
158+
See for instance the following `diffoci` output reporting a difference in the image name annotation:
134159

135160
```
136161
Event: "DescriptorMismatch" (field "Annotations")
137162
map[string]string{
138-
"io.containerd.image.name": strings.Join({
139-
"registry.archlinux.org/archlinux/archlinux-docker:repro-repro",
140-
- "-orig",
141-
}, ""),
142-
- "org.opencontainers.image.ref.name": "repro-repro-orig",
143-
+ "org.opencontainers.image.ref.name": "repro-repro",
144-
}
163+
"io.containerd.image.name": strings.Join({
164+
- "docker.io/archlinux/archlinux:repro-20260331.0.508794",
165+
+ "localhost/archlinux:repro-20260331.0.508794",
166+
}, ""),
145167
```
146168

147169
Given that it's currently not possible to have two images with the same name & tag
@@ -154,12 +176,12 @@ This is why we are "forced" to run `diffoci` with the `--semantic` flag
154176
which ignores some attributes, including image name annotations.
155177

156178
While having to run `diffoci` with the `--semantic` flag (for the lack of another option)
157-
just to workaround this image naming technical constraint is unfortunate, we can attest that:
179+
just to workaround this technical constraint is unfortunate, we can attest that:
158180

159181
* This limitation is specific to metadata handling in container tooling and does not
160182
affect the actual filesystem contents or runtime behavior of the image.
161-
* The reported difference in the image name annotations is (or is supposed to be, at least) the **only**
162-
difference being reported when comparing the two images.
163-
* These image name annotations are not part of the hashed object when generating the image digest,
164-
meaning that this difference does not go in the way of digest equality between the two images (allowing
183+
* The reported difference in the image name annotation when running `diffoci` in default / strict mode
184+
is (or is supposed to be, at least) the **only** difference being reported when comparing the two images.
185+
* This image name annotation is not part of the hashed object when generating the image digest,
186+
meaning that this difference does not prevent digest equality between the two images (allowing
165187
us to claim bit for bit reproducibility regardless).

0 commit comments

Comments
 (0)