tags: container

dockerfile使用COPY覆盖父镜像定义的volume

1. 问题: dockerfile中对父镜像的volume的修改,会被丢弃

以下dockerfile在父镜像A中定义了一个volume, 地址为/vol。在新编译的B镜像中,我们对/vol下的文件做了修改。我们想要这个修改可以保留在新编译的镜像B中,但实际却无法做到。

FROM alpine:3.14 AS A
RUN mkdir "/vol" && echo "FOO" > "/vol/data"
VOLUME /vol

FROM A AS B
RUN sed -i s/FOO/BAR/g "/vol/data" && cat /vol/data
RUN cat /vol/data

编译上述dockerfile, 我们看到,在同一条RUN指令中,/vol/data的内容确实被修改成了BAR。但是,下一条RUN指令,/vol/data仍然是FOO。

Sending build context to Docker daemon  2.048kB
Step 1/6 : FROM alpine:3.14 AS A
 ---> d4ff818577bc
Step 2/6 : RUN mkdir "/vol" && echo "FOO" > "/vol/data"
 ---> Running in c10987fe0acd
Removing intermediate container c10987fe0acd
 ---> 8a7395e87d34
Step 3/6 : VOLUME /vol
 ---> Running in 611a152bd6ac
Removing intermediate container 611a152bd6ac
 ---> 637ce773f44b
Step 4/6 : FROM A AS B
 ---> 637ce773f44b
Step 5/6 : RUN sed -i s/FOO/BAR/g "/vol/data" && cat /vol/data
 ---> Running in ce1481fd1eb8
BAR
Removing intermediate container ce1481fd1eb8
 ---> 873b7801b7c2
Step 6/6 : RUN cat /vol/data
 ---> Running in bd0105a6d801
FOO
Removing intermediate container bd0105a6d801
 ---> e721c33e07ef
Successfully built e721c33e07ef

2. 原因

docker的官方文档中,我们可以看到对volume有这样的描述:

Changing the volume from within the Dockerfile: If any build steps change the data within the volume after it has been declared, those changes will be discarded.

即在Dockerfile中,在volume已声明后,对其的修改都会被丢弃。

这也容易理解,因为每次RUN指令,都是通过启动一个容器来实现的。在容器内的volume是一个新的mount点,这和镜像的volume不同。如果在commit时,没有保留对volume的修改,则不会影响镜像的volume。

3. 解决办法: COPY

在stackoverflow上我们找到了类似的问题和解决办法

可以使用COPY指令,对VOLUME进行修改。

以下dockerfile中,镜像D从镜像C中copy了文件,镜像E从本地copy了文件,这两种方法都可以修改VOLUME。

FROM alpine:3.14 AS A
RUN mkdir "/vol" && echo "FOO" > "/vol/data"
VOLUME /vol

FROM A AS B
RUN sed -i s/FOO/BAR/g "/vol/data" && cat /vol/data
RUN cat /vol/data

FROM alpine:3.14 AS C
RUN mkdir "/vol" && echo "BAR" > "/vol/data"

FROM A AS D
COPY --from=C /vol/data /vol/data
RUN cat /vol/data
RUN sed -i s/BAR/OFF/g "/vol/data" && cat /vol/data
RUN cat /vol/data

FROM A AS E
COPY local /vol/data
RUN cat /vol/data
RUN sed -i s/BAR/OFF/g "/vol/data" && cat /vol/data
RUN cat /vol/data

但是我们也可以观察到,COPY指令确实可以修改VOLUME, 但是COPY之后, RUN指令还是不能修改VOLUME

Sending build context to Docker daemon  3.072kB
Step 1/18 : FROM alpine:3.14 AS A
 ---> d4ff818577bc
Step 2/18 : RUN mkdir "/vol" && echo "FOO" > "/vol/data"
 ---> Using cache
 ---> 8a7395e87d34
Step 3/18 : VOLUME /vol
 ---> Using cache
 ---> 637ce773f44b
Step 4/18 : FROM A AS B
 ---> 637ce773f44b
Step 5/18 : RUN sed -i s/FOO/BAR/g "/vol/data" && cat /vol/data
 ---> Using cache
 ---> 873b7801b7c2
Step 6/18 : RUN cat /vol/data
 ---> Using cache
 ---> e721c33e07ef
Step 7/18 : FROM alpine:3.14 AS C
 ---> d4ff818577bc
Step 8/18 : RUN mkdir "/vol" && echo "BAR" > "/vol/data"
 ---> Using cache
 ---> cfcf70f1aa1d
Step 9/18 : FROM A AS D
 ---> 637ce773f44b
Step 10/18 : COPY --from=C /vol/data /vol/data
 ---> aba2b1406b7d
Step 11/18 : RUN cat /vol/data
 ---> Running in 6d62b136f91d
BAR
Removing intermediate container 6d62b136f91d
 ---> e9b1fc3f97a0
Step 12/18 : RUN sed -i s/BAR/OFF/g "/vol/data" && cat /vol/data
 ---> Running in 0af235286554
OFF
Removing intermediate container 0af235286554
 ---> 9ddce12792c1
Step 13/18 : RUN cat /vol/data
 ---> Running in 8df52a875f51
BAR
Removing intermediate container 8df52a875f51
 ---> 85fbbc714ddc
Step 14/18 : FROM A AS E
 ---> 637ce773f44b
Step 15/18 : COPY local /vol/data
 ---> 14891ecbb893
Step 16/18 : RUN cat /vol/data
 ---> Running in 7e052e9fd375
BAR
Removing intermediate container 7e052e9fd375
 ---> 6e11f30e96a0
Step 17/18 : RUN sed -i s/BAR/OFF/g "/vol/data" && cat /vol/data
 ---> Running in 2148e7554efe
OFF
Removing intermediate container 2148e7554efe
 ---> 9819a6fd0a3e
Step 18/18 : RUN cat /vol/data
 ---> Running in b07e64ef6680
BAR
Removing intermediate container b07e64ef6680
 ---> 8b194e973d50
Successfully built 8b194e973d50

4. 为什么COPY指令可以?

翻阅源码,我们发现RUN指令是通过运行容器实现的: https://github.com/moby/moby/blob/master/builder/dockerfile/dispatchers.go#L384

func dispatchRun(d dispatchRequest, c *instructions.RunCommand) error {
    ...
    if err := d.builder.containerManager.Run(d.builder.clientCtx, cID, d.builder.Stdout, d.builder.Stderr); err != nil {
        ...
    }

    ...
    return d.builder.commitContainer(d.state, cID, runConfigForCacheProbe)
}

而COPY指令没有运行容器的动作,是直接向RWLayer写入内容实现的:

https://github.com/moby/moby/blob/471fd27709777d2cce3251129887e14e8bb2e0c7/builder/dockerfile/internals.go#L217

func (b *Builder) performCopy(req dispatchRequest, inst copyInstruction) error {
    ...
    rwLayer, err := imageMount.NewRWLayer()
    ...
        if err := performCopyForInfo(destInfo, info, opts); err != nil {
    ...
    return b.exportImage(state, rwLayer, imageMount.Image(), runConfigWithCommentCmd)
}

因此两者存在上述的差异。在上文我们已经分析了RUN指令不能修改VOLUME的大致原因,对于COPY指令为什么可以的原因是类似的,即COPY指令是修改了镜像的RWLayer,并commit了layer的修改; 而RUN指令是对容器新mount的volume,未commit修改。