tags: container,源码分析

docker cp 源码分析

本文编写时,最新release为v20.10.6, 因此代码均为v20.10.6分支的代码

1. docker cp简介

在容器和宿主机的文件系统直接复制文件。执行cp命令的入口有两个,分别是docker container cpdocker cp, 两者作用相同, 后者可能在未来会被删除。

https://docs.docker.com/engine/reference/commandline/container_cp/

2. 源码入口位置

由cli接收cp命令参数,发送至docker engine api

cli与engine api的代码入口分别位于:

https://github.com/docker/cli/blob/v20.10.6/cli/command/container/cp.go#L43

https://github.com/moby/moby/blob/v20.10.6/api/server/router/container/copy.go

3. docker-cli

执行cp命令的两个入口分别位于

https://github.com/docker/cli/blob/v20.10.6/cli/command/container/cmd.go#L20

func NewContainerCommand(dockerCli command.Cli) *cobra.Command {
    ...
	cmd.AddCommand(
    ...
		NewCopyCommand(dockerCli),
    ...

https://github.com/docker/cli/blob/v20.10.6/cli/command/commands/commands.go#L96

func AddCommands(cmd *cobra.Command, dockerCli command.Cli) {
    ...
    cmd.AddCommand(
    ...
        hide(container.NewCopyCommand(dockerCli)),
    ...

两者分别用docker cp命令和docker container cp命令,两者都调用同一个函数NewCopyCommand

https://github.com/docker/cli/blob/v20.10.6/cli/command/container/cp.go#L43

func NewCopyCommand(dockerCli command.Cli) *cobra.Command {
	var opts copyOptions

	cmd := &cobra.Command{
        ...
		Args: cli.ExactArgs(2),
		RunE: func(cmd *cobra.Command, args []string) error {
			if args[0] == "" {
				return errors.New("source can not be empty")
			}
			if args[1] == "" {
				return errors.New("destination can not be empty")
			}
			opts.source = args[0]
			opts.destination = args[1]
			return runCopy(dockerCli, opts)
		},
	}

	flags := cmd.Flags()
	flags.BoolVarP(&opts.followLink, "follow-link", "L", false, "Always follow symbol link in SRC_PATH")
	flags.BoolVarP(&opts.copyUIDGID, "archive", "a", false, "Archive mode (copy all uid/gid information)")
	return cmd
}

该函数定义了cp命令的参数, 具体实现调用runCopy函数。仅支持从容器复制到宿主机、从宿主机复制到容器,分别调用copyFromContainercopyToContainer,不支持容器间复制,也不支持仅在宿主机间复制。

https://github.com/docker/cli/blob/v20.10.6/cli/command/container/cp.go#L77

func runCopy(dockerCli command.Cli, opts copyOptions) error {
	srcContainer, srcPath := splitCpArg(opts.source)
	destContainer, destPath := splitCpArg(opts.destination)

    ...

	switch direction {
	case fromContainer:
		return copyFromContainer(ctx, dockerCli, copyConfig)
	case toContainer:
		return copyToContainer(ctx, dockerCli, copyConfig)
	case acrossContainers:
		return errors.New("copying between containers is not supported")
	default:
		return errors.New("must specify at least one container source")

copyFromContainercopyToContainer主要负责拼装参数,并调用docker engine api

3.1 copyFromContainer

如果执行cp命令时设置了followLink选项,则会将srcPath更新为软链接指向的地址。

https://github.com/docker/cli/blob/v20.10.6/cli/command/container/cp.go#L137-L154

if copyConfig.followLink {
    srcStat, err := client.ContainerStatPath(ctx, copyConfig.container, srcPath)

    // If the destination is a symbolic link, we should follow it.
    if err == nil && srcStat.Mode&os.ModeSymlink != 0 {
        linkTarget := srcStat.LinkTarget
        ...
        srcPath = linkTarget
    }

}

其中,获取软链接指向地址是由ContainerStatPath函数通过调用api实现的。

实际复制的操作是由CopyFromContainer函数通过调用api实现的。

https://github.com/docker/cli/blob/v20.10.6/cli/command/container/cp.go#L156

content, stat, err := client.CopyFromContainer(ctx, copyConfig.container, srcPath)

下面将调用api得到的输出复制到dstPath。

如果dstPath是-, 就会直接输出到stdout。 https://github.com/docker/cli/blob/v20.10.6/cli/command/container/cp.go#L162-L179

if dstPath == "-" {
    _, err = io.Copy(dockerCli.Out(), content)
    return err
}

如果之前使用了followLink选项,因为更改了srcPath, 所以要把打包文件中的路径修正回原始的路径

preArchive := content
if len(srcInfo.RebaseName) != 0 {
    _, srcBase := archive.SplitPathDirEntry(srcInfo.Path)
    preArchive = archive.RebaseArchiveEntries(content, srcBase, srcInfo.RebaseName)
}

最后将打包文件复制到目的路径。

return archive.CopyTo(preArchive, srcInfo, dstPath)

这一过程虽然调用的是docker engine的pkg代码,但完全由cli执行, 主要就是将archive解压到dstPath。

https://github.com/moby/moby/blob/v20.10.6/pkg/archive/copy.go#L402

3.2 copyToContainer

如果dstPath是一个软链接, 则将目的路径更新为软链接指向的地址

https://github.com/docker/cli/blob/v20.10.6/cli/command/container/cp.go#L200-L219

dstStat, err := client.ContainerStatPath(ctx, copyConfig.container, dstPath)

// If the destination is a symbolic link, we should evaluate it.
if err == nil && dstStat.Mode&os.ModeSymlink != 0 {
    linkTarget := dstStat.LinkTarget
    if !system.IsAbs(linkTarget) {
        // Join with the parent directory.
        dstParent, _ := archive.SplitPathDirEntry(dstPath)
        linkTarget = filepath.Join(dstParent, linkTarget)
    }

    dstInfo.Path = linkTarget
...
}

如果srcPath-, 则目的路径必须是目录(根据帮助文档,设计上就是这样从stdin解压到目录,但不清楚为什么这么设计)

https://github.com/docker/cli/blob/v20.10.6/cli/command/container/cp.go#L236-L242

if srcPath == "-" {
    content = os.Stdin
    resolvedDstPath = dstInfo.Path
    if !dstInfo.IsDir {
        return errors.Errorf("destination \"%s:%s\" must be a directory", copyConfig.container, dstPath)
    }
}

将srcPath打包(如果设置了followLink选项,srcPath会更新为目的地址)

// Prepare source copy info.
srcInfo, err := archive.CopyInfoSourcePath(srcPath, copyConfig.followLink)
if err != nil {
    return err
}

srcArchive, err := archive.TarResource(srcInfo)
if err != nil {
    return err
}
defer srcArchive.Close()
dstDir, preparedArchive, err := archive.PrepareArchiveCopy(srcArchive, srcInfo, dstInfo)
if err != nil {
    return err
}
defer preparedArchive.Close()

resolvedDstPath = dstDir
content = preparedArchive

最后复制到容器,实际复制的操作是由CopyToContainer函数通过调用api实现的。

options := types.CopyToContainerOptions{
    AllowOverwriteDirWithFile: false,
    CopyUIDGID:                copyConfig.copyUIDGID,
}
return client.CopyToContainer(ctx, copyConfig.container, resolvedDstPath, content, options)

4. docker engine api

在cli中调用的api有3个:

对应的服务端代码分别为:

下面分析这三个函数的具体流程

4.1 headContainersArchive(ContainerStatPath)

https://github.com/moby/moby/blob/v20.10.6/api/server/router/container/copy.go#L78

这个api有两个参数,分别是requestUri中传入的name(=ContainerID)和query中传入的path。

path取值可能为srcPath,dstPath,dstPath指向的地址

解析完name, path参数后,由ContainerStatPath函数处理

https://github.com/moby/moby/blob/v20.10.6/daemon/archive.go#L75

func (daemon *Daemon) ContainerStatPath(name string, path string) (stat *types.ContainerPathStat, err error) {
	container, err := daemon.GetContainer(name)
    ...
	stat, err = daemon.containerStatPath(container, path)
    ...
}

先挂载容器的文件系统和各个存储卷, 然后得到path在rootfs下的resolvedPath(// TODO)和absPath(//TODO), 再根据它们得到ContainerPathStat

https://github.com/moby/moby/blob/v20.10.6/daemon/archive.go#L152

func (daemon *Daemon) containerStatPath(container *container.Container, path string) (stat *types.ContainerPathStat, err error) {
    ...
	if err = daemon.Mount(container); err != nil {
		return nil, err
	}
    ...
	err = daemon.mountVolumes(container)
    ...
	resolvedPath, absPath, err := container.ResolvePath(path)
    ...
	return container.StatPath(resolvedPath, absPath)
}

ContainerPathStat是这样得到的

https://github.com/moby/moby/blob/v20.10.6/container/archive.go#L51

func (container *Container) StatPath(resolvedPath, absPath string) (stat *types.ContainerPathStat, err error) {
    ...
	lstat, err := driver.Lstat(resolvedPath)
	if err != nil {
		return nil, err
	}

	var linkTarget string
	if lstat.Mode()&os.ModeSymlink != 0 {
		// Fully evaluate the symlink in the scope of the container rootfs.
		hostPath, err := container.GetResourcePath(absPath)
		if err != nil {
			return nil, err
		}

		linkTarget, err = driver.Rel(driver.Path(), hostPath)
		if err != nil {
			return nil, err
		}

		// Make it an absolute path.
		linkTarget = driver.Join(string(driver.Separator()), linkTarget)
	}

	return &types.ContainerPathStat{
		Name:       driver.Base(absPath),
		Size:       lstat.Size(),
		Mode:       lstat.Mode(),
		Mtime:      lstat.ModTime(),
		LinkTarget: linkTarget,
	}, nil
}

最后,将ContainerPathStat返回到header头中 https://github.com/moby/moby/blob/v20.10.6/api/server/router/container/copy.go#L64

func setContainerPathStatHeader(stat *types.ContainerPathStat, header http.Header) error {
	statJSON, err := json.Marshal(stat)
	if err != nil {
		return err
	}

	header.Set(
		"X-Docker-Container-Path-Stat",
		base64.StdEncoding.EncodeToString(statJSON),
	)

	return nil
}

4.2 getContainersArchive(CopyFromContainer)

这个api有两个参数,分别是requestUri中传入的name(=ContainerID)和query中传入的path(=srcPath)。

打包完后将数据压缩并返回在body中。

https://github.com/moby/moby/blob/v20.10.6/api/server/router/container/copy.go#L115

func (s *containerRouter) getContainersArchive(ctx context.Context, w http.ResponseWriter, r *http.Request, vars map[string]string) error {
	v, err := httputils.ArchiveFormValues(r, vars)
	if err != nil {
		return err
	}

	tarArchive, stat, err := s.backend.ContainerArchivePath(v.Name, v.Path)
	if err != nil {
		return err
	}
	defer tarArchive.Close()

	if err := setContainerPathStatHeader(stat, w.Header()); err != nil {
		return err
	}

	w.Header().Set("Content-Type", "application/x-tar")
	return writeCompressedResponse(w, r, tarArchive)
}

调用daemon.containerArchivePath函数打包 https://github.com/moby/moby/blob/v20.10.6/daemon/archive.go#L100

func (daemon *Daemon) ContainerArchivePath(name string, path string) (content io.ReadCloser, stat *types.ContainerPathStat, err error) {
    ctr, err := daemon.GetContainer(name)
    ...
    content, stat, err = daemon.containerArchivePath(ctr, path)
    ...

containerArchivePath函数会创建一个关于path的归档文件,并返回一个ContainerPathStat

其中获取ContainerPathStat的过程与headContainersArchive几乎一致 https://github.com/moby/moby/blob/v20.10.6/daemon/archive.go#L182-L218

func (daemon *Daemon) containerArchivePath(container *container.Container, path string) (content io.ReadCloser, stat *types.ContainerPathStat, err error) {
    ...
    if err = daemon.Mount(container); err != nil {
        return nil, nil, err
    }
    ...
    if err = daemon.mountVolumes(container); err != nil {
        return nil, nil, err
    }

    // Normalize path before sending to rootfs
    path = container.BaseFS.FromSlash(path)

    resolvedPath, absPath, err := container.ResolvePath(path)
    ...
    stat, err = container.StatPath(resolvedPath, absPath)
    ...
}

关键在于打包的过程:

https://github.com/moby/moby/blob/v20.10.6/daemon/archive.go#L231-L263

driver := container.BaseFS
...
opts := archive.TarResourceRebaseOpts(sourceBase, driver.Base(absPath))

data, err := archivePath(driver, sourceDir, opts, container.BaseFS.Path())
...
content = ioutils.NewReadCloserWrapper(data, func() error {
    err := data.Close()
    container.DetachAndUnmount(daemon.LogVolumeEvent)
    daemon.Unmount(container)
    container.Unlock()
    return err
})

如果driver(container.BaseFS)有实现ArchivePath函数,则优先使用该函数。

目前localfs均未实现该函数,仅发现remotefs(lcowfs)有此实现 https://github.com/moby/moby/blob/v20.10.6/daemon/graphdriver/lcow/remotefs.go#L83

否则,使用chrootarchive.Tar打包

func archivePath(i interface{}, src string, opts *archive.TarOptions, root string) (io.ReadCloser, error) {
	if ap, ok := i.(archiver); ok {
		return ap.ArchivePath(src, opts)
	}
	return chrootarchive.Tar(src, opts, root)
}

4.2.1 chrootarchive.Tar

https://github.com/moby/moby/blob/v20.10.6/pkg/chrootarchive/archive.go#L101

func Tar(srcPath string, options *archive.TarOptions, root string) (io.ReadCloser, error) {
	if options == nil {
		options = &archive.TarOptions{}
	}
	return invokePack(srcPath, options, root)
}

调用docker-tar命令打包 https://github.com/moby/moby/blob/v20.10.6/pkg/chrootarchive/archive_unix.go#L178

func invokePack(srcPath string, options *archive.TarOptions, root string) (io.ReadCloser, error) {
    if root == "" {
        return nil, errors.New("root path must not be empty")
    }

    ...

    cmd := reexec.Command("docker-tar", relSrc, root)

docker-tar的实现对应tar函数 https://github.com/moby/moby/blob/v20.10.6/pkg/chrootarchive/init_unix.go#L17

reexec.Register("docker-tar", tar)

具体实现调用archive.TarWithOptions(关于archive模块的源码分析,暂不在本文展开), 但在此之前先进行了chroot

func tar() {
    ...
    if err := realChroot(root); err != nil {
        fatal(err)
    }
    ...	
    rdr, err := archive.TarWithOptions(src, &options)
    if err != nil {
        fatal(err)
    }
    defer rdr.Close()

    if _, err := io.Copy(os.Stdout, rdr); err != nil {
        fatal(err)
    }

    os.Exit(0)
}

4.2.2 lcowfs

lcowfs通过执行remotefs archivepath path命令打包

https://github.com/moby/moby/blob/v20.10.6/daemon/graphdriver/lcow/remotefs.go#L83

if err := l.runRemoteFSProcess(tarBuf, w, remotefs.ArchivePathCmd, src); err != nil {
...

https://github.com/moby/moby/blob/v20.10.6/daemon/graphdriver/lcow/remotefs.go#L127-L129

func (l *lcowfs) runRemoteFSProcess(stdin io.Reader, stdout io.Writer, args ...string) error {
    ...
    cmd := fmt.Sprintf("%s %s", remotefs.RemotefsCmd, strings.Join(args, " "))
    stderr := &bytes.Buffer{}
    if err := l.currentSVM.runProcess(cmd, stdin, stdout, stderr); err != nil {
        return err
    }
    ...
}

remotefs是一个大命令,archivepath是其中一个子命令。archivePath对应的实现为

https://github.com/moby/moby/blob/v20.10.6/vendor/github.com/Microsoft/opengcs/service/gcsutils/remotefs/remotefs.go#L48

// Commands provide a string -> remotefs function mapping.
// This is useful for commandline programs that will receive a string
// as the function to execute.
var Commands = map[string]Func{
    ...
	ArchivePathCmd:    ArchivePath,
}

archivepath具体实现如下,打包函数实际为archive.TarWithOptions, 关于archive模块的源码分析,暂不在本文展开。

https://github.com/moby/moby/blob/v20.10.6/vendor/github.com/Microsoft/opengcs/service/gcsutils/remotefs/remotefs.go#L559

// ArchivePath archives the given directory and writes it to out.
// Args:
// - in = size of json | json of archive.TarOptions
// - args[0] = source directory name
// Out:
// - out = tar file of the archive
func ArchivePath(in io.Reader, out io.Writer, args []string) error {
	if len(args) < 1 {
		return ErrInvalid
	}

	opts, err := ReadTarOptions(in)
	if err != nil {
		return err
	}

	r, err := archive.TarWithOptions(args[0], opts)
	if err != nil {
		return err
	}

	if _, err := io.Copy(out, r); err != nil {
		return err
	}
	return nil
}

4.3 putContainersArchive(CopyToContainer)

putContainersArchive有4个由客户端传入的参数,分别是requestUri中传入的name(=ContainerID),query中传入的path(=dstPath), noOverwriteDirNonDir(=true), copyUIDGID(=true) https://github.com/moby/moby/blob/v20.10.6/client/container_copy.go#L33-L59

query.Set("path", filepath.ToSlash(dstPath)) // Normalize the paths used in the API.
// Do not allow for an existing directory to be overwritten by a non-directory and vice versa.
if !options.AllowOverwriteDirWithFile {
    query.Set("noOverwriteDirNonDir", "true")
}

if options.CopyUIDGID {
    query.Set("copyUIDGID", "true")
}

apiPath := "/containers/" + containerID + "/archive"

解析完参数后,由putContainersArchive函数解包 https://github.com/moby/moby/blob/v20.10.6/api/server/router/container/copy.go#L135

func (s *containerRouter) putContainersArchive(ctx context.Context, w http.ResponseWriter, r *http.Request, vars map[string]string) error {
    ...
    return s.backend.ContainerExtractToDir(v.Name, v.Path, copyUIDGID, noOverwriteDirNonDir, r.Body)
}

containerExtractToDir函数中,一开始仍然是先执行mountrootfs和volume操作 https://github.com/moby/moby/blob/v20.10.6/daemon/archive.go#L273-L289

func (daemon *Daemon) containerExtractToDir(container *container.Container, path string, copyUIDGID, noOverwriteDirNonDir bool, content io.Reader) (err error) {
    container.Lock()
    defer container.Unlock()

    if err = daemon.Mount(container); err != nil {
        return err
    }
    defer daemon.Unmount(container)

    err = daemon.mountVolumes(container)
    defer container.DetachAndUnmount(daemon.LogVolumeEvent)
    if err != nil {
        return err
    }

    // Normalize path before sending to rootfs'
    path = container.BaseFS.FromSlash(path)
    driver := container.BaseFS

获取软链接目的路径,判断目的路径是否是目录,如果不是目录,返回报错

https://github.com/moby/moby/blob/v20.10.6/daemon/archive.go#L315-L318

func (daemon *Daemon) containerExtractToDir(container *container.Container, path string, copyUIDGID, noOverwriteDirNonDir bool, content io.Reader) (err error) {
    ...
    
    resolvedPath, err := container.GetResourcePath(absPath)
    if err != nil {
        return err
    }

    stat, err := driver.Lstat(resolvedPath)
    if err != nil {
        return err
    }

    if !stat.IsDir() {
        return ErrExtractPointNotDirectory
    }

限制目的路径不能是只读的

https://github.com/moby/moby/blob/v20.10.6/daemon/archive.go#L356-L363

func (daemon *Daemon) containerExtractToDir(container *container.Container, path string, copyUIDGID, noOverwriteDirNonDir bool, content io.Reader) (err error) {
    ...
    toVolume, err := checkIfPathIsInAVolume(container, absPath)
    ...
    if !toVolume && container.HostConfig.ReadonlyRootfs {
        return ErrRootFSReadOnly
    }

调用extractArchive函数解包,如果配置了copyUIDGID,会把文件的uid,gid修改为容器内的主用户,否则保留原文件的uid,gid。 https://github.com/moby/moby/blob/v20.10.6/daemon/archive.go#L365-L379

func (daemon *Daemon) containerExtractToDir(container *container.Container, path string, copyUIDGID, noOverwriteDirNonDir bool, content io.Reader) (err error) {
    ...
    options := daemon.defaultTarCopyOptions(noOverwriteDirNonDir)

    if copyUIDGID {
        var err error
        // tarCopyOptions will appropriately pull in the right uid/gid for the
        // user/group and will set the options.
        options, err = daemon.tarCopyOptions(container, noOverwriteDirNonDir)
        if err != nil {
            return err
        }
    }

    if err := extractArchive(driver, content, resolvedPath, options, container.BaseFS.Path()); err != nil {
        return err
    }

如果driver(container.BaseFS)有ExtractArchive方法,则使用该方法(目前只有lcowfs实现了该方法)。否则调用chrootarchive.UntarWithRoot

https://github.com/moby/moby/blob/v20.10.6/daemon/archive.go#L34

func extractArchive(i interface{}, src io.Reader, dst string, opts *archive.TarOptions, root string) error {
    if ea, ok := i.(extractor); ok {
        return ea.ExtractArchive(src, dst, opts)
    }

    return chrootarchive.UntarWithRoot(src, dst, opts, root)
}

4.3.1 chrootarchive.UntarWithRoot

https://github.com/moby/moby/blob/v20.10.6/pkg/chrootarchive/archive.go#L54

func UntarWithRoot(tarArchive io.Reader, dest string, options *archive.TarOptions, root string) error {
	return untarHandler(tarArchive, dest, options, true, root)
}

如果容器内不存在目的目录,就创建并chown,然后调用invokeUnpack函数。 https://github.com/moby/moby/blob/v20.10.6/pkg/chrootarchive/archive.go#L66

func untarHandler(tarArchive io.Reader, dest string, options *archive.TarOptions, decompress bool, root string) error {
    ...
    if _, err := os.Stat(dest); os.IsNotExist(err) {
        if err := idtools.MkdirAllAndChownNew(dest, 0755, rootIDs); err != nil {
            return err
        }
    }
    ...
    return invokeUnpack(r, dest, options, root)
    }

调用docker-untar命令解包

https://github.com/moby/moby/blob/v20.10.6/pkg/chrootarchive/archive_unix.go#L61

func invokeUnpack(decompressedArchive io.Reader, dest string, options *archive.TarOptions, root string) error {
    ...
	cmd := reexec.Command("docker-untar", dest, root)
	cmd.Stdin = decompressedArchive
    ...
	if err := cmd.Start(); err != nil {
		w.Close()
		return fmt.Errorf("Untar error on re-exec cmd: %v", err)
	}
    ...
}

docker-untar实现对应chrootarchive.untar函数,先chroot,再调用archive.Unpack解包。 关于archive模块的源码分析,暂不在本文展开。

https://github.com/moby/moby/blob/v20.10.6/pkg/chrootarchive/archive_unix.go#L25

func untar() {
	...
	if err := chroot(root); err != nil {
		fatal(err)
	}

	if err := archive.Unpack(os.Stdin, dst, &options); err != nil {
		fatal(err)
	}
	...
}

4.3.2 lcowfs

lcowfs中调用extractarchive实现解包

https://github.com/moby/moby/blob/v20.10.6/daemon/graphdriver/lcow/remotefs.go#L77

func (l *lcowfs) ExtractArchive(src io.Reader, dst string, opts *archive.TarOptions) error {
    ...
    if err := l.runRemoteFSProcess(input, nil, remotefs.ExtractArchiveCmd, dst); err != nil {
}

extractarchive命令对应ExtractArchive函数

https://github.com/moby/moby/blob/v20.10.6/vendor/github.com/Microsoft/opengcs/service/gcsutils/remotefs/remotefs.go#L47

var Commands = map[string]Func{
    ...
    ExtractArchiveCmd: ExtractArchive,

ExtractArchive函数中调用archive.Untar

func ExtractArchive(in io.Reader, out io.Writer, args []string) error {
    ...
    if err := archive.Untar(in, args[0], opts); err != nil {
}