Kubernetes学习(daemonset)

发表于 2020-01-24 分类于 Kubernetes 阅读次数：本文字数： 5k 阅读时长 ≈ 5 分钟

daemonset是kubernetes常用且重要的资源对象, DaemonSet 可以保证集群中所有的或者部分的节点都能够运行同一份 Pod 副本，每当有新的节点被加入到集群时，Pod 就会在目标的节点上启动，如果节点被从集群中剔除，节点上的 Pod 也会被垃圾收集器清除, 比如xx-agent， fluentd, node-exporter等, daemonset在kubernetes中随处可见.

daemonset创建

daemonset无法通过kubectl命令行直接进行创建, 可以先–dry-run先创建deployment后再修改

这里以kube-proxy的daemonset yaml文件为例:

apiVersion: apps/v1
kind: DaemonSet
metadata:
  labels:
    k8s-app: kube-proxy
  name: kube-proxy
  namespace: kube-system
spec:
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      k8s-app: kube-proxy
  template:
    metadata:
      creationTimestamp: null
      labels:
        k8s-app: kube-proxy
    spec:
      containers:
      - command:
        - /usr/local/bin/kube-proxy
        - --config=/var/lib/kube-proxy/config.conf
        - --hostname-override=$(NODE_NAME)
        env:
        - name: NODE_NAME
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: spec.nodeName
        image: registry.aliyuncs.com/google_containers/kube-proxy:v1.17.0
        imagePullPolicy: IfNotPresent
        name: kube-proxy
        resources: {}
        securityContext:
          privileged: true
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
        volumeMounts:
        - mountPath: /var/lib/kube-proxy
          name: kube-proxy
        - mountPath: /run/xtables.lock
          name: xtables-lock
        - mountPath: /lib/modules
          name: lib-modules
          readOnly: true
      dnsPolicy: ClusterFirst
      hostNetwork: true
      nodeSelector:
        beta.kubernetes.io/os: linux
      priorityClassName: system-node-critical
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext: {}
      serviceAccount: kube-proxy
      serviceAccountName: kube-proxy
      terminationGracePeriodSeconds: 30
      tolerations:
      - key: CriticalAddonsOnly
        operator: Exists
      - operator: Exists
      volumes:
      - configMap:
          defaultMode: 420
          name: kube-proxy
        name: kube-proxy
      - hostPath:
          path: /run/xtables.lock
          type: FileOrCreate
        name: xtables-lock
      - hostPath:
          path: /lib/modules
          type: ""
        name: lib-modules
  updateStrategy:
    rollingUpdate:
      maxUnavailable: 1
    type: RollingUpdate
status:
	...

跟deployment非常相似, 可以看到，yaml文件里没有replicas字段.

当然daemonset也是会受调度器影响的, 也就是说, 不一定会在所有的机器都会部署,也可能是符合条件的部分.

daemonset调度

在kubernetes的v1.12之前的版本, daemonset是直接由daemonset controller进行调度, 不会通过 kube-scheduler,但直接使用daemonset controller会造成一些问题, 因此在v1.12之后改成了由默认的kube-scheduler进行调度, 可以从上面的yaml文件中schedulerName看出

特别地，kube-proxy中使用了

tolerations:
- key: CriticalAddonsOnly
  operator: Exists
- operator: "Exists"
#上面两个条件是并行的

这里使用了tolerations, 也就是对节点上的taint进行容忍, 特别注意第二个条件

- operator: "Exists"

即这个 toleration 能容忍任意 taint，所以添加任何地taint都无法驱逐kube-proxy

kube-proxy做为kubernetes中service的实现方式,对服务的转发至关重要, 因此必须要在所有节点上进行部署

从kubernetes的官方文档中可以看到daemonset 还会默认增加的toleration.

Toleration Key	Effect	Version	Alpha Features	Description
node.kubernetes.io/not-ready	NoExecute	1.8+	TaintBasedEvictions	when TaintBasedEvictions is enabled,they will not be evicted when there are node problems such as a network partition.
node.kubernetes.io/unreachable	NoExecute	1.8+	TaintBasedEvictions	when TaintBasedEvictions is enabled,they will not be evicted when there are node problems such as a network partition.
node.kubernetes.io/disk-pressure	NoSchedule	1.8+	TaintNodesByCondition
node.kubernetes.io/memory-pressure	NoSchedule	1.8+	TaintNodesByCondition
node.kubernetes.io/unschedulable	NoSchedule	1.11+	ScheduleDaemonSetPods, TaintNodesByCondition	When ScheduleDaemonSetPodsis enabled, TaintNodesByConditionis necessary to make sure DaemonSet pods tolerate unschedulable attributes by default scheduler.
node.kubernetes.io/network-unavailable	NoSchedule	1.11+	ScheduleDaemonSetPods, TaintNodesByCondition, hostnework	When ScheduleDaemonSetPodsis enabled, TaintNodesByConditionis necessary to make sure DaemonSet pods, who uses host network, tolerate network-unavailable attributes by default scheduler.
node.kubernetes.io/out-of-disk	NoSchedule	1.8+	ExperimentalCriticalPodAnnotation(critical pod only), TaintNodesByCondition

出现以上情况是不会对daemonset造成影响的, 打个比如,假如某台机器出现了设定的磁盘不足, 这个时候kube-scheduler会将改节点打上taint: node.kubernetes.io/out-of-disk, effect=NoSchedule,就是说出现这个taint后不再允许新pod调度到该机器上, 但对daemonset的调度是不影响的, 因为daemonset默认会加上对应的tolerations.

daemonset驱逐

daemonset既然会默认有这么多tolerations，那有没有办法进行驱逐呢?

在kubernetes中经常使用drain来做节点的问题处理, 在使用drain命令时，对于四处种pod不会进行驱逐

daemonset pod

Static pod

不被ReplicationController, ReplicaSet, Job, DaemonSet or StatefulSet管理的pod

pod中使用了emptyDir的存储, drain也不会直接驱逐, 存在丢数据的风险,可使用–delete-local-data=true

#Set configuration context 
kubectl config use-context ek8s
#Set the node labelled with name=ek8s-node-1 as unavailable and reschedule all the pods running on it
kubectl cordon node
kubectl drain node --ignore-daemonsets=true --delete-local-data=true
#drain节点时会提示驱逐pod的明细

#如果使用drain作用于存在daemonset的节点但是又没有使用ignore-daemonsets=true标志, 则drain命令会被忽略,不会对节点上的pod进行驱逐，节点变成SchedulingDisabled状态，同时提示以下错误
#error: unable to drain node "instance-node1", aborting command...

如果使用drain而又没有ignore-daemonsets=true时，则drain命令会被忽略，这是因为如果不被忽略的话, drain对pod进行驱逐,而调度器又会将daemonset调度过来,造成死循环的.

daemonset更新

多kube-proxy的yaml文件来看, daemonset的更新也属于rollingUpdate，机制跟deployment一样.

daemonset删除

这个很简单, 一条命令

1	kubectl delete DaemonSet [Name]

参考文章:

https://kubernetes.io

https://www.cnblogs.com/tylerzhou/p/11007427.html

https://www.bmc.com/blogs/kubernetes-daemonset/

https://draveness.me/kubernetes-daemonset

https://banzaicloud.com/blog/drain/