Z.S.K.'s Records

Kubernetes学习(daemonset)

daemonset是kubernetes常用且重要的资源对象, DaemonSet 可以保证集群中所有的或者部分的节点都能够运行同一份 Pod 副本,每当有新的节点被加入到集群时,Pod 就会在目标的节点上启动,如果节点被从集群中剔除,节点上的 Pod 也会被垃圾收集器清除, 比如xx-agent, fluentd, node-exporter等, daemonset在kubernetes中随处可见.

daemonset创建

daemonset无法通过kubectl命令行直接进行创建, 可以先–dry-run先创建deployment后再修改

这里以kube-proxy的daemonset yaml文件为例:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
apiVersion: apps/v1
kind: DaemonSet
metadata:
labels:
k8s-app: kube-proxy
name: kube-proxy
namespace: kube-system
spec:
revisionHistoryLimit: 10
selector:
matchLabels:
k8s-app: kube-proxy
template:
metadata:
creationTimestamp: null
labels:
k8s-app: kube-proxy
spec:
containers:
- command:
- /usr/local/bin/kube-proxy
- --config=/var/lib/kube-proxy/config.conf
- --hostname-override=$(NODE_NAME)
env:
- name: NODE_NAME
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: spec.nodeName
image: registry.aliyuncs.com/google_containers/kube-proxy:v1.17.0
imagePullPolicy: IfNotPresent
name: kube-proxy
resources: {}
securityContext:
privileged: true
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /var/lib/kube-proxy
name: kube-proxy
- mountPath: /run/xtables.lock
name: xtables-lock
- mountPath: /lib/modules
name: lib-modules
readOnly: true
dnsPolicy: ClusterFirst
hostNetwork: true
nodeSelector:
beta.kubernetes.io/os: linux
priorityClassName: system-node-critical
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
serviceAccount: kube-proxy
serviceAccountName: kube-proxy
terminationGracePeriodSeconds: 30
tolerations:
- key: CriticalAddonsOnly
operator: Exists
- operator: Exists
volumes:
- configMap:
defaultMode: 420
name: kube-proxy
name: kube-proxy
- hostPath:
path: /run/xtables.lock
type: FileOrCreate
name: xtables-lock
- hostPath:
path: /lib/modules
type: ""
name: lib-modules
updateStrategy:
rollingUpdate:
maxUnavailable: 1
type: RollingUpdate
status:
...

跟deployment非常相似, 可以看到,yaml文件里没有replicas字段.

当然daemonset也是会受调度器影响的, 也就是说, 不一定会在所有的机器都会部署,也可能是符合条件的部分.

daemonset调度

在kubernetes的v1.12之前的版本, daemonset是直接由daemonset controller进行调度, 不会通过 kube-scheduler,但直接使用daemonset controller会造成一些问题, 因此在v1.12之后改成了由默认的kube-scheduler进行调度, 可以从上面的yaml文件中schedulerName看出

特别地,kube-proxy中使用了

1
2
3
4
5
tolerations:
- key: CriticalAddonsOnly
operator: Exists
- operator: "Exists"
#上面两个条件是并行的

这里使用了tolerations, 也就是对节点上的taint进行容忍, 特别注意第二个条件

- operator: "Exists"

即这个 toleration 能容忍任意 taint,所以添加任何地taint都无法驱逐kube-proxy

kube-proxy做为kubernetes中service的实现方式,对服务的转发至关重要, 因此必须要在所有节点上进行部署

从kubernetes的官方文档中可以看到daemonset 还会默认增加的toleration.

Toleration Key Effect Version Alpha Features Description
node.kubernetes.io/not-ready NoExecute 1.8+ TaintBasedEvictions when TaintBasedEvictions is enabled,they will not be evicted when there are node problems such as a network partition.
node.kubernetes.io/unreachable NoExecute 1.8+ TaintBasedEvictions when TaintBasedEvictions is enabled,they will not be evicted when there are node problems such as a network partition.
node.kubernetes.io/disk-pressure NoSchedule 1.8+ TaintNodesByCondition
node.kubernetes.io/memory-pressure NoSchedule 1.8+ TaintNodesByCondition
node.kubernetes.io/unschedulable NoSchedule 1.11+ ScheduleDaemonSetPods, TaintNodesByCondition When ScheduleDaemonSetPodsis enabled, TaintNodesByConditionis necessary to make sure DaemonSet pods tolerate unschedulable attributes by default scheduler.
node.kubernetes.io/network-unavailable NoSchedule 1.11+ ScheduleDaemonSetPods, TaintNodesByCondition, hostnework When ScheduleDaemonSetPodsis enabled, TaintNodesByConditionis necessary to make sure DaemonSet pods, who uses host network, tolerate network-unavailable attributes by default scheduler.
node.kubernetes.io/out-of-disk NoSchedule 1.8+ ExperimentalCriticalPodAnnotation(critical pod only), TaintNodesByCondition

出现以上情况是不会对daemonset造成影响的, 打个比如,假如某台机器出现了设定的磁盘不足, 这个时候kube-scheduler会将改节点打上taint: node.kubernetes.io/out-of-disk, effect=NoSchedule,就是说出现这个taint后不再允许新pod调度到该机器上, 但对daemonset的调度是不影响的, 因为daemonset默认会加上对应的tolerations.

daemonset驱逐

daemonset既然会默认有这么多tolerations,那有没有办法进行驱逐呢?

在kubernetes中经常使用drain来做节点的问题处理, 在使用drain命令时,对于四处种pod不会进行驱逐

  • daemonset pod
  • Static pod
  • 不被ReplicationController, ReplicaSet, Job, DaemonSet or StatefulSet管理的pod
  • pod中使用了emptyDir的存储, drain也不会直接驱逐, 存在丢数据的风险,可使用–delete-local-data=true
1
2
3
4
5
6
7
8
9
#Set configuration context 
kubectl config use-context ek8s
#Set the node labelled with name=ek8s-node-1 as unavailable and reschedule all the pods running on it
kubectl cordon node
kubectl drain node --ignore-daemonsets=true --delete-local-data=true
#drain节点时会提示驱逐pod的明细

#如果使用drain作用于存在daemonset的节点但是又没有使用ignore-daemonsets=true标志, 则drain命令会被忽略,不会对节点上的pod进行驱逐,节点变成SchedulingDisabled状态,同时提示以下错误
#error: unable to drain node "instance-node1", aborting command...

如果使用drain而又没有ignore-daemonsets=true时,则drain命令会被忽略,这是因为如果不被忽略的话, drain对pod进行驱逐,而调度器又会将daemonset调度过来,造成死循环的.

daemonset更新

多kube-proxy的yaml文件来看, daemonset的更新也属于rollingUpdate, 机制跟deployment一样.

daemonset删除

这个很简单, 一条命令

1
kubectl delete  DaemonSet [Name]

参考文章:

转载请注明原作者: 周淑科(https://izsk.me)

 wechat
Scan Me To Read on Phone
I know you won't do this,but what if you did?