第29关 Kube-eventer秒级事件监控
阿里云开源的k8s容器秒级事件监控软件-Kube-eventer
这节课给大家分享一款K8S上宝藏级秒级事件监控报警的开源软件kube-eventer,它是由阿里云开源的,并且难得的还一直有在更新。
介绍
天下武功,唯快不破。对于报警监控也是一样,我们前面的课程有讲到prometheus这款监控软件,但总还觉得缺了些什么,对了,就是K8S上面无处不在的事件监控,博哥在实际的生产工作中,切身体会到事件监控的重要性,对于事件监控的使用力度更有超过prometheus,能及时灵敏地发现全球各个K8S集群的重要事件报警,使问题能得到及时的处理,维护了K8S集群的稳定性。
监控describe里的事件Events
在全球n个k8s集群,可以不用prometheus,但是不能没有kube-eventer,不监控起来没有底气,不知道发生什么 生产中 prometheus看历史数据多一点,95%报警由kube-eventer发出
下面是kube-eventer的github开源地址:
https://github.com/AliyunContainerService/kube-eventer

部署
下面是博哥生产中实际在用的完整yaml配置
api地址 grep boge.com /etc/kubeasz/clusters/test-cn/config.yml
---
apiVersion: v1
data:
content: >-
{"EventType": "{{ .Type }}","EventNamespace": "{{
.InvolvedObject.Namespace }}","EventKind": "{{ .InvolvedObject.Kind }}","EventObject": "{{
.InvolvedObject.Name }}","EventReason": "{{
.Reason }}","EventTime": "{{ .LastTimestamp }}","EventMessage": "{{ .Message
}}"}
kind: ConfigMap
metadata:
name: kubeeventer-webhook
namespace: kube-system
---
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
name: kube-eventer
name: kube-eventer-webhook
namespace: kube-system
spec:
replicas: 1
selector:
matchLabels:
app: kube-eventer
template:
metadata:
labels:
app: kube-eventer
annotations:
scheduler.alpha.kubernetes.io/critical-pod: ''
spec:
dnsPolicy: ClusterFirstWithHostNet
serviceAccount: kube-eventer
containers:
- image: registry.aliyuncs.com/acs/kube-eventer:v1.2.7-ca03be0-aliyun
# - image: registry.us-west-1.aliyuncs.com/acs/kube-eventer:v1.2.7-ca03be0-aliyun
name: kube-eventer
command:
- "/kube-eventer"
# api server地址
- "--source=kubernetes:https://test-cnk8s.boge.com:6443"
## .e.g,dingtalk sink demo
#- --sink=dingtalk:[your_webhook_url]&label=[your_cluster_id]&level=[Normal or Warning(default)]&namespaces=[kube-system,kae-app(all)]
- --sink=webhook:http://alertmanaer-dingtalk-svc.kube-system/b01bdc063/boge/getjson?level=Warning&kinds=Pod&method=POST&header=Content-Type=application/json&custom_body_configmap=kubeeventer-webhook&custom_body_configmap_namespace=kube-system
env:
# If TZ is assigned, set the TZ value as the time zone
- name: TZ
value: "Asia/Shanghai"
volumeMounts:
- name: localtime
mountPath: /etc/localtime
readOnly: true
- name: zoneinfo
mountPath: /usr/share/zoneinfo
readOnly: true
resources:
requests:
cpu: 100m
memory: 100Mi
limits:
cpu: 500m
memory: 250Mi
hostAliases:
- hostnames:
- test-cnk8s.boge.com
ip: 10.0.1.201
volumes:
- name: localtime
hostPath:
path: /etc/localtime
- name: zoneinfo
hostPath:
path: /usr/share/zoneinfo
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: kube-eventer
rules:
- apiGroups:
- ""
resources:
- configmaps
- events
verbs:
- get
- list
- watch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: kube-eventer
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: kube-eventer
subjects:
- kind: ServiceAccount
name: kube-eventer
namespace: kube-system
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: kube-eventer
namespace: kube-system注意使用密钥签名的话钉钉webhook地址要dingtalk开头,这样才能让服务适配上
--sink=dingtalk:https://oapi.dingtalk.com/robot/send?access_token=xxxxxxxxxxxxx&sign=SECxxxxxxxxxxxxxxxxxxxxxxx&level=Warning&kinds=Pod&method=POST&header=Content-Type=application/json&custom_body_configmap=kubeeventer-webhook&custom_body_configmap_namespace=kube-system
kubectl create ns test-kube-eventer
kubectl -n test-kube-eventer apply -f kube-eventer.yaml
kubectl -n test-kube-eventer delete -f kube-eventer.yaml
# kubectl -n test-kube-eventer get pod
NAME READY STATUS RESTARTS AGE
kube-eventer-webhook-76b97bd7b5-nnlsn 1/1 Running 0 107s测试
kubectl create ns test
kubectl -n test apply -f test-kube-eventer-pod.yaml
kubectl -n test delete -f test-kube-eventer-pod.yaml
kubectl -n test get pod -wkubectl -n test-kube-eventer logs kube-eventer-webhook-76b97bd7b5-nnlsnapiVersion: v1
kind: Pod
metadata:
name: test-pod
spec:
containers:
- name: tomcat-container
image: tomcat:114514
ports:
- containerPort: 8080
command: ["/bin/bash", "-c", "while true; do sleep 30; done"] # 这里添加了空任务,仅作示例,实际中Tomcat自身会保持容器运行
lifecycle:
postStart:
exec:
command: ["sh", "-c", "echo 'Pod has been started.'"] # 可选,Pod启动后执行的命令
readinessProbe: # 就绪探针,确保Tomcat服务正常
httpGet:
path: /manager/html # 假设Tomcat管理页面作为健康检查
port: 8080
initialDelaySeconds: 5
periodSeconds: 5
livenessProbe: # 活跃性探针,防止Pod假死
httpGet:
path: /manager/html
port: 8080
initialDelaySeconds: 15
periodSeconds: 20#- --sink=dingtalk:[your_webhook_url]&label=[your_cluster_id]&level=[Normal or Warning(default)]&namespaces=[kube-system,kae-app(all)]
- --sink=webhook:http://alertmanaer-dingtalk-svc.kube-system/b01bdc063/boge/getjson?level=Warning&kinds=Pod&method=POST&header=Content-Type=application/json&custom_body_configmap=kubeeventer-webhook&custom_body_configmap_namespace=kube-system