第21关 Job和CronJob
从零到一:在 K8s 中部署 Job 和 CronJob 的全面指导
有时候我们想在K8s跑个一次性任务,或者是定时任务,能不能实现呢,答案肯定是可以的。
job
首先讲下一次性任务,在K8s中它叫job,直接来实战一番,先准备下yaml配置
这里我们不知道yaml怎么写,可以直接
kubectl create job -h就能看到命令行创建示例了,然后可以根据创建出来的服务资源来导出它的yaml配置为my-job.yaml
# kubectl -n test-job create job my-job --image=busybox --dry-run -o yaml -- date
W0602 10:47:51.322047 10048 helpers.go:692] --dry-run is deprecated and can be replaced with --dry-run=client.
apiVersion: batch/v1
kind: Job
metadata:
creationTimestamp: null
name: my-job
namespace: test-job
spec:
template:
metadata:
creationTimestamp: null
spec:
containers:
- command:
- date
image: busybox
name: my-job
resources: {}
restartPolicy: Never
status: {}apiVersion: batch/v1 # 1. batch/v1 是当前 Job 的 apiVersion
kind: Job # 2. 指明当前资源的类型为 Job
metadata:
name: my-job
spec:
template:
metadata:
spec:
containers:
- image: registry.cn-shanghai.aliyuncs.com/acs/busybox:v1.29.2
name: my-job
# command: ["echo","Hello, hong."]
command: ["sh", "-c", "date && echo Hello, hong."]
restartPolicy: Never # 3. restartPolicy 指定什么情况下需要重启容器。对于 Job,只能设置为 Never(从不) 或者 OnFailure(失败)command内容是任务最关键的地方,
相比Linux系统他可以访问内部service域名
pod中一共有以下三个重启策略(restartPolicy) 1、Always:当容器终止退出后,总是重启容器,默认策略。 2、OnFailure:当容器异常退出(退出状态码非0)时,才重启容器。 3、Never:当容器终止退出,从不重启容器。 三种重启策略中,Always是默认策略,即当用户在配置文件中未配置关于重启的策略,则默认为Always
创建它并查看结果
kubectl -n test-job run busybox --rm -it --image=busybox -- sh# kubectl create ns test-job
# kubectl -n test-job apply -f my-job.yaml
job.batch/my-job created
# kubectl -n test-job get jobs.batch
NAME COMPLETIONS DURATION AGE
my-job 1/1 4s 73s
# COMPLETIONS 已完成的
# DURATION 这个job运行所花费的时间
# AGE 这个job资源已经从创建到目前为止的时间
# job会生成一个pod,当完成任务后会是Completed(完成)的状态
# kubectl -n test-job get pod
NAME READY STATUS RESTARTS AGE
pod/my-job-s4njd 0/1 Completed 0 2s
# 完成了什么呢,可以看下这个job生成的pod日志
# kubectl -n test-job logs my-job-s4njd
Sun Jun 2 02:59:10 UTC 2024
Hello, hong.job失败了会有什么现象出现呢?
我们编辑这个job的yaml,把执行的命令改成一个不存在的命令看看会发生什么
apiVersion: batch/v1 # 1. batch/v1 是当前 Job 的 apiVersion
kind: Job # 2. 指明当前资源的类型为 Job
metadata:
name: my-job-error
spec:
template:
metadata:
spec:
containers:
- image: registry.cn-shanghai.aliyuncs.com/acs/busybox:v1.29.2
name: my-job
command: ["echoaaa","Hello, hong."] # 错误的echo命令
restartPolicy: Never # 3. restartPolicy 指定什么情况下需要重启容器。对于 Job,只能设置为 Never 或者 OnFailure创建它
# kubectl -n test-job apply -f my-job-error.yaml
# 可以观察到这个job因为不成功,并且restartPolicy重启模式是Never不会被重启,但它的job状态始终未完成,所以它会一直不停的创建新的pod(4-5次),直到COMPLETIONS为1/1,对于我们这个示例,它显然永远都不会成功
# kubectl -n test-job get pod
NAME READY STATUS RESTARTS AGE
my-job-error-88gpz 0/1 StartError 0 58s
my-job-error-bbgzz 0/1 StartError 0 47s
my-job-error-vv2pb 0/1 StartError 0 27s
# kubectl -n test-job get job
NAME COMPLETIONS DURATION AGE
my-job-error 0/1 78s 78s
# 找一个pod看下事件描述,会很清晰地指出命令不存在
# kubectl -n test-job describe pod my-job-error-88gpz
Name: my-job-error-88gpz
Namespace: test-job
......
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 106s default-scheduler Successfully assigned test-job/my-job-error-88gpz to 10.0.1.203
Normal Pulled 105s kubelet Container image "registry.cn-shanghai.aliyuncs.com/acs/busybox:v1.29.2" already present on machine
Normal Created 105s kubelet Created container my-job
Warning Failed 105s kubelet Error: failed to create containerd task: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: exec: "echoaaa": executable file not found in $PATH: unknown
# 删除掉这个job,不然那创建的pod数量可有够多的了
# kubectl -n test-job delete job my-job-error试试把restartPolicy重启模式换成OnFailure观察看看
apiVersion: batch/v1
kind: Job
metadata:
name: my-job-error-onfailure
spec:
template:
metadata:
name: my-job
spec:
containers:
- image: registry.cn-shanghai.aliyuncs.com/acs/busybox:v1.29.2
name: my-job
command: ["echoaaa","Hello, boge."]
restartPolicy: OnFailure# kubectl -n test-job apply -f my-job-error-onfailure.yaml
# kubectl -n test-job get pod -w
NAME READY STATUS RESTARTS AGE
my-job-error-onfailure-cxmx2 0/1 RunContainerError 1 (5s ago) 6s
my-job-error-onfailure-cxmx2 0/1 CrashLoopBackOff 1 (14s ago) 15s
my-job-error-onfailure-cxmx2 0/1 RunContainerError 2 (1s ago) 16s
my-job-error-onfailure-cxmx2 0/1 CrashLoopBackOff 2 (14s ago) 29s
my-job-error-onfailure-cxmx2 0/1 RunContainerError 3 (1s ago) 41s
my-job-error-onfailure-cxmx2 0/1 CrashLoopBackOff 3 (12s ago) 52s
my-job-error-onfailure-cxmx2 0/1 RunContainerError 4 (1s ago) 90s
# 可以看到它不会创建新的pod,而是会尝试重启自身,以期望恢复正常,这里看到已经重启了3次,还会持续增加到5
# 然后会被K8s给删除以尝试,因为这里只是job而不是deployment,它不会自己再启动一个新的pod,所以这个job等于就没有了
# 这里说明OnFailure是生效的,至少不会有那么多错误的pod出现了并行执行job
准备好yaml配置
apiVersion: batch/v1
kind: Job
metadata:
name: my-job-parallelism
spec:
parallelism: 2 # 并行执行2个job
template:
metadata:
name: my-job
spec:
containers:
- image: registry.cn-shanghai.aliyuncs.com/acs/busybox:v1.29.2
name: my-job
command: ["echo","Hello, boge."]
restartPolicy: OnFailure创建并查看结果
# kubectl -n test-job apply -f my-job-parallelism.yaml
job.batch/my-job created
# job一共启动了2个pod,并且它们的AGE一样,可见是并行创建的
# kubectl -n test-job get pod
NAME READY STATUS RESTARTS AGE
my-job-fwf8l 0/1 Completed 0 7s
my-job-w2fxd 0/1 Completed 0 7s再来个组合测试下并行完成定制的总任务数量
apiVersion: batch/v1
kind: Job
metadata:
name: myjob-compose
spec:
completions: 6 # 此job完成pod的总数量
parallelism: 2 # 每次并发跑2个job
template:
metadata:
name: myjob
spec:
containers:
- name: hello
image: registry.cn-shanghai.aliyuncs.com/acs/busybox:v1.29.2
command: ["sh", "-c", "date && echo Hello, hong."]
restartPolicy: OnFailure创建并查看结果
# kubectl -n test-job apply -f my-job-compose.yaml
# 可以看到是每次并发2个job,完成6个总量即停止
# kubectl -n test-job get pod
NAME READY STATUS RESTARTS AGE
myjob-compose-82zjn 0/1 Completed 0 7s
myjob-compose-ngv7x 0/1 Completed 0 10s
myjob-compose-p6lpb 0/1 Completed 0 7s
myjob-compose-r6j75 0/1 Completed 0 11s
myjob-compose-t2zlj 0/1 Completed 0 4s
myjob-compose-tfwkp 0/1 Completed 0 4s
# 符合预期
# kubectl -n test-job get job
NAME COMPLETIONS DURATION AGE
myjob-compose 6/6 11s 33s
# 测试完成后删掉这个资源
kubectl -n test-job delete job myjob-compose到此,job的内容就讲完了,在生产中,job比较适合用在CI/CD流水线中,作完一次性任务使用,我在生产中基本没怎么用这个资源。
cronjob
上面的job是一次性任务,那我们需要定时循环来执行一个任务可以嘛?答案肯定是可以的,就像我们在linux系统上面用crontab一样,在K8s上用cronjob的另一个好处就是它是分布式的,执行的pod可以是在集群中的任意一台NODE上面(这点和cronsun有点类似)
让我们开始实战吧,先准备一下cronjob的yaml配置为my-cronjob.yaml
apiVersion: batch/v1 # <--------- 当前 CronJob 的 apiVersion
kind: CronJob # <--------- 当前资源的类型
metadata:
name: my-cron-job
spec:
schedule: "* * * * *" # <--------- schedule 指定什么时候运行 Job,其格式与 Linux crontab 一致,这里 * * * * * 的含义是每一分钟启动一次
jobTemplate: # <--------- 定义 Job 的模板,格式与前面 Job 一致
spec:
template:
spec:
containers:
- name: hello
image: registry.cn-shanghai.aliyuncs.com/acs/busybox:v1.29.2
command: ["sh", "-c", "date", "sh", "-c", "echo 'hong like cronjob.'"]
restartPolicy: OnFailurecommand: ["sh", "-c", "date", "sh", "-c", "echo, "hong like cronjob."]
正常创建后,我们过几分钟来看看运行结果
# kubectl -n test-job apply -f my-cron-job.yaml
# 这里会显示cronjob的综合信息
# kubectl -n test-job get cronjobs.batch
NAME SCHEDULE SUSPEND ACTIVE LAST SCHEDULE AGE
hello * * * * * False 0 66s 2m20s
# 可以看到它每隔一分钟就会创建一个pod来执行job任务
# kubectl -n test-job get pod -w
NAME READY STATUS RESTARTS AGE
my-cron-job-28621647-2hdhz 0/1 Completed 0 31s
my-cron-job-28621648-7mp8v 0/1 Pending 0 0s
my-cron-job-28621648-7mp8v 0/1 Pending 0 0s
my-cron-job-28621648-7mp8v 0/1 ContainerCreating 0 0s
my-cron-job-28621648-7mp8v 0/1 Completed 0 1s
my-cron-job-28621648-7mp8v 0/1 Completed 0 2s
my-cron-job-28621648-7mp8v 0/1 Completed 0 3s
my-cron-job-28621648-7mp8v 0/1 Completed 0 4s
# kubectl -n test-job logs my-cron-job-28621650-q8zbc
Sun Jun 2 03:30:00 UTC 2024
# 测试完成后删掉这个资源
# kubectl -n test-job delete cronjobs.batch my-cron-job
cronjob.batch "hello" deletedcronjob定时任务在生产中的用处很多,这也是为什么上面job我说用得很少的缘故,我们可以把一些需要定时定期运行的任务,在K8s上以cronjob运行,依托K8s强大的资源调度以及服务自愈能力,我们可以放心的把定时任务交给它执行。