Kruise Rollout发布
1. kruise-rollout简介
金丝雀发布
是逐步将流量导向新版本的应用,以最小化风险。具体过程包括:
- 部署新版本的 Pod。
- 将一部分流量分配到新版本(通常比例很小)。
- 逐步增加新版本的流量比例,直到完成全量发布。
Kruise Rollouts
是一个 Bypass(旁路) 组件,提供 高级渐进式交付功能 。可以通过Kruise Rollouts插件来实现金丝雀发布的能力。
组件 | Kruise Rollouts |
---|---|
核心概念 | 增强现有的工作负载 |
架构 | Bypass |
插拔和热切换 | 是 |
发布类型 | 多批次、金丝雀、A/B测试、全链路灰度 |
工作负载类型 | Deployment、StatefulSet、CloneSet、Advanced StatefulSet、Advanced DaemonSet |
流量类型 | Ingress、GatewayAPI、CRD(需要 Lua 脚本) |
迁移成本 | 无需迁移工作负载和Pods |
HPA 兼容性 | 是 |
2. 安装kruise-rollout
2.1. helm安装kruise-rollout
helm repo add openkruise https://openkruise.github.io/charts/
helm repo update
kubectl create ns openkruise
helm install kruise-rollout openkruise/kruise-rollout -n openkruise
# 升级到指定版本
helm upgrade kruise-rollout openkruise/kruise-rollout --version 0.5.0
查看部署结果
通过helm
和kubectl
的命令可以看到创建了几个crd
和kruise-rollout-controller-manager
的pod来提供rollout的功能。
# helm list -A
NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION
kruise-rollout openkruise 1 2024-11-24 16:52:36.067327844 +0800 +08 deployed kruise-rollout-0.5.0 0.5.0
# kubectl get po -n kruise-rollout
NAME READY STATUS RESTARTS AGE
kruise-rollout-controller-manager-875654888-rpxds 1/1 Running 0 87s
kruise-rollout-controller-manager-875654888-w75xj 1/1 Running 0 87s
# kubectl get crd |grep kruise
batchreleases.rollouts.kruise.io 2024-11-24T08:52:36Z
rollouthistories.rollouts.kruise.io 2024-11-24T08:52:36Z
rollouts.rollouts.kruise.io 2024-11-24T08:52:36Z
trafficroutings.rollouts.kruise.io 2024-11-24T08:52:36Z
2.2. 安装kubectl-kruise
kubectl-kruise用于执行rollout发版和回退等操作的二进制命令。
kubectl-kruise rollout approve
:执行下一批版本发布kubectl-kruise rollout undo
:回退全部发版批次
wget https://github.com/openkruise/kruise-tools/releases/download/v1.1.7/kubectl-kruise-linux-amd64-v1.1.7.tar.gz
tar -zvxf kubectl-kruise-linux-amd64-v1.1.7.tar.gz
mv linux-amd64/kubectl-kruise /usr/local/bin/
3. 使用指南
先部署一个k8s deployment对象,例如部署10个nginx:1.26的容器。
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: nginx
name: nginx
namespace: default
spec:
replicas: 10
selector:
matchLabels:
app: nginx
strategy:
rollingUpdate:
maxSurge: 25%
maxUnavailable: 25%
type: RollingUpdate
template:
metadata:
labels:
app: nginx
spec:
containers:
- image: nginx:1.26
imagePullPolicy: IfNotPresent
name: nginx
ports:
- containerPort: 80
protocol: TCP
3.1. 创建发布策略(rollout对象)
我们以使用多批次灰度发布
为例。
workloadRef
定义作用于哪个deployment对象。strategy
定义发布策略。
以下是发布策略示例描述:
- 在第一批次:只升级 1个Pod;
- 在第二批次:升级 50% 的 Pods,即 5个已更新的Pod;
- 在第三批次:升级 100% 的 Pods,即 10个已更新的Pod。
$ kubectl apply -f - <<EOF
apiVersion: rollouts.kruise.io/v1beta1
kind: Rollout
metadata:
name: rollouts-nginx
namespace: default
spec:
workloadRef:
apiVersion: apps/v1
kind: Deployment
name: nginx
strategy:
canary:
enableExtraWorkloadForCanary: false
steps:
- replicas: 1
- replicas: 50%
- replicas: 100%
EOF
3.2. 升级发布版本(发布第一批次)
升级nginx:1.26到nginx:1.27
kubectl set image deployment nginx nginx=nginx:1.27
查看灰度结果:
# kubectl get rs -L pod-template-hash -w
NAME DESIRED CURRENT READY AGE POD-TEMPLATE-HASH
nginx-68556bc579 1 1 1 8s 68556bc579 # nginx:1.27
nginx-d7f5d89c9 9 9 9 24m d7f5d89c9 # nginx:1.26
查看rollout状态:
rollout对象的状态主要描述了当前处于哪个rollout阶段及总状态和子阶段状态的信息。
# kubectl get rollout rollouts-nginx -oyaml
apiVersion: rollouts.kruise.io/v1beta1
kind: Rollout
metadata:
name: rollouts-nginx
namespace: default
spec:
disabled: false
strategy:
canary:
steps:
- pause: {}
replicas: 1
- pause: {}
replicas: 50%
- pause: {}
replicas: 100%
workloadRef:
apiVersion: apps/v1
kind: Deployment
name: nginx
status:
canaryStatus:
canaryReadyReplicas: 1
canaryReplicas: 1
canaryRevision: 5c9556986d
currentStepIndex: 1
currentStepState: StepPaused # 当前步骤状态
lastUpdateTime: "2024-11-24T11:52:56Z"
message: BatchRelease is at state Ready, rollout-id , step 1
observedWorkloadGeneration: 36
podTemplateHash: d7f5d89c9
rolloutHash: 77cxd69w47b7bwddwv2w7vxvb4xxdbwcx9x289vw69w788w4w6z4x8dd4vbz2zbw
stableRevision: 68556bc579
conditions:
- lastTransitionTime: "2024-11-24T11:52:47Z"
lastUpdateTime: "2024-11-24T11:52:47Z"
message: Rollout is in Progressing
reason: InRolling
status: "True"
type: Progressing
message: Rollout is in step(1/3), and you need manually confirm to enter the next
step # rollout信息
observedGeneration: 2
phase: Progressing # 整体状态
3.3. 发布第二批次
kubectl-kruise rollout approve rollout/rollouts-nginx -n default
查看灰度结果:
# kubectl get rs -L pod-template-hash -w
NAME DESIRED CURRENT READY AGE POD-TEMPLATE-HASH
nginx-68556bc579 5 5 5 7m24s 68556bc579 # nginx:1.27
nginx-d7f5d89c9 5 5 5 32m d7f5d89c9 # nginx:1.26
# kubectl get po
NAME READY STATUS RESTARTS AGE
nginx-68556bc579-24lpl 1/1 Running 0 7m34s # nginx:1.27 第一批次
nginx-68556bc579-2dph8 1/1 Running 0 14s # nginx:1.27 第二批次
nginx-68556bc579-57pqt 1/1 Running 0 14s # nginx:1.27 第二批次
nginx-68556bc579-879s9 1/1 Running 0 14s # nginx:1.27 第二批次
nginx-68556bc579-fwt52 1/1 Running 0 14s # nginx:1.27 第二批次
nginx-d7f5d89c9-5fbfp 1/1 Running 0 30m # 其余为第三批次
nginx-d7f5d89c9-gkz9p 1/1 Running 0 30m
nginx-d7f5d89c9-jhxwl 1/1 Running 0 30m
nginx-d7f5d89c9-vrqfz 1/1 Running 0 30m
nginx-d7f5d89c9-zk2sj 1/1 Running 0 30m
3.4. 发布第三批次
kubectl-kruise rollout approve rollout/rollouts-nginx -n default
查看结果:
# kubectl get rs -L pod-template-hash -w
NAME DESIRED CURRENT READY AGE POD-TEMPLATE-HASH
nginx-68556bc579 10 10 10 12m 68556bc579 # nginx:1.27
nginx-d7f5d89c9 0 0 0 37m d7f5d89c9
3.5. 发布回滚
该回滚会把全部批次的pod都回滚到第一批发版前的版本。
kubectl-kruise rollout undo rollout/rollouts-nginx -n default
4. k8s对象变更
kruise-rollout的原理主要还是劫持Deployment
和ReplicaSet
的操作,例如在升级deployment的时候会把deployment原先的strategy
策略放在annotations中暂存,再增加paused=true
的参数暂停deployment的发布。在rollout全部结束后再恢复原先的deployment参数,主要包括paused
和strategy
。
以下是rollout中间阶段的deployment信息:
apiVersion: apps/v1
kind: Deployment
metadata:
annotations:
batchrelease.rollouts.kruise.io/control-info: '{"apiVersion":"rollouts.kruise.io/v1beta1","kind":"BatchRelease","name":"rollouts-nginx","uid":"8f29624b-ab77-49d1-abbe-aa92bb3998e2","controller":true,"blockOwnerDeletion":true}'
deployment.kubernetes.io/revision: "5"
rollouts.kruise.io/deployment-extra-status: '{"updatedReadyReplicas":1,"expectedUpdatedReplicas":1}'
rollouts.kruise.io/deployment-strategy: '{"rollingStyle":"Partition","rollingUpdate":{"maxUnavailable":"25%","maxSurge":"25%"},"partition":1}' # deployment原本的strategy策略
rollouts.kruise.io/in-progressing: '{"rolloutName":"rollouts-nginx"}'
generation: 36
labels:
app: nginx
rollouts.kruise.io/controlled-by-advanced-deployment-controller: "true"
rollouts.kruise.io/stable-revision: 68556bc579
rollouts.kruise.io/workload-type: deployment
name: nginx
namespace: default
spec:
paused: true # 增加了paused参数
progressDeadlineSeconds: 600
replicas: 10
revisionHistoryLimit: 10
5. Rollout流量灰度
5.1. 添加pod 版本标签
为了适配Apisix
或Nginx
等k8s流量网关的灰度逻辑,我们通过k8s service
的方式来动态获取Pod的IP(即endpoint),其中包括全量pod,灰度pod和非灰度pod。因此我们在每次发版的deployment给pod加上所属的版本标签
。
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: nginx
name: nginx
namespace: default
spec:
replicas: 10
selector:
matchLabels:
app: nginx
strategy:
rollingUpdate:
maxSurge: 25%
maxUnavailable: 25%
type: RollingUpdate
template:
metadata:
labels:
app: nginx
version: "1.26" # 增加pod对应的版本标签,用于service联动
spec:
containers:
- image: nginx:1.26
imagePullPolicy: IfNotPresent
name: nginx
ports:
- containerPort: 80
protocol: TCP
5.2. 添加灰度pod的service
创建三个service来对应全量pod,灰度pod和非灰度pod。
全量pod的service
apiVersion: v1
kind: Service
metadata:
name: nginx
namespace: default
spec:
selector:
app: nginx # 全量pod的标签
ports:
- name: http
protocol: TCP
port: 80
targetPort: 80
灰度pod的service
apiVersion: v1
kind: Service
metadata:
name: nginx-canary
namespace: default
labels:
version: canary
spec:
selector:
app: nginx
version: "1.27" # 增加灰度版本的标签,对应灰度的pod
ports:
- name: http
protocol: TCP
port: 80
targetPort: 80
非灰度pod的service
apiVersion: v1
kind: Service
metadata:
name: nginx-stable
namespace: default
spec:
selector:
app: nginx
version: "1.26" # 增加非灰度版本的标签,对应非灰度的pod
ports:
- name: http
protocol: TCP
port: 80
targetPort: 80
5.3. 动态获取灰度的endpoint
通过以上service的配置,可以在灰度的过程中动态获取灰度pod的endpoint,然后动态集成到流量网关中。
全量pod的service
:在流量灰度前和流量灰度完成后可以把流量调度到这个service下的pod。灰度pod的service
:可以设置把灰度流量调度到canary的service的pod。非灰度pod的service
:可以把非灰度流量调度到stable的service的pod。
以下是灰度前后endpoint的变化:
# kubectl get endpoints
# 灰度前:canary的pod为0
nginx 10.0.1.139:80,10.0.1.202:80,10.0.1.203:80 + 7 more... 2d
nginx-canary <none> 2d
nginx-stable 10.0.1.139:80,10.0.1.202:80,10.0.1.203:80 + 7 more... 47h
# 灰度中:canary的pod增加
nginx 10.0.1.202:80,10.0.1.203:80,10.0.1.204:80 + 7 more... 2d <none>
nginx-canary 10.0.1.5:80 2d version=canary
nginx-stable 10.0.1.202:80,10.0.1.203:80,10.0.1.204:80 + 6 more... 47h version=stable
# 灰度完成:stable的pod为0
NAME ENDPOINTS AGE
nginx 10.0.1.139:80,10.0.1.202:80,10.0.1.203:80 + 7 more... 2d
nginx-canary 10.0.1.139:80,10.0.1.202:80,10.0.1.203:80 + 7 more... 2d
nginx-stable <none> 47h
6. 总结
k8s的deployment默认支持金丝雀发布
,可以通过kubectl rollout
的命令实现,通过配置deployment中的maxSurge
和maxUnavailable
来控制发版的节奏,可以通过以下命令来控制发版批次:
kubectl rollout pause
:暂停发版kubectl rollout resume
:继续发版kubectl rollout undo
:回滚版本
但相对来讲,控制粒度没有那么精确,而OpenKruise的Rollouts
插件提供了一种更加轻便,可控的发版工具,个人认为最大的优势是配置简单,且无需迁移工作负载和Pods
,可以更好的集成到现有的容器平台中。因此可以通过该工具来实现金丝雀发版(多批次发版),实现灰度的能力。
参考:
- https://openkruise.io/zh/rollouts/introduction
- https://openkruise.io/zh/rollouts/installation
- https://openkruise.io/zh/rollouts/user-manuals/basic-usage
Feedback
Was this page helpful?
Glad to hear it! Please tell us how we can improve.
Sorry to hear that. Please tell us how we can improve.