Anthos 架構、部署系列 – Anthos on Bare Metal

Kubernetes這個名詞,在過去幾年橫掃了IT界,可以說是繼虛擬化之後,另外一個指標性的技術,它在雲原生的世界中大放異彩,幾乎快到了無所不能的境界,與此同時越來越多的軟/硬體廠商,也相繼推出了各式各樣的Kubernetes整合方案,讓它能夠持續、不間斷地繼續發燒!

從 VM > Contaner >  CI/CD > Service Mesh >  GitOps > MultiCloud

像我們這種以網路起家的IT人員,跟三五好友聚會聊天時,總不禁會感嘆的說,現在不會寫一點程式,搞一點自動化部署、開發,就好像跟IT世界脫節一般。

敏捷這兩字,似乎只要跟上了,就會發大財一般,其實不然,因為大多數的企業主,還是採用較為保守的方式來面對新技術的發展,也是因為這樣的一步一腳印,我們這些老SE,才能夠繼續在這邊跟各位聊聊技術。

轉變,從來都不是一觸即成的,唯有一步一腳印,才能夠造就我們。

Anthos Bare Metal,就是一個這樣子的產品,它可以讓企業可以逐步的建構屬於自己的雲原生落地環境,而不需要急急躁躁的一窩蜂上雲,因此當筆者聽到Google發佈了這個產品時,心中非常的雀躍,因為終於可以在地端環境中,好好的玩一下基於Google進行優化的Kuberntes平台,讓我們跟著Google的腳步一起來玩玩吧! 相信大多數接觸Kubernetes平台的玩家們,應該都是從開源社群開始的吧!在Kubernetes環境建置的過程中,你[絕對]會遇到非常,不!應該說是超級多的安裝問題、錯誤訊息,這個時候就是考驗工程師魂的時候啦!只要爬的文夠多,再加上貢獻的Lab時數足夠,自然就能夠一一解決問題,將平台建構起來,進而享受Kubernetes帶來的極速快感!

沒錯!只需要一行指令,你就可以快速的把workpress的blog網站建構完畢,並且讓你的網站具備自動修復、自動資源擴展的神奇能力!

然而現實是殘酷的,大多數的企業IT人員,其實並不具備這樣的能力,因為在過往的科技培訓,都是以網路、Windows系統為主,鮮少有MIS會去主動接觸難搞的Linux系統,也因為如此,在企業踏入雲原生的領域前,可能要先好好惡補一下才行。

先來做個小測驗,如果以下幾個新興技術,您都有聽過、玩過,那麼恭喜你,是時候可以開來玩玩Kubernetes了!

  • Linux作業系統(Ubuntu/ CentOS)
  • 虛擬化平台(VMware/ HyperV)
  • 容器化(Container)
  • 代理/負載平衡器(Proxy / Load Balancer)
  • 程式碼編輯器(Virtaul Studio Code)
  • 版本控制(GIT)

假設有一半以上都沒聽過?那也沒關係,因為我們接下來會告訴您,如何透過Anthos Bare Metal,來協助您建置企業級別的Kuberntes環境,透過Google提供的bmctl管理工具,幫助我們完成自動化的部署。

話不多說!那我們就先從系統架構說明開始吧!

anthos-on-bare-metal-系統架構說明
圖片來源: Google

一般情況下,如果是標準的營運級Kubernetes cluster環境,我們會建議至少要有3台以上的master以及至少5台以上的worker,來提供workload所需要的系統資源,當然如果您的環境是測試環境,你也可以設置1台master + 1台worker這樣的迷你cluster環境來進行測試。

硬體資源需求:

Anthos Bare Metal 是有最低的系統資源要求的,因此在準備node環境的時候,要稍微留意一下,不要低於最低需求唷!另外針對硬碟類型的部分,您可以依據您的workload特性,自由選擇HDD or SSD(IOPS較高)。

Anthos Bare Metal 硬體資源需求

作業系統資源需求:

Anthos Bare Metal目前僅支援以下幾種作業系統、版本,選擇的時候要稍微留意一下。

  • CentOS 8.1
  • CentOS 8.2
  • CentOS 8.3
  • CentOS 8.4
  • RHEL 8.1
  • RHEL 8.2
  • RHEL 8.3
  • RHEL 8.4
  • Ubuntu 18.04
  • Ubuntu 20.04

軟體調整需求:

針對不同的作業系統,會有各的軟體調整需求,在系統環境安裝完畢後,也要稍微留意一下,搭配進行調整

CentOS
  • 將SELinux設置為permissive
  • 關閉firewalld
  • 安裝Docker 19.03以上的版本
  • NTP更新
RHEL
  • 註冊更新
  • 將SELinux設置為permissive
  • 關閉firewalld
  • 安裝Docker 19.03以上的版本
  • NTP更新
Ubuntu
  • 關閉AppArmor
  • 關閉ufw
  • 安裝Docker 19.03以上的版本
  • NTP更新

Networking資源需求:

Anthos bare metal的網路架構非常的簡單,您只需要一個subnet,就可以同時提供external L4 load balancing 還有master/worker的nodes網路存取,當然在許多的企業環境中,會希望拆分服務跟主機的網路區段,這部分當然也不是問題,您只需要在前端部署一台firewall,就可以輕鬆搞定囉。

LB資源需求:

AnthoAnthos bare metal本身內建MetalLB,在佈建的過程中,就會自動完成相關的安裝設置了,不像其他開源版本的K8S Cluster,需要自行安裝才能提供L4的load balancing流量轉送。

GCP平台相關設置需求:

在正式透過bmctl指令進行環境部署前,會需要預先將Anthos相關的API進行啟用,並產生service account key的token,最後再將相關路徑寫入到config中,以利部署時讀取相應的權限。

部署參考範例:https://cloud.google.com/anthos/clusters/docs/bare-metal/1.8/installing/configure-sa

Note: 如果覺得很麻煩,你也可以透過 bmctl自動幫你完成上述的作業。

Anthos Cluster部署模式:

Anthos Cluster 部署模式 - 部署模式比較表
部署模式比較表

Anthos cluster支援三種模式(Standalone/MultiCluster/HybridCluster),大多情況下,我們會選擇Standalone模式進行部署,除非您的環境中會需要部署多個User Cluster,才需要選擇Multi Cluster or Hybrid Cluster,來進行User Cluster的集中管理、部署,Hybrid Cluster您則可視為Multi Cluster的變形版本,它可以部署共用的workload資源在Hybrid Cluster的worker中,通常會是監控、分析相關的資源。

管理工作站需求:

因為是透過bmctl的指令集來進行自動化部屬,因此我們需要額外準備一台Linux主機(建議用Ubuntu即可),然後進行一些額外的準備:

  1. 透過ssh-keygen產生公/私金鑰,並將公鑰資訊寫入到所有的master/worker主機的.ssh/authorized_keys中

2. 安裝gcloud sdk指令集

https://cloud.google.com/sdk/docs/install

3. 安裝最新版本的bmctl指令集

https://cloud.google.com/anthos/clusters/docs/bare-metal/1.8/downloads

呼!費了一些勁,終於把基本環境的資源準備好了,接下來就是重頭戲了,要來開始部署cluster囉!

第一步:在管理工作佔中,取得登入GCP專案的應用程式權限(請先確認您登入的帳戶是否有專案OWNER的權限)

gcloud auth application-default login

第二步:彙整配置表,以利後續進行template config修改

配置項目設置說明
sshPrivateKeyPath ~/.ssh/id_rsa讓管理工作站能夠藉由此私鑰,直接登入Nodes進行環境、設置的自動化部屬
Cluster TypeStandalone 
Master Node IP10.200.0.4Master Nodes的IP,如果有多台,請自行加入
Pod IP-range192.168.0.0/16kubernetes pod的IP-range,建議不要跟目前運行網路中的subnet重疊
Service IP-range10.96.0.0/20kubernetes service的IP-range,建議不要跟目前運行網路中的subnet重疊
controlPlaneVIP10.200.0.71kubernetes的api-server IP,後續會透過kubectl訪問此位址進行管理
ingressVIP10.0.200.72kubernetes的default ingress IP,後續若配置ingress,可透過此IP做L7的LB訪問、轉發
MetalLB IP-pool10.200.0.72-10.200.0.90kubernetes的external L4 LB IP配發位址,後續可透過kubectl expose動態要求一個IP,進行L4的LB訪問、轉發
GCP Logging & Monitotinglocation: asia-east1 ​​enableApplication: true啟用application logging 機制,將log及metrics拋到GCP中進行存儲、分析
Worker Nodes IP10.200.0.5 10.200.0.6Worker Nodes的IP,如果有多台,請自行加入
彙整配置表

第三步:透過bmctl產生出cluster template config,同時透過第一步獲得的GCP專案權限自動的啟用API及創建servcice account & key

export CLOUD_PROJECT_ID=$(gcloud config get-value project)   bmctl create config -c STD-C01 –enable-apis \     –create-service-accounts –project-id=$CLOUD_PROJECT_ID  

第四步:修改cluster template config

gcrKeyPath: /bmctl/bmctl-workspace/.sa-keys/my-gcp-project-anthos-baremetal-gcr.json
sshPrivateKeyPath: ~/.ssh/id_rsa
gkeConnectAgentServiceAccountKeyPath: /bmctl/bmctl-workspace/.sa-keys/my-gcp-project-anthos-baremetal-connect.json
gkeConnectRegisterServiceAccountKeyPath: /bmctl/bmctl-workspace/.sa-keys/my-gcp-project-anthos-baremetal-register.json
cloudOperationsServiceAccountKeyPath: /bmctl/bmctl-workspace/.sa-keys/my-gcp-project-anthos-baremetal-cloud-ops.json
apiVersion: v1
kind: Namespace
metadata:
name: cluster-STD-C01
apiVersion: baremetal.cluster.gke.io/v1
kind: Cluster
metadata:
name: STD-C01
namespace: cluster-STD-C01
spec:
# Cluster type. This can be:
# 1) admin: to create an admin cluster. This can later be used to create user clusters.
# 2) user: to create a user cluster. Requires an existing admin cluster.
# 3) hybrid: to create a hybrid cluster that runs admin cluster components and user workloads.
# 4) standalone: to create a cluster that manages itself, runs user workloads, but does not manage other clusters.
type: standalone
# Anthos cluster version.
anthosBareMetalVersion: 1.8.2
# GKE connect configuration
gkeConnect:
projectID: $GOOGLE_PROJECT_ID
# Control plane configuration
controlPlane:
nodePoolSpec:
nodes:

# Control plane node pools. Typically, this is either a single machine
# or 3 machines if using a high availability deployment.
– address: 10.200.0.4
# Cluster networking configuration
clusterNetwork:
# Pods specify the IP ranges from which pod networks are allocated.
pods:
cidrBlocks:

– 192.168.0.0/16
# Services specify the network ranges from which service virtual IPs are allocated.
# This can be any RFC 1918 range that does not conflict with any other IP range
# in the cluster and node pool resources.
services:
cidrBlocks:
– 10.96.0.0/20
# Load balancer configuration
loadBalancer:
# Load balancer mode can be either ‘bundled’ or ‘manual’.
# In ‘bundled’ mode a load balancer will be installed on load balancer nodes during cluster creation.
# In ‘manual’ mode the cluster relies on a manually-configured external load balancer.
mode: bundled
# Load balancer port configuration
ports:
# Specifies the port the load balancer serves the Kubernetes control plane on.
# In ‘manual’ mode the external load balancer must be listening on this port.
controlPlaneLBPort: 443
# There are two load balancer virtual IP (VIP) addresses: one for the control plane
# and one for the L7 Ingress service. The VIPs must be in the same subnet as the load balancer nodes.
# These IP addresses do not correspond to physical network interfaces.
vips:
# ControlPlaneVIP specifies the VIP to connect to the Kubernetes API server.
# This address must not be in the address pools below.
controlPlaneVIP: 10.200.0.71
# IngressVIP specifies the VIP shared by all services for ingress traffic.
# Allowed only in non-admin clusters.
# This address must be in the address pools below.
ingressVIP: 10.200.0.72
# AddressPools is a list of non-overlapping IP ranges for the data plane load balancer.
# All addresses must be in the same subnet as the load balancer nodes.
# Address pool configuration is only valid for ‘bundled’ LB mode in non-admin clusters.
addressPools:
– name: pool1
addresses:
# Each address must be either in the CIDR form (1.2.3.0/24)
# or range form (1.2.3.1-1.2.3.5).
– 10.200.0.72-10.200.0.90
# A load balancer node pool can be configured to specify nodes used for load balancing.
# These nodes are part of the Kubernetes cluster and run regular workloads as well as load balancers.
# If the node pool config is absent then the control plane nodes are used.
# Node pool configuration is only valid for ‘bundled’ LB mode.
# nodePoolSpec:
# nodes:
# – address:
# Proxy configuration
# proxy:
# url: http://[username:password@]domain
# # A list of IPs, hostnames or domains that should not be proxied.
# noProxy:
# – 127.0.0.1
# – localhost
# Logging and Monitoring
clusterOperations:
# Cloud project for logs and metrics.
projectID: $GOOGLE_PROJECT_ID
# Cloud location for logs and metrics.
location: asia-east1
# Whether collection of application logs/metrics should be enabled (in addition to
# collection of system logs/metrics which correspond to system components such as
# Kubernetes control plane or cluster management agents).
enableApplication:
true
# Storage configuration
storage:
# lvpNodeMounts specifies the config for local PersistentVolumes backed by mounted disks.
# These disks need to be formatted and mounted by the user, which can be done before or after
# cluster creation.
lvpNodeMounts:
# path specifies the host machine path where mounted disks will be discovered and a local PV
# will be created for each mount.
path: /mnt/localpv-disk
# storageClassName specifies the StorageClass that PVs will be created with. The StorageClass
# is created during cluster creation.
storageClassName: local-disks
# lvpShare specifies the config for local PersistentVolumes backed by subdirectories in a shared filesystem.
# These subdirectories are automatically created during cluster creation.
lvpShare:
# path specifies the host machine path where subdirectories will be created on each host. A local PV
# will be created for each subdirectory.
path: /mnt/localpv-share
# storageClassName specifies the StorageClass that PVs will be created with. The StorageClass
# is created during cluster creation.
storageClassName: local-shared
# numPVUnderSharedPath specifies the number of subdirectories to create under path.
numPVUnderSharedPath: 5
# NodeConfig specifies the configuration that applies to all nodes in the cluster.
nodeConfig:
# podDensity specifies the pod density configuration.
podDensity:
# maxPodsPerNode specifies at most how many pods can be run on a single node.
maxPodsPerNode: 250
# containerRuntime specifies which container runtime to use for scheduling containers on nodes.
# containerd and docker are supported.
containerRuntime: containerd
# KubeVirt configuration, uncomment this section if you want to install kubevirt to the cluster
# kubevirt:
# # if useEmulation is enabled, hardware accelerator (i.e relies on cpu feature like vmx or svm)
# # will not be attempted. QEMU will be used for software emulation.
# # useEmulation must be specified for KubeVirt installation
# useEmulation: false
# Authentication; uncomment this section if you wish to enable authentication to the cluster with OpenID Connect.
# authentication:
# oidc:
# # issuerURL specifies the URL of your OpenID provider, such as “https://accounts.google.com”. The Kubernetes API
# # server uses this URL to discover public keys for verifying tokens. Must use HTTPS.
# issuerURL:
# # clientID specifies the ID for the client application that makes authentication requests to the OpenID
# # provider.
# clientID:
# # clientSecret specifies the secret for the client application.
# clientSecret:
# # kubectlRedirectURL specifies the redirect URL (required) for the gcloud CLI, such as
# # “http://localhost:[PORT]/callback”.
# kubectlRedirectURL:
# # username specifies the JWT claim to use as the username. The default is “sub”, which is expected to be a
# # unique identifier of the end user.
# username:
# # usernamePrefix specifies the prefix prepended to username claims to prevent clashes with existing names.
# usernamePrefix:
# # group specifies the JWT claim that the provider will use to return your security groups.
# group:
# # groupPrefix specifies the prefix prepended to group claims to prevent clashes with existing names.
# groupPrefix:
# # scopes specifies additional scopes to send to the OpenID provider as a comma-delimited list.
# scopes:
# # extraParams specifies additional key-value parameters to send to the OpenID provider as a comma-delimited
# # list.
# extraParams:
# # proxy specifies the proxy server to use for the cluster to connect to your OIDC provider, if applicable.
# # Example: https://user:password@10.10.10.10:8888. If left blank, this defaults to no proxy.
# proxy:
# # deployCloudConsoleProxy specifies whether to deploy a reverse proxy in the cluster to allow Google Cloud
# # Console access to the on-premises OIDC provider for authenticating users. If your identity provider is not
# # reachable over the public internet, and you wish to authenticate using Google Cloud Console, then this field
# # must be set to true. If left blank, this field defaults to false.
# deployCloudConsoleProxy:
# # certificateAuthorityData specifies a Base64 PEM-encoded certificate authority certificate of your identity
# # provider. It’s not needed if your identity provider’s certificate was issued by a well-known public CA.
# # However, if deployCloudConsoleProxy is true, then this value must be provided, even for a well-known public
# # CA.
# certificateAuthorityData:
# Node access configuration; uncomment this section if you wish to use a non-root user
# with passwordless sudo capability for machine login.
# nodeAccess:
# loginUser:
Node pools for worker nodes

apiVersion: baremetal.cluster.gke.io/v1
kind:
NodePool
metadata:
name:
node-pool-1
namespace:
cluster-STD-C01
spec:
clusterName:
STD-C01
nodes:
address: 10.200.0.5
address:
10.200.0.6

第五步:透過bmctl進行正式部署,接下來只要等kubeconfig產出後,就可以登入叢集中使用囉!

bmctl create cluster -c STD-C01 export KUBECONFIG=~/bmctl-workspace/STD-C01/STD-C01-kubeconfig

第六步:創建service accout & role 並進行綁定,接下來只要取得token輸入到GCP Console > Anthos > 點選STD-C01 > login > token auth 中,我們就可以透過 GCP Console 來進行地端的K8S叢集管理囉。

kubectl create serviceaccount gke-connect-admin
kubectl create clusterrolebinding gke-connect-admin-clusterrolebinding –clusterrole cluster-admin –serviceaccount=default:gke-connect-admin
kubectl get secrets
NAME TYPE DATA AGE
default-token-pkhk5 kubernetes.io/service-account-token 3 15h
gke-connect-admin-token-pdns6 kubernetes.io/service-account-token 3 14h
echo Token: $(kubectl get secret –namespace default gke-connect-admin-token-pdns6 -o jsonpath=”{.data.token}” | base64 –decode)
Token: eyJhbGciOiJSUzI1NiIsImtpZCI6IlJfTzFqODdUVFByN2lidzhQMGc5djdTaDJYWlNTcU1sQUtiODU1TXltemcifQ.eyJpc3MiOiJrdWJlcm5ldGVzL3NlcnZpY2VhY2NvdW50Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9uYW1lc3BhY2UiOiJkZWZhdWx0Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9zZWNyZXQubmFtZSI6ImdrZS1jb25uZWN0LWFkbWluLXRva2VuLXBkbnM2Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9zZXJ2aWNlLWFjY291bnQubmFtZSI6ImdrZS1jb25uZWN0LWFkbWluIiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9zxxxxxxxxxxxxxxxxxxxxxxxxWlkIjoiMmJmMjcxYzUtMjc2ZC00MjRjLWExODUtNWM3NjJiZWI0Y2Q3Iiwic3ViIjoic3lzdGVtOnNlcnZpY2VhY2NvdW50OmRlZmF1bHQ6Z2tlLWNvbm5lY3QtYWRtaW4ifQ.l8a6hOLPDsl5LYP5aZrBAlaik8eCJFAYGmU0xr81BizgJ0ASb4YiCGSOKzgLGfZ8xq2OGDtrjTD3BG6DnZyKsnPihg1xYVnitZSVcXhQneGwQeFBXqRg81Wp6qfVqdJDB1Kr9dMZUzg0zVD9mOt9VNZhrMnAVTI5KO1KZBC4yBDwX2QLO5dHYdvg-RRsZZEK0SUt7Bu1rXoAWuMlwKRmaczGW3lAJkI87OT0H9xtL0_wc2UhQEkQl-P5e08MshKjSEcgdqOcLkZQxUsny7IsoIKQ6QT0XGOjWNNkc0kOcwrHAsxyna1tPRvTy7EvLicJ3ywcGWBFzQ63hs2zpSC-5w

安裝kubenetes環境,原來可以如此的簡單!你是不是也是試試看呢!

從部署的過程中,我們會發現 Anthos on Bare Metal 已經將很多複雜的部署流程都進行了簡化,甚至連啟用GCP的API跟創建IAM權限,也都自動化了!除此之外在bmctl的部署工具中,也加入了部署檢測機制(preflight check),讓我們在建置的過程中,可以有較清晰的脈絡,知道哪些部分需要做調整跟修正。

針對一些核心的專案,也都陸續的整併到Anthos feature的生態系中了,,您只需要進行一些簡單的設置,就可以完成自動化的部署!這真的太棒了!而且遇到技術問題時,也可以開Case請Google的專家們來協助您解決問題、給予方向,讓您可以更快速的找到正確、對的方向,來進行問題處理、改進。

今天的內容差不多到這邊囉!後續我們還會再介紹Anthos的其他部署方式,敬請期待,也歡迎大家關注我們的活動網頁、Line頻道,就可以在第一時間獲得我們的最新更新囉!

發佈日期: 2021-09-13 | James Wu

更多雲地混合解決方案 : 雲地混合規劃與建置

羽昇國際-企業上雲規劃評估服務

Tagged with: