跳到主要内容

Kubernetes Operator 数据采集

前置条件

前置条件
  1. 需要安装好 light-agent,具体安装步骤查看 Lighthouse 平台【数据采集】
  2. 区分 light-agent 是否安装在 Kubernetes 集群中
    • 如果 light-agent 安装在 Kubernetes 集群中,在安装 light-agent 的时候,已经将镜像导入到 Kubernetes 可以访问的仓库
    • 如果 light-agent 安装在 Kubernetes 集群外,需要将 kube-state-metrics 的镜像导入到 Kubernetes 可以访问的仓库。具体步骤查看 Lighthouse 平台【数据采集】-> 【Kubernetes】-> 推送镜像到 Kubernetes 的镜像仓库

获取 cert-manager、opentelemetry-operator 的 yaml 文件

注意
  • REGISTRY_REPO: 请替换为您的具体的镜像仓库地址 (需要加上项目名称, 比如 192.168.2.99/light-agent)
  • REGISTRY_CRED: dockerconfig-secret 密钥。请参考Kubenetes 安装 light-agent
  • envsubst: 环境变量替换工具。里面包含的变量不需要修改
# <Lighthouse IP> 是 Lighthouse 平台的 ip,请替换为实际的 ip。默认端口号是: 80, 一般不需要修改。
curl -O http://<Lighthouse IP>/downloads/k8s/cert-manager-template.yaml
REGISTRY_REPO=<your_docker_registry_url> REGISTRY_CRED=<registry_crea_str> envsubst '${REGISTRY_REPO} ${REGISTRY_CRED}' <cert-manager-template.yaml >cert-manager.yaml

curl -O http://<Lighthouse IP>/downloads/k8s/opentelemetry-operator-template.yaml
REGISTRY_REPO=<your_docker_registry_url> REGISTRY_CRED=<registry_crea_str> envsubst '${REGISTRY_REPO} ${REGISTRY_CRED}' <opentelemetry-operator-template.yaml >opentelemetry-operator.yaml

安装 cert-manager

OpenTelemetry Operator 需要先安装证书管理器

[root@master tpls]# kubectl apply -f cert-manager.yaml
namespace/cert-manager created
customresourcedefinition.apiextensions.k8s.io/certificaterequests.cert-manager.io created
customresourcedefinition.apiextensions.k8s.io/certificates.cert-manager.io created
customresourcedefinition.apiextensions.k8s.io/challenges.acme.cert-manager.io created
customresourcedefinition.apiextensions.k8s.io/clusterissuers.cert-manager.io created
customresourcedefinition.apiextensions.k8s.io/issuers.cert-manager.io created
customresourcedefinition.apiextensions.k8s.io/orders.acme.cert-manager.io created
serviceaccount/cert-manager-cainjector created
serviceaccount/cert-manager created
serviceaccount/cert-manager-webhook created
clusterrole.rbac.authorization.k8s.io/cert-manager-cainjector created
clusterrole.rbac.authorization.k8s.io/cert-manager-controller-issuers created
clusterrole.rbac.authorization.k8s.io/cert-manager-controller-clusterissuers created
clusterrole.rbac.authorization.k8s.io/cert-manager-controller-certificates created
clusterrole.rbac.authorization.k8s.io/cert-manager-controller-orders created
clusterrole.rbac.authorization.k8s.io/cert-manager-controller-challenges created
clusterrole.rbac.authorization.k8s.io/cert-manager-controller-ingress-shim created
clusterrole.rbac.authorization.k8s.io/cert-manager-cluster-view created
clusterrole.rbac.authorization.k8s.io/cert-manager-view created
clusterrole.rbac.authorization.k8s.io/cert-manager-edit created
clusterrole.rbac.authorization.k8s.io/cert-manager-controller-approve:cert-manager-io created
clusterrole.rbac.authorization.k8s.io/cert-manager-controller-certificatesigningrequests created
clusterrole.rbac.authorization.k8s.io/cert-manager-webhook:subjectaccessreviews created
clusterrolebinding.rbac.authorization.k8s.io/cert-manager-cainjector created
clusterrolebinding.rbac.authorization.k8s.io/cert-manager-controller-issuers created
clusterrolebinding.rbac.authorization.k8s.io/cert-manager-controller-clusterissuers created
clusterrolebinding.rbac.authorization.k8s.io/cert-manager-controller-certificates created
clusterrolebinding.rbac.authorization.k8s.io/cert-manager-controller-orders created
clusterrolebinding.rbac.authorization.k8s.io/cert-manager-controller-challenges created
clusterrolebinding.rbac.authorization.k8s.io/cert-manager-controller-ingress-shim created
clusterrolebinding.rbac.authorization.k8s.io/cert-manager-controller-approve:cert-manager-io created
clusterrolebinding.rbac.authorization.k8s.io/cert-manager-controller-certificatesigningrequests created
clusterrolebinding.rbac.authorization.k8s.io/cert-manager-webhook:subjectaccessreviews created
role.rbac.authorization.k8s.io/cert-manager-cainjector:leaderelection created
role.rbac.authorization.k8s.io/cert-manager:leaderelection created
role.rbac.authorization.k8s.io/cert-manager-tokenrequest created
role.rbac.authorization.k8s.io/cert-manager-webhook:dynamic-serving created
rolebinding.rbac.authorization.k8s.io/cert-manager-cainjector:leaderelection created
rolebinding.rbac.authorization.k8s.io/cert-manager:leaderelection created
rolebinding.rbac.authorization.k8s.io/cert-manager-cert-manager-tokenrequest created
rolebinding.rbac.authorization.k8s.io/cert-manager-webhook:dynamic-serving created
service/cert-manager-cainjector created
service/cert-manager created
service/cert-manager-webhook created
deployment.apps/cert-manager-cainjector created
deployment.apps/cert-manager created
deployment.apps/cert-manager-webhook created
mutatingwebhookconfiguration.admissionregistration.k8s.io/cert-manager-webhook created
validatingwebhookconfiguration.admissionregistration.k8s.io/cert-manager-webhook created

等待证书证书管理器安装完成

[root@master tpls]# kubectl wait --for=condition=Ready pods -n cert-manager -l app.kubernetes.io/instance=cert-manager
#有如下输出表示安装完成
pod/cert-manager-57d855897b-b2czv condition met
pod/cert-manager-cainjector-5c7f79b84b-5pshp condition met
pod/cert-manager-webhook-657b9f664c-vvvs5 condition met
[root@master tpls]# kubectl get pods -n cert-manager
#确认证书管理器正确安装
NAME READY STATUS RESTARTS AGE
cert-manager-57d855897b-b2czv 1/1 Running 0 9m8s
cert-manager-cainjector-5c7f79b84b-5pshp 1/1 Running 0 9m8s
cert-manager-webhook-657b9f664c-vvvs5 1/1 Running 0 9m8s

安装 opentelemetry-operator

[root@master tpls]# kubectl apply -f opentelemetry-operator.yaml
namespace/opentelemetry-operator-system created
customresourcedefinition.apiextensions.k8s.io/instrumentations.opentelemetry.io created
customresourcedefinition.apiextensions.k8s.io/opampbridges.opentelemetry.io created
customresourcedefinition.apiextensions.k8s.io/opentelemetrycollectors.opentelemetry.io created
serviceaccount/opentelemetry-operator-controller-manager created
role.rbac.authorization.k8s.io/opentelemetry-operator-leader-election-role created
clusterrole.rbac.authorization.k8s.io/opentelemetry-operator-manager-role created
clusterrole.rbac.authorization.k8s.io/opentelemetry-operator-metrics-reader created
clusterrole.rbac.authorization.k8s.io/opentelemetry-operator-proxy-role created
rolebinding.rbac.authorization.k8s.io/opentelemetry-operator-leader-election-rolebinding created
clusterrolebinding.rbac.authorization.k8s.io/opentelemetry-operator-manager-rolebinding created
clusterrolebinding.rbac.authorization.k8s.io/opentelemetry-operator-proxy-rolebinding created
service/opentelemetry-operator-controller-manager-metrics-service created
service/opentelemetry-operator-webhook-service created
deployment.apps/opentelemetry-operator-controller-manager created
certificate.cert-manager.io/opentelemetry-operator-serving-cert created
issuer.cert-manager.io/opentelemetry-operator-selfsigned-issuer created
mutatingwebhookconfiguration.admissionregistration.k8s.io/opentelemetry-operator-mutating-webhook-configuration created
validatingwebhookconfiguration.admissionregistration.k8s.io/opentelemetry-operator-validating-webhook-configuration created
[root@master tpls]# kubectl get pods -n opentelemetry-operator-system
#确认opentelemetry-operator安装完成
NAME READY STATUS RESTARTS AGE
opentelemetry-operator-controller-manager-58c68ff5c6-plkzn 2/2 Running 0 21s
[root@master tpls]# kubectl logs opentelemetry-operator-controller-manager-58c68ff5c6-plkzn -n opentelemetry-operator-system
#确认opentelemetry-operator正确安装【通过日志查看无ERROR级别日志】
Defaulted container "manager" out of: manager, kube-rbac-proxy
{"level":"INFO","timestamp":"2024-12-17T06:15:13.084660532Z","message":"Starting the OpenTelemetry Operator","opentelemetry-operator":"0.114.1","opentelemetry-collector":"ghcr.io/open-telemetry/opentelemetry-collector-releases/opentelemetry-collector:0.114.0","opentelemetry-targetallocator":"ghcr.io/open-telemetry/opentelemetry-operator/target-allocator:0.114.1","operator-opamp-bridge":"ghcr.io/open-telemetry/opentelemetry-operator/operator-opamp-bridge:0.114.1","auto-instrumentation-java":"ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-java:1.33.5","auto-instrumentation-nodejs":"ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-nodejs:0.53.0","auto-instrumentation-python":"ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-python:0.48b0","auto-instrumentation-dotnet":"ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-dotnet:1.2.0","auto-instrumentation-go":"ghcr.io/open-telemetry/opentelemetry-go-instrumentation/autoinstrumentation-go:v0.17.0-alpha","auto-instrumentation-apache-httpd":"ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-apache-httpd:1.0.4","auto-instrumentation-nginx":"ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-apache-httpd:1.0.4","feature-gates":"operator.collector.default.config,-operator.collector.targetallocatorcr,-operator.golang.flags,operator.observability.prometheus,-operator.sidecarcontainers.native,-operator.targetallocator.fallbackstrategy,-operator.targetallocator.mtls","build-date":"2024-12-05T14:36:38Z","go-version":"go1.22.9","go-arch":"amd64","go-os":"linux","labels-filter":[],"annotations-filter":[],"enable-multi-instrumentation":true,"enable-apache-httpd-instrumentation":true,"enable-dotnet-instrumentation":true,"enable-go-instrumentation":false,"enable-python-instrumentation":true,"enable-nginx-instrumentation":true,"enable-nodejs-instrumentation":true,"enable-java-instrumentation":true,"create-openshift-dashboard":false,"zap-message-key":"message","zap-level-key":"level","zap-time-key":"timestamp","zap-level-format":"uppercase"}
{"level":"INFO","timestamp":"2024-12-17T06:15:13.085761306Z","logger":"setup","message":"the env var WATCH_NAMESPACE isn't set, watching all namespaces"}
{"level":"INFO","timestamp":"2024-12-17T06:15:13.291129199Z","logger":"setup","message":"Prometheus CRDs are not installed, skipping adding to scheme."}
{"level":"INFO","timestamp":"2024-12-17T06:15:13.291223365Z","logger":"setup","message":"Openshift CRDs are not installed, skipping adding to scheme."}
{"level":"INFO","timestamp":"2024-12-17T06:15:13.291239316Z","logger":"setup","message":"Cert-Manager is not available to the operator, skipping adding to scheme."}
{"level":"INFO","timestamp":"2024-12-17T06:15:13.292364665Z","logger":"controller-runtime.builder","message":"Registering a mutating webhook","GVK":"opentelemetry.io/v1beta1, Kind=OpenTelemetryCollector","path":"/mutate-opentelemetry-io-v1beta1-opentelemetrycollector"}
{"level":"INFO","timestamp":"2024-12-17T06:15:13.293091003Z","logger":"controller-runtime.webhook","message":"Registering webhook","path":"/mutate-opentelemetry-io-v1beta1-opentelemetrycollector"}
{"level":"INFO","timestamp":"2024-12-17T06:15:13.293280876Z","logger":"controller-runtime.builder","message":"Registering a validating webhook","GVK":"opentelemetry.io/v1beta1, Kind=OpenTelemetryCollector","path":"/validate-opentelemetry-io-v1beta1-opentelemetrycollector"}
{"level":"INFO","timestamp":"2024-12-17T06:15:13.293506657Z","logger":"controller-runtime.webhook","message":"Registering webhook","path":"/validate-opentelemetry-io-v1beta1-opentelemetrycollector"}
{"level":"INFO","timestamp":"2024-12-17T06:15:13.294022343Z","logger":"controller-runtime.webhook","message":"Registering webhook","path":"/convert"}
{"level":"INFO","timestamp":"2024-12-17T06:15:13.294060324Z","logger":"controller-runtime.builder","message":"Conversion webhook enabled","GVK":"opentelemetry.io/v1beta1, Kind=OpenTelemetryCollector"}
{"level":"INFO","timestamp":"2024-12-17T06:15:13.294156082Z","logger":"controller-runtime.builder","message":"Registering a mutating webhook","GVK":"opentelemetry.io/v1alpha1, Kind=Instrumentation","path":"/mutate-opentelemetry-io-v1alpha1-instrumentation"}
{"level":"INFO","timestamp":"2024-12-17T06:15:13.294287297Z","logger":"controller-runtime.webhook","message":"Registering webhook","path":"/mutate-opentelemetry-io-v1alpha1-instrumentation"}
{"level":"INFO","timestamp":"2024-12-17T06:15:13.294369558Z","logger":"controller-runtime.builder","message":"Registering a validating webhook","GVK":"opentelemetry.io/v1alpha1, Kind=Instrumentation","path":"/validate-opentelemetry-io-v1alpha1-instrumentation"}
{"level":"INFO","timestamp":"2024-12-17T06:15:13.294491744Z","logger":"controller-runtime.webhook","message":"Registering webhook","path":"/validate-opentelemetry-io-v1alpha1-instrumentation"}
{"level":"INFO","timestamp":"2024-12-17T06:15:13.294748464Z","logger":"controller-runtime.webhook","message":"Registering webhook","path":"/mutate-v1-pod"}
{"level":"INFO","timestamp":"2024-12-17T06:15:13.294839746Z","logger":"controller-runtime.builder","message":"Registering a mutating webhook","GVK":"opentelemetry.io/v1alpha1, Kind=OpAMPBridge","path":"/mutate-opentelemetry-io-v1alpha1-opampbridge"}
{"level":"INFO","timestamp":"2024-12-17T06:15:13.295040562Z","logger":"controller-runtime.webhook","message":"Registering webhook","path":"/mutate-opentelemetry-io-v1alpha1-opampbridge"}
{"level":"INFO","timestamp":"2024-12-17T06:15:13.295402574Z","logger":"controller-runtime.builder","message":"Registering a validating webhook","GVK":"opentelemetry.io/v1alpha1, Kind=OpAMPBridge","path":"/validate-opentelemetry-io-v1alpha1-opampbridge"}
{"level":"INFO","timestamp":"2024-12-17T06:15:13.295563447Z","logger":"controller-runtime.webhook","message":"Registering webhook","path":"/validate-opentelemetry-io-v1alpha1-opampbridge"}
{"level":"INFO","timestamp":"2024-12-17T06:15:13.2956694Z","logger":"setup","message":"starting manager"}
{"level":"INFO","timestamp":"2024-12-17T06:15:13.295963356Z","logger":"controller-runtime.metrics","message":"Starting metrics server"}
{"level":"INFO","timestamp":"2024-12-17T06:15:13.296114435Z","logger":"controller-runtime.webhook","message":"Starting webhook server"}
{"level":"INFO","timestamp":"2024-12-17T06:15:13.296298087Z","logger":"controller-runtime.metrics","message":"Serving metrics server","bindAddress":"127.0.0.1:8080","secure":false}
{"level":"INFO","timestamp":"2024-12-17T06:15:13.296046786Z","message":"starting server","name":"health probe","addr":"[::]:8081"}
{"level":"INFO","timestamp":"2024-12-17T06:15:13.296885436Z","logger":"controller-runtime.certwatcher","message":"Updated current TLS certificate"}
I1217 06:15:13.296963 1 leaderelection.go:254] attempting to acquire leader lease opentelemetry-operator-system/9f7554c3.opentelemetry.io...
{"level":"INFO","timestamp":"2024-12-17T06:15:13.297181963Z","logger":"controller-runtime.webhook","message":"Serving webhook server","host":"","port":9443}
{"level":"INFO","timestamp":"2024-12-17T06:15:13.297655721Z","logger":"controller-runtime.certwatcher","message":"Starting certificate watcher"}
I1217 06:15:13.706142 1 leaderelection.go:268] successfully acquired lease opentelemetry-operator-system/9f7554c3.opentelemetry.io
{"level":"INFO","timestamp":"2024-12-17T06:15:13.706767102Z","logger":"collector-upgrade","message":"looking for managed instances to upgrade"}
{"level":"INFO","timestamp":"2024-12-17T06:15:13.706950261Z","logger":"instrumentation-upgrade","message":"looking for managed Instrumentation instances to upgrade"}
{"level":"INFO","timestamp":"2024-12-17T06:15:13.707645102Z","message":"Starting EventSource","controller":"opampbridge","controllerGroup":"opentelemetry.io","controllerKind":"OpAMPBridge","source":"kind source: *v1alpha1.OpAMPBridge"}
{"level":"INFO","timestamp":"2024-12-17T06:15:13.707760238Z","message":"Starting EventSource","controller":"opampbridge","controllerGroup":"opentelemetry.io","controllerKind":"OpAMPBridge","source":"kind source: *v1.ConfigMap"}
{"level":"INFO","timestamp":"2024-12-17T06:15:13.707781113Z","message":"Starting EventSource","controller":"opampbridge","controllerGroup":"opentelemetry.io","controllerKind":"OpAMPBridge","source":"kind source: *v1.ServiceAccount"}
{"level":"INFO","timestamp":"2024-12-17T06:15:13.707798046Z","message":"Starting EventSource","controller":"opampbridge","controllerGroup":"opentelemetry.io","controllerKind":"OpAMPBridge","source":"kind source: *v1.Service"}
{"level":"INFO","timestamp":"2024-12-17T06:15:13.707812065Z","message":"Starting EventSource","controller":"opampbridge","controllerGroup":"opentelemetry.io","controllerKind":"OpAMPBridge","source":"kind source: *v1.Deployment"}
{"level":"INFO","timestamp":"2024-12-17T06:15:13.707823304Z","message":"Starting Controller","controller":"opampbridge","controllerGroup":"opentelemetry.io","controllerKind":"OpAMPBridge"}
{"level":"INFO","timestamp":"2024-12-17T06:15:13.708510704Z","message":"Starting EventSource","controller":"opentelemetrycollector","controllerGroup":"opentelemetry.io","controllerKind":"OpenTelemetryCollector","source":"kind source: *v1beta1.OpenTelemetryCollector"}
{"level":"INFO","timestamp":"2024-12-17T06:15:13.708638225Z","message":"Starting EventSource","controller":"opentelemetrycollector","controllerGroup":"opentelemetry.io","controllerKind":"OpenTelemetryCollector","source":"kind source: *v1.ConfigMap"}
{"level":"INFO","timestamp":"2024-12-17T06:15:13.708685139Z","message":"Starting EventSource","controller":"opentelemetrycollector","controllerGroup":"opentelemetry.io","controllerKind":"OpenTelemetryCollector","source":"kind source: *v1.ServiceAccount"}
{"level":"INFO","timestamp":"2024-12-17T06:15:13.708709076Z","message":"Starting EventSource","controller":"opentelemetrycollector","controllerGroup":"opentelemetry.io","controllerKind":"OpenTelemetryCollector","source":"kind source: *v1.Service"}
{"level":"INFO","timestamp":"2024-12-17T06:15:13.708744207Z","message":"Starting EventSource","controller":"opentelemetrycollector","controllerGroup":"opentelemetry.io","controllerKind":"OpenTelemetryCollector","source":"kind source: *v1.Deployment"}
{"level":"INFO","timestamp":"2024-12-17T06:15:13.708771699Z","message":"Starting EventSource","controller":"opentelemetrycollector","controllerGroup":"opentelemetry.io","controllerKind":"OpenTelemetryCollector","source":"kind source: *v1.DaemonSet"}
{"level":"INFO","timestamp":"2024-12-17T06:15:13.708811888Z","message":"Starting EventSource","controller":"opentelemetrycollector","controllerGroup":"opentelemetry.io","controllerKind":"OpenTelemetryCollector","source":"kind source: *v1.StatefulSet"}
{"level":"INFO","timestamp":"2024-12-17T06:15:13.708827911Z","message":"Starting EventSource","controller":"opentelemetrycollector","controllerGroup":"opentelemetry.io","controllerKind":"OpenTelemetryCollector","source":"kind source: *v1.Ingress"}
{"level":"INFO","timestamp":"2024-12-17T06:15:13.708850083Z","message":"Starting EventSource","controller":"opentelemetrycollector","controllerGroup":"opentelemetry.io","controllerKind":"OpenTelemetryCollector","source":"kind source: *v2.HorizontalPodAutoscaler"}
{"level":"INFO","timestamp":"2024-12-17T06:15:13.708922276Z","message":"Starting EventSource","controller":"opentelemetrycollector","controllerGroup":"opentelemetry.io","controllerKind":"OpenTelemetryCollector","source":"kind source: *v1.PodDisruptionBudget"}
{"level":"INFO","timestamp":"2024-12-17T06:15:13.708953283Z","message":"Starting Controller","controller":"opentelemetrycollector","controllerGroup":"opentelemetry.io","controllerKind":"OpenTelemetryCollector"}
{"level":"INFO","timestamp":"2024-12-17T06:15:14.218120231Z","logger":"collector-upgrade","message":"no instances to upgrade"}
{"level":"INFO","timestamp":"2024-12-17T06:15:14.251332517Z","message":"Starting workers","controller":"opentelemetrycollector","controllerGroup":"opentelemetry.io","controllerKind":"OpenTelemetryCollector","worker count":1}
{"level":"INFO","timestamp":"2024-12-17T06:15:14.254833405Z","logger":"instrumentation-upgrade","message":"no instances to upgrade"}
{"level":"INFO","timestamp":"2024-12-17T06:15:14.255402031Z","message":"Starting workers","controller":"opampbridge","controllerGroup":"opentelemetry.io","controllerKind":"OpAMPBridge","worker count":1}
[root@master tpls]#