使用Prometheus收集Docker指标
预计阅读时间:8分钟
Prometheus是一个开源系统监视和警报工具包。您可以将Docker配置为Prometheus目标。本主题向您展示如何配置Docker,设置Prometheus以作为Docker容器运行以及如何使用Prometheus监视Docker实例。
警告:可用的度量标准和这些度量标准的名称正在开发中,并可能随时更改。
当前,您只能监视Docker本身。您当前无法使用Docker目标监视您的应用程序。
配置Docker
要将Docker守护程序配置为Prometheus目标,您需要指定
metrics-address
。最好的方法是通过daemon.json
,默认情况下,它位于以下位置之一。如果文件不存在,请创建它。
- Linux:
/etc/docker/daemon.json
- Windows Server:
C:\ProgramData\docker\config\daemon.json
- 适用于Mac的Docker桌面/适用于Windows的Docker桌面:单击工具栏中的Docker图标,选择首选项,然后选择守护程序。单击高级。
如果文件当前为空,请粘贴以下内容:
{
"metrics-addr" : "127.0.0.1:9323",
"experimental" : true
}
如果文件不为空,请添加这两个键,并确保生成的文件是有效的JSON。请注意,,
除最后一行外,每行均以逗号()结尾。
保存文件,或者对于Mac的Docker Desktop或Windows的Docker Desktop,保存配置。重新启动Docker。
Docker现在在端口9323上公开了与Prometheus兼容的指标。
配置并运行Prometheus
Prometheus作为Docker服务在Docker群上运行。
先决条件
使用
docker swarm init
一个管理器以及docker swarm join
其他管理器和工作器节点,将一个或多个Docker引擎加入到Docker群中。您需要互联网连接才能提取Prometheus图像。
复制以下配置文件之一,并将其保存到
/tmp/prometheus.yml
(Linux或Mac)或C:\tmp\prometheus.yml
(Windows)。这是一个常规的Prometheus配置文件,除了在文件底部添加了Docker作业定义外。Mac的Docker桌面和Windows的Docker桌面需要稍有不同的配置。
# my global config
global:
scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
# scrape_timeout is set to the global default (10s).
# Attach these labels to any time series or alerts when communicating with
# external systems (federation, remote storage, Alertmanager).
external_labels:
monitor: 'codelab-monitor'
# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
# - "first.rules"
# - "second.rules"
# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
# The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
- job_name: 'prometheus'
# metrics_path defaults to '/metrics'
# scheme defaults to 'http'.
static_configs:
- targets: ['localhost:9090']
- job_name: 'docker'
# metrics_path defaults to '/metrics'
# scheme defaults to 'http'.
static_configs:
- targets: ['localhost:9323']
# my global config
global:
scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
# scrape_timeout is set to the global default (10s).
# Attach these labels to any time series or alerts when communicating with
# external systems (federation, remote storage, Alertmanager).
external_labels:
monitor: 'codelab-monitor'
# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
# - "first.rules"
# - "second.rules"
# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
# The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
- job_name: 'prometheus'
# metrics_path defaults to '/metrics'
# scheme defaults to 'http'.
static_configs:
- targets: ['host.docker.internal:9090'] # Only works on Docker Desktop for Mac
- job_name: 'docker'
# metrics_path defaults to '/metrics'
# scheme defaults to 'http'.
static_configs:
- targets: ['docker.for.mac.host.internal:9323']
# my global config
global:
scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
# scrape_timeout is set to the global default (10s).
# Attach these labels to any time series or alerts when communicating with
# external systems (federation, remote storage, Alertmanager).
external_labels:
monitor: 'codelab-monitor'
# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
# - "first.rules"
# - "second.rules"
# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
# The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
- job_name: 'prometheus'
# metrics_path defaults to '/metrics'
# scheme defaults to 'http'.
static_configs:
- targets: ['host.docker.internal:9090'] # Only works on Docker Desktop for Windows
- job_name: 'docker'
# metrics_path defaults to '/metrics'
# scheme defaults to 'http'.
static_configs:
- targets: ['192.168.65.1:9323']
接下来,使用此配置启动单副本Prometheus服务。
$ docker service create --replicas 1 --name my-prometheus \
--mount type=bind,source=/tmp/prometheus.yml,destination=/etc/prometheus/prometheus.yml \
--publish published=9090,target=9090,protocol=tcp \
prom/prometheus
$ docker service create --replicas 1 --name my-prometheus \
--mount type=bind,source=/tmp/prometheus.yml,destination=/etc/prometheus/prometheus.yml \
--publish published=9090,target=9090,protocol=tcp \
prom/prometheus
PS C:\> docker service create --replicas 1 --name my-prometheus
--mount type=bind,source=C:/tmp/prometheus.yml,destination=/etc/prometheus/prometheus.yml
--publish published=9090,target=9090,protocol=tcp
prom/prometheus
验证Docker目标是否在http:// localhost:9090 / targets /中列出。
如果使用Mac的Docker桌面或Windows的Docker桌面,则无法直接访问端点URL。
使用普罗米修斯
创建一个图形。单击Prometheus UI中的“图形”链接。从“执行”按钮右侧的组合框中选择一个指标,然后单击“
执行”。以下屏幕截图显示了的图形
engine_daemon_network_actions_seconds_count
。
上图显示了一个非常空闲的Docker实例。如果您正在运行活动的工作负载,则图表的外观可能会有所不同。
为了使图更有趣,通过启动一项服务来创建一些网络操作,这些服务只需执行10个不间断ping Docker的任务(您可以将ping目标更改为您喜欢的任何对象):
$ docker service create \
--replicas 10 \
--name ping_service \
alpine ping docker.com
等待几分钟(默认的抓取间隔为15秒),然后重新加载图形。
准备就绪后,请停止并删除该ping_service
服务,以使您不会无缘无故地用ping泛洪主机。
$ docker service remove ping_service
等待几分钟,您应该会看到图表回落到空闲水平。
下一步
- 阅读Prometheus文档
- 设置一些警报