使用Prometheus收集Docker指标

预计阅读时间：8分钟

Prometheus是一个开源系统监视和警报工具包。您可以将Docker配置为Prometheus目标。本主题向您展示如何配置Docker，设置Prometheus以作为Docker容器运行以及如何使用Prometheus监视Docker实例。

警告：可用的度量标准和这些度量标准的名称正在开发中，并可能随时更改。

当前，您只能监视Docker本身。您当前无法使用Docker目标监视您的应用程序。

配置Docker

要将Docker守护程序配置为Prometheus目标，您需要指定 metrics-address。最好的方法是通过daemon.json，默认情况下，它位于以下位置之一。如果文件不存在，请创建它。

Linux：/etc/docker/daemon.json
Windows Server：C:\ProgramData\docker\config\daemon.json
适用于Mac的Docker桌面/适用于Windows的Docker桌面：单击工具栏中的Docker图标，选择首选项，然后选择守护程序。单击高级。

如果文件当前为空，请粘贴以下内容：

{
  "metrics-addr" : "127.0.0.1:9323",
  "experimental" : true
}

如果文件不为空，请添加这两个键，并确保生成的文件是有效的JSON。请注意，,除最后一行外，每行均以逗号（）结尾。

保存文件，或者对于Mac的Docker Desktop或Windows的Docker Desktop，保存配置。重新启动Docker。

Docker现在在端口9323上公开了与Prometheus兼容的指标。

配置并运行Prometheus

Prometheus作为Docker服务在Docker群上运行。

先决条件

使用docker swarm init 一个管理器以及docker swarm join其他管理器和工作器节点，将一个或多个Docker引擎加入到Docker群中。

您需要互联网连接才能提取Prometheus图像。

复制以下配置文件之一，并将其保存到 /tmp/prometheus.yml（Linux或Mac）或C:\tmp\prometheus.yml（Windows）。这是一个常规的Prometheus配置文件，除了在文件底部添加了Docker作业定义外。Mac的Docker桌面和Windows的Docker桌面需要稍有不同的配置。

# my global config
global:
  scrape_interval:     15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
  evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
  # scrape_timeout is set to the global default (10s).

  # Attach these labels to any time series or alerts when communicating with
  # external systems (federation, remote storage, Alertmanager).
  external_labels:
      monitor: 'codelab-monitor'

# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
  # - "first.rules"
  # - "second.rules"

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
  - job_name: 'prometheus'

    # metrics_path defaults to '/metrics'
    # scheme defaults to 'http'.

    static_configs:
      - targets: ['localhost:9090']

  - job_name: 'docker'
         # metrics_path defaults to '/metrics'
         # scheme defaults to 'http'.

    static_configs:
      - targets: ['localhost:9323']

# my global config
global:
  scrape_interval:     15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
  evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
  # scrape_timeout is set to the global default (10s).

  # Attach these labels to any time series or alerts when communicating with
  # external systems (federation, remote storage, Alertmanager).
  external_labels:
      monitor: 'codelab-monitor'

# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
  # - "first.rules"
  # - "second.rules"

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
  - job_name: 'prometheus'

    # metrics_path defaults to '/metrics'
    # scheme defaults to 'http'.

    static_configs:
      - targets: ['host.docker.internal:9090'] # Only works on Docker Desktop for Mac

  - job_name: 'docker'
         # metrics_path defaults to '/metrics'
         # scheme defaults to 'http'.

    static_configs:
      - targets: ['docker.for.mac.host.internal:9323']

# my global config
global:
  scrape_interval:     15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
  evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
  # scrape_timeout is set to the global default (10s).

  # Attach these labels to any time series or alerts when communicating with
  # external systems (federation, remote storage, Alertmanager).
  external_labels:
      monitor: 'codelab-monitor'

# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
  # - "first.rules"
  # - "second.rules"

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
  - job_name: 'prometheus'

    # metrics_path defaults to '/metrics'
    # scheme defaults to 'http'.

    static_configs:
      - targets: ['host.docker.internal:9090'] # Only works on Docker Desktop for Windows

  - job_name: 'docker'
         # metrics_path defaults to '/metrics'
         # scheme defaults to 'http'.

    static_configs:
      - targets: ['192.168.65.1:9323']

接下来，使用此配置启动单副本Prometheus服务。

$ docker service create --replicas 1 --name my-prometheus \
    --mount type=bind,source=/tmp/prometheus.yml,destination=/etc/prometheus/prometheus.yml \
    --publish published=9090,target=9090,protocol=tcp \
    prom/prometheus

$ docker service create --replicas 1 --name my-prometheus \
    --mount type=bind,source=/tmp/prometheus.yml,destination=/etc/prometheus/prometheus.yml \
    --publish published=9090,target=9090,protocol=tcp \
    prom/prometheus

PS C:\> docker service create --replicas 1 --name my-prometheus
    --mount type=bind,source=C:/tmp/prometheus.yml,destination=/etc/prometheus/prometheus.yml
    --publish published=9090,target=9090,protocol=tcp
    prom/prometheus

验证Docker目标是否在http：// localhost：9090 / targets /中列出。

普罗米修斯目标页面

如果使用Mac的Docker桌面或Windows的Docker桌面，则无法直接访问端点URL。

使用普罗米修斯

创建一个图形。单击Prometheus UI中的“图形”链接。从“执行”按钮右侧的组合框中选择一个指标，然后单击“ 执行”。以下屏幕截图显示了的图形 engine_daemon_network_actions_seconds_count。

Prometheus engine_daemon_network_actions_seconds_count报告

上图显示了一个非常空闲的Docker实例。如果您正在运行活动的工作负载，则图表的外观可能会有所不同。

为了使图更有趣，通过启动一项服务来创建一些网络操作，这些服务只需执行10个不间断ping Docker的任务（您可以将ping目标更改为您喜欢的任何对象）：

$ docker service create \
  --replicas 10 \
  --name ping_service \
  alpine ping docker.com

等待几分钟（默认的抓取间隔为15秒），然后重新加载图形。

Prometheus engine_daemon_network_actions_seconds_count报告

准备就绪后，请停止并删除该ping_service服务，以使您不会无缘无故地用ping泛洪主机。

$ docker service remove ping_service

等待几分钟，您应该会看到图表回落到空闲水平。

下一步

阅读Prometheus文档
设置一些警报

普罗米修斯，指标