00:00:00
Vmalertmanagerconfig
本文使用的示例是静默告警的需求,其他用途待补充。
CRD
yaml
KIND: VMAlertmanagerConfig
VERSION: operator.victoriametrics.com/v1beta1
RESOURCE: spec <Object>
DESCRIPTION:
VMAlertmanagerConfigSpec defines configuration for VMAlertmanagerConfig
FIELDS:
- <string>
inhibit_rules <[]Object>
InhibitRules will only apply for alerts matching the resource's namespace.
mute_time_intervals <[]Object>
MuteTimeInterval - global mute time See
https://prometheus.io/docs/alerting/latest/configuration/#mute_time_interval
receivers <[]Object>
Receivers defines alert receivers. without defined Route, receivers will be
skipped.
route <Object>
Route definition for alertmanager, may include nested routes.
time_intervals <[]Object>
ParsingError contents error with context if operator was failed to parse
json object from kubernetes api server TimeIntervals modern config option,
use it instead of mute_time_intervals
KIND: VMAlertmanagerConfig
VERSION: operator.victoriametrics.com/v1beta1
RESOURCE: route <Object>
DESCRIPTION:
Route definition for alertmanager, may include nested routes.
FIELDS:
active_time_intervals <[]string>
ActiveTimeIntervals Times when the route should be active These must match
the name at time_intervals
continue <boolean>
Continue indicating whether an alert should continue matching subsequent
sibling nodes. It will always be true for the first-level route if
disableRouteContinueEnforce for vmalertmanager not set.
group_by <[]string>
List of labels to group by.
group_interval <string>
How long to wait before sending an updated notification.
group_wait <string>
How long to wait before sending the initial notification.
matchers <[]string>
List of matchers that the alert’s labels should match. For the first
level route, the operator adds a namespace: "CRD_NS" matcher.
https://prometheus.io/docs/alerting/latest/configuration/#matcher
mute_time_intervals <[]string>
MuteTimeIntervals for alerts
receiver <string> -required-
Name of the receiver for this route.
repeat_interval <string>
How long to wait before repeating the last notification.
routes <[]>
Child routes.
https://prometheus.io/docs/alerting/latest/configuration/#route
KIND: VMAlertmanagerConfig
VERSION: operator.victoriametrics.com/v1beta1
RESOURCE: time_intervals <[]Object>
DESCRIPTION:
ParsingError contents error with context if operator was failed to parse
json object from kubernetes api server TimeIntervals modern config option,
use it instead of mute_time_intervals
MuteTimeInterval for alerts
FIELDS:
name <string>
Name of interval
time_intervals <[]Object> -required-
TimeIntervals interval configuration
KIND: VMAlertmanagerConfig
VERSION: operator.victoriametrics.com/v1beta1
RESOURCE: time_intervals <[]Object>
DESCRIPTION:
TimeIntervals interval configuration
TimeInterval defines intervals of time
FIELDS:
days_of_month <[]string>
DayOfMonth defines list of numerical days in the month. Days begin at 1.
Negative values are also accepted. for example, ['1:5', '-3:-1']
location <string>
Location in golang time location form, e.g. UTC
months <[]string>
Months defines list of calendar months identified by a case-insentive name
(e.g. ‘January’) or numeric 1. For example, ['1:3', 'may:august',
'december']
times <[]Object>
Times defines time range for mute
weekdays <[]string>
Weekdays defines list of days of the week, where the week begins on Sunday
and ends on Saturday.
years <[]string>
Years defines numerical list of years, ranges are accepted. For example,
['2020:2022', '2030']示例
VMAlertmanagerConfig 可以创建多个,那么就有执行顺序,按照 metadata.name 排序。
多个 VMAlertmanagerConfig 的 receivers.name 可以同名。
多个 VMAlertmanagerConfig 的 matchers 配置一样时,需要根据实际情况来决定是否启用 continue,以实现不同时间段的静默效果。
a-a-infra-uat-evfk-esyellow-inhibition-test 是一个示例,用于测试 Elasticsearch 集群黄色状态告警的静默时间配置。
yaml
apiVersion: operator.victoriametrics.com/v1beta1
kind: VMAlertmanagerConfig
metadata:
name: a-a-infra-uat-evfk-esyellow-inhibition-test
namespace: insight-system
spec:
# 定义告警接收者
receivers:
- name: test-100-receivers
webhook_configs:
- send_resolved: true
url: http://172.31.142.251:8080/prometheusalert?type=dd&tpl=prd-dingidng
# 告警路由配置:匹配特定告警并引用静默时间
route:
continue: false # 匹配后不再继续匹配其他路由/配置
receiver: test-100-receivers
group_wait: 30s # 告警分组等待时间(首次收到告警后,等待 30s 再批量发送)
group_interval: 5m # 同一组告警新增告警后再次发送的间隔。
repeat_interval: 1h # 重复发送相同告警的间隔。
# 匹配需要静默的特定告警
matchers:
- namespace="evfk-system"
- alertgroup="murphy-test-evfk"
- alertname="ElasticsearchClusterYellow"
- severity="info"
# 引用静默时间区间:在daily_11_to_14期间抑制匹配的告警
mute_time_intervals:
- daily_11_to_14
# 静默时间配置
time_intervals:
- name: daily_11_to_14
# 嵌套的时间区间配置
time_intervals:
- times:
- start_time: "11:00"
end_time: "14:00"
# 明确时区为北京时间(避免UTC偏差)
location: "Asia/Shanghai"
# 不限制日期、月份、年份和星期(即每天生效)
days_of_month: []
months: []
years: []
weekdays: []