Outages like: |
[db-postgresql1] cpu usage is over 99%
2021-05-30 02:21 (15 minutes)
|
Found similar: |
[nosql2] cpu usage is over 99%
possible false positive due to
На ноде cl1ojovqhk3qm0ungvip-ujoh были ошибки udp пакетов.
Помог перезапуск collectorov на этой node.
Полезные график;
https://alfa.okmetric.com/okmeter/graph?duration=1h&graph_config=group_by%3A%20source_hostname%0Aoptions%3A%0A%20%20y_title%3A%20connections%0Alines%3A%0A%20%20-%20expression%3A%20%22sum_by(instance%2C%20metric(name%3D%27collector.tcp_collector.connections.count%27))%20*%20max_by(instance%2C%20defined(metric(name%3D%27collector.tcp_collector.connections.gc.time.sum%27)))%22%0A%20%20%20%20type%3A%20area%0A%20%20-%20expression%3A%20%22sum(max_by(instance%2C%20metric(name%3D%27collector.tcp_collector.connections.threshold%27))%20%20*%20max_by(instance%2C%20defined(metric(name%3D%27collector.tcp_collector.connections.gc.time.sum%27))))%22%0A%20%20%20%20legend%3A%20configured%20threshold%0Aevents%3A%0A%20%20-%20collector-errors%0A%20%20-%20tcp_collector_critical%0Atitle%3A%20Active%20connections%0A
https://alfa.okmetric.com/okmeter/hosts/cl1ojovqhk3qm0ungvip-ujoh/netstat?&duration=1h
2021-05-27 18:01 (1 minute)
|
[db-postgresql1] cpu usage is over 99%
possible false positive due to
На ноде cl1ojovqhk3qm0ungvip-ujoh были ошибки udp пакетов.
Помог перезапуск collectorov на этой node.
Полезные график;
https://alfa.okmetric.com/okmeter/graph?duration=1h&graph_config=group_by%3A%20source_hostname%0Aoptions%3A%0A%20%20y_title%3A%20connections%0Alines%3A%0A%20%20-%20expression%3A%20%22sum_by(instance%2C%20metric(name%3D%27collector.tcp_collector.connections.count%27))%20*%20max_by(instance%2C%20defined(metric(name%3D%27collector.tcp_collector.connections.gc.time.sum%27)))%22%0A%20%20%20%20type%3A%20area%0A%20%20-%20expression%3A%20%22sum(max_by(instance%2C%20metric(name%3D%27collector.tcp_collector.connections.threshold%27))%20%20*%20max_by(instance%2C%20defined(metric(name%3D%27collector.tcp_collector.connections.gc.time.sum%27))))%22%0A%20%20%20%20legend%3A%20configured%20threshold%0Aevents%3A%0A%20%20-%20collector-errors%0A%20%20-%20tcp_collector_critical%0Atitle%3A%20Active%20connections%0A
https://alfa.okmetric.com/okmeter/hosts/cl1ojovqhk3qm0ungvip-ujoh/netstat?&duration=1h
2021-05-27 17:37 (18 minutes)
|
[db-postgresql1] cpu usage is over 99%
possible false positive due to
На ноде cl1ojovqhk3qm0ungvip-ujoh были ошибки udp пакетов.
Помог перезапуск collectorov на этой node.
Полезные график;
https://alfa.okmetric.com/okmeter/graph?duration=1h&graph_config=group_by%3A%20source_hostname%0Aoptions%3A%0A%20%20y_title%3A%20connections%0Alines%3A%0A%20%20-%20expression%3A%20%22sum_by(instance%2C%20metric(name%3D%27collector.tcp_collector.connections.count%27))%20*%20max_by(instance%2C%20defined(metric(name%3D%27collector.tcp_collector.connections.gc.time.sum%27)))%22%0A%20%20%20%20type%3A%20area%0A%20%20-%20expression%3A%20%22sum(max_by(instance%2C%20metric(name%3D%27collector.tcp_collector.connections.threshold%27))%20%20*%20max_by(instance%2C%20defined(metric(name%3D%27collector.tcp_collector.connections.gc.time.sum%27))))%22%0A%20%20%20%20legend%3A%20configured%20threshold%0Aevents%3A%0A%20%20-%20collector-errors%0A%20%20-%20tcp_collector_critical%0Atitle%3A%20Active%20connections%0A
https://alfa.okmetric.com/okmeter/hosts/cl1ojovqhk3qm0ungvip-ujoh/netstat?&duration=1h
2021-05-27 17:26 (4 minutes)
|
[db-postgresql1] cpu usage is over 99%
2021-05-27 17:11 (4 minutes)
|
[db-postgresql1] cpu usage is over 99%
2021-05-27 16:46 (9 minutes)
|
[nosql2] cpu usage is over 99%
2021-05-27 16:44 (1 minute)
|
[db-postgresql1] cpu usage is over 99%
2021-05-27 16:16 (7 minutes)
|
[db-postgresql1] cpu usage is over 99%
2021-05-27 15:36 (36 minutes)
|
[db-postgresql1] cpu usage is over 99%
possible false positive due to
okmeter outage
2021-05-27 15:21 (6 minutes)
|
[db-postgresql1] cpu usage is over 99%
2021-05-27 14:56 (16 minutes)
|
[nosql2] cpu usage is over 99%
2021-05-27 14:01 (1 minute)
|
[db-postgresql1] cpu usage is over 99%
2021-05-27 14:01 (34 minutes)
|
[db-postgresql1] cpu usage is over 99%
2021-05-27 13:41 (12 minutes)
|
[db-postgresql1] cpu usage is over 99%
2021-05-27 13:31 (5 minutes)
|
[db-postgresql1] cpu usage is over 99%
2021-05-27 13:18 (4 minutes)
|
[db-postgresql1] cpu usage is over 99%
2021-05-27 13:05 (1 minute)
|
[db-postgresql1] cpu usage is over 99%
2021-05-27 12:51 (7 minutes)
|
[db-postgresql1] cpu usage is over 99%
2021-05-27 12:41 (4 minutes)
|
[db-postgresql1] cpu usage is over 99%
2021-05-27 12:26 (5 minutes)
|
[db-postgresql1] cpu usage is over 99%
2021-05-27 12:06 (9 minutes)
|
[db-postgresql1] cpu usage is over 99%
2021-05-27 11:56 (2 minutes)
|
[db-postgresql1] cpu usage is over 99%
2021-05-27 11:22 (15 minutes)
|
[db-postgresql1] cpu usage is over 99%
possible false positive due to
Кратковременная потеря метрик. https://alfa.okmetric.com/okmeter/graph?duration=1h&graph_config=options%3A%0A%20%20y_title%3A%20metrics%2Fsecond%0Alines%3A%0A%20%20-%20expression%3A%20%27top(5%2C%20sum_by(site_name%2C%20metric(name%3D%5B%22gokserver.collector.metrics_written%22%2C%20%22gokserver.tcp_collector.metrics_written.rate%22%2C%20%22collector.tcp_collector.metrics_written.rate%22%5D)))%27%0A%20%20%20%20type%3A%20area%0A%20%20%20%20legend%3A%20%27%25(site_name)s%27%0Aevents%3A%0A%20%20-%20collector-errors%0A%20%20-%20tcp_collector_critical%0Atitle%3A%20Metrics%20written%0A
2021-05-27 11:12 (2 minutes)
|
[db-postgresql1] cpu usage is over 99%
possible false positive due to
Кратковременная потеря метрик. https://alfa.okmetric.com/okmeter/graph?duration=1h&graph_config=options%3A%0A%20%20y_title%3A%20metrics%2Fsecond%0Alines%3A%0A%20%20-%20expression%3A%20%27top(5%2C%20sum_by(site_name%2C%20metric(name%3D%5B%22gokserver.collector.metrics_written%22%2C%20%22gokserver.tcp_collector.metrics_written.rate%22%2C%20%22collector.tcp_collector.metrics_written.rate%22%5D)))%27%0A%20%20%20%20type%3A%20area%0A%20%20%20%20legend%3A%20%27%25(site_name)s%27%0Aevents%3A%0A%20%20-%20collector-errors%0A%20%20-%20tcp_collector_critical%0Atitle%3A%20Metrics%20written%0A
2021-05-27 11:02 (1 minute)
|
[db-postgresql1] cpu usage is over 99%
2021-05-27 10:41 (4 minutes)
|
[db-postgresql1] cpu usage is over 99%
2021-05-27 10:01 (4 minutes)
|
[db-postgresql1] cpu usage is over 99%
2021-05-27 09:36 (9 minutes)
|
[db-postgresql1] cpu usage is over 99%
2021-05-27 09:26 (6 minutes)
|
[db-postgresql1] cpu usage is over 99%
2021-05-27 09:16 (5 minutes)
|
[nosql2] cpu usage is over 99%
2021-05-27 08:55 (1 minute)
|
[db-postgresql1] cpu usage is over 99%
2021-05-27 08:51 (7 minutes)
|
[db-postgresql1] cpu usage is over 99%
2021-05-27 08:41 (6 minutes)
|
[db-postgresql1] cpu usage is over 99%
2021-05-27 08:26 (6 minutes)
|
[db-postgresql1] cpu usage is over 99%
2021-05-27 08:06 (13 minutes)
|
[db-postgresql1] cpu usage is over 99%
2021-05-27 07:37 (18 minutes)
|
[db-postgresql1] cpu usage is over 99%
2021-05-27 07:26 (3 minutes)
|
[db-postgresql1] cpu usage is over 99%
2021-05-27 07:06 (8 minutes)
|
[db-postgresql1] cpu usage is over 99%
2021-05-27 06:46 (15 minutes)
|
[db-postgresql1] cpu usage is over 99%
2021-05-27 06:26 (10 minutes)
|
[db-postgresql1] cpu usage is over 99%
2021-05-27 06:16 (4 minutes)
|
[db-postgresql1] cpu usage is over 99%
2021-05-27 05:51 (2 minutes)
|
[nosql2] cpu usage is over 99%
possible false positive due to
был какой-то скачок на графиках, но сейчас все ровно и хорошо
2021-05-27 05:44 (1 minute)
|
[db-postgresql1] cpu usage is over 99%
possible false positive due to
был какой-то скачок на графиках, но сейчас все ровно и хорошо
2021-05-27 05:41 (2 minutes)
|
[db-postgresql1] cpu usage is over 99%
possible false positive due to
был какой-то скачок на графиках, но сейчас все ровно и хорошо
2021-05-27 05:26 (1 minute)
|
[db-postgresql1] cpu usage is over 99%
possible false positive due to
был какой-то скачок на графиках, но сейчас все ровно и хорошо
2021-05-27 05:20 (2 minutes)
|
[db-postgresql1] cpu usage is over 99%
possible false positive due to
был какой-то скачок на графиках, но сейчас все ровно и хорошо
2021-05-27 05:08 (1 minute)
|
[db-postgresql1] cpu usage is over 99%
possible false positive due to
был какой-то скачок на графиках, но сейчас все ровно и хорошо
2021-05-27 04:56 (4 minutes)
|
[db-postgresql1] cpu usage is over 99%
possible false positive due to
был какой-то скачок на графиках, но сейчас все ровно и хорошо
2021-05-27 04:41 (4 minutes)
|
[db-postgresql1] cpu usage is over 99%
possible false positive due to
был какой-то скачок на графиках, но сейчас все ровно и хорошо
2021-05-27 04:26 (7 minutes)
|
[db-postgresql1] cpu usage is over 99%
2021-05-27 04:06 (5 minutes)
|
[db-postgresql1] cpu usage is over 99%
2021-05-27 03:26 (15 minutes)
|
[db-postgresql1] cpu usage is over 99%
2021-05-27 03:02 (6 minutes)
|
[db-postgresql1] cpu usage is over 99%
2021-05-27 02:51 (6 minutes)
|
[db-postgresql1] cpu usage is over 99%
2021-05-27 02:41 (2 minutes)
|
[db-postgresql1] cpu usage is over 99%
2021-05-27 02:31 (6 minutes)
|
[db-postgresql1] cpu usage is over 99%
2021-05-27 02:16 (4 minutes)
|
[db-postgresql1] cpu usage is over 99%
2021-05-27 01:46 (1 minute)
|
[db-postgresql1] cpu usage is over 99%
2021-05-27 01:33 (5 minutes)
|
[db-postgresql1] cpu usage is over 99%
2021-05-27 01:20 (8 minutes)
|
[nosql1] cpu usage is over 99%
2021-05-27 00:40 (1 minute)
|
[db-postgresql1] cpu usage is over 99%
2021-05-27 00:31 (27 minutes)
|
[db-postgresql1] cpu usage is over 99%
2021-05-27 00:01 (16 minutes)
|
[db-postgresql1] cpu usage is over 99%
2021-05-26 23:56 (1 minute)
|
[db-postgresql1] cpu usage is over 99%
2021-05-26 23:41 (4 minutes)
|
[db-postgresql1] cpu usage is over 99%
2021-05-26 23:31 (2 minutes)
|
[db-postgresql1] cpu usage is over 99%
2021-05-26 23:21 (4 minutes)
|
[db-postgresql1] cpu usage is over 99%
2021-05-26 23:06 (4 minutes)
|
[db-postgresql1] cpu usage is over 99%
2021-05-26 23:01 (1 minute)
|
[db-postgresql1] cpu usage is over 99%
2021-05-26 22:41 (6 minutes)
|
[nosql2] cpu usage is over 99%
2021-05-26 22:31 (1 minute)
|
[db-postgresql1] cpu usage is over 99%
2021-05-26 22:31 (2 minutes)
|
[db-postgresql1] cpu usage is over 99%
2021-05-26 22:16 (8 minutes)
|
[db-postgresql1] cpu usage is over 99%
2021-05-26 22:06 (2 minutes)
|
[db-postgresql1] cpu usage is over 99%
2021-05-26 21:51 (4 minutes)
|
[db-postgresql1] cpu usage is over 99%
2021-05-26 21:36 (2 minutes)
|
[db-postgresql1] cpu usage is over 99%
2021-05-26 21:16 (2 minutes)
|
[db-postgresql1] cpu usage is over 99%
2021-05-26 20:56 (2 minutes)
|
[db-postgresql1] cpu usage is over 99%
2021-05-26 20:46 (6 minutes)
|
[db-postgresql1] cpu usage is over 99%
2021-05-26 20:31 (4 minutes)
|
[db-postgresql1] cpu usage is over 99%
2021-05-26 20:21 (2 minutes)
|
[db-postgresql1] cpu usage is over 99%
2021-05-26 20:06 (1 minute)
|
[db-postgresql1] cpu usage is over 99%
2021-05-26 19:37 (23 minutes)
|
[db-postgresql1] cpu usage is over 99%
2021-05-26 19:17 (6 minutes)
|
[db-postgresql1] cpu usage is over 99%
2021-05-26 19:06 (4 minutes)
|
[db-postgresql1] cpu usage is over 99%
2021-05-26 18:56 (6 minutes)
|
[db-postgresql1] cpu usage is over 99%
2021-05-26 18:46 (3 minutes)
|
[db-postgresql1] cpu usage is over 99%
2021-05-26 18:36 (6 minutes)
|
[db-postgresql1] cpu usage is over 99%
2021-05-26 18:26 (4 minutes)
|
[db-postgresql1] cpu usage is over 99%
2021-05-26 18:11 (3 minutes)
|
[nosql2] cpu usage is over 99%
2021-05-26 18:01 (1 minute)
|
[db-postgresql1] cpu usage is over 99%
2021-05-26 17:41 (26 minutes)
|
[db-postgresql1] cpu usage is over 99%
2021-05-26 17:18 (3 minutes)
|
[db-postgresql1] cpu usage is over 99%
2021-05-26 16:57 (5 minutes)
|
[nosql1] cpu usage is over 99%
2021-05-26 16:33 (1 minute)
|
[db-postgresql1] cpu usage is over 99%
2021-05-26 16:21 (7 minutes)
|
[db-postgresql1] cpu usage is over 99%
2021-05-26 16:16 (1 minute)
|
[db-postgresql1] cpu usage is over 99%
2021-05-26 16:01 (7 minutes)
|
[db-postgresql1] cpu usage is over 99%
2021-05-26 15:46 (2 minutes)
|
[db-postgresql1] cpu usage is over 99%
2021-05-26 15:31 (11 minutes)
|
[db-postgresql1] cpu usage is over 99%
2021-05-26 15:01 (22 minutes)
|