|
3c02a16714
|
Prometheus: remove waiting time for KubeNodeUnreachable Alert
continuous-integration/drone/push Build is passing
|
2024-02-01 22:44:57 +01:00 |
|
|
921306dcdc
|
PROMETHEUS: move alerts to this repo to allow modifications
continuous-integration/drone/push Build is passing
|
2024-02-01 22:00:37 +01:00 |
|
|
a2a306c195
|
add inhibition rule to alertmanager
continuous-integration/drone/push Build is passing
|
2024-01-31 16:19:42 +01:00 |
|
|
30b7c96833
|
add ECC alert (closes #19)
continuous-integration/drone/push Build is passing
|
2024-01-29 19:28:16 +01:00 |
|
|
0be2949c50
|
rework storage to reduce backup load
continuous-integration/drone/push Build is passing
|
2024-01-26 13:39:13 +01:00 |
|
|
dad89f524c
|
prometheus/values.yaml: Prevent all replicas on the same node
continuous-integration/drone/push Build is passing
|
2023-12-25 10:19:03 +01:00 |
|
|
5a9bb1850e
|
change alert inhibition rules
continuous-integration/drone/push Build is passing
|
2023-12-18 17:33:47 +01:00 |
|
|
4c6bf59f9e
|
prometheus/values.yaml: avoid all pods on the same node
continuous-integration/drone/push Build is passing
|
2023-11-26 20:41:43 +01:00 |
|
|
11f471a711
|
prometheus/alerts.yaml: increase temperature limit to 90
continuous-integration/drone/push Build is passing
|
2023-11-25 18:21:45 +01:00 |
|
|
cf76be1d39
|
add longhorn monitoring
continuous-integration/drone/push Build is passing
|
2023-11-24 20:32:50 +01:00 |
|
|
a441ff630b
|
Prometheus: change DiskspaceLow Alert
continuous-integration/drone/push Build is passing
|
2023-11-23 20:35:46 +01:00 |
|
|
2207baf8e2
|
fix type error
continuous-integration/drone/push Build is passing
|
2023-10-23 18:32:49 +02:00 |
|
|
8c5f6beca7
|
add label to prometheus namespace
continuous-integration/drone/push Build is passing
|
2023-10-23 18:31:47 +02:00 |
|
|
e5cd0a214f
|
Tell Prometheus to only pick up rules from namespaces with label "prometheus: yolokube"
continuous-integration/drone/push Build is passing
|
2023-10-23 18:05:29 +02:00 |
|
|
53be807c0b
|
prometheus/ingress.yaml aktualisiert
|
2023-09-20 22:15:41 +02:00 |
|
|
d22605c1d9
|
fix alertmanager
|
2023-09-15 01:43:41 +02:00 |
|
|
94c2a34aac
|
try to fix prometheus 2
|
2023-08-31 00:29:12 +02:00 |
|
|
778306127f
|
try to fix prometheus
try to fix prometheus 2
try to fix prometheus 3
|
2023-08-30 22:56:03 +02:00 |
|
|
ffaf6a079e
|
put alertmanager config back into helm values
|
2023-08-30 21:27:13 +02:00 |
|
|
69dde5d035
|
enable persistence for grafana
|
2023-06-29 12:02:54 +02:00 |
|
|
deba86906d
|
revert memory rule changes (back to 80%)
Signed-off-by: Tom Neuber <tomneuber@web.de>
|
2023-06-24 18:50:24 +02:00 |
|
|
812cd1efa6
|
Alerting: edit rules for storage low
|
2023-06-24 09:56:07 +02:00 |
|
|
78793ed440
|
Monitoring: change prometheus values to prevent sync-loop in argo
|
2023-06-24 07:28:58 +02:00 |
|
|
c4033903b4
|
Monitoring: add node tag to node-exporter metrics
|
2023-06-23 19:19:48 +02:00 |
|
|
fd6cc7ef3d
|
add etcdbackup alerts
|
2023-06-22 19:59:25 +02:00 |
|
|
d75cb6b7b6
|
change memory rule
|
2023-06-20 14:41:39 +02:00 |
|
|
e63707d16c
|
try to fix prometheus deployment 6 (final) (for now)
|
2023-06-20 13:15:28 +02:00 |
|
|
c706f9b61e
|
try to fix prometheus deployment 5
|
2023-06-20 10:08:22 +02:00 |
|
|
23a3a50c3d
|
try to fix prometheus deployment 4
|
2023-06-20 09:25:19 +02:00 |
|
|
953ee8e085
|
try to fix prometheus deployment 3
|
2023-06-20 09:06:13 +02:00 |
|
|
549cfac957
|
try to fix prometheus deployment 2
|
2023-06-20 08:56:06 +02:00 |
|
|
8c065d71ce
|
try to fix prometheus deployment
|
2023-06-20 08:54:19 +02:00 |
|
|
d5985f50b5
|
change prometheus to prometheus-operator with kube-prometheus, this includes grafana
|
2023-06-20 08:43:48 +02:00 |
|
|
93b8c785a1
|
add the ingress class to the ingresses to improve compatibility in the early stages of the cluster creation, where the default class is not yet propagated.
|
2023-06-19 07:15:19 +02:00 |
|
|
79a22afb98
|
changes to the ingresses
|
2023-06-19 07:01:14 +02:00 |
|
|
115c128c60
|
further trim down the ingress ressource to the bare minimum
|
2023-06-19 00:50:20 +02:00 |
|
|
82294d3cf5
|
fck nginx-ingress, use loadbalancer for https
|
2023-06-18 05:18:02 +02:00 |
|
|
03df120bbe
|
switch prometheus to letsencrypt staging to debug the basicauth ingresses
|
2023-06-18 04:26:53 +02:00 |
|
|
9972c1598c
|
reload alertmanager config automatically
|
2023-06-17 09:11:18 +02:00 |
|
|
d9743788cb
|
increase repeat interval
|
2023-06-17 07:25:56 +02:00 |
|
|
02a123fe1f
|
Fix basicauth issue (finaly)
|
2023-04-23 04:34:05 +02:00 |
|
|
7cc71f7ccd
|
more experimenting with regex rules
|
2023-04-23 01:10:51 +02:00 |
|
|
58a545ed11
|
potentially fix basicauth issue
|
2023-04-23 00:54:24 +02:00 |
|
|
8c6ea5fb15
|
Revert "Revert "disable basicauth for acme challenges 🍇""
This reverts commit b3ccda1728 .
|
2023-04-23 00:42:43 +02:00 |
|
|
b3ccda1728
|
Revert "disable basicauth for acme challenges 🍇"
This reverts commit 84f3e623b5 .
|
2023-04-21 09:58:38 +02:00 |
|
|
84f3e623b5
|
disable basicauth for acme challenges 🍇
|
2023-04-21 09:34:41 +02:00 |
|
|
f66674a69a
|
fix newline issue in alert 😶🌫️
|
2023-04-21 08:33:55 +02:00 |
|
|
928ef62122
|
fix alerts 🤕
|
2023-04-21 08:18:39 +02:00 |
|
|
414b7d9318
|
add alert for unhealthy pod
|
2023-04-21 07:59:28 +02:00 |
|
|
a7578bf430
|
edit in alert PrometheusTargetMissing (again)
|
2023-04-21 07:53:09 +02:00 |
|