Commit graph

60 commits

Author SHA1 Message Date
30b7c96833
add ECC alert (closes #19)
All checks were successful
continuous-integration/drone/push Build is passing
2024-01-29 19:28:16 +01:00
0be2949c50
rework storage to reduce backup load
All checks were successful
continuous-integration/drone/push Build is passing
2024-01-26 13:39:13 +01:00
dad89f524c
prometheus/values.yaml: Prevent all replicas on the same node
All checks were successful
continuous-integration/drone/push Build is passing
2023-12-25 10:19:03 +01:00
5a9bb1850e
change alert inhibition rules
All checks were successful
continuous-integration/drone/push Build is passing
2023-12-18 17:33:47 +01:00
4c6bf59f9e
prometheus/values.yaml: avoid all pods on the same node
All checks were successful
continuous-integration/drone/push Build is passing
2023-11-26 20:41:43 +01:00
11f471a711
prometheus/alerts.yaml: increase temperature limit to 90
All checks were successful
continuous-integration/drone/push Build is passing
2023-11-25 18:21:45 +01:00
cf76be1d39
add longhorn monitoring
All checks were successful
continuous-integration/drone/push Build is passing
2023-11-24 20:32:50 +01:00
a441ff630b
Prometheus: change DiskspaceLow Alert
All checks were successful
continuous-integration/drone/push Build is passing
2023-11-23 20:35:46 +01:00
2207baf8e2
fix type error
All checks were successful
continuous-integration/drone/push Build is passing
2023-10-23 18:32:49 +02:00
8c5f6beca7
add label to prometheus namespace
All checks were successful
continuous-integration/drone/push Build is passing
2023-10-23 18:31:47 +02:00
e5cd0a214f
Tell Prometheus to only pick up rules from namespaces with label "prometheus: yolokube"
All checks were successful
continuous-integration/drone/push Build is passing
2023-10-23 18:05:29 +02:00
53be807c0b prometheus/ingress.yaml aktualisiert 2023-09-20 22:15:41 +02:00
d22605c1d9
fix alertmanager 2023-09-15 01:43:41 +02:00
94c2a34aac
try to fix prometheus 2 2023-08-31 00:29:12 +02:00
778306127f
try to fix prometheus
try to fix prometheus 2

try to fix prometheus 3
2023-08-30 22:56:03 +02:00
ffaf6a079e
put alertmanager config back into helm values 2023-08-30 21:27:13 +02:00
69dde5d035
enable persistence for grafana 2023-06-29 12:02:54 +02:00
deba86906d revert memory rule changes (back to 80%)
Signed-off-by: Tom Neuber <tomneuber@web.de>
2023-06-24 18:50:24 +02:00
812cd1efa6
Alerting: edit rules for storage low 2023-06-24 09:56:07 +02:00
78793ed440
Monitoring: change prometheus values to prevent sync-loop in argo 2023-06-24 07:28:58 +02:00
c4033903b4
Monitoring: add node tag to node-exporter metrics 2023-06-23 19:19:48 +02:00
fd6cc7ef3d
add etcdbackup alerts 2023-06-22 19:59:25 +02:00
d75cb6b7b6
change memory rule 2023-06-20 14:41:39 +02:00
e63707d16c
try to fix prometheus deployment 6 (final) (for now) 2023-06-20 13:15:28 +02:00
c706f9b61e
try to fix prometheus deployment 5 2023-06-20 10:08:22 +02:00
23a3a50c3d
try to fix prometheus deployment 4 2023-06-20 09:25:19 +02:00
953ee8e085
try to fix prometheus deployment 3 2023-06-20 09:06:13 +02:00
549cfac957
try to fix prometheus deployment 2 2023-06-20 08:56:06 +02:00
8c065d71ce
try to fix prometheus deployment 2023-06-20 08:54:19 +02:00
d5985f50b5
change prometheus to prometheus-operator with kube-prometheus, this includes grafana 2023-06-20 08:43:48 +02:00
93b8c785a1
add the ingress class to the ingresses to improve compatibility in the early stages of the cluster creation, where the default class is not yet propagated. 2023-06-19 07:15:19 +02:00
79a22afb98
changes to the ingresses 2023-06-19 07:01:14 +02:00
115c128c60
further trim down the ingress ressource to the bare minimum 2023-06-19 00:50:20 +02:00
82294d3cf5
fck nginx-ingress, use loadbalancer for https 2023-06-18 05:18:02 +02:00
03df120bbe
switch prometheus to letsencrypt staging to debug the basicauth ingresses 2023-06-18 04:26:53 +02:00
9972c1598c
reload alertmanager config automatically 2023-06-17 09:11:18 +02:00
d9743788cb
increase repeat interval 2023-06-17 07:25:56 +02:00
02a123fe1f
Fix basicauth issue (finaly) 2023-04-23 04:34:05 +02:00
7cc71f7ccd
more experimenting with regex rules 2023-04-23 01:10:51 +02:00
58a545ed11
potentially fix basicauth issue 2023-04-23 00:54:24 +02:00
8c6ea5fb15
Revert "Revert "disable basicauth for acme challenges 🍇""
This reverts commit b3ccda1728.
2023-04-23 00:42:43 +02:00
b3ccda1728
Revert "disable basicauth for acme challenges 🍇"
This reverts commit 84f3e623b5.
2023-04-21 09:58:38 +02:00
84f3e623b5
disable basicauth for acme challenges 🍇 2023-04-21 09:34:41 +02:00
f66674a69a
fix newline issue in alert 😶‍🌫️ 2023-04-21 08:33:55 +02:00
928ef62122
fix alerts 🤕 2023-04-21 08:18:39 +02:00
414b7d9318
add alert for unhealthy pod 2023-04-21 07:59:28 +02:00
a7578bf430
edit in alert PrometheusTargetMissing (again) 2023-04-21 07:53:09 +02:00
8503d2b46f
edit Alert PrometheusTargetMissing 2023-04-21 05:09:02 +02:00
706e4605fb
start scraping the /var/log directory for prometheus text files 2023-04-20 06:03:40 +02:00
477bbfbd06
fix issue in file system alert 2023-04-20 05:54:18 +02:00