Commit graph

1524 commits

Author SHA1 Message Date
243ab78b65
core-deployments.yaml: add node-labeler to core deployments 🥳 2023-07-09 14:07:58 +02:00
300d39e29e add node-labeler 2023-07-09 14:02:49 +02:00
69dde5d035
enable persistence for grafana 2023-06-29 12:02:54 +02:00
deba86906d revert memory rule changes (back to 80%)
Signed-off-by: Tom Neuber <tomneuber@web.de>
2023-06-24 18:50:24 +02:00
812cd1efa6
Alerting: edit rules for storage low 2023-06-24 09:56:07 +02:00
78793ed440
Monitoring: change prometheus values to prevent sync-loop in argo 2023-06-24 07:28:58 +02:00
63d60746f3
longhorn ingress with basicauth 2023-06-24 07:23:03 +02:00
c4033903b4
Monitoring: add node tag to node-exporter metrics 2023-06-23 19:19:48 +02:00
fd6cc7ef3d
add etcdbackup alerts 2023-06-22 19:59:25 +02:00
d75cb6b7b6
change memory rule 2023-06-20 14:41:39 +02:00
e63707d16c
try to fix prometheus deployment 6 (final) (for now) 2023-06-20 13:15:28 +02:00
c706f9b61e
try to fix prometheus deployment 5 2023-06-20 10:08:22 +02:00
23a3a50c3d
try to fix prometheus deployment 4 2023-06-20 09:25:19 +02:00
953ee8e085
try to fix prometheus deployment 3 2023-06-20 09:06:13 +02:00
549cfac957
try to fix prometheus deployment 2 2023-06-20 08:56:06 +02:00
8c065d71ce
try to fix prometheus deployment 2023-06-20 08:54:19 +02:00
d5985f50b5
change prometheus to prometheus-operator with kube-prometheus, this includes grafana 2023-06-20 08:43:48 +02:00
2a75bbe501 test argo 2023-06-20 07:41:29 +02:00
93b8c785a1
add the ingress class to the ingresses to improve compatibility in the early stages of the cluster creation, where the default class is not yet propagated. 2023-06-19 07:15:19 +02:00
c22ebeb2f2
change host domain for argo 2023-06-19 07:02:22 +02:00
79a22afb98
changes to the ingresses 2023-06-19 07:01:14 +02:00
8a1f144a52
fix namespace 2023-06-19 04:47:13 +02:00
cf6a1f0922
longhorn 2023-06-19 04:43:29 +02:00
115c128c60
further trim down the ingress ressource to the bare minimum 2023-06-19 00:50:20 +02:00
ca710e2013
edit ingress ressources for the test deployments 2023-06-18 11:11:21 +02:00
82294d3cf5
fck nginx-ingress, use loadbalancer for https 2023-06-18 05:18:02 +02:00
03df120bbe
switch prometheus to letsencrypt staging to debug the basicauth ingresses 2023-06-18 04:26:53 +02:00
9972c1598c
reload alertmanager config automatically 2023-06-17 09:11:18 +02:00
d9743788cb
increase repeat interval 2023-06-17 07:25:56 +02:00
fe03ff3dd6
add grafana values 2023-06-14 07:54:39 +02:00
6efe28a8ef
add configmap for argocd 2023-04-23 05:57:57 +02:00
02a123fe1f
Fix basicauth issue (finaly) 2023-04-23 04:34:05 +02:00
7cc71f7ccd
more experimenting with regex rules 2023-04-23 01:10:51 +02:00
58a545ed11
potentially fix basicauth issue 2023-04-23 00:54:24 +02:00
8c6ea5fb15
Revert "Revert "disable basicauth for acme challenges 🍇""
This reverts commit b3ccda1728.
2023-04-23 00:42:43 +02:00
b3ccda1728
Revert "disable basicauth for acme challenges 🍇"
This reverts commit 84f3e623b5.
2023-04-21 09:58:38 +02:00
84f3e623b5
disable basicauth for acme challenges 🍇 2023-04-21 09:34:41 +02:00
f66674a69a
fix newline issue in alert 😶‍🌫️ 2023-04-21 08:33:55 +02:00
928ef62122
fix alerts 🤕 2023-04-21 08:18:39 +02:00
414b7d9318
add alert for unhealthy pod 2023-04-21 07:59:28 +02:00
a7578bf430
edit in alert PrometheusTargetMissing (again) 2023-04-21 07:53:09 +02:00
8503d2b46f
edit Alert PrometheusTargetMissing 2023-04-21 05:09:02 +02:00
706e4605fb
start scraping the /var/log directory for prometheus text files 2023-04-20 06:03:40 +02:00
477bbfbd06
fix issue in file system alert 2023-04-20 05:54:18 +02:00
517f048d2b
add more alerts 2023-04-20 05:40:46 +02:00
2db14ec4f8
fix config-map for alertmanager 2023-04-20 04:57:03 +02:00
141278c3a9
add image tag to custom alertmanager image 2023-04-20 04:40:10 +02:00
dafd4cd295
use our own alertmanager image 2023-04-20 04:31:47 +02:00
cca32125a4
add alertmanager config 2023-04-20 01:39:49 +02:00
1230d5c05e
add tom access 🤫 2023-04-06 20:04:05 +02:00