HealthChecks.io

Create a new check

  • These settings allow for a system restart without triggering the alerts

  • Period:

    • 5 minutes

    • The expected time between pings

  • Grace Time:

    • 10 minutes

    • When a check is late, how long to wait to send an alert

Bash script

Used to monitor if Prometheus and Alertmanager are running, and if the entire machine is offline or unresponsive.

  • Edit UNIQUE_ID

Configure CRON

  • Run every 1 minute.

Integrations

Integrates with PagerDuty for alerts.

Last updated