uptime monitoring

Introducing Uptime Monitoring in iLert

06 Mar, 2020

After a closed beta test of 3 months, we’re finally making Uptime Monitoring in iLert generally available. And the best of all, uptime monitoring will be included on all our existing plans at no additional cost.

Why Uptime Monitoring?

Why are we adding uptime monitoring to iLert? After all, our customers are already leveraging iLert for uptime monitoring using our integrations. iLert has always been about increasing the availability of your services by solving the problem of mean-time to acknowledge. We help you to respond to incidents faster by integrating with your tools and extending them with advanced alerting, on-call schedules and call routing. Uptime monitoring is about availability in the strictest sense: it answers the question of whether your service or website is available from a user’s perspective. This fits like a glove into our mission in helping operations teams to run always-on services and deliver a seamless online experience to their customers.

Our uptime monitoring comes with all the features you need and is fully integrated with iLert.

Multiple check types with up to 1-minute check intervals and custom timeouts

We’re starting with HTTP, TCP, UDP, and ICMP checks with a configurable check interval and timeout thresholds. The check interval determines how often iLert checks your service for availability. Let’s quickly go over each check type:

  1. HTTP: This check performs a GET request on your website or API endpoint and expects a status code of 2xx or 3xx.

  2. TCP / UDP: these checks connect to a specific host and post using the TCP or UDP protocol port and checks whether the network endpoint is responding.

  3. ICMP: this is a ping check and sends ICMP packets to a host and considers it down if it doesn’t respond.

multiple uptime monitor check types

Multiple Regions and False Positive Prevention

When creating an uptime monitor, you can pick from two regions: US or EU. We recommend picking the region closest to your users. If you’re serving users in both regions, we recommend creating two uptime monitors so you can monitor response time in each region individually. To prevent false alerts, we’re running the checks in two geographically different locations in each region. For example, a monitor that is created in the EU regions runs checks from Germany and Irland. Whenever a check from a location was not successful, it consults the other location for a second check. Only if the checks from both locations are not successful, the monitor is considered to be down. This ensures that no false positives are reported.

multiple regions to prevent false positives

Root Cause Analysis Checks

Whenever an uptime monitor confirms an outage, we run additional diagnostic checks depending on the type of the connection problem. For example, if a connection couldn’t be established, we run traceroute, or if the HTTP status code doesn’t match, we show you the results of full HTTP response including its headers.

root cause analysis checks

Best-in-class alerting with automatic escalations and on-call scheduling

An uptime monitor gets all the alerting capabilities that you know and love from iLert. Because an uptime monitor is tied to an alert source in iLert, you can leverage all its features. To name a few:

  1. Advanced alerting: instantly get notified via SMS, push and voice notifications

  2. Outbound integrations: use one of your ticketing or ChatOps integrations to have a ticket created in JIRA or a message posted in Slack whenever an uptime monitor is down

  3. Maintenance windows: automatically pause your uptime monitors during maintenance. Maintenance windows will be reflected as such in uptime reports and won’t appear as downtime

  4. Notification priorities & support hours: use different alerting rules based on the priority you define on an uptime monitor or based on support hours

In addition to that, you can configure an uptime monitor to only create an incident after a certain number of failed checks. That way, you’ll be alerted only when your website is down for X minutes.