ilert seamlessly connects with your tools using our pre-built integrations or via email. ilert integrates with monitoring, ticketing, chat, and collaboration tools.
See how industry leaders achieve 99.9% uptime with ilert
Organizations worldwide trust ilert to streamline incident management, enhance reliability, and minimize downtime. Read what our customers have to say about their experience with our platform.
As we head into the holiday season, the ilert team is doing the opposite of slowing down; we’re ramping up. Over the past weeks, we’ve shipped a wave of impactful improvements across alerting, AI-powered automation, mobile app, and status pages. From major upgrades that reshape how teams triage incidents to smaller refinements that remove daily friction, this release is packed with updates designed to make on-call and operations smoother, smarter, and faster. Let’s dive in.
AI SRE: Your knowledgeable incident buddy
You probably remember us talking about ilert Responder – ilert's first intelligent agent that provides actionable insights during incidents. In the last few months, we introduced way more features, powerful agents, and capabilities, which are now all gathered under ilert AI SRE. So, what exactly has changed?
As the previous version did, ilert AI SRE can analyze logs, correlate metrics, check recent code changes, and propose recommended actions to you and your team to resolve the incident. Moreover, ilert agents can now also act autonomously, if you give permission.
While it might sound wild to give access to a production environment to AI, you will be surprised by how many issues require manual and quick fixes, rather than intellectual work. To reduce the burden of hand-operated tasks performed in the middle of the night and gain more valuable time for long-term sustainable fixes, you can start giving AI SRE gradual access and enable automatic actions such as rollbacks to the previous healthy version or restarting a service. To make it easier for you to identify different levels of agentic autonomy, we introduced three stages in our Agentic Incident Management Guide.
Under the hood, ilert AI SRE becomes useful because it integrates deeply with your existing monitoring, observability, and deployment tools. That means you don’t need to change your stack; you connect your existing tools and let the agent work across them. Everything starts with deployment events, as they allow the agent to correlate alerts with recent code changes and rollouts, which are often key signals for identifying root causes. You can check the article on how to introduce your CI & CD pipelines to ilert, if you haven't done this before.
The next step is to familiarize the agent with your observability data. For this, you will need to connect it to tools such as Grafana, Prometheus, Elastic, etc. It's pretty simple and straightforward. And as a final step of setup, you need to set the Root Cause Analysis Policy for the agent. We recommend beginning with a manual trigger to see the agent's performance.
When the SRE agent is in place, and the first incident occurs, you can communicate with it via chat on the right side of the alert view. Just as if you were talking to your colleague. Check the live demo of ilert AI SRE at Oredev Conference in Malmö to see agentic incident response in action.
If you want to be among the first to try ilert AI SRE incident response, just drop us a message at support@ilert.com.
Connect Claude, Cursor, and other MCP clients to ilert
With the release of the ilert MCP Server, integrating your alerting and incident management workflows into AI assistants has become seamless. The MCP server implements the Model Context Protocol, an open standard that lets tools like Claude, Cursor (or any MCP-compatible client) interact with ilert over a unified interface. Through this setup, your assistant can securely list alerts, inspect on-call schedules, acknowledge or resolve alerts, create incidents – all with proper permissions and audit trails.
Connecting is straightforward: you generate an API key in ilert, then configure your MCP client using either a remote HTTP transport. Find more detailed instructions in the ilert documentation. Once configured, ilert appears in the client’s tool list and becomes available directly inside the assistant’s interface. This reduces context-switching, shortens time to resolution, and embeds incident response directly into your team's AI-powered workflow.
With the alert merge feature, you can combine existing alerts into a single main alert with one click. Merging stops duplicate escalations and notifications instantly, keeps responders aligned on one thread of communication, and preserves full traceability by keeping merged alerts available in the audit log. The result is a cleaner incident workspace, more accurate reporting, and a better foundation for AI SRE features – including automated merge recommendations during root-cause analysis.
Alert merge works hand-in-hand with event grouping: events merge into alerts, and alerts can now merge into one primary alert. Clear, intentional, and built to reflect how teams actually troubleshoot in the real world.
Filter alerts by labels for faster, targeted triage
The alert list now supports powerful label-based filtering, making it easier to zero in on exactly the alerts you care about. You can build filters using label keys and values with autocomplete, combine multiple conditions, and instantly see active filters represented in a compact ICL-style syntax. Editing filters is just a click away, and the same experience is available on mobile, so teams can slice their alert stream by environment, region, service, or any custom label from anywhere.
This brings far more precision to alert triage, especially for larger environments where labels are the primary way teams organize data across systems.
More alert filtering options
You can now also filter alerts by priority in both the ilert interface and mobile app. Whether you’re triaging from your desk or on the go, it’s easy to focus on the most critical alerts first and cut through noise from lower-priority issues.
Transparent alert grouping
To remove confusion caused by mismatched event counts, we’ve unified how grouped events are displayed across the platform. Previously, event grouping via alertKey and alert-source-based grouping were treated separately, leading to different totals in the alert list and alert detail views. The updated design consolidates these into a single, consistent event count, with clear grouping states and a detailed breakdown available in the Event grouping dialog. This ensures users always see one accurate number, regardless of the grouping method, and can easily understand how and when events were combined.
New Wait node for Event flows
Event Flows gain a powerful new control step: the Wait node. This addition lets teams pause a flow either for a specific duration or until the start or end of defined support hours. It brings precise timing control to automation, enabling smarter workflows, for example, delaying non-urgent actions outside business hours or spacing out retries with fixed wait times. The node respects support-hour configurations, including holiday exceptions, giving teams predictable, context-aware behavior.
This enhancement builds on the foundation introduced in our recent deep dive into Event Flows. The Wait node expands what’s possible with flow automation, helping teams design more reliable, human-friendly processes.
Responsive grid layout for large-scale status pages
Status pages now support a third layout option – the responsive grid – built for organizations managing hundreds or even thousands of services.
The new layout introduces a high-density grid optimized for large service catalogs. On wide screens, services are arranged in up to 12 columns within a 1536px content width, creating a clean, scannable overview. As the screen size decreases, the grid adapts seamlessly: tablets display fewer columns, and mobile switches to an icon-only mode for maximum clarity. Crucially, this layout supports all key elements such as active incidents, past incidents, metrics, and service grouping, ensuring teams can communicate status effectively at any scale.
For enterprises with sprawling architectures, the responsive grid makes status pages both performant and user-friendly, turning massive service inventories into a readable, navigable experience.
Mobile app news
Handling coverage requests on mobile just got smoother. Until now, many users didn’t realize that the top section in the coverage request flow acted only as a search filter. This meant they still had to manually adjust each identified shift in the list below before sending the request – a common point of confusion reported by several customers.
With the latest update, ilert mobile now applies the selected search boundaries to all matching shifts by default. You can still fine-tune individual shifts if needed, but the default behavior now reflects the intent expressed in the filter. The result: fewer taps, less ambiguity, and a more intuitive coverage request experience.
The heartbeat list in the mobile app no longer appears empty: we’ve migrated both the list and detail view from relying on alert sources with integration-type filters to using the dedicated Heartbeat Monitors API. This ensures your monitors are displayed correctly and in real time, aligned with how heartbeats are managed across the platform.
And a few minor but still eye- and heart-pleasing updates.
We revamped the outbound integrations (also familiar to you as alert actions) catalog. You can now see all features relevant to each connection, and it's easier to navigate through the list.
Additionally, alert action logs now show which alert and alert source each action relates to, and you can filter by these references to drill into exactly what happened, faster.
Status page email notifications now support Markdown, making it easier to format updates clearly and consistently. Bold text, lists, links, and other lightweight formatting options render correctly in outgoing emails, so teams can share structured, readable incident updates without switching tools or rewriting content.
Custom processing rules templates now behave in a way that better matches how teams actually use them: conditions only evaluate as true when a real template is present (for alertKey or any of the create/accept/resolve actions). Combined with new out-of-the-box templates for the most-used integrations, this means less guesswork, fewer “empty” conditions, and faster rollout of consistent, high-quality alert payloads.
And finally, our ilert mascot – the blue froggy – has a fresh look across the entire interface. Enjoy its brighter, more colorful style every time you open ilert.
The end of the year brings pressure. (Oh, we know!) Customer demand spikes, response expectations stay high, and engineering teams are juggling production issues, releases, and time off. For many teams, this is when on-call becomes chaotic: schedules break, notifications hit at the wrong time, and coverage gaps appear exactly when you can’t afford them.
ilert's Holidays and Support hours features were built to fix that. They simplify on-call management, protect your team’s time, and keep organizations running smoothly.
In this article, we’ll cover how these features help to stay in control, prevent burnout, and create predictable, reliable schedules even during the busiest seasons. And at the end, you’ll find a practical bonus chapter: how to stay healthy, sane, and avoid burnout during this pre-holiday rush.
End of the year – a crash-test for your on-call routine
On-call management gets complicated when support expectations vary across regions, customers, or time zones. Add holidays and PTO season on top, and teams often resort to spreadsheets, Slack pings, or “please cover my shift” chaos.
End-of-year operations already demand tight focus. Yet many teams still patch schedules manually with spreadsheets, emails, or Slack threads. These small improvisations add up, pulling engineers and managers away from solving the real problems and into administrative firefighting.
Holidays and Support hours solve this by giving teams precise, automated control over when alerts should trigger and who should handle them. The result: fewer interruptions, cleaner routing, and schedules that reflect real-world availability.
Let's first look at the Support hours feature.
The Support hours feature in ilert lets teams define exactly when alerts should fire and how they should be routed using time-based rules. It acts as a guardrail that checks whether an event happens during business hours, outside them, or during specially defined windows. This allows teams to tailor behavior depending on urgency: critical incidents can escalate 24/7, while lower-severity issues can quietly wait until the next morning.
Support Hours can be simple (weekdays 9–5) or highly structured with multiple blocks, time zones, and logic layers. They’re ideal for organizations with different SLA tiers, global customer bases, or engineering teams who want to protect nights and weekends from non-urgent noise.
Holidays take this concept further. Internally, we jokingly call them “Support Hours on steroids.”
They let teams automatically exclude national holidays, company-wide days off, or regional observances from regular support windows. Instead of manually adjusting schedules every time a holiday rolls around, ilert matches your service hours with the relevant holiday calendar and automatically adapts routing.
This is especially powerful for distributed teams with country-specific holidays, or anyone who was accidentally paged on Christmas morning. Holidays ensure your escalation flow reflects the real world: fewer surprises for engineers, and cleaner operational coverage when the office lights are off.
Here is why your future self will thank you
We established Support hours and Holidays for fundamental reasons. We receive numerous support tickets daily, but December is a unique peak season, and we understand the challenges our customers face. Do any of these sound familiar to you?
Coverage gaps appear at the worst time. With engineers taking well-earned time off, gaps in coverage become inevitable unless schedules adapt automatically. Without that automation, the burden shifts unevenly; someone ends up carrying extra on-call weight, or – worst of all – an engineer gets notified during their holiday. These gaps hurt morale and directly slow down incident response when uptime matters most.
Manual fixes drain time you don’t have.Year-end work already demands focus, yet many teams still scramble through spreadsheets or Slack threads to patch schedules. These last-minute adjustments consume engineering time that should be spent on stabilizing systems, shipping features, or preparing for traffic spikes – not babysitting calendars.
Escalations become noisy and inaccurate.When support hours and holidays aren’t fully integrated into the on-call logic, alerts fire at the wrong moments. People get pinged outside business hours, or urgent issues quietly fall through the cracks. In a peak season full of customer activity, misrouted alerts escalate quickly into incidents, impacting customers and leaving unpleasant red spots on your status pages.
International teams feel the complexity even more. Distributed teams deal with a patchwork of national holidays, cultural observances, and regional support-hour rules. Without a system that adapts to each region’s calendar, some teams get overloaded while others are unintentionally under-utilized. This imbalance becomes especially dangerous when global usage spikes.
Customers don’t pause their expectations. Even while internal teams slow down or go on vacation, SLAs keep ticking. Customers expect the same level of responsiveness, and any misalignment between support contracts and on-call coverage becomes painfully visible. Poorly controlled support hours during the busiest season don’t just inconvenience engineers – they damage trust.
Bonus: 8 tips for engineers on duty to avoid burnout during the pre-holiday rush
Enhance your alerts. Clean up noisy monitors, retire outdated checks, and tune thresholds before traffic spikes. A quieter, more accurate alert setup pays off massively during high-stress weeks.
Automate what you can. Small automations, such as log parsers, deployment scripts, and error-notification filters, save hours when systems get noisy.
Use rotations effectively. Make sure on-call responsibilities are distributed fairly, and that holidays are properly reflected so no one works a stretch longer than they should.
Rehearse failovers and edge cases. Run quick simulations or tabletop exercises with your team. Knowing how systems behave under load removes guesswork when real issues hit.
Configure safety nets. Enable auto-remediation where appropriate, make sure backup contacts are defined, and double-check that escalations route correctly if someone is unavailable.
Share context proactively. Post short updates in Slack or your incident channel about ongoing issues, infrastructure changes, or known risks. The next person on-call shouldn’t have to rediscover what you already know.
Lean on your tools. Features like Support Hours and Holidays exist to reduce mental load. Let them do the heavy lifting so you don’t have to think about schedules or routing.
If an incident happens, debrief faster. Short, focused post-incident reviews help teams resolve patterns quickly without sinking hours into analysis.
Modern SRE and IT operations run on two truths: you must see problems the way users do, and you must respond fast. With the new ilert and Ekara integration, you can turn Ekara’s powerful synthetic and real-user insights into actionable alerts and incidents in ilert – routed to the right on-call engineer, enriched with context, and communicated to stakeholders via status pages. The result: fewer surprises, faster recoveries, and happier users.
What is Ekara?
Ekara by the French company ip-label is a digital experience monitoring platform that combines synthetic monitoring (robots) and Real User Monitoring to detect and diagnose issues across web, mobile, APIs, business apps, and voice/IVR – deployed as SaaS, hybrid, or fully on-prem. Ekara offers no-code journey scripting, Edge/branch monitoring, and options like Flow AI and AI Incident Guard. The platform processes billions of measurements daily and is used by 400+ customers across 25 countries.
Ekara is used by enterprises across e-commerce, travel, finance, public sector, and contact centers to see performance the way users do. PVCP Group, for example, models key booking journeys to catch issues before they hurt conversions. Contact centers and telecoms run Ekara’s IVR/Voice probes to validate call flows and speech quality. Hybrid IT teams monitor thick-client and Citrix apps alongside web and APIs, including from edge sites with Ekara Pod. In short, it helps diverse teams spot real user problems early and act fast.
Why connect Ekara to ilert?
Ekara detects problems and sends alerts. ilert turns those alerts into actionable notifications and gets the right people moving fast if issues have a business impact.
Faster response: When Ekara sends an event, ilert notifies the on-call teams via voice, SMS, push, Slack, or Microsoft Teams. No manual steps, no guesswork.
Less noise, clearer focus: Similar alerts from the same scenario or region are grouped into one. Teams identify one problem to fix, rather than multiple duplicates.
AI that speeds you up: ilert offers powerful AI features, all designed to reduce the time to resolution. ilert AI summarizes incoming information, so responders start with context, not a blank page. It also helps prepare clear status updates and later assembles blameless postmortems. AI is integrated into every stage of incident response to reduce manual burden and enable teams to react quickly.
Keep everyone informed: using ilert public and private status pages, users can keep customers and stakeholders informed.
Step-by-step setup
Create Ekara's Alert source in ilert. Copy the Webhook URL.
Configure Ekara to send alerts. Choose the events to forward: failure, recovery, threshold breach, and SLO alert.
Test and verify. Trigger a test failure in Ekara. Confirm an alert opens in ilert and pages the current on-call.
A complete step-by-step guide is available at doc.ilert.com. If you experience any issues or have questions, feel free to reach out to the ilert support team at support@ilert.com.