Looking Back at Devoxx France: No More Sleepless Nights — How AI Is Transforming Our DevOps On-Call Duties
It is 3:00 AM. The alert goes off. It is not a critical outage — just a full disk, a CPU load spike, or a temperamental microservice. Yet an on-call engineer has to wake up, log into the infrastructure, and analyze the metrics. The result: an hour of lost sleep for a trivial action, and dozens of minutes of downtime for the client.
What if a sovereign artificial intelligence could diagnose, act, and document this incident before your engineer even opens their eyes?
That is exactly the technological revolution we showcased at Devoxx France, the largest independent developer conference in Europe. In front of an audience of technical experts, Jean-Philippe Fourès, our VP Product at Iguana Solutions, presented nAIghts Watch: our AIOps solution designed to radically transform DevOps on-call shifts.
The Real Cost of an IT Incident: Investigation Time
As a hosting and managed services provider operating a large number of heterogeneous servers for critical clients, we face the same challenge as any SRE team: the time-consuming management of "Level 1" alerts.
Generally, the lifecycle of this type of incident lasts between 45 minutes and 1 hour. The real bottleneck doesn't lie in the technical complexity of the outage, but in the first 25 to 30 minutes of incompressible latency:
- waking up the on-call engineer,
- acknowledging the alert,
- connecting securely to the servers,
- manually collecting and analyzing logs and metrics.
During this half-hour, the service remains unavailable, directly impacting the business. It was to eliminate this idle time — and to spare our teams from exhaustion — that we developed nAIghts Watch.
nAIghts Watch: AIOps Powering Investigations
To solve this problem, our teams developed an AIOps system based on a cutting-edge multi-agent architecture. Rather than drowning a single AI in an overly broad context and risking hallucinations, we divided the workload:
- The Metrics Agent: collects and analyzes the last hour of data from our systems (Prometheus, VictoriaMetrics).
- The Logs Agent: works in parallel to extract and analyze log streams via Fluentd or Loki.
- The Judge (Super Agent): merges the two reports, makes a decision, establishes a diagnosis, and generates a comprehensive Root Cause Analysis (RCA) report.
As soon as an incident ticket is opened in our Jira, the AI triggers automatically. The result: by the time our on-call engineer logs in, the investigation is already finished, and a clear report is waiting for them on Slack.
Absolute Security and Sovereign Infrastructure AI
At Iguana Solutions, the confidentiality of our clients' data is a red line. Handing the keys to production over to an artificial intelligence requires guarantees that public cloud APIs simply could not offer us.
Our commitment: your data stays with us. We made the strategic choice to deploy a sovereign infrastructure AI. We leverage our own open-weight models (like Qwen) hosted directly on our private H200 GPU clusters. Furthermore, nAIghts Watch operates within a strictly defined framework set by our SREs:
- no direct SSH access and no execution of arbitrary shell commands,
- the AI interacts with the servers via an in-house MCP (Model Context Protocol) server,
- only pre-validated and highly targeted functions are exposed (mostly read-only, with very rare remediation actions allowed, such as restarting a specific service).
The alliance between our SREs and artificial intelligence operates under total human control.
Measurable Results for Our Teams and Clients
The deployment of nAIghts Watch has had a spectacular impact on our on-call operations:
| Incident Typology | Initial Processing Time | Time with nAIghts Watch |
|---|---|---|
| Autonomous resolution (Level 1) | 60 minutes | 3 to 7 minutes |
| Assisted resolution (Human + AI) | 60 minutes | 23 minutes |
The benefits observed at Iguana Solutions:
- 80% reduction in unnecessary nighttime alerts.
- 350 hours saved per month on processing repetitive alerts.
- Premium documentation: every incident, even minor ones, now generates a comprehensive report for our clients.
Contrary to popular belief, Iguana Solutions' goal is absolutely not to eliminate jobs (in fact, we are actively hiring!). Our goal is to eliminate tedious work. Instead of being exhausted by 3:00 AM alerts, our engineers dedicate their energy to innovation, architecture, and supporting our clients.
Go Further with Iguana Solutions
Curious to see how AI can redefine your production standards?
- 📺 Relive the entire Devoxx France conference: discover the technical details of our implementation and our exclusive benchmarks by watching the presentation video on YouTube. Watch on YouTube →
- ⚙️ Discover the technical solution: dive into our AIOps capabilities on the dedicated nAIghts Watch page. Explore nAIghts Watch →
- 🚀 Transform your on-call shifts: want to know how this sovereign AI can integrate into your information system, or are you looking to join a cutting-edge technical team? Contact our Iguana Solutions experts →