On August 1, 2023, between 2:00 PM and 4:30 PM (UTC), One of your critical web application, experienced service degradation and was intermittently unresponsiveness, impacting customer access to the platform. The incident was triggered by a surge in traffic, leading to increased server load and reduced application responsiveness.
2:00 PM: DevOps engineers logs in to Rakuten SixthSense and looks into the APM dashboard.
2:01 PM: They quickly identify a sharp increase in incoming HTTP requests and suspect a traffic surge.
2:05 PM: DevOps engineers also identifies a significant increase in HTTP 4XX error rates.
2:15 PM: They quickly check the error traces from the dashboard.
2:16 PM: database query performance present in the same frame along the application metrics.
2:17 PM: The application-database trace reveals an unexpected database query pattern causing database contention revealing a slow query.
2:30 PM: The root cause is identified as an inefficient query introduced in the recent code deployment.
3:00 PM: The problematic query is optimized and applied as an emergency patch to the production environment.
3:05 PM: The performance is quickly viewed in the dashboard again.
From a single Rakuten SixthSense dashboard, the SRE team is able to quickly validate the Application metrics, trace to the Errors and from the span information, was able detect the Database metrics, that lead to an inefficient database query.
Rakuten SixthSense's powerful distributed and method-level tracing capabilities have redefined the management of distributed systems. The technology, akin to a GPS for your requests, dramatically reduces RCA times, minimizes costs, and bolsters system reliability. Indeed, SixthSense has emerged as the backbone of observability tools in the ever-evolving landscape of distributed systems.