Resolving a performance bottleneck through service throttling

Written by Anees Noor | Sep 22, 2025 9:00:00 AM

In complex system architectures, performance bottlenecks can often occur in unexpected places. This case study documents the analysis and resolution of significant slowdowns in the critical login service AuthService. The root cause analysis did not lead to an error in the service itself, but to a seemingly independent, internal service that triggered a cascade effect by sharing a database resource. The following report describes the methodological approach used to identify the problem, the strategy used to validate the hypothesis and the implementation of an effective solution through targeted rate limiting.

Slowdowns were observed in the system's login service, AuthService, during routine operation. Upon further investigation, it was determined that the problem was related to an internal service called ActivityQueryService, which is frequently used by employees to retrieve customer activity logs.

Although both services function independently at the application level, a shared backend resource was being retrieved - the audit database.

With increasing concurrent use of ActivityQueryService, a cascading effect on system performance was triggered, which noticeably degraded login response times in particular. While the functionality of ActivityQueryService was ensured, the behavior of the service under load was found to affect more critical workflows such as user authentication.

Symptoms

Audit database connection pool exhaustion: It was observed that the DB connection pool reached its limits during peak utilization and blocked incoming requests.
Increased 500/503/504 errors: A significant increase in these errors was observed during peak loads.
Cross-service performance degradation: Login flows, although not directly dependent on ServiceEventInq, were affected by noticeable timeouts and slowdowns.

Validation strategy

To validate and isolate the issue, a comprehensive series of load tests were designed and executed:

Scenarios before and after a fix (pre-fix vs. post-fix)
- Isolated execution of AuthService
- Isolated execution of ActivityQueryService
- Simultaneous execution of both services
Monitored metrics
- Average response time per API
- Maximum and average number of DB connections used to the audit database
- Error rates and types
- Throughput (requests/sec)

All tests were conducted in a controlled environment to ensure repeatability. Metrics and results were monitored using monitoring dashboards and backend logs.

Solution: Introduction of a limit of 10 requests

A fixed limit of concurrent ActivityQueryService requests per node was implemented. Requests that exceeded this threshold were either queued or responded to with controlled error messages.

This customization ensured that:

Connections to the audit database were maintained and not monopolized by a single service.
Resource access for critical services such as login was guaranteed.
System-widecontention was proactively mitigated.

Results

The login response time was improved by 30%.
The use of the audit DB was stabilized.
- The maximum number of connections used fell by 50%.
- The average number of connections used fell by over 45%.
The error distribution showed an improvement.
- 500/503 errors at login were reduced by over 60%.
- A slight increase in ServiceEventInq errors was observed due to the introduced limit (expected behavior).

Trade-offs

Although the response times of ActivityQueryService almost doubled under load, this trade-off was deemed acceptable to ensure overall system stability. This increase in response time was an expected consequence ofthrottling andqueuing, but did not affect any business-critical workflows.

Conclusions for developers & architects

Throttling is not the same as slowing down: Intelligent rate limiting helps to maintain core functionalities and avoid overload.
Shared backend resources can act as hidden dependencies: Understanding shared components is essential for a robust architecture and to avoid side effects.
Tests should reflect realistic user flows: Simulating concurrent, real-world usage scenarios promotes insights rather than running tests in isolation.
Controlled errors are preferable to system-wide failures: Allowing non-essential processes to fail in a controlled manner maintains system integrity.

The key message of this case study is that sometimes the most effective way to accelerate system performance is to strategically slow down certain components.

Get in touch.

We know all about load and performance testing.

View full post