Slow queries are problematic because they can cause delays for individual users and lead to timeouts or degraded user experiences. These queries usually occur infrequently, and their total resource consumption is often relatively small. In certain cases, like batch processing jobs, a slow query might not cause any issues at all. However, in
interactive applications, where users expect a fast response, a query taking 10 seconds to execute is generally unacceptable.
Furthermore, in
high-concurrency environments, even infrequent slow queries can trigger system-wide issues. For example, a poorly written query running 5 times per day may not seem like a huge problem, but if it causes locking on an important table, it can lead to
max connection exhaustion and prevent other queries from executing. This domino effect can ultimately lead to:
- Connection exhaustion at the database: As queries pile up waiting for locks to clear, all available connections are consumed.
- Failure at other system layers: Web servers, app servers, and queue systems can also exhaust their worker/connection limits, triggering cascading failures.
- Auto-scaling limits: Even if the system is designed to auto-scale, it can only handle a limited amount of load. Moreover, auto-scaling may not react quickly enough to prevent failure, especially when the core issue is lock contention, not raw CPU load.
In such cases, a single slow query can cause significant issues in high-concurrency systems, and addressing them is critical to maintaining system stability.