We run a backend that gets hit by a lot of clients at the same time. At first, the system looked fine. Then traffic spiked and everything collapsed into 502s. Users were angry, services were healthy, and Aurora was the bottleneck.

The wall we hit

We scaled our ECS tasks to keep up:

We pushed Aurora’s max_connections up to ~7,000 and gave the database more resources. It helped for a moment, but it wasn’t a real fix. Every time we scaled the app, we hit the limit again. More connections meant more memory and more cost.

We needed pooling.

Why RDS Proxy looked perfect (but wasn’t)

RDS Proxy is built for this exact problem: smooth spikes, reuse connections, and protect Aurora from churn. We enabled it expecting instant relief.

Instead, the connection count still climbed. We weren’t pooling as much as we should have. That led us to the root cause: pinned connections.

flowchart LR
  Clients --> ECS[ECS Tasks / Workers]
  ECS -->|many connections| Aurora[(Aurora)]
  ECS --> Proxy[RDS Proxy]
  Proxy --> Aurora
  ECS -. "session SET pins" .-> Proxy
  ECS -. "SET LOCAL in txn" .-> Proxy

The real culprit: session state

Our app enforces row-level security. For each request, we set session state like this:

SET my.tenant_id = '...';

That works for security, but RDS Proxy treats session-level state as a reason to pin the connection. Once pinned, it can’t return to the pool. In practice, every worker ends up hoarding its own connection, and the proxy can’t do its job.

This behavior is documented: any session state can cause a pin, which kills pooling.

How Django’s database wrapper works (and why it mattered)

Django sits between your code and the database through its database wrapper. The wrapper manages when a connection is opened, when a transaction begins, and when it ends. That means it is the safest place to add tenant state without leaking it across requests.

In practice, the wrapper does this for each worker:

That “setup SQL” hook is where we set the tenant for row-level security. The key insight was simple: use transaction‑scoped state, not session‑scoped state.

The fix: move state into transactions

SET LOCAL only exists for the lifetime of a transaction. If you are in autocommit, it will reset immediately after the statement.

So we moved our RLS setup into the transaction boundary:

SET LOCAL my.tenant_id = '...';

Because the statement is scoped to the transaction, it doesn’t create long-lived session state. When the transaction ends, the state disappears, the connection is no longer pinned, and RDS Proxy can safely reuse it for the next request.

Implementation sketch (Django)

This is the pattern we used: set the tenant at transaction start and clear nothing manually.

from django.db import connection, transaction

def run_tenant_query(tenant_id, fn):
    with transaction.atomic():
        with connection.cursor() as cursor:
            cursor.execute("SET LOCAL my.tenant_id = %s", [tenant_id])
        return fn()

Make sure your request path enters a transaction before any queries run, so the SET LOCAL scope is correct.

sequenceDiagram
  participant Req as Request
  participant App as Django App
  participant Wrap as DB Wrapper
  participant Proxy as RDS Proxy
  participant DB as Aurora

  Req->>App: start request
  App->>Wrap: begin transaction
  Wrap->>Proxy: acquire pooled connection
  Wrap->>DB: SET LOCAL my.tenant_id
  Proxy->>DB: execute queries
  App->>Wrap: commit transaction
  Wrap->>Proxy: release connection

The result

After switching to SET LOCAL inside the Django wrapper:

Takeaway

RDS Proxy is powerful, but it doesn’t override session state. If your app sets state for security or tenancy, make sure it happens inside a transaction using SET LOCAL. The change is small, but the impact is massive.

References