GHSA-76RV-2R9V-C5M6
Vulnerability from github – Published: 2026-02-25 22:31 – Updated: 2026-02-25 22:31Summary
All rate limit buckets for a single entity share the same DynamoDB partition key (namespace/ENTITY#{id}). A high-traffic entity can exceed DynamoDB's per-partition throughput limits (~1,000 WCU/sec), causing throttling that degrades service for that entity — and potentially co-located entities in the same partition.
Details
Each acquire() call performs a TransactWriteItems (or UpdateItem in speculative mode) against items sharing the same partition key. For cascade entities, this doubles to 2-4 writes per request (child + parent). At sustained rates above ~500 req/sec for a single entity, DynamoDB's adaptive capacity may not redistribute fast enough, causing ProvisionedThroughputExceededException.
The library has no built-in mitigation:
- No partition key sharding/salting
- No write coalescing or batching
- No client-side admission control before hitting DynamoDB
- RateLimiterUnavailable is raised but the caller has already been delayed
Impact
- Availability: High-traffic entities experience elevated latency and rejected requests beyond what their rate limits specify
- Fairness: Other entities sharing the same DynamoDB partition may experience collateral throttling
- Multi-tenant risk: In a shared LLM proxy scenario, one tenant's burst traffic could degrade service for others
Reproduction
- Create an entity with high rate limits (e.g., 100,000 rpm)
- Send sustained traffic at 1,000+ req/sec to a single entity
- Observe DynamoDB
ThrottledRequestsCloudWatch metric increasing - Observe
acquire()latency spikes andRateLimiterUnavailableexceptions
Remediation Design: Pre-Shard Buckets
- Move buckets to
PK={ns}/BUCKET#{entity}#{resource}#{shard}, SK=#STATE— one partition per (entity, resource, shard) - Auto-inject
wcu:1000reserved limit on every bucket — tracks DynamoDB partition write pressure in-band (name may change during implementation) - Shard doubling (1→2→4→8) triggered by client on
wcuexhaustion or proactively by aggregator - Shard 0 at suffix
#0is source of truth forshard_count. Aggregator propagates to other shards - Original limits stored on bucket, effective limits derived:
original / shard_count. Infrastructure limits (wcu) not divided - Shard selection: random/round-robin. On application limit exhaustion, retry on another shard (max 2 retries)
- Lazy shard creation on first access
- Bucket discovery via GSI3 (KEYS_ONLY) + BatchGetItem. GSI2 for resource aggregation unchanged
- Cascade: parent unaware, protected by own
wcu - Aggregator: parse new PK format, key by shard_id, effective limits for refill, filter
wcufrom snapshots - Clean break migration: schema version bump, old buckets ignored, new buckets created on first access
- $0.625/M preserved on hot path
{
"affected": [
{
"database_specific": {
"last_known_affected_version_range": "\u003c= 0.10.0"
},
"package": {
"ecosystem": "PyPI",
"name": "zae-limiter"
},
"ranges": [
{
"events": [
{
"introduced": "0"
},
{
"fixed": "0.10.1"
}
],
"type": "ECOSYSTEM"
}
]
}
],
"aliases": [
"CVE-2026-27695"
],
"database_specific": {
"cwe_ids": [
"CWE-770"
],
"github_reviewed": true,
"github_reviewed_at": "2026-02-25T22:31:10Z",
"nvd_published_at": "2026-02-25T15:20:52Z",
"severity": "MODERATE"
},
"details": "## Summary\n\nAll rate limit buckets for a single entity share the same DynamoDB partition key (`namespace/ENTITY#{id}`). A high-traffic entity can exceed DynamoDB\u0027s per-partition throughput limits (~1,000 WCU/sec), causing throttling that degrades service for that entity \u2014 and potentially co-located entities in the same partition.\n\n## Details\n\nEach `acquire()` call performs a `TransactWriteItems` (or `UpdateItem` in speculative mode) against items sharing the same partition key. For cascade entities, this doubles to 2-4 writes per request (child + parent). At sustained rates above ~500 req/sec for a single entity, DynamoDB\u0027s adaptive capacity may not redistribute fast enough, causing `ProvisionedThroughputExceededException`.\n\nThe library has no built-in mitigation:\n- No partition key sharding/salting\n- No write coalescing or batching\n- No client-side admission control before hitting DynamoDB\n- `RateLimiterUnavailable` is raised but the caller has already been delayed\n\n## Impact\n\n- **Availability**: High-traffic entities experience elevated latency and rejected requests beyond what their rate limits specify\n- **Fairness**: Other entities sharing the same DynamoDB partition may experience collateral throttling\n- **Multi-tenant risk**: In a shared LLM proxy scenario, one tenant\u0027s burst traffic could degrade service for others\n\n## Reproduction\n\n1. Create an entity with high rate limits (e.g., 100,000 rpm)\n2. Send sustained traffic at 1,000+ req/sec to a single entity\n3. Observe DynamoDB `ThrottledRequests` CloudWatch metric increasing\n4. Observe `acquire()` latency spikes and `RateLimiterUnavailable` exceptions\n\n## Remediation Design: Pre-Shard Buckets\n\n- Move buckets to `PK={ns}/BUCKET#{entity}#{resource}#{shard}, SK=#STATE` \u2014 one partition per (entity, resource, shard)\n- Auto-inject `wcu:1000` reserved limit on every bucket \u2014 tracks DynamoDB partition write pressure in-band (name may change during implementation)\n- Shard doubling (1\u21922\u21924\u21928) triggered by client on `wcu` exhaustion or proactively by aggregator\n- Shard 0 at suffix `#0` is source of truth for `shard_count`. Aggregator propagates to other shards\n- Original limits stored on bucket, effective limits derived: `original / shard_count`. Infrastructure limits (`wcu`) not divided\n- Shard selection: random/round-robin. On application limit exhaustion, retry on another shard (max 2 retries)\n- Lazy shard creation on first access\n- Bucket discovery via GSI3 (KEYS_ONLY) + BatchGetItem. GSI2 for resource aggregation unchanged\n- Cascade: parent unaware, protected by own `wcu`\n- Aggregator: parse new PK format, key by shard_id, effective limits for refill, filter `wcu` from snapshots\n- Clean break migration: schema version bump, old buckets ignored, new buckets created on first access\n- **$0.625/M preserved on hot path**",
"id": "GHSA-76rv-2r9v-c5m6",
"modified": "2026-02-25T22:31:10Z",
"published": "2026-02-25T22:31:10Z",
"references": [
{
"type": "WEB",
"url": "https://github.com/zeroae/zae-limiter/security/advisories/GHSA-76rv-2r9v-c5m6"
},
{
"type": "ADVISORY",
"url": "https://nvd.nist.gov/vuln/detail/CVE-2026-27695"
},
{
"type": "WEB",
"url": "https://github.com/zeroae/zae-limiter/commit/481ce44d818d66e31d8837bc48519660ce4c267f"
},
{
"type": "PACKAGE",
"url": "https://github.com/zeroae/zae-limiter"
},
{
"type": "WEB",
"url": "https://github.com/zeroae/zae-limiter/releases/tag/v0.10.1"
}
],
"schema_version": "1.4.0",
"severity": [
{
"score": "CVSS:3.1/AV:N/AC:L/PR:L/UI:N/S:U/C:N/I:N/A:L",
"type": "CVSS_V3"
}
],
"summary": "zae-limiter: DynamoDB hot partition throttling enables per-entity Denial of Service"
}
Sightings
| Author | Source | Type | Date |
|---|
Nomenclature
- Seen: The vulnerability was mentioned, discussed, or observed by the user.
- Confirmed: The vulnerability has been validated from an analyst's perspective.
- Published Proof of Concept: A public proof of concept is available for this vulnerability.
- Exploited: The vulnerability was observed as exploited by the user who reported the sighting.
- Patched: The vulnerability was observed as successfully patched by the user who reported the sighting.
- Not exploited: The vulnerability was not observed as exploited by the user who reported the sighting.
- Not confirmed: The user expressed doubt about the validity of the vulnerability.
- Not patched: The vulnerability was not observed as successfully patched by the user who reported the sighting.