Rate limits & quotas
The API enforces two distinct budgets. Rate limits protect the platform from bursty clients and reset every minute. Quotas are standing allowances tied to your plan, chiefly the monthly action allowance. Both surface the same way: response headers you can watch, and a 429 with Retry-After when a budget is exhausted.
During early access, limits are deliberately generous and are not enforced against normal integration work. Final limits for your workload are agreed during onboarding, and anything you negotiate there supersedes the defaults on this page. If a default below is tight for your use case, say so; raising it is configuration, not engineering.
Rate limit headers
Every response includes the state of the rate window the request was counted against:
| Header | Meaning |
|---|---|
X-RateLimit-Limit | The size of the window for this route class, in requests per minute. |
X-RateLimit-Remaining | Requests left in the current window, after this one. |
X-RateLimit-Reset | Unix timestamp (seconds) at which the window resets. |
Retry-After | Sent only with 429 and lock-related 409 responses: seconds to wait before retrying. |
HTTP/1.1 200 OK
X-RateLimit-Limit: 1000
X-RateLimit-Remaining: 994
X-RateLimit-Reset: 1782172860
How limits are scoped
All rate limits and quotas apply per tenant, not per API key. Splitting traffic across keys does not raise a budget; the keys share the tenant's windows. This is deliberate: the limits protect the executor and the source systems behind your connectors, and those do not care how many keys the requests arrived on.
- Two services using two keys against one tenant draw from the same buckets. Budget for their combined rate.
- Resellers operating multiple tenants get an independent budget per tenant; one noisy tenant cannot starve another. This is the same wall that isolates data, applied to capacity.
- Route classes are independent. Exhausting the events write budget has no effect on plan approvals or receipt reads.
To see which bucket a request was counted against, read the headers on the response itself; the limit value identifies the class.
curl -sI "https://api.fibric.io/v1/events?limit=1" \
-H "Authorization: Bearer sk_live_3f9c2a7b8e1d4f60a2c9" | grep -i "x-ratelimit"
# X-RateLimit-Limit: 1000
# X-RateLimit-Remaining: 997
# X-RateLimit-Reset: 1782172860
Default limits by endpoint group
Limits apply per tenant, per route class, on a one-minute sliding window. Reads and writes are budgeted separately, so a listing loop can never starve your ingest and vice versa.
| Endpoint group | Reads / min | Writes / min | Notes |
|---|---|---|---|
| Events | 1,000 | 600 | Ingest (POST /v1/events) has its own 600/min budget, separate from all other writes. Batch upstream if you sustain more. |
| Operators | 1,000 | 120 | Create, update, pause, resume share the write budget. |
| Connectors | 1,000 | 120 | POST /v1/connectors/:id/test counts as a write; it calls the source system. |
| Actions & plans | 1,000 | 120 | Approve, veto, and undo share the write budget. Executed actions are metered by the action allowance, not the rate limit. |
| Receipts | 1,000 | 60 | Writes here are export-job creation only, further capped by the concurrent export limit. |
All read routes across the API share one additional global ceiling of 2,000 reads per minute per tenant, so five saturated groups cannot compound into an unbounded aggregate.
Burst behavior
Windows are enforced with a token bucket, not a hard per-second gate. Each bucket refills continuously at the per-minute rate and holds a burst reserve of 2× the per-minute limit. In practice:
- A quiet client can burst up to twice the listed rate for a short stretch, for example draining a backlog of 1,200 events in one push, and is throttled only if the sustained rate stays above the refill rate.
- A client that runs flat-out at the limit has no reserve; any spike above the line returns
429immediately. X-RateLimit-Remainingreflects the bucket, so it can read above the nominal limit right after a quiet period. Treat the headers, not the table, as the live truth.
Handling 429
A rate-limited response carries the standard error envelope with code rate_limited and a Retry-After header:
HTTP/1.1 429 Too Many Requests
Retry-After: 12
X-RateLimit-Limit: 600
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1782172872
{
"error": {
"type": "rate_limit_error",
"code": "rate_limited",
"message": "Event ingest is limited to 600 requests per minute for this tenant.",
"doc_url": "https://fibric.io/docs/limits#handling-429",
"request_id": "req_c418e97d"
}
}
Handling rules, in order of importance:
- Honor
Retry-Afterexactly. It is computed from the actual bucket state; sleeping longer wastes throughput, sleeping less guarantees another429. - Retry with the same
Idempotency-Key. A rate-limited request was never processed, so the retry is the first attempt, and the key protects you if a proxy disagrees. - Add exponential backoff with jitter for repeated 429s. If two consecutive retries are limited, you are sustained over budget, not unlucky; double the wait each time and cap around 60 seconds.
- Shed proactively. When
X-RateLimit-Remainingdrops below ~10% ofX-RateLimit-Limit, slow the producer instead of waiting for the wall.
The official SDKs implement all four behaviors by default; hand-rolled clients should reproduce them. The reference loop, in shell form:
attempt=0
while true; do
status=$(curl -s -o resp.json -w "%{http_code}" -D headers.txt \
-X POST https://api.fibric.io/v1/events \
-H "Authorization: Bearer $FIBRIC_KEY" \
-H "Content-Type: application/json" \
-H "Idempotency-Key: magento:SO-10884:v7" \
-d @envelope.json)
[ "$status" != "429" ] && break
wait=$(grep -i "^retry-after:" headers.txt | tr -dc "0-9")
attempt=$((attempt + 1))
backoff=$((wait * attempt > 60 ? 60 : wait * attempt))
sleep $((backoff + RANDOM % 3)) # jitter
done
The same Idempotency-Key rides every attempt, so even a retry that races a proxy replay cannot ingest the event twice.
Action allowance quotas
Executed actions, side effects the executor actually applies with an ALLOW or approved ALERT disposition, are metered against your plan's monthly allowance. This is a quota, not a rate limit: it does not reset per minute and it is what your bill is computed from.
| Plan | Included actions / month | Overage |
|---|---|---|
| Early access | Uncapped during the program | Free during early access |
| Team, $240/mo | Included allowance per your plan | $0.01 per additional action |
What counts, and what does not:
- Counted: each action that applies to a source system, including undos, which are compensating actions in their own right.
- Not counted: reads, event ingest, plan proposals, and refused work.
BLOCKverdicts andDEDUPdispositions are receipted but free; you are never billed for something the kernel refused to do or already did once. When single-flight and idempotency collapse a flood, the 657-message pattern that motivated them, you pay for the one action that ran, not the 657 that were proposed.
Overage is on by default so operators never stall mid-incident; you can set a hard cap in the console instead, in which case actions beyond the cap return 429 with code quota_exceeded and the plans remain in proposed for approval after the cap is raised. Premium connectors (from $29/source/mo) and operator packs (from $49/operator/mo) are licensed separately and have no per-action component.
Tracking usage
The allowance is metered from the receipt ledger, which means you can reconcile it yourself: every billable action is a receipt with outcome applied, and nothing else bills. To count the current month's billable actions:
curl -s "https://api.fibric.io/v1/receipts?outcome=applied&since=2026-07-01T00:00:00Z&limit=100" \
-H "Authorization: Bearer sk_live_3f9c2a7b8e1d4f60a2c9" | jq '.data | length'
Walk next_cursor for the full count, or use a receipt export for a month-end statement. The number you compute this way is the number you are billed for; there is no separate metering system to trust.
Other quotas
| Quota | Default | When exceeded |
|---|---|---|
| Concurrent receipt export jobs | 2 per tenant | 429 quota_exceeded; wait for a running job to reach complete or failed. |
| Event payload size | 256 KB | 400 invalid_request; put large artifacts in your own store and reference them from the payload. |
| Active operators | 25 per tenant | 429 quota_exceeded on create or resume; raised on request during onboarding. |
| Installed connectors | 25 per tenant | 429 quota_exceeded on install; raised on request during onboarding. |
| API keys | 20 per tenant | 429 quota_exceeded on key creation; revoke unused keys or ask for more. |
Upstream source-system limits
Fibric's limits are not the only limits in the loop. Every connector acts against a source system with rate limits of its own: a commerce API, a ticketing API, a building-management gateway. The executor paces outbound tool calls to stay inside each source system's published limits, and this pacing is separate from, and often tighter than, anything on this page.
- When a source system throttles a tool call, the executor backs off and retries under the action's own
idempotency_key; the action stays in flight rather than failing to your client. You see the delay, not an error. - If the source system stays unavailable past the executor's retry budget, the action disposes as failed with
ok: falseand the upstream error in the action'serrorfield, and the receipt records it. Direct API calls that depend on the source system, such as a connector test, surface it as502 connector_upstream_error. - Connector pacing is per connector instance. Two connectors to two store fronts of the same vendor pace independently.
The practical consequence: sizing your ingest rate against the tables above is necessary but not sufficient. If an operator can propose actions faster than the source system will accept them, the executor's single-flight queue absorbs the difference, and the receipt timestamps show the true applied rate.
How limits change
Limit changes follow the same discipline as the API surface. Raises take effect immediately and are not announced per tenant. Reductions to a default never apply retroactively to tenants already onboarded: your effective limits are the ones agreed at onboarding, and any reduction reaches you only with at least 30 days of notice and a migration path. The changelog records default changes; the headers on every response record your live values, which always win over this page.
The cheapest request is the one you do not send. Forward raw source events and let operators reason, rather than pre-chunking one change into many envelopes; use since filters instead of full re-lists; and let idempotency keys, not client-side bookkeeping, dedupe your retries.