Reference

Rate limits & quotas

The API enforces two distinct budgets. Rate limits protect the platform from bursty clients and reset every minute. Quotas are standing allowances tied to your plan, chiefly the monthly action allowance. Both surface the same way: response headers you can watch, and a 429 with Retry-After when a budget is exhausted.

Early access

During early access, limits are deliberately generous and are not enforced against normal integration work. Final limits for your workload are agreed during onboarding, and anything you negotiate there supersedes the defaults on this page. If a default below is tight for your use case, say so; raising it is configuration, not engineering.

Rate limit headers

Every response includes the state of the rate window the request was counted against:

Header	Meaning
`X-RateLimit-Limit`	The size of the window for this route class, in requests per minute.
`X-RateLimit-Remaining`	Requests left in the current window, after this one.
`X-RateLimit-Reset`	Unix timestamp (seconds) at which the window resets.
`Retry-After`	Sent only with `429` and lock-related `409` responses: seconds to wait before retrying.

http · rate headers on a normal response

HTTP/1.1 200 OK
X-RateLimit-Limit: 1000
X-RateLimit-Remaining: 994
X-RateLimit-Reset: 1782172860

How limits are scoped

All rate limits and quotas apply per tenant, not per API key. Splitting traffic across keys does not raise a budget; the keys share the tenant's windows. This is deliberate: the limits protect the executor and the source systems behind your connectors, and those do not care how many keys the requests arrived on.

Two services using two keys against one tenant draw from the same buckets. Budget for their combined rate.
Resellers operating multiple tenants get an independent budget per tenant; one noisy tenant cannot starve another. This is the same wall that isolates data, applied to capacity.
Route classes are independent. Exhausting the events write budget has no effect on plan approvals or receipt reads.

To see which bucket a request was counted against, read the headers on the response itself; the limit value identifies the class.

bash · inspect the budget without spending a write

curl -sI "https://api.fibric.io/v1/events?limit=1" \
  -H "Authorization: Bearer sk_live_3f9c2a7b8e1d4f60a2c9" | grep -i "x-ratelimit"
# X-RateLimit-Limit: 1000
# X-RateLimit-Remaining: 997
# X-RateLimit-Reset: 1782172860

Default limits by endpoint group

Limits apply per tenant, per route class, on a one-minute sliding window. Reads and writes are budgeted separately, so a listing loop can never starve your ingest and vice versa.

Endpoint group	Reads / min	Writes / min	Notes
Events	`1,000`	`600`	Ingest (`POST /v1/events`) has its own 600/min budget, separate from all other writes. Batch upstream if you sustain more.
Operators	`1,000`	`120`	Create, update, pause, resume share the write budget.
Connectors	`1,000`	`120`	`POST /v1/connectors/:id/test` counts as a write; it calls the source system.
Actions & plans	`1,000`	`120`	Approve, veto, and undo share the write budget. Executed actions are metered by the action allowance, not the rate limit.
Receipts	`1,000`	`60`	Writes here are export-job creation only, further capped by the concurrent export limit.

All read routes across the API share one additional global ceiling of 2,000 reads per minute per tenant, so five saturated groups cannot compound into an unbounded aggregate.

Burst behavior

Windows are enforced with a token bucket, not a hard per-second gate. Each bucket refills continuously at the per-minute rate and holds a burst reserve of 2× the per-minute limit. In practice:

A quiet client can burst up to twice the listed rate for a short stretch, for example draining a backlog of 1,200 events in one push, and is throttled only if the sustained rate stays above the refill rate.
A client that runs flat-out at the limit has no reserve; any spike above the line returns 429 immediately.
X-RateLimit-Remaining reflects the bucket, so it can read above the nominal limit right after a quiet period. Treat the headers, not the table, as the live truth.

Handling 429

A rate-limited response carries the standard error envelope with code rate_limited and a Retry-After header:

json · 429 rate_limited

HTTP/1.1 429 Too Many Requests
Retry-After: 12
X-RateLimit-Limit: 600
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1782172872

{
  "error": {
    "type": "rate_limit_error",
    "code": "rate_limited",
    "message": "Event ingest is limited to 600 requests per minute for this tenant.",
    "doc_url": "https://fibric.io/docs/limits#handling-429",
    "request_id": "req_c418e97d"
  }
}

Handling rules, in order of importance:

Honor Retry-After exactly. It is computed from the actual bucket state; sleeping longer wastes throughput, sleeping less guarantees another 429.
Retry with the same Idempotency-Key. A rate-limited request was never processed, so the retry is the first attempt, and the key protects you if a proxy disagrees.
Add exponential backoff with jitter for repeated 429s. If two consecutive retries are limited, you are sustained over budget, not unlucky; double the wait each time and cap around 60 seconds.
Shed proactively. When X-RateLimit-Remaining drops below ~10% of X-RateLimit-Limit, slow the producer instead of waiting for the wall.

The official SDKs implement all four behaviors by default; hand-rolled clients should reproduce them. The reference loop, in shell form:

bash · retry loop honoring Retry-After with capped backoff

attempt=0
while true; do
  status=$(curl -s -o resp.json -w "%{http_code}" -D headers.txt \
    -X POST https://api.fibric.io/v1/events \
    -H "Authorization: Bearer $FIBRIC_KEY" \
    -H "Content-Type: application/json" \
    -H "Idempotency-Key: magento:SO-10884:v7" \
    -d @envelope.json)
  [ "$status" != "429" ] && break
  wait=$(grep -i "^retry-after:" headers.txt | tr -dc "0-9")
  attempt=$((attempt + 1))
  backoff=$((wait * attempt > 60 ? 60 : wait * attempt))
  sleep $((backoff + RANDOM % 3))   # jitter
done

The same Idempotency-Key rides every attempt, so even a retry that races a proxy replay cannot ingest the event twice.

Action allowance quotas

Executed actions, side effects the executor actually applies with an ALLOW or approved ALERT disposition, are metered against your plan's monthly allowance. This is a quota, not a rate limit: it does not reset per minute and it is what your bill is computed from.

Plan	Included actions / month	Overage
Early access	Uncapped during the program	Free during early access
Team, $240/mo	Included allowance per your plan	$0.01 per additional action

What counts, and what does not:

Counted: each action that applies to a source system, including undos, which are compensating actions in their own right.
Not counted: reads, event ingest, plan proposals, and refused work. BLOCK verdicts and DEDUP dispositions are receipted but free; you are never billed for something the kernel refused to do or already did once. When single-flight and idempotency collapse a flood, the 657-message pattern that motivated them, you pay for the one action that ran, not the 657 that were proposed.

Overage is on by default so operators never stall mid-incident; you can set a hard cap in the console instead, in which case actions beyond the cap return 429 with code quota_exceeded and the plans remain in proposed for approval after the cap is raised. Premium connectors (from $29/source/mo) and operator packs (from $49/operator/mo) are licensed separately and have no per-action component.

Tracking usage

The allowance is metered from the receipt ledger, which means you can reconcile it yourself: every billable action is a receipt with outcome applied, and nothing else bills. To count the current month's billable actions:

bash · count billable actions this month from receipts

curl -s "https://api.fibric.io/v1/receipts?outcome=applied&since=2026-07-01T00:00:00Z&limit=100" \
  -H "Authorization: Bearer sk_live_3f9c2a7b8e1d4f60a2c9" | jq '.data | length'

Walk next_cursor for the full count, or use a receipt export for a month-end statement. The number you compute this way is the number you are billed for; there is no separate metering system to trust.

Other quotas

Quota	Default	When exceeded
Concurrent receipt export jobs	`2` per tenant	`429 quota_exceeded`; wait for a running job to reach `complete` or `failed`.
Event payload size	`256 KB`	`400 invalid_request`; put large artifacts in your own store and reference them from the payload.
Active operators	`25` per tenant	`429 quota_exceeded` on create or resume; raised on request during onboarding.
Installed connectors	`25` per tenant	`429 quota_exceeded` on install; raised on request during onboarding.
API keys	`20` per tenant	`429 quota_exceeded` on key creation; revoke unused keys or ask for more.

Upstream source-system limits

Fibric's limits are not the only limits in the loop. Every connector acts against a source system with rate limits of its own: a commerce API, a ticketing API, a building-management gateway. The executor paces outbound tool calls to stay inside each source system's published limits, and this pacing is separate from, and often tighter than, anything on this page.

When a source system throttles a tool call, the executor backs off and retries under the action's own idempotency_key; the action stays in flight rather than failing to your client. You see the delay, not an error.
If the source system stays unavailable past the executor's retry budget, the action disposes as failed with ok: false and the upstream error in the action's error field, and the receipt records it. Direct API calls that depend on the source system, such as a connector test, surface it as 502 connector_upstream_error.
Connector pacing is per connector instance. Two connectors to two store fronts of the same vendor pace independently.

The practical consequence: sizing your ingest rate against the tables above is necessary but not sufficient. If an operator can propose actions faster than the source system will accept them, the executor's single-flight queue absorbs the difference, and the receipt timestamps show the true applied rate.

How limits change

Limit changes follow the same discipline as the API surface. Raises take effect immediately and are not announced per tenant. Reductions to a default never apply retroactively to tenants already onboarded: your effective limits are the ones agreed at onboarding, and any reduction reaches you only with at least 30 days of notice and a migration path. The changelog records default changes; the headers on every response record your live values, which always win over this page.

→

Design for the quota, not around it

The cheapest request is the one you do not send. Forward raw source events and let operators reason, rather than pre-chunking one change into many envelopes; use since filters instead of full re-lists; and let idempotency keys, not client-side bookkeeping, dedupe your retries.

Rate limits & quotas

#Rate limit headers

#How limits are scoped

#Default limits by endpoint group

#Burst behavior

#Handling 429

#Action allowance quotas

Tracking usage#

#Other quotas

#Upstream source-system limits

#How limits change