Skip to content
Reference

Onboarding flow

The onboarding flow has three actors and two server boundaries. Understanding both makes it much easier to debug.

ActorRole
Your platformThe Kirimdev customer running the integration. Creates the customer, generates the link, listens for webhooks.
End-customerThe tenant on your platform (e.g. an Acme Logistics branch). Opens the link and completes Meta Embedded Signup.
KirimdevThe server hosting app.kirimdev.com/onboard/{token} and the /api/public/onboarding/* endpoints.
BoundaryAuthPurpose
Public API (/v1/customers/*)Bearer API keyYour platform creates / lists / patches / revokes setup links here.
Public onboarding (/api/public/onboarding/*)No auth — bearer token in bodyEnd-customer’s browser hits these. Token is the credential.

Six phases, each independently testable. Skip to What runs where for the per-endpoint detail.

#PhaseTriggerResult
1Create customerPOST /v1/customers from your platformCustomer row in pending status. Fires customer.created.
2Generate linkPOST /v1/customers/{id}/setup_links from your platformSetup link active. Plaintext token returned once. Fires customer.setup_link.created.
3Deliver linkYour platform emails / SMSs / DMs the setup_url to the tenantTenant has the URL.
4ResolveTenant opens the URL. Browser POSTs /api/public/onboarding/resolveToken validated, CSRF nonce minted, Meta Facebook app config returned.
5Meta Embedded SignupBrowser runs FB.login(config_id)Tenant authorises Kirimdev’s Facebook app, browser receives an OAuth code.
6CallbackBrowser POSTs /api/public/onboarding/callback with { token, nonce, code }Meta token exchange, WhatsApp account inserted, link consumed, customer flips pending → active. Fires customer.setup_link.consumed + customer.onboarded. Browser redirected to success_redirect_url.

Browser-driven. Two side effects: rate-limit increment and nonce mint.

  1. Per-token-prefix rate limit (Redis INCR + EXPIRE). Reject 429 if over 30 calls / 60s.
  2. Look up the token via token_prefix index, argon2-verify against token_hash.
  3. If the link is revoked / consumed / expired → return 410 with the reason.
  4. Mint a random 18-byte nonce, store in Redis under customer-onboard-nonce:<prefix> with 10-minute TTL.
  5. Return { customer: { id, name }, facebook: { appId, configId }, nonce, expires_at, success_redirect_url, failure_redirect_url }.

Browser-driven. Many side effects: Meta calls, DB writes, webhook emits, queue enqueue.

  1. Per-token-prefix rate limit (same bucket as /resolve).
  2. Read the nonce from Redis and compare. Reject 400 invalid_nonce if missing or mismatched. Delete the nonce on first use.
  3. Resolve the token (same path as /resolve).
  4. Exchange Meta’s code for an access_token via Graph API.
  5. Resolve the WABA id (from debug_token) and the phone number (from /{waba}/phone_numbers).
  6. Run checkWhatsAppAccountLimit(team_id). Reject 403 if over quota.
  7. Probe checkCoexistenceStatus to detect SMB / WhatsApp Business App pairing.
  8. Insert whatsapp_accounts with customer_id set, credentials encrypted, status connecting.
  9. Atomically flip the link from active to consumed. If we lose the race to a parallel callback, delete the orphan WhatsApp account and return 409 link_already_consumed.
  10. Flip customers.status from pending to active if it was pending.
  11. Publish customer.setup_link.consumed and customer.onboarded webhook events.
  12. Enqueue the onboarding BullMQ job that subscribes the WABA to our app webhook and (if coexistence) kicks off the contact sync.
  13. Build the success_redirect_url with appended query params and return { accountId, customerId, status: 'pending', redirect_url }.

The browser then window.location.replace(redirect_url) to bounce the tenant back to your own app.

Why a sync handler instead of an async worker?

Section titled “Why a sync handler instead of an async worker?”

The callback is synchronous from /oauth/access_token through the DB writes because:

  • The customer’s browser is waiting for the response — they need an immediate redirect.
  • Meta access_token is short-lived; we can’t park the work and pick it up later.
  • The orphan-cleanup compensation only works inside the same handler.

The slow Meta calls (webhook subscription, contact sync, history sync) are handed off to the onboarding queue so the callback typically returns in 2-3 seconds.

The two relevant events fire after the synchronous DB writes commit but before the slow worker job runs:

EventCarriesUse it for
customer.setup_link.consumedcustomer_id, setup_link, account_idAudit trail / link analytics
customer.onboardedcustomer_id, account_id, phone_number_id, phone_numberBusiness signal. Update your CRM, surface the new number in your UI, kick off welcome automation.

Both events deliver before the contact sync completes. The account is status: 'connecting' at the moment the webhook fires; it transitions to connected once the worker finishes (typically a few seconds later). If you’re sending immediately on receipt, retry with backoff until the account is connected.

SymptomCauseWhat we return
410 at /resolveToken revoked / consumed / expired{ error: 'revoked' | 'consumed' | 'expired' }. Browser shows inline error. No redirect.
400 invalid_nonceBrowser lost / never received nonce, or tried to replayInline error. No redirect.
400 token_exchange_failedMeta rejected the OAuth code{ error, redirect_url? }. Redirect to failure_redirect_url if configured.
400 no_waba_found / phone_lookup_failed / no_phone_foundThe granted scope didn’t include a WABA, or the WABA has no phone numbers yetSame shape. Often resolved by retrying Embedded Signup with the right scopes.
403 account_limit_reachedYour org has hit its WhatsApp account quota{ error, current, limit, redirect_url? }. Upgrade your plan.
409 link_already_consumedA parallel callback won the raceOrphan WA account is rolled back. Generate a new link.

On every error after the token resolves, the callback returns the failure_redirect_url (if configured) so the tenant doesn’t see a dead-end Kirimdev page. See Redirect URLs for the query param contract.

RaceOutcome
Two browsers open the same link, both complete signupFirst-committer wins. Loser sees 409 link_already_consumed; the orphan WhatsApp account row is deleted.
Browser retries /callback with the same codeMeta returns a fresh access token for the same code (within their grace window). The DB writes idempotency on link status, but the WhatsApp account insert is NOT idempotent — a re-run produces a second account row. Mitigation: clients should not retry; the spec is “exactly once”.
/resolve called many timesSafe. Only side effects are the rate-limit increment and a new nonce overwriting the previous.
  • Customers — what gets created server-side.
  • Setup links — token format and lifecycle.
  • Redirect URLs — query param contract on the success/failure bounce.
  • Webhooks — payload shapes for the two onboarding events.