English
  • identity management
  • enterprise
  • auth

How to choose an identity provider: The engineering team's evaluation framework

A practical IdP evaluation framework built from real enterprise requirements. Covers protocol depth, migration, multi-tenancy, AI-readiness, and the criteria most checklists miss.

Yijun
Yijun
Developer

Stop wasting weeks on user auth
Launch secure apps faster with Logto. Integrate user auth in minutes, and focus on your core product.
Get started
Product screenshot

Most identity provider comparison articles are written by identity providers. Shocking, right? They list features their product has, skip the ones it doesn't, and call it an "objective guide."

This isn't that.

We've reviewed dozens of real enterprise evaluation requests — the actual spreadsheets and RFP documents that procurement teams send to vendors. The patterns are clear: engineering teams consistently underweight the criteria that matter most and overweight the ones that matter least.

The result? Teams pick an IdP based on a demo, discover the migration story is a nightmare six months in, and start evaluating again.

Here's the evaluation framework we wish someone had given us before we started. It's built for engineering teams at B2B SaaS companies — the ones building products, not the ones buying workforce SSO for their employees.

Quick answer: What makes or breaks an IdP decision

If you're skimming, here's the short version:

  1. Protocol depth matters more than feature count. Supporting "OAuth2" means nothing. Which grant types? Can you customize token claims? Can you become an OIDC provider yourself?
  2. Migration capability is the #1 underrated criterion. If you can't migrate your existing users without forcing password resets, the IdP is unusable — no matter how good everything else looks.
  3. Multi-tenancy must be native, not bolted on. If organization models and per-tenant configurations require workarounds, you'll be fighting the system forever.
  4. AI-readiness isn't future planning — it's a 12-month requirement. Token exchange, agent identity, delegated scopes. If the IdP doesn't support these, you'll be back here evaluating again next year.

The rest of this guide walks through each evaluation dimension in detail, with specific questions to ask and red flags to watch for.

Who this guide is for (and who it's not for)

This is for you if:

  • You're a CTO, VP of Engineering, or platform architect at a 50-300 person B2B SaaS company
  • You have 100K+ existing users and can't afford a disruptive migration
  • You're moving upmarket into enterprise customers who need SSO, org models, and audit logs
  • You need to write a technical evaluation report and want a framework that doesn't come from a vendor

This is NOT for you if:

  • You're looking for workforce IAM (employee SSO to internal tools) — that's a different buying decision
  • You're a startup with 500 users and no enterprise customers yet — pick whatever has the best SDK and move on
  • You need identity verification (KYC/KYB) — that's a separate category entirely

Dimension 1: Protocol capabilities — Not just "supports OAuth2"

Every IdP on the market will tell you they support OAuth2 and OIDC. That's table stakes. The real questions are about depth.

Grant types: Which ones and why it matters

Must-have:

  • Authorization Code + PKCE — The only flow you should use for browser-based and mobile apps. If a vendor still recommends Implicit flow, walk away. PKCE isn't optional — it's a security requirement.
  • Client Credentials — For service-to-service communication. Your backend services need to authenticate with each other without a user present.
  • Refresh Token — Sounds basic, but the implementation details vary wildly. Can you configure rotation? Expiration? Can you revoke a specific refresh token without nuking the entire session?

Increasingly critical:

  • Token Exchange (RFC 8693) — This is the grant type that enables AI agent authentication, impersonation flows, and delegation. If it's missing today, ask about the roadmap. If there's no roadmap, that's a red flag.

OIDC Provider capability

Here's a question most teams don't think to ask: Can you use this IdP as an OIDC Provider — not just an OIDC consumer?

Why it matters: As your SaaS grows, partners and customers may want to use your identity system to log into their own tools. You need to issue tokens, manage consent, and handle third-party app registrations. If your IdP only lets you consume external identity providers but can't act as one, you'll hit a wall when you need to federate outward.

Ask:

  • Does the IdP expose an OpenID Discovery endpoint you can white-label?
  • Can you register first-party and third-party applications with different trust levels?
  • Can first-party apps skip the consent screen while third-party apps require it?

JWT customization

The token is the contract between your IdP and your services. If you can't customize it, every downstream service needs to make additional API calls to figure out what a user is allowed to do.

Ask:

  • Can you add custom claims to access tokens and ID tokens?
  • Can you embed organization context (which tenant the user is operating in) directly in the token?
  • Can you define custom scopes that map to your application's permission model?
  • Are claims computed at token issuance time, or can they be dynamically populated via a webhook or script?

A token that carries { "org_id": "org_123", "role": "admin", "auth_level": 2 } means your API middleware can make authorization decisions in one line. A token that only carries { "sub": "user_456" } means every service needs to call back to the IdP or a database to figure out the rest. At scale, that difference is the difference between 2ms and 200ms per request.

Dimension 2: Authentication flows — The details that kill you

Every IdP supports email/password and social login. Congratulations, you've narrowed the field to... all of them.

The differentiation is in the details most demo scripts don't cover.

The sign-up flow

  • Post-registration auto-login: After a user signs up, are they automatically logged in? Or do they see a login page again? Forcing a user to log in immediately after registering is a conversion killer. You'd be surprised how many IdPs get this wrong.
  • Custom registration fields: Can you collect role, company name, or use case during sign-up? Or do you need a separate onboarding flow after the fact?
  • Progressive profiling: Can you collect additional information over multiple sessions, rather than demanding everything upfront?

The login flow

  • login_hint support: When a user clicks a link from a marketing email, can you pre-fill their email address? This sounds trivial. It's not. It's the difference between a 40% and 60% conversion rate on email campaigns.
  • Organization-specific authentication policies: Can Org A require SAML SSO while Org B uses email/password? Can you enforce MFA for enterprise tenants only? If per-tenant auth policies require code changes instead of configuration changes, you'll burn engineering cycles every time you onboard an enterprise customer.
  • Branding customization: Can you customize the login experience per tenant? Not just logo and colors — full CSS control, custom domains, and white-labeled emails. "Hosted UI vs. bring your own UI" should be a choice you make, not a limitation you accept.

What most checklists miss

  • Silent authentication: When a token expires, can the app silently refresh without redirecting the user? What happens if the refresh token is also expired — is there a fallback (like a sliding session via iframe)?
  • Account linking: A user signs up with Google, then tries to log in with email. Are these two accounts, or one? How does the IdP handle identity merging? Get this wrong and you'll have phantom duplicate accounts forever.
  • Passwordless options: Magic links, passkeys, WebAuthn. Not because everyone needs them today, but because your enterprise customers will ask within 6 months.

Dimension 3: Session and token management — The deep water

This is where evaluations get separated from demos. Session and token management is boring until it breaks — and when it breaks, your entire user base gets logged out simultaneously.

Not glamorous. Absolutely critical.

  • HttpOnly, Secure, SameSite attributes: All three must be set correctly. Any IdP that doesn't set HttpOnly on session cookies is not ready for production.
  • Cross-subdomain support: If your app runs on app.yourproduct.com and your API on api.yourproduct.com, can sessions span subdomains? How?
  • Third-party cookie deprecation: Chrome is phasing these out. How does the IdP handle cross-origin auth flows without third-party cookies? If the answer is "we're working on it," that's not good enough.

Remember me and persistent sessions

Your users want to stay logged in for weeks, not minutes. But a 180-day persistent session has very different security implications than a 30-minute session.

Ask:

  • Can you configure session duration independently from token lifetime?
  • Is there a "remember me" option that extends the session while keeping token lifetimes short?
  • Can you force re-authentication for sensitive operations without terminating the session?

Refresh token security

  • Token rotation: Does the IdP rotate refresh tokens on each use? (It should.)
  • Encrypted storage: Are refresh tokens encrypted at rest?
  • Revocation granularity: Can you revoke a single device's session without revoking all sessions?
  • Configurable expiration: Different apps need different refresh token lifetimes. Can you configure this per-application, or is it a global setting?

Dimension 4: Authorization model — Beyond basic RBAC

Role-Based Access Control is the baseline. If the IdP doesn't support RBAC, it's not worth evaluating. But for B2B SaaS, RBAC alone isn't enough.

Organization-scoped permissions

Your users belong to organizations. Their permissions within each organization are different from their platform-level permissions.

A user might be an admin in Org A and a viewer in Org B. The same user, two different role contexts. If the IdP can't model this natively, you'll build a parallel permissions system in your application — and now you have two sources of truth.

Questions:

  • Can you define roles at the organization level, not just the user level?
  • Can a single user have different roles in different organizations?
  • Is the current organization context embedded in the token, so your API knows which org the user is operating in?

Multi-level authorization (auth_level)

For financial applications, healthcare, or any product where certain operations carry higher risk: not all authenticated sessions are equal.

Viewing a dashboard? Session cookie is fine. Initiating a wire transfer? You need a fresh MFA verification, even if the user is already logged in.

This is step-up authentication, and it requires the concept of an authentication level (auth_level) as a first-class citizen in the token system.

Ask:

  • Can the token carry an auth_level claim that your backend can check?
  • Can you trigger step-up authentication from your application without forcing a full re-login?
  • Does the auth_level have its own expiration, independent of the session?

If your IdP doesn't support this natively, you'll end up building it yourself — which is exactly the kind of identity logic you're buying an IdP to avoid.

Token-based authorization decisions

The ideal: your API middleware reads the token, sees the user's org, role, and auth level, and makes the authorization decision without hitting any external service.

The reality with many IdPs: the token tells you who the user is, but you need a separate API call to figure out what they're allowed to do.

That separate call adds latency, creates a dependency, and introduces a failure mode. At 1,000 requests per second, you don't want your authorization check making network hops.

Dimension 5: Migration — The criterion that decides everything

Here's a statistic nobody likes to talk about: most IdP evaluations end not because the new IdP isn't good enough, but because the team can't figure out how to migrate their existing users.

If you have 100K+ users, migration isn't a "nice to have" — it's the entire project.

Three migration strategies (and which ones the IdP must support)

1. Bulk import with existing password hashes

Your users have passwords hashed with bcrypt, argon2, or whatever your current system uses. Can the IdP import those hashes directly and verify passwords against them?

If yes: users log in with their existing password, and nothing changes from their perspective. Best case scenario.

If no: every user gets a "reset your password" email. You will lose 30-50% of your user base in the migration. This is not hypothetical — we've seen it happen.

2. Progressive (lazy) migration

Instead of migrating all users at once, you migrate them one by one as they log in. The first login hits your old system, verifies the password, and creates the user in the new IdP. Every subsequent login goes directly to the new IdP.

This is the safest approach for large user bases, but it requires the IdP to support:

  • A custom authentication hook that calls your legacy system
  • The ability to create users on-the-fly during login
  • Tracking which users have been migrated vs. which haven't

3. Dual-write (running systems in parallel)

During transition, both the old and new identity systems are active. Writes go to both, reads gradually shift to the new system. This provides a rollback path but adds operational complexity.

Migration red flags

  • "We support CSV import" — This means bulk import of user profiles, not passwords. You'll still need password resets.
  • "We have a migration guide" — Read it carefully. If it says "users will need to set a new password," that's the 30-50% user loss scenario.
  • No mention of hash compatibility — If the vendor hasn't thought about password hash migration, they haven't worked with teams at your scale.

Questions to ask

  • Which password hash algorithms do you support for import? (bcrypt, argon2, scrypt, PBKDF2, custom?)
  • Can we run a progressive migration where users are migrated on first login?
  • Can we track migration progress — what percentage of users have been migrated?
  • What's the rollback strategy if migration doesn't go well?
  • Can we maintain session continuity — so users don't get logged out during migration?

If the vendor can't answer these confidently, they haven't done this before. Move on.

Dimension 6: Multi-tenancy — Native vs. bolted on

B2B SaaS means multi-tenancy. Your customers are organizations with multiple users, roles, and access policies. The IdP needs to understand this natively.

What "native multi-tenancy" means

  • Organization as a first-class entity: Not a custom field on the user profile, but a proper data model with its own ID, configuration, and membership list.
  • Per-organization authentication policies: Org A uses SAML SSO with their corporate IdP. Org B uses email/password with mandatory MFA. Org C uses Google Workspace login. All configured via UI or API, not code changes.
  • Organization invitations and membership management: Admins within each org can invite users, assign roles, and remove members. The IdP handles the invitation flow, email verification, and role assignment.
  • Organization-scoped tokens: When a user operates within an organization, the token includes the org context. Your API knows which org's data to return.

The "custom metadata" workaround

Some IdPs don't have a native organization model. They suggest using custom user metadata (user.app_metadata.org_id = "123") as a workaround.

This falls apart fast:

  • A user belonging to multiple orgs requires array management in metadata
  • No built-in invitation or membership flow
  • No per-org auth policies
  • No org-scoped tokens — you have to infer context from other signals
  • Audit logs don't know about organizations

If the vendor says "you can model organizations using our metadata fields," that's the identity equivalent of storing relational data in a JSON column. It works until it doesn't.

Questions to ask

  • Is the organization model native or built on top of user metadata?
  • Can users belong to multiple organizations simultaneously?
  • Can we configure different authentication requirements per organization?
  • Are organization-scoped roles and permissions supported natively?
  • Can organization admins manage members via a self-service UI?
  • Does the token include organization context?

Dimension 7: AI-readiness — The criterion nobody's asking yet

Twelve months ago, "AI agent authentication" wasn't on any evaluation checklist. Today, if you're building AI features into your product — copilots, autonomous agents, AI-driven workflows — your IdP needs to handle a new type of identity: the agent.

Why agents break the traditional model

Traditional auth has two actors: the user and the application. OAuth2 was designed around this.

AI agents introduce a third: a non-human entity that acts on behalf of a user, with constrained permissions, and needs its own audit trail.

  • An agent isn't a user — it doesn't have a password or a browser session
  • An agent isn't a machine-to-machine service — it's acting on behalf of a specific user
  • An agent needs scoped, time-limited permissions — not the full access the user has

What your IdP needs to support

Token Exchange (RFC 8693): The agent presents its own credential plus the user's authorization, and receives a scoped token. The token carries: who (the user), what (the agent), scope (the permission boundary), and when (the expiration).

Agent as a client type: The agent should be modeled as a proper OAuth2 client with its own client_id, not a hack using API keys or shared user tokens.

Delegated scope management: The user can grant specific permissions to the agent — read but not write, access to certain resources but not others.

Audit distinction: Your logs must differentiate between "user did X" and "agent did X on behalf of user." If you can't distinguish these, you'll fail your next SOC2 audit when the auditor asks "who made this change?"

MCP (Model Context Protocol) compatibility

MCP is becoming the standard protocol for AI agents to interact with tools and services. If your IdP supports OAuth-based authentication for MCP servers, agents can authenticate properly through the protocol layer rather than through API keys or shared secrets.

Questions to ask

  • Do you support OAuth2 Token Exchange?
  • Can agents be modeled as distinct client types?
  • Can tokens carry delegation information (who authorized the agent, what scope)?
  • Do audit logs distinguish agent actions from human actions?
  • Is there MCP server integration or OAuth support for agent-to-tool authentication?

If the vendor hasn't thought about this, they're building for 2020. You're planning for 2026.

Dimension 8: Non-functional requirements — The stuff that keeps you up at night

Features sell. Operations decide whether you renew.

Performance

  • Authentication throughput: Can the system handle 100+ auth requests per second during peak? What about 1,000+?
  • Token validation latency: If your services validate JWTs locally (as they should), this is sub-millisecond. But if the IdP requires introspection calls, what's the P99 latency?
  • Scale ceiling: What's the maximum Monthly Active Users (MAU) supported? Is there a demonstrated track record at your target scale?

Compliance

  • SOC2 Type II: Not Type I. Type II means they've been audited over a period, not just at a point in time. If they only have Type I, ask when Type II is expected.
  • Audit logs: Every authentication event, permission change, and admin action logged. Can you export logs to your SIEM? Are logs immutable?
  • Data residency: Can you specify which region stores user data? For EU customers, this isn't optional.

Reliability

  • Uptime SLA: 99.9% sounds good until you realize that's 8.7 hours of downtime per year. 99.99% is 52 minutes. For authentication — the front door of your application — the difference matters.
  • Failover: What happens during a provider outage? Is there a fallback mechanism? Multi-region deployment?
  • Incident history: Check their status page history. Not what they promise — what actually happened.

Data portability

  • User export: Can you export all user data, including metadata, organization memberships, and roles? In what format?
  • Standards compliance: Are they using standard protocols (OIDC, SCIM) that make migration to a different provider feasible?
  • No lock-in signals: Proprietary APIs, custom protocols, non-standard token formats — these are lock-in indicators. The more proprietary the integration, the harder it is to leave.

The evaluation matrix: A practical scoring framework

After evaluating across all dimensions, you need a way to compare. Here's a priority framework:

P1 — Deal-breakers (must pass or disqualify)

CriterionWhy it's P1
Password hash import or progressive migrationCan't use it if you can't migrate
Authorization Code + PKCE supportSecurity baseline
Native organization modelB2B SaaS requirement
SOC2 Type II or clear path to itEnterprise customers will ask
99.9%+ uptime SLAAuth down = product down

P2 — Strongly preferred (significant engineering effort if missing)

CriterionWhy it's P2
Custom JWT claimsAvoids per-request permission lookups
Per-org authentication policiesEnterprise customer onboarding
Organization-scoped roles and tokensMulti-tenant authorization
Refresh token rotation and revocationSecurity best practice
Hosted UI + custom UI optionFlexibility for different use cases

P3 — Important (plan for within 12 months)

CriterionWhy it's P3
Token Exchange (RFC 8693)AI agent authentication
OIDC Provider capabilityPartner federation
Step-up authentication / auth_levelFinancial or high-risk operations
SCIM provisioningEnterprise customer directory sync
Passkey / WebAuthn supportPasswordless direction

P4 — Nice to have (won't block decision)

CriterionWhy it's P4
Built-in analytics dashboardCan build from audit logs
White-labeled email templatesConvenience feature
Visual flow builderConvenience feature
Pre-built social connectors (beyond top 5)Long tail providers

How to use this: Start with P1. If a vendor fails any P1 criterion, stop evaluating them. Then score P2 and P3 categories. The vendor with the best combined P2+P3 score is your answer.

Common evaluation mistakes

We've seen teams make the same mistakes repeatedly. Here's how to avoid them:

Mistake 1: Evaluating on features, not architecture

A feature comparison table tells you what exists. It doesn't tell you how it's built. An IdP might "support" organizations by storing org IDs in user metadata. That checks the box on a spreadsheet but creates real problems in production.

Fix: For every feature, ask "how is this implemented?" — not just "do you have this?"

Mistake 2: Ignoring migration until after selection

Teams pick the "best" IdP, start implementation, and then realize they can't migrate their users without a password reset campaign. Now they're either stuck with a bad migration experience or starting the evaluation over.

Fix: Make migration capability your first filter, not your last.

Mistake 3: Over-indexing on the demo

Every vendor's demo is polished. It shows the happy path with a clean database and zero edge cases. Your production environment has users with merged accounts, weird unicode in profile fields, and sessions that shouldn't exist but do.

Fix: Ask for a proof-of-concept with your actual data. Import 1,000 real users and run your actual authentication flows.

Mistake 4: Not involving the right people

If only the platform team evaluates, they'll pick what's technically cleanest. If only product evaluates, they'll pick what's easiest to integrate. If only security evaluates, they'll pick what has the most compliance checkboxes.

Fix: Evaluation team should include platform engineering, product, and security. Each owns different P1/P2 criteria.

Mistake 5: Forgetting you'll need to leave someday

Vendor lock-in is real. Proprietary SDKs, custom APIs, non-standard token formats — they all make migration harder later.

Fix: Prefer IdPs that use standard protocols (OIDC, OAuth2, SCIM). Your future self will thank you.

FAQ

How long does an IdP evaluation typically take?

For a thorough evaluation including proof-of-concept testing, expect 4-8 weeks. Rushing it leads to the mistakes we outlined above — particularly the migration oversight. Budget 2 weeks for requirements gathering, 2-3 weeks for vendor evaluation and PoC, and 1-2 weeks for stakeholder alignment.

Should we build our own auth instead?

It depends on your stage. If you have fewer than 10,000 users and no enterprise customers, a lightweight auth library might be fine. But once you need SSO, multi-tenancy, MFA, and compliance documentation, the maintenance cost of homegrown auth exceeds the cost of a dedicated IdP. We've seen engineering teams spend 2-3 full-time engineers maintaining custom auth systems — that's $300-500K/year in opportunity cost.

What's the difference between CIAM and workforce IAM?

Customer Identity and Access Management (CIAM) is what your product's end users interact with — sign-up, login, profile management. Workforce IAM is what your employees use to access internal tools (Okta for your company's Slack, Google Workspace, etc.). They're different buying decisions with different evaluation criteria. This guide is about CIAM.

How important is open-source vs. proprietary?

Open-source IdPs offer transparency (you can audit the code), portability (self-host if needed), and community contributions. Proprietary IdPs may offer more polished UIs and managed services. The key question isn't "open vs. closed" — it's "can I leave if I need to?" Open-source solutions tend to make leaving easier because the data model and APIs are public.

When should AI agent authentication be a P1 instead of P3?

If you're already building AI features that access user data on behalf of users (copilots, automated workflows, AI assistants), move it to P1 now. If AI features are on your 6-12 month roadmap, keep it at P3 but weight it heavily. If AI isn't on your radar, it can stay P4 — but revisit in 6 months.

How do we evaluate pricing when vendors use different models?

Most IdPs price by Monthly Active Users (MAU). But "MAU" definitions vary — some count any login, others count unique users, some count M2M tokens separately. Get the vendor to quote your specific scenario: X users, Y organizations, Z M2M connections, with your expected authentication volume. Compare total cost, not unit price.

The bottom line

Choosing an identity provider is an infrastructure decision, not a feature decision. You're committing to a system that will handle every user's first interaction with your product, every permission check in your API, and every audit log entry your compliance team reviews.

The evaluation framework above covers what actually matters — not the marketing bullet points. Use it to filter candidates quickly (P1 criteria first), evaluate deeply (P2 and P3 criteria with proof-of-concept testing), and make a decision that holds up for years, not months.

The teams that get this right share one thing in common: they treat identity as infrastructure, not as a feature to ship and forget.