Responsible AI · Fairness Report · June 2026

Tested for bias
at every stage.
None found.

Counterfactual identity-swap testing was conducted across every AI evaluation stage in Avya's hiring pipeline. This page documents the methodology, findings, and their honest limits — and lets you run the test yourself.

3,700+
Individual scored evaluations
5
Identity dimensions varied
4
Stages audited incl. blind control
3
Warning signals — all refuted
0
Leads surviving confirmation
±2
pt MDE, application screening
Source: Zeko AI Internal Fairness Study, June 2026 — 3,700+ scored evaluations, 4 pipeline stages, 5 identity dimensions, pre-registered.
Interactive · Based on real study data

Try to make Avya evaluate them differently.

Two candidates. Identical answers. Change who Candidate B appears to be — then change what they said.

ⓘ Scores representative of study findings (baseline 74/100). Sample answers illustrative.
Score gap between A & B
0
points difference
Identity had no effect
No swaps made yet
A🔒 ReferenceRahul Kumar74/100
RK
Rahul Kumar
Senior Backend Engineer · Java
Answer — system design
"For the payment service, I'd implement the saga pattern with compensating transactions. Each step publishes an event; downstream services subscribe and roll back on failure."
✓ Content locked
Education
IIT Delhi · T1
Location
Delhi · Metro
Language
Standard EN
Community
Sharma
B← You controlRahul Kumar74/100
RK
Rahul Kumar
Senior Backend Engineer · Java
Answer — system design
"For the payment service, I'd implement the saga pattern with compensating transactions. Each step publishes an event; downstream services subscribe and roll back on failure."
✓ Identical to Candidate A
Education
IIT Delhi · T1
Location
Delhi · Metro
Language
Standard EN
Community
Sharma
1
Change who Candidate B appears to be. Try different names, backgrounds, languages. Watch the score gap — the study found it stayed at zero across 3,700+ trials.
Name & gender signal
RahulMale
PriyaFemale
AlexNeutral
KavyaFemale
ArjunMale
More identity signals
Methodology

How the test was designed.

The counterfactual swap method is borrowed from controlled experimental design — change one variable, hold everything else constant, measure the effect.

Counterfactual identity-swap test — schematic
🔒
Application content
"Saga pattern, compensating transactions, event-driven rollback…"
Held byte-for-byte constant across all conditions
Rahul · IIT Delhi · Metro · Sharma
Priya · NIT · Non-metro · Iyer
Khan · Private · Small town · Code-switched
+ 12 more identity conditions tested
Evaluation engine
Avya
Production config · Unmodified
Rahul74
Priya74
Khan74
All within ±1 pt across conditions
0
pts gap
No measurable identity effect within MDE ±2 pts
Pre-registeredRepeated N×Perception-checkedConfirmed at scaleAnalysis plan fixed before data was seen. Each condition replicated many times.
Stage
Identity exposure
Sample scale
MDE precision
Perception check
Verdict
Application screening
Stage 01
High
15 × 3 × 3
±2 pt
✓ Passed
No effect
Competency evaluation
Stage 02
Low
Many runs
±1 pt
✓ Passed
No effect
Post-interview screening
Stage 03
Medium
Many runs
±23 pt
✓ Passed
No effect
Technical evaluation
Control stage
None
Control
N/A
Baseline
Honest limits

What this report does not claim.

A report containing only favourable findings would be the less credible one.

📐
Consistency, not real-world outcomes
The study measured whether identical work earns identical evaluation — a necessary condition for fairness, not a sufficient one. Downstream hiring outcomes on live populations are a separate ongoing effort.
🔬
A precision floor exists
The tests reliably detect sizeable differences. Very small ones need more data. Application screening MDE is ±2 points. Results are presented against that yardstick, not as proof of absence.
📅
A point-in-time snapshot
Models and prompts evolve. A result valid today is not a permanent certificate. Evaluations are re-run whenever models or prompts change. This is a repeating process, not a one-off report.
Fairness testing at Zeko AI is continuous, not ceremonial. Evaluations are re-run when models or prompts change, the set of candidates and scenarios widens over time, and findings are published openly — including inconvenient ones.

Trusted by Global Enterprises

Workflows curated by sector, configured by role.

Persistent Systems
Infosys
Schneider Electric
Nagarro
Shadow Fax
Pierian Services
Orion Innovation
Recro
Coditas
Persistent Systems
Infosys
Schneider Electric
Nagarro
Shadow Fax
Pierian Services
Orion Innovation
Recro
Coditas

Enterprise-grade security with Responsible AI built-in

SOC2
ISO27001
GDPR
SOC2
ISO27001
GDPR

Act Now

Data compounds with every hire. Build your talent capability intelligence now.

AI-verified hiring is the new enterprise baseline. The advantage goes to whoever builds the intelligence layer first, and that window is closing.

  • Trusted by 150+ enterprises

  • SOC2 · GDPR · ISO27001

  • 4.8/5 Average Candidate Rating

Act Now

Data compounds with every hire. Build your talent capability intelligence now.

AI-verified hiring is the new enterprise baseline. The advantage goes to whoever builds the intelligence layer first, and that window is closing.

  • Trusted by 150+ enterprises

  • SOC2 · GDPR · ISO27001

  • 4.8/5 Average Candidate Rating

Act Now

Data compounds with every hire. Build your talent capability intelligence now.

AI-verified hiring is the new enterprise baseline. The advantage goes to whoever builds the intelligence layer first, and that window is closing.

  • Trusted by 150+ enterprises

  • SOC2 · GDPR · ISO27001

  • 4.8/5 Average Candidate Rating