Fonteum Research · Methodology Disclosure
Provider Directory Accuracy Index
Version: pdai/v1
DOI: Pending Zenodo deposit (§sprint3-zenodo-doi-pipeline)
Overview
The Provider Directory Accuracy Index (PDAI) measures the degree to which three federal provider data sources agree on four key fields for each individual NPI. A high agreement score indicates that the same provider's information is consistent across federal sources — a necessary (though not sufficient) condition for directory accuracy. A low score flags systematic data quality problems that can lead to ghost networks and No Surprises Act compliance risk.
Fonteum does not independently verify provider information. This is a cross-source agreement score — not a ground-truth accuracy measurement.
Data sources
CMS NPPES — National Plan and Provider Enumeration System
Self-reported provider data. The authoritative NPI registry. Providers are responsible for keeping their NPPES record current; update frequency varies by practice. Source tier: Tier-2 (federal public records, bulk download).
CMS Care Compare — Provider data
CMS-curated provider information published via the Care Compare portal. Reflects CMS enrollment records and curation workflows. Source tier: Tier-2 (federal public records).
CMS PECOS — Provider Enrollment, Chain, and Ownership System
Medicare enrollment data. Represents the billing and enrollment record that determines Medicare payment eligibility. Source tier: Tier-2 (federal public records).
Scored fields
| Field | Definition | Normalization |
|---|---|---|
| practice_address | Primary practice city + state | Lowercase; city normalized; state as 2-letter USPS code |
| primary_specialty | Primary taxonomy code (NUCC 10-digit) | Exact code match; no synonym expansion in v1 |
| org_affiliation | Organizational CCN or parent NPI | 6-char CCN or 10-digit organizational NPI, uppercase |
| telecom | Primary phone number | 10-digit digits only, no formatting |
Scoring algorithm (pdai/v1)
- NPI intersection: Include only NPIs present in ≥2 of the three sources. NPIs in only one source are excluded from scoring (no comparison possible).
- Pairwise field comparison: For each NPI, compare all source pairs (NPPES × Care Compare, NPPES × PECOS, Care Compare × PECOS when all three are present). Each pair contributes one comparison per field.
- Field agreement rate: Count comparisons where both sources have a non-null value and the values match exactly. Agreement rate = matches / comparisons.
- Coverage filter: Fields where fewer than 50% of matched NPIs have a non-null value in ≥2 sources are excluded from the composite score (insufficient coverage to be representative).
- Composite score: Simple average of field agreement rates for fields that pass the coverage filter.
- Insufficient sample: Scopes with fewer than 100 matched NPIs return no score and are flagged
insufficient_sample = true.
Aggregation scopes
- National: All matched NPIs across all states.
- State: NPIs filtered by practice_address state component. A provider with a multi-state practice is included in each state where they have a matched record.
- Specialty: NPIs filtered by primary_specialty NUCC taxonomy code.
Exclusions
- NPIs present in only one source.
- Organizational NPIs (Type-2 NPIs) — scored separately in a future methodology version; excluded from v1 to avoid mixing individual and group practices.
- Scopes with fewer than 100 matched NPIs (insufficient sample).
- Fields with null values in both sources (no comparison possible).
Version history
pdai/v1
Initial release. Four scored fields: practice_address, primary_specialty, org_affiliation, telecom. Simple average composite with 50% coverage filter. National + state + specialty scopes. Insufficient sample threshold: 100 NPIs.
Limitations
- Agreement is a proxy for accuracy — fields that are wrong the same way in all sources score as accurate.
- Address granularity is city + state only in v1; ZIP-level and street-level matching is deferred.
- Specialty codes are matched exactly; taxonomy synonym expansion is not performed. Source-specific code mapping variations may inflate disagreement counts.
- Snapshot timing differs across sources — NPPES, Care Compare, and PECOS have different update cadences. Field-level staleness is not currently accounted for.
- Fonteum does not independently verify, inspect, or certify any provider. This index describes data consistency, not provider quality.
Compliance posture
We don’t sell ranking and don’t accept payment to move a provider up the list. For final hire decisions, verify licensing, insurance, and references directly with the applicable licensing or credentialing body.
No bulk-licensing source family is currently ingested for this vertical. Hire-time checking still routes through the body named above.