In addition, 20.7% of the race data did not conform to the standard; the largest category was data that were missing.
e quantified the variability in the harmonized race and ethnicity data in the N3C Data Enclave by analyzing the conformance to health care standards for such data.
Differences in how race and ethnicity data are conceptualized and encoded by health care institutions can affect the quality of the data in aggregated clinical databases.
ransparency about how data have been transformed can help users make accurate analyses and inferences and eventually better guide clinical care and public policy.
ace a substantial challenge in the form of data heterogeneity, stemming from varying data collection, documentation, and coding practices [
The encodings used to represent race and ethnicity vary across institutions and data models and require specialized harmonization
he lack of consistency in how data about race and ethnicity are collected and structured by health care organizations
he Institute of Medicine’s landmark report on racial and ethnic disparities in health care, Unequal Treatment: Confronting Racial and Ethnic Disparities in Healthcare, highlighted the need for standardized collection and reporting of race and ethnicity data .
Data standardization and harmonization is one of the best tools for combating heterogeneity and ensuring that observed signals are genuine.
assess how different health care systems in various locations collect and conceptualize information about their patients’ race and ethnicity a
iscuss race and ethnicity from the perspective of data standards and database harmonization.
The standard most commonly used by health care systems to collect and organize data about race and ethnicity was created for the 2000 US Census.
he CDC added encodings to the OMB Standard; both are shown in Table 1 . To maintain clarity and consistency, we used these terms throughout this paper.
1997 OMB classification system was then adopted with minor changes by Health Level Seven International (HL7), the creator of the standard most widely used by health care systems to transmit and receive health records ;
any references to “the health care standard” in our paper refer to how this information is currently structured in HL7 Fast Healthcare Interoperability Resources (FHIR).
The current health care standard uses terminology in a manner different from how it is used colloquially.
For the purposes of collecting and organizing self-reported patient demographic data, race and ethnicity are considered distinct concepts, and ethnicity refers only to Hispanic or Latino origin.
Because the health care standard treats race and ethnicity as separate concepts, it is recommended that the question about Hispanic of Latino origin be presented first when gathering demographic information from patients.
5 minimum categories for race: (1) AI/AN, (2) Asian, (3) Black or African American, (4) Native Hawaiian or Other Pacific Islander, and (5) White.
For patients who identify as multiracial, the 1997 OMB Standard and the Institute of Medicine Subcommittee both recommend allowing for the selection of more than one race rather than offering a single “multiracial” category
Glasp is a social web highlighter that people can highlight and organize quotes and thoughts from the web, and access other like-minded people’s learning.