The Policy Evaluation and Research Linkage Initiative (PERLI) is a four-year project funded by the National Science Foundation that will establish powerful new datasets of linked administrative microdata from California state and local government sources.
Social, behavioral, and economic (SBE) research could inform solutions to poverty, income inequality, housing shortages, or other entrenched problems. Government administrative microdata are increasingly vital to SBE research, and have enabled recent breakthroughs in these fields. Yet the potential of these data remains unfulfilled in the United States, where legal and technical barriers have led researchers to rely largely on national survey data (whose reliability has been called into question by several recent papers) and on administrative microdata from other countries, with very different contexts. The lack of accessible, large-scale, linked US administrative microdata limits SBE research, resulting in findings of uncertain accuracy that potentially undermine the very public policy decisions they seek to inform.
PERLI will expand the data resources available for quantitative SBE research. Using administrative microdata from California, the world’s 5th largest economy, we will create linked longitudinal datasets that unlock new pathways to discovering the causes and consequences of poverty and economic mobility, and whether government interventions help households succeed.
PERLI will also refine and disseminate methods of data linkage that do not depend on access to personally identifiable information, thus expanding the universe of data available for research, and providing tools for accessing it in a streamlined fashion while safeguarding privacy.
The datasets will be available to researchers through a streamlined and secure virtual environment, and accompanied by resources such as data documentation, analysis files, and resources on privacy-preserving linkage methods.
PERLI will create three datasets:
“Life Course” dataset — Expected 2022
This dataset will link anonymized longitudinal data on individuals interacting with a vast array of public services in Sonoma County, California. This will capture interactions with criminal justice, health, behavioral health and human services, and housing programs. This dataset will facilitate research on complex, cross-domain issues such as poverty, homelessness, and mental illness, and inform more comprehensive policy solutions.
“Safety Net” dataset — Expected 2022
This dataset will link anonymized longitudinal data about individual and household participation in a wide range of state and federal social safety net programs. These data will facilitate research on the efficacy of the existing safety net, successful exits from government support, and shortfalls in eligible take-up.
“Household Economics” dataset — Expected 2023
This dataset will link anonymized longitudinal household financial information for millions of Americans, including on household income, debt, assets, and credit histories, using the newly created UC Consumer Credit Panel. These data will facilitate research on pathways into and out of poverty, the role of debt in weathering and triggering economic shocks, and on household mobility and financial well-being.
Resources for Researchers
PERLI will also develop resources to help researchers understand and use these data:
Data documentation — Each dataset will be accompanied by relevant and detailed data documentation, developed by CPL in partnership with each data provider. These will include descriptions of variables, codebooks, and any known issues of reliability or missingness.
Data linkage methods — As part of PERLI, CPL will establish a webpage dedicated to sharing information about data linkage methods. Data linkage is often essential to projects using administrative data, and can greatly influence the results of some studies, but is too rarely discussed by SBE researchers.
For the linkages required for this project, we will be using a privacy-protective record linkage (PPRL) technique that uses one-way encryptions (hashes) to mask identifying information but still allows individual records to be linked across datasets, even when transcription errors lead to small differences in the way that identifiers are recorded. CPL will advance the use of this method in SBE research by testing methods and by disseminating resources such as relevant code, linkage metrics, and a white paper for SBE researchers interested in applying this method to their own data.
Accessing the datasets
The PERLI datasets will be hosted on CPL’s Secure Data Hub, a virtual enclave environment designed for secure analysis and research using sensitive administrative microdata. Only approved users have access and only for approved projects. User activities are monitored, logged, and audited.
We will begin posting applications for research projects using the PERLI datasets in 2021. Please check back here or sign up for our updates list below. Because final approval lies with the contributing data partners, we cannot make guarantees about the likelihood that a proposal will be accepted.
To stay connected about PERLI, please subscribe to our updates list:
- Oct 1, 2020 — Press release: California Policy Lab Awarded $2M National Science Foundation Grant