The University of California Consumer Credit Panel (UCCCP) is a new dataset of anonymized consumer credit information, created for the purpose of studying consumer financial well-being and identifying trends among California households related to credit, debt, income, and mobility. The UCCCP was created in 2020 through a partnership between the California Policy Lab, the Student Borrower Protection Center, and the Student Loan Law Initiative. The dataset is designed for use by researchers affiliated with the University of California or the California Policy Lab. The data can inform research on a variety of topics including economic mobility, health and financial well-being, social mobility, the impact of student debt, California’s housing challenges, and more.
More about the dataset
The UCCCP is a longitudinal panel of approximately 40 million consumers starting in 2004 and continuing quarterly through 2019. Updates to the data on a quarterly basis are anticipated but will depend on funding. The sample comprises anonymized credit records of a nationally representative 1% sample of U.S. adult consumers with credit records along with a full sample of 100% of Californians with credit histories. The dataset also includes records from consumers that shared an address or an account (e.g., co-signers) with those in the sample. Data elements includes demographic information about consumers, credit scores, and raw tradeline-level information about each loan or collections item, including payment history, credit limits and balances, and various information about the type and status of those tradelines, including collections and deferments.
While the UCCCP is similar to existing credit panels by the Federal Reserve Bank of New York and the Consumer Financial Protection Bureau, it also has three distinct advantages for researchers:
1. The size of the sample and the oversampling of California consumers.
2. The granularity of the data.
3. A streamlined process (through CPL) for potentially linking the UCCCP data with other California data.
The data originates from one of the three nationwide consumer reporting agencies. Before being provided to the UCCCP, the data was stripped of any information that might reveal consumers’ identities, such as names, addresses, and Social Security numbers.
Accessing the Data
The UCCCP is hosted on CPL’s Secure Data Hub, which is a virtual enclave environment designed for secure analysis and research of sensitive administrative microdata. Only approved users have access and only for approved projects. User activities are monitored, logged, and audited.
There is a cost to use the data and potentially interested researchers should reach out with questions and inquiries to: firstname.lastname@example.org.
Potential users should note that not all requests may be approved and that, as of March 2020, the data were still being cleaned and assembled.
While we are accepting research inquiries immediately, we don’t expect the data to become available until at least summer 2020. Please check this page for updates.
Frequently Asked Questions
Can you describe the data in more detail? For example, what variables are in the data?
We do not yet have detailed data documentation available to share with potential users. However, our data is similar to credit panels held by the Federal Reserve Bank of New York and the Consumer Financial Protection Bureau, so it may help to read up on those data.
The NY Fed describes their data here: An Introduction to the New York Fed Consumer Credit Panel. One main difference from the NY Fed’s data, besides sampling, is that the UCCCP contains tradeline-level information from the credit bureau, in addition to person-level.
Here is a sampling of research using consumer credit panel data:
For each consumer in each archive, there are four files: one on consumer characteristics, one on tradelines, one on inquiries, and one on public records.
- Consumer characteristics include credit score, geography, gender, month and year of birth, marital status, occupation and education codes, household count, and an indicator of homeownership status. We plan to perform reliability tests on some of these data, which are modeled/estimated by the credit bureau.
- “Tradelines” are loans or other reported credit products, and we receive several variables describing that tradeline, such as loan type, balance amount, minimum payment, credit limit, open and closure dates, a multi-year monthly payment history.
- Hard inquiries for credit (i.e., credit checks) are tracked by date, dollar amount, and type of business.
- Public records include bankruptcy records, including the type of bankruptcy, the filing date, and the amounts of assets and liabilities.
Can you identify individuals in the data?
No. The data are anonymized so that consumer privacy is maintained. There are no names, addresses, social security numbers, birth dates, or other personally identifying information in the data.
For what years do you have the data?
We have quarterly extracts going back to 2004. The first archive is from March 2004, and we are receiving archives through present from March, June, September, and December of each year. We plan to continue purchasing the data going forward, pending funding availability.
Some of the demographic data is unavailable for archives before June 2010.
What is the most detailed geography for which you have data?
The UCCCP data has 5-digit ZIP and Census Block Group for each record.
Can you describe the sampling methodology in greater detail?
There are two samples, one nationwide and one from California.
The National Sample: For each archive, we first select all records with a “consumer pin” ending in one of two two-digit numbers (e.g., 24 or 56). The consumer pin is assigned sequentially by the credit bureau and we will be testing to ensure that it creates a representative nationwide sample.
The California Sample: We first selected all consumers that had a California address during one of the sixty quarterly archives between March 2004 and December 2019. We have data for those consumers from all archives, even from archives in which they are not located in California. The resulting sample includes “always” residents of California, but also “comers” to California during the 2004-19 period, and “leavers” from California from 2004 to present, and into the future.
Household Members and Associated Borrowers: For both the National and California Samples, we also have data for consumers who share the same address (max of 8 co-habitants) during that archive (Household Members). And we also have data for consumers who are on the same tradelines, such as co-signers (Associated Borrowers). These Household Members and Associated Borrowers are distinguished within the data, and we only have data for them during the archives in which they are associated with the Sample members.
Can the UCCCP data be linked to other data? If so, how?
Yes, in some circumstances.
The UCCCP data can readily be linked with other data at the ZIP-5 or Census Block Group-level, or higher levels of geography.
In addition, we have arranged for a streamlined process by which UCCCP data can be linked with other data at the individual level. This process requires that each data provider encrypt, in the same manner, identifiers that can then be matched and linked on CPL’s servers, without ever seeing the identifiers. The process requires an additional $5,000 fee and is subject to approval by the data providers.
Who is eligible to access the data?
Faculty, students, and employees of the University of California are eligible to access the data, but their specific use of the data must first be approved by the credit bureau and CPL.
What is the approval process for accessing the data?
If you are interested in conducting research with the UCCCP, please fill out this form: UCCCP Research Request Form.
Not all research projects may get approval. Projects will be reviewed at the start of each month, so please be patient if you’ve submitted an inquiry before that time.
Even those who are approved for access may experience delays in getting access to the data as we get our hosting environment set up. We appreciate your patience.
What is the cost for accessing the data?
We are charging users $5,000 per project to access the data. This helps recoup our costs of purchasing and hosting the data.