This report documents the methodology and technical approach and summarizes results of linking Survey of Earned Doctorates (SED) data collected by the National Center for Science and Engineering Statistics (NCSES) within the National Science Foundation to a new source of research data (UMETRICS [Measuring the Impacts of Research on Innovation, Competitiveness, and Science]) processed by the Institute for Research on Innovation and Science (IRIS).
This data linkage research shows it is feasible to combine the two data sources, even though differences exist in coverage of populations and time periods and they have different identifiers. The implementation approach can inform other similar efforts and can inform the implementation of the Foundations for Evidence-Based Policymaking Act of 2018. The data linkage approach, while initially time consuming, is replicable and can scale to additional years and data sets. Measuring linkage quality when combining administrative and survey data requires defining both the relevant population to be matched and the match rate. Those definitions will differ depending on the goal of the linkage. The value of the linkage effort is considerable in terms of (1) expanding understanding of survey responses to key questions of interest, notably using grant funding to enrich understanding of reported sources of graduate school financial support and (2) adding new measures reflecting the dynamics of research experiences of doctorate recipients. There is substantial potential to engage researchers and the academic community to inform the linkage quality results as well as to expand and enhance the value of the data linkage project documented in this report.