The IGERT (Integrative Graduate Education and Research Training) Big Data Social Science Ph.D. Training Program

BDSS-IGERT was an interdisciplinary research and training program for Penn State PhD students interested in data-intensive and computation-intensive research based on large or complex datasets arising from human interaction. BDSS-IGERT had a core mission of enabling a new type of scientist, defining a new model and role for the social sciences in both research and education, reflected now in Penn State’s new Program in Social Data Analytics (SoDA) and Center for Social Data Analytics (C-SoDA).

BDSS-IGERT trainees received a $30000 annual stipend and additional resources in order to complete a two-year training program involving the graduate curriculum in Social Data Analytics (SoDA), interdisciplinary research rotations, and externships. There were also a variety of mechanisms through which students could participate as “affiliates” of the program. From 2012-2018, BDSS-IGERT directly supported over 50 PhD student fellows and affiliates from political science, geography, statistics, sociology, information science, human development, health policy, criminology, computer science, and demography.

BDSS-IGERT was funded by a $3m IGERT grant from the National Science Foundation in 2012, with over $3m of additional programmatic and infrastructure support from the College of Liberal Arts, the Office of the Vice President for Research, the Social Science Research Institute, the Institute for CyberScience, the Department of Political Science, the Quantitative Social Science Initiative, and private donors. Participation of international students was made possible through support from the College of Liberal Arts, the Quantitative Social Science Initiative, and partner colleges: the College of Earth & Mineral Sciences, the College of Engineering, the College of Health & Human Development, the College of Information Sciences & Technology, and the Eberly College of Science.

BDSS-IGERT was developed by the Penn State Quantitative Social Science Initiative [QuaSSI], in partnership with over 90 faculty across the Penn State campus, as well as nonacademic partners in industry, government, and nonprofits.

BDSS-IGERT Training Program Requirements

The BDSS-IGERT training program consisted of four core elements:

  1. Curriculum. Integrated with their home PhD program requirements, BDSS-IGERT trainees completed an interdisciplinary curriculum, now instantiated as the dual-title PhD and doctoral minor in Social Data Analytics. Please see the SoDA Graduate Program page for more detail.
  2. Research rotations. Trainees spent the academic year as participants in interdisciplinary social data analytics research hosted by faculty, projects, and labs across Penn State. Trainees and research rotation hosts were advised to manage this as a roughly quarter time (10 hour / week) RAship. Most research rotation hosts were members of the SoDA Graduate Faculty. Trainees were matched to hosts primarily through the “Speed Dating / Matchmaking” event, held annually in September, and trainees were expected to rotate across the social science / non-social science boundary during their two years. Each spring semester, trainees and hosts presented a poster discussing the project and progress to that point. These research rotations have resulted in an impressive volume and breadth of interdisciplinary research co-authored by BDSS-IGERT students. Please see the Research page for more detail.
  3. Externships. Trainees were required to take up externships in their two BDSS-IGERT summers, engaging in social data analytics research outside of Penn State. At least one of these was to be in a nonacademic research setting such as private industry, government agencies, or nonprofits. BDSS-IGERT trainees took up externships in a wide variety of locations across the globe, resulting in both ongoing research collaborations and job opportunities. 
  4. Community. BDSS trainees and other students engaged in a wide variety of other collaborative and community events. These include taking (and offering) training workshops, hackathons and other research challenge events, and meeting with a variety of outside visitors hosted through the BDSS-IGERT Speaker and Event Series. The center for BDSS-IGERT community activity was the DataBasement in Sparks building in central campus. The result of a $1m renovation in 2013, the DataBasement is a flexible collaboratory, acting both as a student lab with access to computational infrastructure — including a high throughput fiber optic research network, visualization wall, and Hadoop cluster — and as the site for lectures, workshops, poster sessions, hackathons, and other events. The DataBasement is now the home of the Center for Social Data Analytics.

Big Data Social Science IGERT - Outcomes Report

The following is the summary outcomes report for the general public, provided to the National Science Foundation in December 2019.

Intellectual Merit - Scholarly productivity and impact

The Big Data Social Science IGERT (BDSS-IGERT) directly catalyzed over 300 reported publications and related scientific products (including software, data, and patents) co-authored by IGERT Trainees, Associates, and Affiliates. To date, Google Scholar ( reports over 3000 citations to this research as of November 2019.

Intellectual Merit - Student recognition

BDSS-IGERT Trainees, Associates, and Affiliates have been recognized for the intellectual merit of their research through awards. These include two winners of the American Statistical Association’s Gertrude Cox Women in Statistics Award, one winner of the American Sociological Association’s Outstanding Dissertation in Progress Award, multiple winners of conference paper and poster awards including from the Population Association of America, the American Statistical Association, and the Political Networks Society, and multiple winners of competitive fellowships including a NASA Space Grant Fellowship and an NIH Pathways T32 Predoctoral Fellowship.

Broader Impacts - Defining and leading the field of Social Data Analytics

BDSS-IGERT has led the development of the new field of Social Data Analytics, a social science-centered approach to the science of learning from socially-generated data. The Graduate Program in Social Data Analytics (SoDA), established in 2016, offers a dual-title PhD for students in Political Science, Sociology, Statistics, Human Development & Family Studies, and Informatics, as well as a graduate minor, available to students in any Penn State PhD program (which have included to date Criminology, Communications, Communications Arts & Sciences, Geography, Marketing, Psychology, Rural Sociology, and Tourism Management). In addition, BDSS-IGERT and the graduate program in SoDA were the catalyst for the development of a new undergraduate degree at Penn State, a Bachelor of Science in Social Data Analytics. In 2018, Penn State opened the Center for Social Data Analytics (C-SoDA), providing an ongoing institutional structure in support of the interdisciplinary community built under BDSS-IGERT.

Broader Impacts - Defining and leading the field of Social Data Analytics

Explicit objectives of BDSS-IGERT, and the SoDA graduate program based on it, are to enable a new type of scientist, and to provide a new and expanded notion of the role of social science in graduate education and in society. BDSS-IGERT’s success in this endeavor is now evident in the scope and quality of the placements of our students in both academic and industry positions in data science, analytics, and social science methodology. Nonacademic employers now include Google (x4), Verisk Maplecroft (x3), Facebook (x2), IBM Research, NASA, RTI International, RAND, SAIS,, IARPA, and the Office of the Director for National Intelligence. Academic employers now include Harvard, Stanford, Carnegie Mellon, Columbia, Ohio State, UCLA, NYU (x2), Pittsburgh, Johns Hopkins, Georgia, Rochester Institute of Technology, SMU, and Miami (Ohio). BDSS-IGERT students have also been recognized with prestigious competitive fellowships dedicated directly to the broader societal impacts of data science, with two holders of Data Science for Public Good fellowships and two holders of Data Science for Social Good fellowships.

Broader Impacts - Broadening participation

BDSS-IGERT had an explicit objective at the outset of broadening participation in data science and quantitative social science. Of the 36 funded Trainees and Associates, 18 (50%) were women, and of the 30 NSF-funded US citizen & permanent resident Trainees, 7 (23%) were from traditionally under-represented groups (three of African descent, two of Latinx descent, and two of Native American descent). These students include two winners of the American Statistical Association’s Gertrude Cox Women in Statistics Award, one winner of the Sloan Foundation Exemplary Mentoring Award, and one student named a White House Champion of Change and appointed to the Pennsylvania Governor’s Commission for Women. This commitment and impact has carried through to the Social Data Analytics degree programs based on BDSS-IGERT, which have much more extensive student participation from women and underrepresented groups than is currently typical in data science programs.