The INSPIRE U2 Program: Training Students in Big Data and Statistics Using RStudio

[This article was first published on RStudio Blog, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

NSF, Spelman College, and RStudio logos on a grey grid background Logos of the National Science Foundation, Spelman College, and RStudio

This is a guest post from Dr. A. Nayena Blankson, Professor of Psychology at Spelman College. Dr. Blankson is the Director of the INSPIRE U2 summer site (NSF Award #1852056), as well as a researcher, consultant, and award-winning instructor.

About the INSPIRE U2 Program

From television game shows to children’s movies, the importance of data is evident. For example, in an episode of the television game show “Wheel of Fortune” that aired on May 28, 2018, the category was Occupation and the puzzle was DATA SCIENTIST. In the children’s movie Cars 3, there is a female Data Analyst character who presents the results of her research right at the start of the movie. The need for individuals who are statistically informed and trained in data analysis has become more recognized.

However, there is a critical need for diverse perspectives in statistical inquiry. Women of color and minorities are less likely to pursue careers in statistical fields because of lack of uniquely stimulating opportunities and appropriate support within the college environment. In recognition of the barriers that underrepresented students may face in pursuit of science degrees, there has been increased development of “pipeline” programs aimed at strengthening opportunities for underrepresented undergraduates.

Funded in 2018 by the National Science Foundation (NSF Award #1852056), the Increasing Statistical Preparation in Research Education for Underrepresented Undergraduates (INSPIRE U2 Program) is a Research Experiences for Undergraduates site at Spelman College. The interdisciplinary program aims to provide a learning pathway that will set underrepresented female students on a track towards graduate studies and careers in statistical and data fields. It is expected that this initiative will: 1) increase student interest in advanced degree programs; 2) provide support and mentorship to students; and 3) serve as a pipeline for entry into advanced degree programs.

Each student will have the opportunity to conduct an independent research project, and in doing so, students will develop the skills, confidence, and inspiration to pursue advanced statistics opportunities within the sciences. Key innovations include: 1) the merging of two evidence-based training approaches, specifically the former Quantitative Training for Underrepresented Groups program and the Passion-Driven Statistics curriculum; 2) training in the flexible application of knowledge; 3) analysis of data in real world contexts; and 4) intensive one-on-one mentoring and support.

2021 INSPIRE U2 Program Session

The 2021 INSPIRE U2 Program ran from June 7- July 30. Eleven students participated in the program and their majors ranged from biology to journalism. Over the course of the eight-week program, INSPIRE U2 Scholars participated in a series of activities including professional development sessions, weekly mindfulness sessions (led by Dr. Natalie Watson-Singleton), and a Statistics Bootcamp using RStudio (taught by myself). INSPIRE U2 Scholars also worked on an independent research project using freely available Big Data sets. They developed their own research questions, conducted data analyses to answer those research questions, and presented their work.

Per Brown, Davis, and McClendon (1999), there are three essential components of mentoring underrepresented students: role modeling, role molding, and collegial friendships. The INSPIRE U2 program stressed all three components. In particular, knowing that mentorship plays a large role in student outcomes, the program took great care to place students into teams that included student (Peer Mentors and Junior Mentors) and faculty mentors (Senior Mentors). Senior Mentors were Dr. Alexandria Hadd (Spelman College), Dr. Lisa C. Dierker (Wesleyan University), Dr. Bhikhari Tharu (Spelman College), Dr. Lisa L. Harlow (University of Rhode Island), Dr. Mentewab Ayalew (Spelman College), and myself (Dr. A. Nayena Blankson). Kathleen Bostic, a third-year biology major at Spelman, served as the Junior Mentor.

Partnership With RStudio

Additionally, through the establishment of a partnership with RStudio prior to the start of the summer program, RStudio Mentors were added to the planned student teams. The RStudio Mentors were Edgar Ruiz, Jeff Allen, Mara Averick, Mine Çetinkaya-Rundel, Curtis Kephart, and Jesse Mostipak. Students met with their Senior Mentors for at least one hour per week. Meetings with RStudio Mentors at times occurred jointly with Senior Mentors so that all team members were on the same page regarding the research topic and research question. Scholars also met independently with their RStudio Mentors to discuss their data wrangling, visualization, and analysis code, along with other topics depending on the student and mentor. In the Bootcamp sessions, the Scholars were introduced to R and RStudio along with statistical topics such as analysis of variance, analysis of covariance, and multiple regression. By working with their Senior and RStudio mentors, Scholars were able to go beyond the main lessons learned in the Bootcamp.

All Scholars presented their research projects in a Summer Research Symposium at the end of the summer. Research topics ranged from plant biology to income and race differences in police contact. The presentations were outstanding, and that is an understatement. In particular, at the time of the program, eight of the 11 Scholars had just finished their first year of college. Most had never taken a statistics course before the summer program. Even more pertinent is the fact that none of the Scholars had ever used R or RStudio before the summer program. Students experienced the full breadth of conducting analyses with secondary data, which can be very messy; data are not always what you expect them to be. Sometimes information is missing. Some Scholars started off with one data set, learned the details about how those data were collected, examined the sample and variables, only to realize that the data set might not be the best for answering their proposed research question. They sifted through many research articles, data sets, and variables. They reformatted and restructured their data sets for analyses. They checked their data for missing values, transformed variables, etc. That students were able to accomplish so much in the span of a few short weeks highlights the impact that such programs can make on the educational and career trajectories of students.

Due to COVID, the inaugural Summer 2020 program was cancelled. To avoid a second year of cancellation, the 2021 program was entirely virtual. Prior to COVID, plans for the program included virtual mentoring. Therefore, shifting entirely to a virtual environment for the 2021 program was not completely off course from the original plans. Moreover, it allowed greater flexibility in the availability of both the Senior Mentors as well as RStudio Mentors for the program. Overall, the program was an incredible success, according to an internal evaluation report. Scholars benefited from having a team of mentors who supported them on their research projects, including their RStudio Mentors.

Student Testimonials

“I was ecstatic to get the chance to combine my two favorite subjects, mathematics and computer science, and see how I can use both outside the classroom settings!” (Samantha Armijo, Dominican University)

“The Inspire U2 Program was the first time I applied statistics to the real world. My project was on healthcare costs and how gender, age, and type of service might affect it. It was also my first time using RStudio. Having a background in coding, RStudio was easy for me to pick up. It was a lot of fun learning how to beautify my data with all sorts of colors, graphs, and fonts! It was also fun manipulating data for the first time and drawing conclusions based on your data. The Inspire U2 Program and RStudio taught me a lot about statistics, and I look forward to using what I learned in my future studies. Having a head start already puts me ahead of the curve and I feel more confident in conducting research and manipulating data.” (Angel Bryant, Howard University)

“INSPIRE U2 challenged me to research with purpose and persist through obstacles. Thank you, Dr. Blankson, Dr. Tharu, Mr. Allen, and all my mentors and peers for this memorable experience. I encourage my peers to think critically about the role data plays into your everyday life and then analyze data through independent projects and research programs like INSPIRE U2!” (Michel Ruiz-Fuentes, Smith College)

Learn More

For more information about the INSPIRE U2 Program, visit sites.spelman.edu/inspireu2-reu. Applications for the 2022 program will open on November 15, 2021.

This post is cross-posted on Dr. A. Nayena Blankson’s blog.

To leave a comment for the author, please follow the link and comment on their blog: RStudio Blog.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)