Statistics and R at the Intel ISEF Science Fair

June 24, 2014
By

(This article was first published on Revolutions, and kindly contributed to R-bloggers)

by Wayne Smith, Ph.D. California State University, Northridge

Editor's note: This post was abstracted from the monthly newsletter of the Southern California Chapter of the ASA.

On May 13th and 14th, the Intel International Science and Engineering Fair (Intel ISEF) the world’s largest international pre-college competition, was held at the Los Angeles Convention Center.

I was blessed with the opportunity to represent the American Statistical Association (ASA).  As one of approximately 30 statisticians, I helped assist in the judging of the statistics-related elements of numerous prescient and empirical projects presented by high school students from around the world.  These students had already won other local and regional science and engineering competitions.  We selected first, second, and third place winners, but 16 student teams in total received special recognition and goodie bags filled with software, books, and other items. 

The photograph below shows the first place winner, Soham Daga, from New York who used Google Trends to develop a model ot prodict the likelihood of mortgage delinquency. An Interview with Soham can be found here.

ISEF 1st Place Winner Statistics PICT0134

I have no doubt that a lasting affinity with statistical professionals and supporting organizations will be a tangible outcome for these motivated, young researchers.

I was energized and transformed by the breadth and depth of the research methods and concomitant inferential analysis applied to address pressing issues in areas as diverse as health care, energy, sustainability, material science, pharmacology, biochemistry, financial economics, and many others.  Along with my ASA colleagues, I discussed projects with students as young as 15.  As one might expect, many of the High School seniors are attending top research universities in the Fall.  I was especially impressed with the rich diversity of students, including groups of students from Qatar, Egypt, Tunisia, Brazil, Japan, Russia, and historically underrepresented areas in the U.S. such as Fresno, CA.  Some of the students' work has been ongoing for more than a year, and the students offered background literature (with references!), purposeful hypotheses, detailed analysis and results (occasionally with tool manifests and explanatory code), and integrated conclusions.

Of the 80 or so projects I reviewed, I observed applications of the general linear model; repeated measures; logistic regression; non-parametric measures; classification, feature extraction, and dimensionality reduction; sundry machine learning approaches; and Monte Carlo simulations.  I was equally impressed by these students' abilities in fundamental research tasks such as locating and using open source software (e.g., R), understanding and coherently explaining potential I/O- and computational-bounds, finding and interpreting peer-reviewed literature, and seeking out the assistance of relevant industry professionals.  Additionally, the students' ebullient entrepreneurial spirit in the design and execution of physical proof-of-concept prototypes and related statistical experiments was especially noteworthy.  I came away from each project and each student/team discussion with a new understanding of a thorny issue, a vision for what the solution space and product and process possibilities might be, and perhaps most germane for a College instructor, a renewed calibration for the knowledge, skills, and abilities of a tapestry of young people in the broad areas of mathematical, statistical, and computational sciences.  I felt visceral pride in the statistical calling of many of these young finalists, and I know that they will craft much social, intellectual, and economic value for many decades to come.

A side benefit of service at this event was the opportunity to interact with academic and professional colleagues representing a variety of statistical-education interests.  In particular, I'd like to thank Madeline Bauer (USC/Keck), Theresa Utlaut (Intel), Jo Hardin (Pomona College), and Olga Korosteleva (CSULB) for their guidance in the judging process.  At this event one can interact with professionals from dozens of other professional societies and technology firms as well.

This Intel-sponsored event circulates annually among three U.S. cities.  I strongly recommend that individuals with an general interest in statistics and data science volunteer at this event and at local SCASA and OCLBASA events in the future.

Many many thanks to all the statisticians who participated as judges and/or behind the scenes!  Thanks to the ASA for the cash prizes and thanks to Chapman Hall/CRC, JMP, Minitab, O’Reilly Media, Revolution Analytics, Sage, Stata, and Taylor & Francis for the donated books, magazines, software and other items. 

To leave a comment for the author, please follow the link and comment on his blog: Revolutions.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.