Visualizing Diabetes Clinical Studies Data

[This article was first published on R – NYC Data Science Academy Blog, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.


Diseases are part of human life. We have been dealing with them throughout history: from the outbreak of smallpox in Athens, Greece in 430 BC, to the mosquito-borne Zika virus that has affected the Americas in recent years. Fortunately, modern medicine has helped alleviate the human condition by providing ways to fight various diseases using scientific methods. At the heart of the scientific method is a clinical study of human participants. A clinical study is intended to advance our knowledge about medicine.

I present an R Shiny prototype called Diabetes Studies App. This app was envisioned to provide useful information about diabetes clinical studies in the United States. The information may be useful in the following ways:

(1) You, a loved one, a friend, or someone you know, may be suffering from diabetes and may be interested in participating in a clinical study or want to know more about the types of treatments available or that could be available in the future.

(2) You are interested in medicine and want to know more about real-life clinical studies.

(3) You are a part of the medical community and want to learn about clinical study sponsors and the state of medical treatments which may affect your organization’s constituents directly or indirectly.

(4) You are interested in investing in pharmaceutical products and you want to learn about sponsor data and pipelines in product development.

You can find more details about the Diabetes Studies App in my GitHub page.


This project was inspired by a previous work by Xiao Jia. The dataset was obtained from, a repository for clinical studies in the United States and around the world. It is a resource provided by the U.S. National Library of Medicine.

For the purpose of creating a prototype app, the scope of this work was limited to a particular disease (diabetes), sponsors based in the United States, and the two main study types (clinical trials and observational studies). Briefly, a clinical trial is usually conducted to test, say, the efficacy of a drug or treatment before the drug is marketed to the public, while an observational study is usually conducted to assess safety when the drug is already in the market.

App Features and Insights:

The app’s main page shows five selection features in the left sidebar: Intro, Studies Info, Annual Data, Sponsor Data, and Map. The Intro presents videos that provide background information about diabetes and the difference between a clinical trial and an observational study. I will describe selected insights in the following app features.

The second feature, Studies Info, presents bar charts of the number of clinical studies by a selection of specific features (sponsor type, (diabetes) condition type, intervention type, status of studies, enrollment, phase (for clinical trials only), and duration). Bar charts are presented separately for clinical trials and observation studies. Some insights from this feature are:

– There are more diabetes clinical trials than observational studies.
– The studies are sponsored mosty by Industry (pharmaceutical companies) and Others (these include academic and other institutions and non-profit organizations).
– The most common intervention types (what the studies are investigating) are drugs, behavior, and devices.

The third feature is Annual Data where you will find the number of studies that are started or completed per year. Each page presents line graphs by sponsor (left graph) and intervention type (right graph). If you select “Studies by Start Year”, you will learn from the graph on the left that:

– Most studies are sponsored by Industry and Others, as we have seen earlier.
– There was a large increase in the number of clinical studies sponsored by Industry and Others (academic and other institutions and non-profit organizations) in the decade from 2000-2010. A possible reason for this may be that diabetes has become a serious disease in those years and affected many more people than in previous years. As a result, there was an urgent need for new diabetes treatments.
– There appears to be a decline in the number of studies sponsored by Industry after 2010 but the number of studies by Others remains relatively the same. Perhaps this means that Industry does not see diabetes as a profitable investment disease area anymore? It may be worth researching further why this is happening.

The fourth feature, Sponsor Data, presents a table of summary data by specific sponsors. The table includes the number of studies per sponsor as well as summary data on the number of participants (enrollment total, mean, minimum, and maximum) and duration (duration mean, minimum, and maximum in years). If you sort the table in descending order of number of studies per sponsor (, you will learn that:

– There are seven Industry sponsors in the top ten.
– AstraZeneca has the most number of studies (168).
– The National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK), an Institute within the National Institutes of Health (NIH), occupies the top third spot.

The fifth and last feature is Maps which presents a map of the sponsor locations within the United States. You can select among the study or sponsor type and zoom in or out. The sponsor locations initially form large clusters, but if you zoom in, the clusters will split into smaller ones. With this feature, you can determine whether there is a clinical study sponsor in your area.


The Diabetes Studies App was created in a span of approximately two weeks. With additional time and effort, this app can be enhanced by including: (1) other diseases such as cancers, heart diseases, skin diseases, etc. (2) sponsors who are based outside of the United States, and (3) other study types available such as registry studies. Additional features can also be added such as summaries of disease areas and clinical studies phases by sponsor. These latter additional features can be useful to pharmaceutical investors in gauging long-term investment opportunities among the different sponsors. This will enhance the business utility of the app.


In the Studies Info feature,

– In Condition Type, the category, Diabetes, includes studies that did not distinguish between the type of diabetes (Type 1 or 2). These may likely be Type 2 diabes because it is the most common type. The Other category includes studies of other less common diabetes types such as gestational diabetes or studies that do not specifically study diabetes but other diseases on diabetes patients.
– In Phase (Clinical Trial), the phase category N/A (or not applicable) represents trials without FDA-defined phases, including trials of devices or behavioral interventions.


To leave a comment for the author, please follow the link and comment on their blog: R – NYC Data Science Academy Blog. offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)