Silent Spring Institute Developer Blog

Bayesian Binomial Test in R

January 10, 2018 | Silent Spring Institute Developer Blog

Summary: in this post, I implemenent an R function for computing \( P(\theta_1 __ \theta2) \), where \( \theta_1 \) and \( \theta_2 \) are beta-distributed random variables. This is useful for estimating the probability that one binomial proportion is greater than another. I am working on a project in which I need to compare two ... [Read more...]

A Bayesian approach to modelling censored data

July 4, 2017 | Silent Spring Institute Developer Blog

For thise case, we can write Bayes formula as: The two components in the numerator are: The probability of the data given a \( \mu \) and \( \sigma \), also called the likelihood function: \( p(y|\mu,\sigma) \) The probability of a given \( \mu \) and \( \sigma \), before seeing any data; also called the ... [Read more...]

Unlocking Data in PDFs

April 5, 2017 | Silent Spring Institute Developer Blog

Unfortunately, there is a lot of data released in on the web in the form of PDF files. Scraping data out of PFDs is much harder than scraping from a web page; web pages have structure, in the form of HTML, that you can usually leverage to extract struc... [Read more...]

Tracking Packages in Excel

March 22, 2017 | Silent Spring Institute Developer Blog

We’ve been shipping hundreds of packages as part of our Detox Me Action Kit project, and we wanted a way to track all of them without having to manually enter their tracking numbers on the UPS website. Through the UPS REST API and some VBA hacking, we were able ... [Read more...]

NHANES made simple with RNHANES

November 21, 2016 | Silent Spring Institute Developer Blog

Scientists spend a lot of time “munging” data. Finding, cleaning, and managing datasets can take up the majority of the time it takes to complete an analysis. Tools that make the munging process easier can save scientists a lot of time. We are tackling a small part of this problem ... [Read more...]

NHANES made simple with RNHANES

November 21, 2016 | Silent Spring Institute Developer Blog

Scientists spend a lot of time “munging” data. Finding, cleaning, and managing datasets can take up the majority of the time it takes to complete an analysis. Tools that make the munging process easier can save scientists a lot of time. We are tackling a small part of this problem ... [Read more...]

Fitting censored log-normal data

November 1, 2016 | Silent Spring Institute Developer Blog

Data are censored when samples from one or both ends of a continuous distribution are cut off and assigned one value. Environmental chemical exposure datasets are often left-censored: instruments can detect levels of chemicals down to a limit, underneath which you can’t say for sure how much of the ... [Read more...]

Fitting censored log-normal data

November 1, 2016 | Silent Spring Institute Developer Blog

Data are censored when samples from one or both ends of a continuous distribution are cut off and assigned one value. Environmental chemical exposure datasets are often left-censored: instruments can detect levels of chemicals down to a limit, underneath which you can’t say for sure how much of the ...

[Read more...]

ggplot2 axis limit gotchas

October 31, 2016 | Silent Spring Institute Developer Blog

Setting axis limits in ggplot has behaviour that may be unexpected: any data that falls outside of the limits is ignored, instead of just being hidden. This means that if you are apply a statistic or calculation on the data, like plotting a box and whi... [Read more...]

ggplot2 axis limit gotchas

October 31, 2016 | Silent Spring Institute Developer Blog

Setting axis limits in ggplot has behaviour that may be unexpected: any data that falls outside of the limits is ignored, instead of just being hidden. This means that if you apply a statistic or calculation on the data, like plotting a box and whisker...