Blog Archives

Assessing Causality from Observational Data using Pearl’s Structural Causal Models

Assessing Causality from Observational Data using Pearl’s Structural Causal Models

Causality In 20th century statistics classes, it was common to hear the statement: “You can never prove causality.” As a result, researchers published results saying “x is associated with y” as a way of circumventing the issue of causality yet implicitly suggesting that the association is causal. As an example from my former discipline, political science, there was an interest...

Read more »

Developing R Packages with usethis and GitLab CI: Part III

February 17, 2019
By
Developing R Packages with usethis and GitLab CI: Part III

While developing your R package, you will want to make sure the code it contains is as clean as possible and that your package build and testing times are as efficient as you can make them. There are a number of tricks and tools at your disposal to accomplish these aims. This post, the third in a series that...

Read more »

Tracking private R dependencies with packrat & git submodules

September 16, 2018
By

Here at Methods we often use RStudio’s packrat package to version our package dependencies and help ensure our work is reproducible. Packrat handles public packages on CRAN or Github just fine, but we have a lot of internal packages hosted privately on Gitlab that we’d like to have packrat manage like the rest of our dependencies. This comes up...

Read more »

Developing R Packages with usethis and GitLab CI: Part II

September 3, 2018
By
Developing R Packages with usethis and GitLab CI: Part II

This post, the second part in a series that covers R package development, will define the important concept of continuous integration (CI) and demonstrate the advantages of using CI within GitLab. The version control code repository, GitLab, offers many services to its users, including the ability to set up CI services to R programmers and software developers in private...

Read more »

Developing R Packages with usethis and GitLab CI: Part I

Developing R Packages with usethis and GitLab CI: Part I

The best way to share your R code with others is to create a package. Whether you want to share your functions with team members, clients, or all interested R users, bundling up your functions into a package is the way to go. Luckily, there are great tools available that make this process relatively smooth and easy. This series...

Read more »

A Tour of Timezones (& Troubles) in R

In any programming tool, dates, times, and timezones are hard. Deceptively hard. They’ve been shaped by politics and whimsy for hundreds of years: timezones can shift with minimal notice, countries have skipped or repeated certain days, some are offset by weird increments, some observe Daylight Saving Time, leap years, leap seconds, the list goes on. Luckily, we rarely need...

Read more »

Be Aware of Bias in RF Variable Importance Metrics

Be Aware of Bias in RF Variable Importance Metrics

Random forests are typically used as “black box” models for prediction, but they can return relative importance metrics associated with each feature in the model. These can be used to help interpretability and give a sense of which features are powering the predictions. Importance metrics can also assist in feature selection in high dimensional data. Careful attention should be...

Read more »

Bias Adjustment for Rare Events Logistic Regression in R

Bias Adjustment for Rare Events Logistic Regression in R

Rare events are often of interest in statistics and machine learning. Mortality caused by a prescription drug may be uncommon but of great concern to patients, providers, and manufacturers. Predictive models in finance may be focused on forecasting when equities move substantially, something quite rare relative to the more quotidian shifts in prices. Logistic-type models (logit models in econometrics,...

Read more »

Highlights from rstudio::conf 2018

The second-annual rstudio::conf was held in San Diego at the end of January, bringing together a wide range of speakers, topics, and attendees. Covering all of it would require several people and a lot of space, but I’d like to highlight two broad topics that received a lot of coverage: new tools for shiny and enhanced modeling capabilities for...

Read more »

Delta Method Standard Errors

January 9, 2018
By
Delta Method Standard Errors

Logistic regression produces result that are typically interpreted in one of two ways: Predicted probabilities Odds ratios Odds are the ratio of the probability that something happens to the probabilty it doesn’t happen. \ An odds ratio is the ratio of two odds, each calculated at a different score for \(X\). There are strengths and weaknesses to either choice. Predictored probabilities are...

Read more »

Search R-bloggers

Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)