Some everyday data tasks: a few hints with R

We all work with data frames and it is important that we know how we can reshape them, as necessary to meet our needs. I think that there are, at least, four routine tasks that we need to be able to accomplish: subsetting sorting casting melting Obviously, there is a wide array of possibilities; I’ll just mention a few, which I regularly use. Subsetting the...

Read more »

Changes in BCEA

March 26, 2019
By

I’ve just updated the GitHub’s version of BCEA. Andrea has done, as usual, some very nice work – this time he’s mainly focussed on the graphical engine underlying the graphs produced by BCEA to post-process the outcome of the economic model. The main changes are the following: Added plot rendering via plotly (using the command graph=“plotly”) to each of the functions: ceplane.plot eib.plot ceac.plot evi.plot info.rank Included...

Read more »

Bio7 3.0 Released

March 26, 2019
By
Bio7 3.0 Released

27.03.2019 A new release of Bio7 is available which is built upon Eclipse 4.11 and the latest Java OpenJDK. This new version comes bundled with OpenJDK 12, supports the dynamic compilation of Java 11 and fixes several annoying bugs on MacOSX (e.g., shutdown crashes). The R interface has been improved and the R-Shell now updates

Read more »

R Studio Shortcuts and Tips

March 26, 2019
By
R Studio Shortcuts and Tips

How can you work faster in R Studio? Do you really want to know? In this article, I would like to share with you some of my favorite productivity features of R Studio along with their respective shortcuts. As well I will provide information about some other tools and techniques that are useful. I also prepared Article R Studio Shortcuts...

Read more »

Rome Was Not Built In A Day But widgetcard Was!

March 26, 2019
By
Rome Was Not Built In A Day But widgetcard Was!

I saw a second post on turning htmlwidgets into interactive Twitter Player cards and felt somewhat compelled to make creating said entities a bit easier so posited the following: Wld this be useful packaged up, #rstats?https://t.co/sfqlWnEeJVhttps://t.co/troKzmzTNv (TLDR/V: Single function to turn an HTML widget into a deployable interactive Twitter card) pic.twitter.com/uahB52YfE2 — boB Rudis (@hrbrmstr)... Continue reading →

Read more »

Could you be the next graduate Mango?

March 26, 2019
By

At Mango, we firmly believe that any decision can be better made using analytics and data. We also know that a company’s success is increasingly dependent on becoming data-driven. That’s where we come in. Our mission is to empower organisations to make informed decisions using data science and advanced analytics to drive bigger gains, lower costs, and optimise performance....

Read more »

Inverse Statistics – and how to create Gain-Loss Asymmetry plots in R

March 26, 2019
By
Inverse Statistics – and how to create Gain-Loss Asymmetry plots in R

Asset returns have certain statistical properties, also called stylized facts. Important ones are: Absence of autocorrelation: basically the direction of the return of one day doesn’t tell you anything useful about the direction of the next day. Fat tails: returns are not normal, i.e. there are many more extreme events than there would be if … Continue reading "Inverse...

Read more »

Koning Filip lijkt op …

March 26, 2019
By
Koning Filip lijkt op …

Last call for the course on Text Mining with R, held next week in Leuven, Belgium on April 1-2. Viewing the course description as well as subscription can be done at https://lstat.kuleuven.be/training/coursedescriptions/text-mining-with-r Some things you'll learn ... is that King Filip of Belgium is similar to public expenses if we just look at open data from questions and answers in...

Read more »

Use RStudio Server in a Virtual Environment with Docker in Minutes!

March 25, 2019
By
Use RStudio Server in a Virtual Environment with Docker in Minutes!

A fundamental aspect of the reproducible research framework is that (statistical) analysis can be reproduced; that is, given a set of instructions (or a script file) the exact results can be achieved by another analyst with the same raw data. This idea may seem intuitive, but in practice it can be difficult to achieve in … Continue reading Use...

Read more »

R package developers, why should you care about R-hub?

March 25, 2019
By
R package developers, why should you care about R-hub?

tl;dr: You should care about R-hub if you care about R CMD check, on all operating systems, for free, and don’t ever want to leave R. So, why use the R-hub package builder? It is useful As an R package developer, you probably regularly use R CMD check to detect common problems in your package, and you might even love that command....

Read more »

February 2019: “Top 40” New CRAN Packages

March 25, 2019
By
February 2019: “Top 40” New CRAN Packages

One hundred and fifty-one new packages arrived at CRAN in February. Here are my “Top 40” picks organized into eight categories: Bioinformatics, Data, Machine Learning, Medicine, Statistics, Time Series, Utilities and Visualization. Bioinfomatics Cascade v1.7: Implements a modeling tool allowing gene selection, reverse engineering, and prediction in cascade networks. See Jung et al. (2014) for details, along with a Package Introduction...

Read more »

R: Birthday Problem

March 25, 2019
By
R: Birthday Problem

An interesting and classic probability question is the birthday problem.The birthday problem asks how many individuals are required to be in one location so there is a probability of 50% that at least two individuals in the group have the same birthday...

Read more »

Markov chain Monte Carlo doesn’t “explore the posterior”

March 25, 2019
By
Markov chain Monte Carlo doesn’t “explore the posterior”

First some background, then the bad news, and finally the good news. Spoiler alert: The bad news is that exploring the posterior is intractable; the good news is that we don’t need to explore all of it. Sampling to characterize the posterior There’s a misconception among Markov chain Monte Carlo (MCMC) practitioners that the purpose

Read more »

Unleash the potential of Recommender Systems

March 25, 2019
By
Unleash the potential of Recommender Systems

Recommender systems are one of the most popular algorithms in data science today. In this tutorial, we will build a movie recommender system.

Read more »

What it the interpretation of the diagonal for a ROC curve

March 25, 2019
By
What it the interpretation of the diagonal for a ROC curve

Last Friday, we discussed the use of ROC curves to describe the goodness of a classifier. I did say that I will post a brief paragraph on the interpretation of the diagonal. If you look around some say that it describes the “strategy of randomly guessing a class“, that it is obtained with “a diagnostic test that is no...

Read more »

Operator Notation for Data Transforms

March 25, 2019
By

As of cdata version 1.0.8 cdata implements an operator notation for data transform. The idea is simple, yet powerful. First let’s start with some data. d

Read more »

Critical Thinking in Data Science

March 25, 2019
By
Critical Thinking in Data Science

Hugo Bowne-Anderson, the host of DataFramed, the DataCamp podcast, recently interviewed Debbie Berebichez, a physicist, TV host and data scientist and is currently the Chief Data Scientist at Metis in NY. Introducing Debbie Berebichez Hugo: Hi there, Debbie, and welcome to DataFramed. Debbie: ...

Read more »

ggCorpIdent: Stylize ggplot2 Graphics in Your Corporate Design

March 25, 2019
By
ggCorpIdent: Stylize ggplot2 Graphics in Your Corporate Design

This is the add-on to our recently published R Markdown template for business reports. Since we’re working with ggplot2 on a daily basis and use it in nearly every our projects, we designed a ggplot2 theme in our corporate design. That is, it uses our font,...

Read more »

quantmod_0.4-14 on CRAN

March 25, 2019
By

I just pushed a new release of quantmod to CRAN! I'm most excited about the update to getSymbols() so it doesn't throw an error and stop processing if there's a problem with one ticker symbol. Now getSymbols() will import all the data it can, and provide an informative error message for any ticker symbols it could not import.Another cool feature is that getQuote() can now import quotes...

Read more »

Play with the cyphr package

March 24, 2019
By

The cyphr package seems to provide a good choice for small research group that shares sensitive data over internet (e.g., DropBox). I did some simple experiment myself and made sure it can actually serve my purpose.I did my experiment on two computers (using openssl): I created the test data on my Linux workstation running Manjaro then I tried to...

Read more »

Getting started with emmeans

Package emmeans (formerly known as lsmeans) is enormously useful for folks wanting to do post hoc comparisons among groups after fitting a model. It has a very thorough set of vignettes (see the vignette topics here), is very flexible with a ton of options, and works out of the box with a lot of different model objects (and can...

Read more »

Formatted correlation output with effect sizes

March 24, 2019
By

One of the most time-consuming part of data analysis in science is the copy-pasting of specific values of some R output to a manuscript or a report. This task is frustrating, prone to errors, and increases the variability of statistical reporting. At the sime time, standardizing practices of what and how to report is crucial for reproducibility and clarity....

Read more »

Formatted correlation output with effect sizes

March 24, 2019
By

One of the most time-consuming part of data analysis in science is the copy-pasting of specific values of some R output to a manuscript or a report. This task is frustrating, prone to errors, and increases the variability of statistical reporting. At the sime time, standardizing practices of what and how to report is crucial for reproducibility and clarity....

Read more »

Formatted correlation output with effect sizes

March 24, 2019
By

One of the most time-consuming part of data analysis in science is the copy-pasting of specific values of some R output to a manuscript or a report. This task is frustrating, prone to errors, and increases the variability of statistical reporting. At the sime time, standardizing practices of what and how to report is crucial for reproducibility and clarity....

Read more »

Formatted correlation output with effect sizes

March 24, 2019
By

One of the most time-consuming part of data analysis in science is the copy-pasting of specific values of some R output to a manuscript or a report. This task is frustrating, prone to errors, and increases the variability of statistical reporting. At the sime time, standardizing practices of what and how to report is crucial for reproducibility and clarity....

Read more »

Get the Office Quotes in R with the dundermifflin Package

March 24, 2019
By
Get the Office Quotes in R with the dundermifflin Package

Introduction I am happy to share a fun project I put together this weekend - a new R package called dundermifflin. If you can’t guess from the name, it will give you quotes from the Office whenever you want! This package was inspired by the goodshirt package, which gives users quotes from The Good Place, I was able to great...

Read more »

Summer Interns 2019

March 24, 2019
By

We received almost 400 applications for our 2019 internship program from students with very diverse backgrounds. After interviewing several dozen people and making some very difficult decisions, we are pleased to announce that these twelve interns have accepted positions with us for this summer: Therese Anders: Calibrated Peer Review. Prototype tools to conduct experiments to see whether calibrated peer...

Read more »

nice student project

March 24, 2019
By

In all of my undergraduate classes, I require a term project, done in groups of 3-4 students. Though the topic is specified, it is largely open-ended, a level of “freedom” that many students are unaccustomed to. However, some adapt quite well. The topic this quarter was to choose a CRAN package that does not use … Continue reading nice...

Read more »

Writing clean and readable R code the easy way

March 24, 2019
By

Writing R code, specially for non-programmers like myself, can be a daunting task. You start really motivated, trying to follow some naming convention, formatting your code lines in the most readable way, keeping your lines in a manageable size but when the code lines start to increase and coding problems arise, when you start to … Continue reading Writing...

Read more »

Search R-bloggers

Sponsors