Web app for individual party vote from the 2014 New Zealand election study

Web app for individual party vote from the 2014 New Zealand election study

Last week I posted some analysis of individual voting behaviour in New Zealand’s 2014 general election. In that post, I used logistic regression in four different models to predict the probability of an individual giving party vote to each of the fo...

Read more »

Come see RStudio at an event near you next week!

May 12, 2017
By
Come see RStudio at an event near you next week!

We love to engage with R and RStudio users online in webinars and communities because it is so efficient for everyone. But sometimes it’s great to meet in person, too! Next week RStudio will be in Miami, Baltimore and Chicago. We wanted to let you know in case you’ll be there at the same time

Read more »

Analyzing the home advantage in English soccer, with R

May 12, 2017
By
Analyzing the home advantage in English soccer, with R

It's well-known that the home team has an advantage in soccer (or football, as it's called in England). But which teams have made the most of their home-field advantage over the years? Evolutionary biologist (and Liverpool fan) Joe Gallagher analyzed the percentage of points won in the UK Premier League (which awards 3 points for a win and one...

Read more »

Problems of causal inference after selecting of controls

May 12, 2017
By
Problems of causal inference after selecting of controls

By Gabriel Vasconcelos Inference after model selection In many cases, when we want to estimate some causal relationship between two variables we have to solve the problem of selecting the right control variables. If we fail, our results will be … Continue reading →

Read more »

R Weekly Bulletin Vol – VIII

May 12, 2017
By
R Weekly Bulletin Vol – VIII

This week’s R bulletin will cover topics on plotting charts like saving the plot, adding a grid, and plotting multiple data sets in a single plot. Hope you like this R weekly bulletin. Enjoy reading! Shortcut Keys 1. Run current document – Ctrl+Alt+R 2. Run from document beginning to current line – Ctrl+Alt+B 3. Run... The post R Weekly Bulletin...

Read more »

Shiny Application Layouts Exercises (Part-7)

May 12, 2017
By
Shiny Application Layouts Exercises (Part-7)

Shiny Application Layouts – Conditional Panel In the seventh part of our series we will use the rnorm() function to create a UI with a Conditional Panel. This type of Panel is visible only when the value of a JavaScript expression is true. The JS expression is re-evaluated every time shiny runs with a different Related exercise sets:Building Shiny...

Read more »

How to go about interpreting regression cofficients

May 12, 2017
By
How to go about interpreting regression cofficients

Following my post about logistic regressions, Ryan got in touch about one bit of building logistic regressions models that I didn’t cover in much detail – interpreting regression coefficients. This post will hopefully help Ryan (and others) out. @SteffLocke This was The post How to go about interpreting regression cofficients appeared first on Locke Data. Locke Data are a data science...

Read more »

New Book in Text Analysis for R

May 11, 2017
By

Focus for books on R tend to be highly focused on either statisticians or programmers. There is a dearth of material to assist those in typically less quantitative field access the powerful tools in the R ecosystem. Enter Text Analysis with R for Students of Literature. I haven't done a deep read of the book,

Read more »

Looking Forward to R/Finance 2017

May 11, 2017
By

R / Finance 2017 starts next Friday, and once again, I am excited about going. It’s true that there are quite a few fun and informative R gatherings these days, but R / Finance is a “big deal” because it is the “real deal”. Finance has been, and remains, one of the driving applications underlying the R language. (A...

Read more »

Parsing Text for Emotion Terms: Analysis & Visualization Using R

May 11, 2017
By
Parsing Text for Emotion Terms: Analysis & Visualization Using R

Recently, I read a post regarding a sentiment analysis of Mr Warren Buffett’s annual shareholder letters in the past 40 years written by Michael Toth. In this post, only five of the annual shareholder letters showed negative net sentiment scores, whereas a majority of the letters (88%) displayed a positive net sentiment score. Toth noted Related PostUsing MongoDB with...

Read more »

Analyzing data on CRAN packages

May 11, 2017
By
Analyzing data on CRAN packages

There's a handy new function in R 3.4.0 for anyone interested in data about CRAN packages. It's not documented, but it's pretty simple: tools::CRAN_package_db() returns a data frame with one row for every package on CRAN and 65 columns of data on those packages, as shown below. __ names(tools::CRAN_package_db()) "Package" "Version" "Priority" "Depends" "Imports" "LinkingTo" "Suggests"...

Read more »

R in Insurance 2017 Programme online

May 11, 2017
By
R in Insurance 2017 Programme online

The programme for the 2017 R in Insurance conference in Paris has been published. Talks will discuss new ideas and research with the applications in life and general insurance, from network analysis, reserving, pricing to catastrophe modelling, followe...

Read more »

tidyquant: New Tools for Performing Financial Analysis within the Tidy Ecosystem

tidyquant: New Tools for Performing Financial Analysis within the Tidy Ecosystem

In advance of upcoming Business Science talks on tidyquant at R/Finance and EARL San Francisco, we are releasing a technical paper entitled “New Tools For Performing Financial Analysis within the ‘Tidy’ Ecosystem”. The technical paper covers an...

Read more »

Stack Overflow Trends

May 10, 2017
By
Stack Overflow Trends

Developer Q&A site Stack Overflow recently introduced Stack Overflow Trends, a useful tool for tracking the growth and decline in the rate of questions asked on various topics (by their Stack Overflow tag). For example, you can see that activity around both R and Python has been increasing over the past 8 years: As you'd expect from a general...

Read more »

Accessing and Manipulating Biological Databases Exercises (Part-3)

May 10, 2017
By
Accessing and Manipulating Biological Databases  Exercises (Part-3)

In the exercises below we cover how we can Manipulate Biological Data using Seqinr packages Install Packages seqinr Answers to the exercises are available here. If you obtained a different (correct) answer than those listed on the solutions page, please feel free to post your answer as a comment on that page. Exercise 1 Read Related exercise sets:Accessing and...

Read more »

Which linear model is best?

May 10, 2017
By
Which linear model is best?

Recently I have been working on a Kaggle competition where participants are tasked with predicting Russian housing prices. In developing a model for the challenge, I came across a few methods for selecting the best regression model for a given dataset. Let’s load up some data and take a look. ## 47 6 ## … Continue reading Which...

Read more »

Text Analysis with R for Students of Literature

Text Analysis with R for Students of Literature

About the book I obtained a copy of this book by Matthew Jockers throughout Universities' access from Springer. You can also get a copy from Amazon. This book is short and to the point. I would actually strongly recommend it to anyone interested ...

Read more »

Pretty histograms with ggplot2

May 10, 2017
By
Pretty histograms with ggplot2

@drsimonj here to make pretty histograms with ggplot2! In this post you’ll learn how to create histograms like this:  The data Let’s simulate data for a continuous variable x in a data frame d: set.seed(070510) d x #__ 1 1.3681661 #__ 2 -0.0452337 #__ 3 0.0290572 #__ 4 -0.8717429 #__ 5 0.9565475 #__ 6 -0.5521690  Basic...

Read more »

Euler Problem 20: Large Integer Factorials

May 10, 2017
By
Euler Problem 20: Large Integer Factorials

A proposed solution in the R language to Euler Problem 20: Find the sum of the digits in the faculty of 100: 100 × 99 × ... × 3 × 2 × 1 Continue reading → The post Euler Problem 20: Large Integer Factorials appeared first on The Devil is in the Data.

Read more »

Everyone knows that loops in R are to be avoided, but vectorization is not always possible

May 9, 2017
By

It goes without saying that there are always many ways to solve a problem in R, but clearly some ways are better (for example, faster) than others. Recently, I found myself in a situation where I could not find a way to avoid using a loop, and I was immediately concerned, knowing that I would want this code to...

Read more »

Mapping Quandl Macroeconomic Data

May 9, 2017
By

In previous posts, we built a map to access global ETFs and a simple Shiny app to import and forecast commodities data from Quandl. Today, we will begin a project that combines those previous apps. Our end goal is to build an interactive map to access macroeconomic data via Quandl, allowing the user to choose an economic indicator and...

Read more »

niceOverPlot, or when the number of dimensions does matter

niceOverPlot, or when the number of dimensions does matter

  Hi there!    Over the last few months, my lab-mate Irene Villa (see more of her work here!) and I, have been discussing ecological niche overlap. The niche concept dates back to ideas first proposed by ornithologist J. Grinnell (1917). Later on, G.E. Hutchinson (1957) defined the ecological niche of a species as the n-dimensional hyper-volume of ecological...

Read more »

Predicting Hospital Length of Stay using SQL Server R Services

May 9, 2017
By
Predicting Hospital Length of Stay using SQL Server R Services

Last week, my Microsoft colleagues Bharath Sankaranarayan and Carl Saroufim presented a live webinar showing how you can predict a patient's length of stay at a hospital using SQL Server R Services. The recorded webinar is available for on-demand viewing now. (Registration is required to view.) The webinar is based on the Machine Learning Solution Template Predicting Length of...

Read more »

Better block sampling in MCMC with the Automated Factor Slice Sampler

May 9, 2017
By
Better block sampling in MCMC with the Automated Factor Slice Sampler

One nice feature of NIMBLE’s MCMC system is that a user can easily write new samplers from R, combine them with NIMBLE’s samplers, and have them automatically compiled to C++ via the NIMBLE compiler. We’ve observed that block sampling using a simple adaptive multivariate random walk Metropolis-Hastings sampler doesn’t always work well in practice, so

Read more »

Looking for a Programming or Statistics High Level Course? MIT Open Course Ware.

Looking for a Programming or Statistics High Level Course? MIT Open Course Ware.

Although MIT OCW has been operating for more than 15 years, I consider it important to do this post as there are still many people who do not know about its existence.MIT OpenCourseWare (OCW) is a web-based publication of virtually all MIT course co...

Read more »

Shiny Applications Layouts Exercises (Part-6)

May 9, 2017
By
Shiny Applications Layouts Exercises (Part-6)

Shiny Applications Layouts – Absolutely-positioned panel In the sixth part of our journey through Shiny App Layouts we will meet the absolutely-positioned panels. These are panels that you can drag and drop or not wherever you want in the interface. Moreover you can put anything in them, including inputs and outputs. This part can be Related exercise sets:Building Shiny...

Read more »

Load a Python/pandas data frame from an HDF5 file into R

May 9, 2017
By

The title is self-descriptive, so I will not dwell on the issue at length before showing the code. Just a small note: to my knowledge, there is only one public snippet out there that addresses this particular problem. It uses the Bioc package rhdf5 and you can find it here. The main problem is that it only works when… Continuar leyendo Load a...

Read more »

datasauRus now on CRAN

May 9, 2017
By
datasauRus now on CRAN

datasauRus is a package storing the datasets from the paper Same Stats, Different Graphs: Generating Datasets with Varied Appearance and Identical Statistics through Simulated Annealing. It’s a useful package for: Having a dinosaur dataset Showing a dinosaur related variant of The post datasauRus now on CRAN appeared first on Locke Data. Locke Data are a data science consultancy aimed at...

Read more »

Reports or Newspapers – The Two Sides of Healthcare Priorities

May 9, 2017
By
Reports or Newspapers – The Two Sides of Healthcare Priorities

  Both the World Health Organization‘s statistical profile of Qatar and the much more detailed Annual Health Report of the Department of Epidemiology and Medical Statistics of the Sate of Qatar show beyond the shadow of a doubt that cardiovascular diseases, diabetes, hypertension, obesity and other metabolic/noncommunicable diseases are the Read More ...

Read more »

Sponsors

Mango solutions





Zero Inflated Models and Generalized Linear Mixed Models with R

r-brain.io



Quantide: statistical consulting and training

ODSC1

ODSC2

datasociety

http://www.eoda.de





CRC R books series







Six Sigma Online Training





omictools

Contact us if you wish to help support R-bloggers, and place your banner here.