Monthly Archives: October 2016

Weighted Effect Coding: Dummy coding when size matters

October 31, 2016
By

If your regression model contains a categorical predictor variable, you commonly test the significance of its categories against a preselected reference category. If all categories have (roughly) the same number of observations, you can also ...

Read more »

Poster/cheatsheet for R/BioC package genomation

October 31, 2016
By

We prepared a poster/cheatsheet for the bioconductor package genomation, which is a package for summary and annotation of genomic intervals. Users can visualize and quantify genomic intervals over pre-defined functional regions, such as promoters, exons, introns, etc. The genomic intervals represent regions with a defined chromosome position, which may be associated with a score, such as aligned reads from HT-seq...

Read more »

Detecting outliers and fraud with R and SQL Server on my bank account data – Part 1

October 31, 2016
By
Detecting outliers and fraud with R and SQL Server on my bank account data – Part 1

Detecting outliers and fraudulent behaviour (transactions, purchases, events, actions, triggers, etc.) takes a large amount of experiences and statistical/mathetmatical background. One of the samples Microsoft provided with release of new SQL Server 2016 was using simple logic of Benford’s law. This law works great with naturally occurring numbers and can be applied across any kind … Continue...

Read more »

ggmail + forecast = how many emails I will get tomorrow?

October 31, 2016
By
ggmail + forecast = how many emails I will get tomorrow?

During the eRum 2016, Adam Zagdański gave a very good tutorial about time series modeling. Among other things I’ve learned that the forecast package (created by Rob Hyndman) got cool new plots based on the ggplot2 package. Let’s use it to play with mailbox statistics for my gmail account! 1. Get the data Follow this … Czytaj...

Read more »

xlim_tree: set x axis limits for only Tree panel

October 30, 2016
By
xlim_tree: set x axis limits for only Tree panel

A ggtree user recently asked me the following question in google group: I try to plot long tip labels in ggtree and usually adjust them using xlim(), however when creating a facet_plot xlim affects all plots and minimizes them. Is it possible to work around this and only affect the tree and it’s tip labels leaving the other plots...

Read more »

Fastest Way to Add New Variables to A Large Data.Frame

October 30, 2016
By
Fastest Way to Add New Variables to A Large Data.Frame

Read more »

Becoming The Intern

October 30, 2016
By
Becoming The Intern

I was not always this famous… And with this I mean that a year ago only my colleagues knew I did stuff in R and now I’m reaching a slightly wider audience. Some of this is definitely due to me interning for Hadley Wickham and helping prepare the ne...

Read more »

ratio-of-uniforms [#2]

October 30, 2016
By
ratio-of-uniforms [#2]

Following my earlier post on Kinderman’s and Monahan’s (1977) ratio-of-uniform method, I must confess I remain quite puzzled by the approach. Or rather by its consequences. When looking at the set A of (u,v)’s in R⁺×X such that 0≤u²≤ƒ(v/u), as discussed in the previous post, it can be represented by its parameterised boundary u(x)=√ƒ(x),v(x)=x√ƒ(x)    x

Read more »

What is wrong with lift curves

October 30, 2016
By
What is wrong with lift curves

The first part of our Marketing Analytics Using R course covers campaign analysis with test- and control groups and campaign optimisation using lift curves and predicted responses. Among the many topics covered, we discuss what is wrong with lift curves. They are a standard tool in marketing to select a target group for a campaign based on predicted response...

Read more »

The Bayesian approach to ridge regression

October 30, 2016
By
The Bayesian approach to ridge regression

In a previous post, we demonstrated that ridge regression (a form of regularized linear regression that attempts to shrink the beta coefficients toward zero) can be super-effective at combating overfitting and lead to a greatly more generalizable model. This approach… Continue reading →

Read more »

Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)