If your regression model contains a categorical predictor variable, you commonly test the significance of its categories against a preselected reference category. If all categories have (roughly) the same number of observations, you can also ...

If your regression model contains a categorical predictor variable, you commonly test the significance of its categories against a preselected reference category. If all categories have (roughly) the same number of observations, you can also ...

We prepared a poster/cheatsheet for the bioconductor package genomation, which is a package for summary and annotation of genomic intervals. Users can visualize and quantify genomic intervals over pre-defined functional regions, such as promoters, exons, introns, etc. The genomic intervals represent regions with a defined chromosome position, which may be associated with a score, such as aligned reads from HT-seq...

Detecting outliers and fraudulent behaviour (transactions, purchases, events, actions, triggers, etc.) takes a large amount of experiences and statistical/mathetmatical background. One of the samples Microsoft provided with release of new SQL Server 2016 was using simple logic of Benford’s law. This law works great with naturally occurring numbers and can be applied across any kind … Continue...

During the eRum 2016, Adam Zagdański gave a very good tutorial about time series modeling. Among other things I’ve learned that the forecast package (created by Rob Hyndman) got cool new plots based on the ggplot2 package. Let’s use it to play with mailbox statistics for my gmail account! 1. Get the data Follow this … Czytaj...

A ggtree user recently asked me the following question in google group: I try to plot long tip labels in ggtree and usually adjust them using xlim(), however when creating a facet_plot xlim affects all plots and minimizes them. Is it possible to work around this and only affect the tree and it’s tip labels leaving the other plots...

Following my earlier post on Kinderman’s and Monahan’s (1977) ratio-of-uniform method, I must confess I remain quite puzzled by the approach. Or rather by its consequences. When looking at the set A of (u,v)’s in R⁺×X such that 0≤u²≤ƒ(v/u), as discussed in the previous post, it can be represented by its parameterised boundary u(x)=√ƒ(x),v(x)=x√ƒ(x) x

The first part of our Marketing Analytics Using R course covers campaign analysis with test- and control groups and campaign optimisation using lift curves and predicted responses. Among the many topics covered, we discuss what is wrong with lift curves. They are a standard tool in marketing to select a target group for a campaign based on predicted response...

e-mails with the latest R posts.

(You will not see this message again.)