This post is just a summary of some interesting online discussion from last week around open access publishing. I learned a few things about definitions and PubMed/PMC filters. It all begins with an opinion piece, “Open access is tiring out peer reviewers.” With a title like that you might ...
We have a data set dat with multiple observations per subject. We want to create a subset of this data such that each subject (with ID giving the unique identifier for the subject) contributes the observation where the variable X takes it’s maximum value for that subject. An R ... [Read more...]
I frequently come across criticisms of PowerPoint as a presentation tool, which is interesting to me given the ubiquity of its use across industries. When I worked as a data analyst prior to coming to TCU, I frequently prepared PowerPoints using a company template for my boss’s presentations or ... [Read more...]
If you're a current graduate or undergraduate student and have a knack for data visualization, why not submit a paper to the 2014 ASA Statistical Graphics Student Paper Competition? Many of the past winners used R to create interesting displays of data, or created a new package for R (general statistical ... [Read more...]
My book titled R and Data Mining – Examples and Case Studies now has its Chinese version, translated by researchers at South China University of Technology, and published by China Machine Press in September 2014. It is sold in China … Continue reading →
[Read more...]
01.12.2014 A new Windows version of Bio7 is available. Many new features have been added to Bio7 2.0 which is now based on Eclipse 4.4. Highlights are the improved R editor, the R debugging Graphical User Interface (using the standard R debugger), the integrated Java Development Tools (JDT) to create Java simulation models ... [Read more...]
Christmas is soon upon us and here are some gift ideas for your statistically inclined friends (or perhaps for you to put on your own wish list). If you have other suggestions please leave a comment! :)
1. Games of probability
A recently released game where probability takes the main role is ... [Read more...]
The other night I was reading Lesk’s Introduction to Protein Science, when I came across this diagram:
A lattice model represents the structure of a protein as a connected set of points distributed at discrete and regular positions in space, with simplified interaction rules for calculating the energies of ...
Next meeting of Warsaw R Enthusiasts (SER = Spotkania Entuzjastów R) will take place on December 8. We are going to start with Roger Bivand talk about spatial statitics (R Foundation / NHH, author of many R packages). The second talk, by Bartosz Meglicki (IPI PAN), will introduce the SERATRON – fusion of ... [Read more...]
I was playing with some non-security-oriented R+Shiny code the other day, and thought that Shiny apps would be even more useful if they were double-clickable applications that you could “just run”—provided R was installed on the target system—vs have to cut/paste code into R. Now, I ... [Read more...]
Recently, Packt Publishing publish the book R Object-oriented Programming. The eleven chapter book covers from basic data types in R to a more advanced method such as simulation and writing functions. Different data types (i.e. integer, character...
Before we can do some quant analysis, we need to get some relevant data - and the web is a good place to start. Sometimes the data can be downloaded in a standard format like .csv files or available via an API e.g. http://www.quandl.com but often ... [Read more...]
I read a post 'race for the warmest year' at sargasso.nl. They used a plot, originating from Ed Hawkins to see how 2014 progressed to be warmest year. Obviously I wanted to make the same plot using R. In addition, I wondered which parts of the year had... [Read more...]
In my last post I mentioned that I started using RSQLite to store computed results. No rocket science here, but my feeling is that this might be useful to others, hence, this post. This can be done using any database, but I will use (R)SQLite as an illustration. Let’... [Read more...]
A week ago, Conrad provided another minor release 4.550.0 of Armadillo which has since received one minor correction in 4.550.1.0. As before, I had created a GitHub-only pre-release of his pre-release which was tested against the almost one hundred CRAN dependents of our RcppArmadillo package. This passed fine as usual, and results ... [Read more...]
The CRAN Task View system is a fine project which Achim Zeileis initiated almost a decade ago. It is described in a short R Journal article in Volume 5, Number 1. I have been editor / maintainer of the Finance task view essentially since the very beginning of these CRAN Task Views, and ... [Read more...]
I just published a new interactive visualization in my series of basic statistical concepts and techniques. This time I have tried to explain confidence intervals for means. This visualization shows a simulation of repeated sampling from a normal dist... [Read more...]
by Joseph Rickert H2O.ai held its first H2O World conference over two days at the Computer History Museum in Mountain View, CA. Although the main purpose of the conference was to promote the company's rich set of Java based machne learning algorithms and announce their new products ... [Read more...]