1091 search results for "latex"

Supervised Classification, beyond the logistic

March 5, 2015
By
Supervised Classification, beyond the logistic

In our data-science class, after discussing limitations of the logistic regression, e.g. the fact that the decision boundary line was a straight line, we’ve mentioned possible natural extensions. Let us consider our (now) standard dataset clr1 <- c(rgb(1,0,0,1),rgb(0,0,1,1)) clr2 <- c(rgb(1,0,0,.2),rgb(0,0,1,.2)) x <- c(.4,.55,.65,.9,.1,.35,.5,.15,.2,.85) y <- c(.85,.95,.8,.87,.5,.55,.5,.2,.1,.3) z <- c(1,1,1,1,1,0,0,1,0,0) df <- data.frame(x,y,z) plot(x,y,pch=19,cex=2,col=clr1) One can consider a quadratic...

Read more »

Supervised Classification, discriminant analysis

March 3, 2015
By
Supervised Classification, discriminant analysis

Another popular technique for classification (or at least, which used to be popular) is the (linear) discriminant analysis, introduced by Ronald Fisher in 1936. Consider the same dataset as in our previous post > clr1 <- c(rgb(1,0,0,1),rgb(0,0,1,1)) > x <- c(.4,.55,.65,.9,.1,.35,.5,.15,.2,.85) > y <- c(.85,.95,.8,.87,.5,.55,.5,.2,.1,.3) > z <- c(1,1,1,1,1,0,0,1,0,0) > df <- data.frame(x,y,z) > plot(x,y,pch=19,cex=2,col=clr1) The main interest of...

Read more »

Plotly Graphs with Domino’s New R Notebook

March 3, 2015
By
Plotly Graphs with Domino’s New R Notebook

by Matt Sundquist co-founder of Plotly Domino's new R Notebook and Plotly's R API let you code, make interactive R and ggplot2 graphs, and collaborate entirely online. Here is the Notebook in action: Published R Notebook To execute this Notebook, or to build your own, head to Domino's Plotly Project. The GIF below shows how to get started: choose...

Read more »

Supervised Classification, Logistic and Multinomial

March 2, 2015
By
Supervised Classification, Logistic and Multinomial

We will start, in our Data Science course,  to discuss classification techniques (in the context of supervised models). Consider the following case, with 10 points, and two classes (red and blue) > clr1 <- c(rgb(1,0,0,1),rgb(0,0,1,1)) > clr2 <- c(rgb(1,0,0,.2),rgb(0,0,1,.2)) > x <- c(.4,.55,.65,.9,.1,.35,.5,.15,.2,.85) > y <- c(.85,.95,.8,.87,.5,.55,.5,.2,.1,.3) > z <- c(1,1,1,1,1,0,0,1,0,0) > df <- data.frame(x,y,z) > plot(x,y,pch=19,cex=2,col=clr1) To get...

Read more »

R Markdown Tutorial by RStudio and DataCamp

March 1, 2015
By
R Markdown Tutorial by RStudio and DataCamp

In collaboration with Garrett Grolemund, RStudio’s teaching specialist, DataCamp has developed a new interactive course to facilitate reproducible reporting of your R analyses. R Markdown enables you to generate reports straight from your R code, documenting your works as an HTML, pdf or Microsoft document. This course is part of DataCamp’s R training path, but can The post

Read more »

Using Tables for Statistics on Large Vectors

March 1, 2015
By
Using Tables for Statistics on Large Vectors

This is the first post I’ve written in a while. I have been somewhat radio silent on social media, but I’m jumping back in. Now, I work with brain images, which can have millions of elements (referred to as voxels). Many of these elements are zero (for background). We want to calculate basic statistics on

Read more »

One weird trick to compile multipartite dynamic documents with Rmarkdown

February 28, 2015
By
One weird trick to compile multipartite dynamic documents with Rmarkdown

This afternoon I stumbled across this one weird trick an undocumented part of the YAML headers that get processed when you click the ‘knit’ button in RStudio. Knitting turns an Rmarkdown document into a specified format, using the rmarkdown package’s render function to call pandoc (a universal document converter written in Haskell). If you...

Read more »

Visualizing Clusters

February 24, 2015
By
Visualizing Clusters

Consider the following dataset, with (only) ten points x=c(.4,.55,.65,.9,.1,.35,.5,.15,.2,.85) y=c(.85,.95,.8,.87,.5,.55,.5,.2,.1,.3) plot(x,y,pch=19,cex=2) We want to get – say – two clusters. Or more specifically, two sets of observations, each of them sharing some similarities. Since the number of observations is rather small, it is actually possible to get an exhaustive list of all partitions, and to minimize some criteria, such...

Read more »

k-means clustering and Voronoi sets

February 22, 2015
By
k-means clustering and Voronoi sets

In the context of -means, we want to partition the space of our observations into  classes. each observation belongs to the cluster with the nearest mean. Here “nearest” is in the sense of some norm, usually the (Euclidean) norm. Consider the case where we have 2 classes. The means being respectively the 2 black dots. If we partition based...

Read more »

12 nifty tips for scientists who use computers

February 16, 2015
By

Simple things are good. Here is a list of 12 things that I find simple and useful, yet not many of my colleagues use them. The list is R-biased. Knitr. Intuitive tool to integrate R and text to make reports with fancy fonts, figures, syntax-highlighted R code and equations. If … Continue reading →

Read more »