3887 search results for "git"

Visualize a random forest that classifies digits

August 5, 2012
By
Visualize a random forest that classifies digits

My last post uses random forest proximity to visualize a set of diamond shapes (the random forest is trained to distinguish diamonds from non-diamonds).This time I looked at the digits data set that Kaggle is using as the basis of a competition for "ge...

Read more »

Sourcing Code from GitHub

July 10, 2012
By
Sourcing Code from GitHub

In previous posts I described how to input data stored on GitHub directly into R. You can do the same thing with source code stored on GitHub. Hadley Wickham has actually made the whole process easier by combining the getURL, textConnection, and source commands into one function: source_url. This is in his devtools...

Read more »

Update to Data on Github Post: Solution to an RCurl problem

June 14, 2012
By
Update to Data on Github Post: Solution to an RCurl problem

A reader of my most recent post tried the R code I had written to download the data set of electoral disproportionality from the GitHub repository. However, it didn’t work for them. After entering disproportionality.data <- getURL(url) they go...

Read more »

Mindoro Digital Elevation Map

June 14, 2012
By
Mindoro Digital Elevation Map

Saw a map produced by my previous student using a commercial GIS software.Using R raster package and data from diva-gis.org. I produced the same map

Read more »

Data on GitHub: The easy way to make your data available

June 13, 2012
By
Data on GitHub: The easy way to make your data available

GitHub is designed for collaborating on coding projects. Nonetheless, it is also a potentially great resource for researchers to make their data publicly available. Specifically you can use it to:store data in the cloud for future use (for free),track ...

Read more »

Digitize linear and (semi-)log scale graphs with multiple point sets

June 5, 2012
By
Digitize linear and (semi-)log scale graphs with multiple point sets

Working on a paper, I ran into the problem of needing data from a graph that was not mine, and for which no underlying table was published. With today's software packages, it is however not very difficult to digitize a figure yourself. I remembered rea...

Read more »

Github Follower Graph with R

May 17, 2012
By
Github Follower Graph with R

Graph a github user's followers (and follower's followers).Each programming language tends to develop its own idiomatic set of data structures.  In R, data frames are often the structure of choice.  JSON (a subset of Javascript) has emerged a...

Read more »

GitHub data analysis

May 15, 2012
By
GitHub data analysis

Few weeks ago GitHub announced, that its timeline data is available on bigquery for analysis. Moreover, it offers prizes for the best visualization of the data. Despite my art skills and minimal chances to win beauty contest, I decided to crunch GitHub data and run data analysis. After initial trial of bigquery service, I found hard

Read more »

Google BigQuery and the Github Data Challenge

May 1, 2012
By

Github has made data on its code repositories, developer updates, forks etc. from the public GitHub timeline available for analysis, and is offering prizes for the most interesting visualization of the data. Sounds like a great challenge for R programmers! The R language is currently the 26th most popular on GitHub (up from #29 in December), and it would...

Read more »

Probit/Logit Marginal Effects in R

April 23, 2012
By
Probit/Logit Marginal Effects in R

The common approach to estimating a binary dependent variable regression model is to use either the logit or probit model. Both are forms of generalized linear models (GLMs), which can be seen as modified linear regressions that allow the dependent variable to originate from non-normal distributions. The coefficients in a linear regression model are marginal

Read more »