4028 search results for "git"

GitHub data analysis

May 15, 2012
By
GitHub data analysis

Few weeks ago GitHub announced, that its timeline data is available on bigquery for analysis. Moreover, it offers prizes for the best visualization of the data. Despite my art skills and minimal chances to win beauty contest, I decided to crunch GitHub data and run data analysis. After initial trial of bigquery service, I found hard

Read more »

Google BigQuery and the Github Data Challenge

May 1, 2012
By

Github has made data on its code repositories, developer updates, forks etc. from the public GitHub timeline available for analysis, and is offering prizes for the most interesting visualization of the data. Sounds like a great challenge for R programmers! The R language is currently the 26th most popular on GitHub (up from #29 in December), and it would...

Read more »

Probit/Logit Marginal Effects in R

April 23, 2012
By
Probit/Logit Marginal Effects in R

The common approach to estimating a binary dependent variable regression model is to use either the logit or probit model. Both are forms of generalized linear models (GLMs), which can be seen as modified linear regressions that allow the dependent variable to originate from non-normal distributions. The coefficients in a linear regression model are marginal

Read more »

R Source on GitHub

April 10, 2012
By

I added R source code v0.49 to v2.15.0 to a GitHub repository: r-source Each release is tagged by version number. This is an easy and accessible way to browse R source and diff with prior version. I couldn’t find a suitable alternative. ...

Read more »

knitr, github, and a new phase for the lab notebook

March 21, 2012
By
knitr, github, and a new phase for the lab notebook

I have recently modified the basic workflow of my lab notebook since discovering knitr. Before, I would write code files which I could track on github, push figures created by the code to flickr, and then write a notebook entry on wordpress describing what I was doing. I’d embed each figure I wanted into the

Read more »

Visualising F1 Telemetry Data and Plotting Latitude and Longitude with ggplot Map Projections in R

March 14, 2012
By
Visualising F1 Telemetry Data and Plotting Latitude and Longitude with ggplot Map Projections in R

Why don’t X-Y plots of latitude and longitude data look “right” compared to traditional map views? For example, here’s an X-Y scatterplot of some of Jenson Button’s McLaren telemetry data from the 2010 Australian Formula One Grand Prix: The image was generated, from a data file hosted on Google Spreadsheets, using the following R script,

Read more »

A Crash Course in git for Data Scientists

March 10, 2012
By

I really like git. It’s the first versioning tool I’ve ever used so I have nothing else to compare it to, but in the world of statistical model building where iteration is constant (and almost never a strict linear progression)...

Read more »

github with Multiple Accounts: An Analyst Perspective

March 10, 2012
By

After using github for data mining competitions and a project on statistical language models I found I enjoyed it some much I wanted to use it at work too. The trick is there’s a lot of overlap between what I...

Read more »

Show me the data! Or how to digitize plots

February 27, 2012
By
Show me the data! Or how to digitize plots

I had mentioned the Guardian's data blog and the need for more data journalism earlier here. What I really like about the Guardian's approach in particular is that they share the data of their articles and encourage readers to use it.Of course there ar...

Read more »

R-Function to Source all Functions from a GitHub Repository

January 1, 2012
By
R-Function to Source all Functions from a GitHub Repository

Here's a function that sources all scripts from an arbitrary github-repository. At the moment the function downloads the whole repo and sources functions kept in a folder named "Functions" - this may be adapted for everyones own purpose.# Script name: ...

Read more »