5119 search results for "git"

Ode to Systematic Clusters

September 25, 2013
By
Ode to Systematic Clusters

Extending the d3 remixes of the fine work at Systematic Investor, I thought it woud be fun to do some dimple.js and nvd3 scatterplots of clusters of PIMCO mutual funds.  As always, I welcome thoughts, comments, and suggestions.  Click here or...

Read more »

US High School Graduation Rates, using googleVis

September 25, 2013
By

Here is an analysis and some graphics using googleVis on high school graduation rates in the United States. A report from the US Deparment of Education on increasing high school graduation rates was published earlier this year (Jan 2013) and some summary graphics were shown on the Department’s blog (http://www.ed.gov/blog/2013/01/high-school-graduation-rate-at-highest-level-in-three-decades). However, the above blog only shows one static...

Read more »

Introducing imagemetrics

September 25, 2013
By
Introducing imagemetrics

References In my recent projects, I had the opportunity to work with the professor Raphaël Proulx who introduced me to several metrics commonly used in landscape ecology for quantifying image texture. In order to make my life easier, I decided to implement...

Read more »

Using R to Solve a Geography Puzzle

September 25, 2013
By
Using R to Solve a Geography Puzzle

The puzzle: find two points inside the United States such that Both points are in the same state The straight line segment (shortest great circle) connecting them crosses the largest number of distinct states This came up during a recent road trip through Pennsylvania, Maryland, West Virginia, and Virginia, where I noticed that it’s possible...

Read more »

R as a command-line tool for data science

September 24, 2013
By

Data Scientist Jeroen Janssens recently published a useful list of 7 data science tools that you can use from the command line. This doesn't just mean they're convenient tools for command-line junkies: it also means you can easily chain them together with data sources for offline, automated processes. Included in the list are JSON processing tools (jq, json2csv), the...

Read more »

Patterns in the Ivy II: Beyond the Giant Component

September 24, 2013
By
Patterns in the Ivy II: Beyond the Giant Component

Last week’s post on the metal collaboration network brought attention largely to the “giant component”–the largest subgraph in a network where all actors have at least one path to all other actors. In large networks, even sparse ones, giant components typically emerge and include the majority of actors in the network. While focusing on the… Continue reading →

Read more »

A speed test comparison of plyr, data.table, and dplyr

September 23, 2013
By
A speed test comparison of plyr, data.table, and dplyr

Guest post by Jake Russ For a recent project I needed to make a simple sum calculation on a rather large data frame (0.8 GB, 4+ million rows, and ~80,000 groups). As an avid user of Hadley Wickham’s packages, my first …Read more »

Read more »

Citations for using Stan?

September 23, 2013
By
Citations for using Stan?

Bob writes: If you have papers that have used Stan, we’d love to hear about it. We finally got some submissions, so we’re going to start a list on the web site for 2.0 in earnest. You can either mail them to the list, to me directly, or just update the issue (at least until The post Citations...

Read more »

Building models over rolling time periods

September 23, 2013
By

Often I have some idea for a trading system that is of the form “does some particular aspect of the last n periods of data have any predictive use for subsequent periods.” I generally like to work with nice units of time, such as 4 weeks or 6 months, rather than 30 or 126 days. It probably doesn’t...

Read more »

Going to Plot Some Proportions? Why not Flog ’em First?

September 23, 2013
By
Going to Plot Some Proportions? Why not Flog ’em First?

Fractions and proportions can be difficult to plot nicely for a number of reasons: If the proportions are based on small counts (e.g., two of his three computing devices were Apple products) then the calculated proportions will only take on a number of discrete values. Depending on what you have measured there might be many proportions close to the...

Read more »