1286 search results for "excel"

Visualizing Agricultural Subsidies by Kentucky County

December 12, 2010
By
Visualizing Agricultural Subsidies by Kentucky County

In this post,  I provide results from my first full blown application of R to read, merge, clean, subset, manipulate, analyze and visualize data related to agricultural subsidies by Kentucky counties. This is very similar to the work I do on a daily basis, and was a great test of the capabilities of doing these tasks open source...

Read more »

Google, The Brew’s On Me

December 3, 2010
By
Google, The Brew’s On Me

While drinking these fine liquids may up your trip count to the john, they can also make your R reports better. Not because they allow you to reach your Ballmer Peak… but because each of these elixers is the direct result of brewing. Okay, I’m rea...

Read more »

Life Is Short, Use Python

November 24, 2010
By
Life Is Short, Use Python

Life is short, use PythonI started to play with Python two weeks ago due to the limitation of R in terms of handling large data, then a friend of mine suggested me to try Python since I had to do data massage frequently, "Python is the best choice, trust me", he...

Read more »

Learn Logistic Regression (and beyond)

November 23, 2010
By
Learn Logistic Regression (and beyond)

One of the current best tools in the machine learning toolbox is the 1930s statistical technique called logistic regression. We explain how to add professional quality logistic regression to your analytic repertoire and describe a bit beyond that. A statistical analyst working on data tends to deliberately start simple move cautiously to more complicated methods. Related posts:

Read more »

My First R Package: infochimps

November 20, 2010
By

I have finally taken the plunge and created my first R package! As frequent readers will know, I often sing the praises of infochimps, a startup out of Austin, TX attempting to be the world’s data clearinghouse. While infochimps is an excellent resource for data sets, they also provide their own set excellent data

Read more »

Is there a Market for Premium R Packages?

November 19, 2010
By
Is there a Market for Premium R Packages?

Nathan Yau, of the excellent FlowingData blog, recently asked on his Twitter stream: I wonder if there’s a market for premium R packages, like there is for say, @wordpress themes and plugins There are some great packages available for R, all of which are currently free. I think it would be great if authors like

Read more »

Postdoc in Wharton

November 16, 2010
By
Postdoc in Wharton

Just received this email from José Bernardo about an exciting postdoc position in Wharton: POST-DOCTORAL FELLOW – DEPARTMENT OF STATISTICS, THE WHARTON SCHOOL The Department of Statistics at The Wharton School of the University of Pennsylvania is seeking candidates for a Post-Doctoral Fellowship. This research fellowship provides full funding without any teaching requirements at a

Read more »

Feature selection: All-relevant selection with the Boruta package

November 15, 2010
By
Feature selection: All-relevant selection with the Boruta package

Feature selection is an important step for practical commercial data mining which is often characterised by data sets with far too many variables for model building. There are two main approaches to selecting the features (variables) we will use for the analysis:...

Read more »

Feature selection: All-relevant selection with the Boruta package

November 15, 2010
By
Feature selection: All-relevant selection with the Boruta package

Feature selection is an important step for practical commercial data mining which is often characterised by data sets with far too many variables for model building. There are two main approaches to selecting the features (variables) we will use for the analysis:...

Read more »

Isarithmic History of the Two-Party Vote

November 15, 2010
By
Isarithmic History of the Two-Party Vote

A few weeks ago, I shared a series of choropleth maps of U.S. presidential election returns, illustrating the relative support for Democratic, Republican, and third Party candidates since 1920. The granularity of these county level results led me to wonder whether it would be possible to develop an isarithmic map of presidential voting using the … Continue reading →

Read more »