Posts Tagged ‘ Uncategorized ’

Distribution of Oft-Used Bash Commands

June 1, 2012
By
Distribution of Oft-Used Bash Commands

Browsing commandlinefu.com today, I came across this little one-liner to display which commands I use most often. Here’s what I got: Yep, seems legit. I navigate and look at files a whole bunch (ls, cd, cat), and I do a butt tonne of editing (vim). I sudo like a boss, hop onto various servers (ssh),

Read more »

Are scatterplots too complex for lay folks?

May 23, 2012
By
Are scatterplots too complex for lay folks?

Usually, I like to write about the solutions to problems I’ve had, but today I only have a problem to write about. This is the second research job I’ve had outside of academia, and in both cases I’ve met with … Continue reading →

Read more »

CFP: the 10th Australasian Data Mining Conference (AusDM 2012)

May 20, 2012
By
CFP: the 10th Australasian Data Mining Conference (AusDM 2012)

The Tenth Australasian Data Mining Conference (AusDM 2012) Sydney, Australia 5-7 December 2012 http://ausdm12.togaware.com/ Data mining, the art and science of intelligent analysis of (usually large) data sets for meaningful (and previously unknown) insights, is now being actively applied in … Continue reading →

Read more »

Bar Graph Colours That Work Well

May 17, 2012
By
Bar Graph Colours That Work Well

Ever since I started using ggplot2 more often at work in order to do graphs, I’ve realized something about the use of colour in bar graphs vs. dot plots: When I’m looking at a graph displayed on the brilliant Viewsonic … Continue reading →

Read more »

An Example of Social Network Analysis with R using Package igraph

May 16, 2012
By
An Example of Social Network Analysis with R using Package igraph

by Yanchang Zhao, RDataMining.com This post presents an example of social network analysis with R using package igraph. The data to analyze is Twitter text data of @RDataMining used in the example of Text Mining, and it can be downloaded … Continue reading →

Read more »

Functions ddply and melt make plotting summary stats in R more tolerable

May 15, 2012
By
Functions ddply and melt make plotting summary stats in R more tolerable

The main reason why I have usually chosen to use excel to make my plots at work is because I had difficulty feeding the summary stats in R into a plotting function.  One thing I learned this week is how … Continue reading →

Read more »

An embarrassing admission; Copy pasting tables with text containing spaces from Excel to R

May 11, 2012
By
An embarrassing admission; Copy pasting tables with text containing spaces from Excel to R

I can’t believe I didn’t learn how to do it earlier, but I never knew how to accurately copy tables from excel that had text with spaces in them, and paste into a data frame in R without generating confusion … Continue reading →

Read more »

Book “R and Data Mining: Examples and Case Studies” on CRAN

May 9, 2012
By
Book “R and Data Mining: Examples and Case Studies” on CRAN

by Yanchang Zhao, RDataMining.com My book in draft titled “R and Data Mining: Examples and Case Studies” is now available on CRAN at http://cran.r-project.org/other-docs.html. It is scheduled to be published by Elsevier in late 2012. Its latest version can be … Continue reading →

Read more »

Memory Management in R, and SOAR

May 8, 2012
By
Memory Management in R, and SOAR

The more I’ve worked with my really large data set, the more cumbersome the work has become to my work computer.  Keep in mind I’ve got a quad core with 8 gigs of RAM.  With growing irritation at how slow … Continue reading →

Read more »

Heartbeat of a Cycling City: Bixi data at Hack/Reduce

May 8, 2012
By
Heartbeat of a Cycling City: Bixi data at Hack/Reduce

The recent Hack/Reduce hackathon in Montreal was a tonne of fun. Our team tackled a data set of consisting of Bixi (Montreal’s bicycle share system) station states at one minute temporal resolution. We used Hadoop and mapreduce to pull out some features of user behaviours. One of the things we extracted was the flux at

Read more »