583 search results for "SQL"

Venue Recommendation – A Simple Use Case Connecting R and Neo4j

April 7, 2013
By
Venue Recommendation – A Simple Use Case Connecting R and Neo4j

Last month I attended the CeBIT trade fair in Hannover. Besides the so called “shareconomy” there was also another main topic across all expedition halls - Big Data. This subject is not completely new and I think that a lot of you also have experiences with some of the tools associated with Big Data. But due to the great...

Read more »

Data visualization with R and ggplot2

March 28, 2013
By
Data visualization with R and ggplot2

I’m working on a one-hour ggplot2 lecture for the San Diego R users group, which I will post here when I’m done. I think there are many great intro to R data visualization resources out there so I’ll only share working examples on my blog. A retail chain client employs a few hundred field agents who perform

Read more »

Build a search engine in 20 minutes or less

March 27, 2013
By
Build a search engine in 20 minutes or less

…or your money back. author = "Ben Ogorek"Twitter = "@baogorek"email = paste0(sub("@", "", Twitter), "@gmail.com") Setup Pretend this is Big Data: doc1 <- "Stray cats are running all over the place. I see 10 a day!"doc2 <- "Cats are killers. They...

Read more »

Python vs R vs SPSS … Can’t All Programmers Just Get Along?

March 26, 2013
By
Python vs R vs SPSS … Can’t All Programmers Just Get Along?

Programmers have long been very proud and loyal with their tools, and often very vocal. This has led to well-contested rivalries and “fights” about which tool is better: emacs or vi; Java or C++; Perl or Python; Django or Rails; … Continue reading →The post Python vs R vs SPSS … Can’t All Programmers Just Get Along?...

Read more »

Massive online data stream mining with R

Massive online data stream mining with R

A few weeks ago, the stream package has been released on CRAN. It allows to do real time analytics on data streams. This can be very usefull if you are working with large datasets which are already hard to put in RAM completely, let alone to build some statistical model on it without getting into RAM problems. Most of...

Read more »

Baseball Statistics with R – Batting Average

March 18, 2013
By
Baseball Statistics with R – Batting Average

I'm working on a new book about the R programming language. R is a language that is designed for use with statistics and data. I use it to analyze sports and social networking. I thought that it would be fun to write the book focusing on baseball statistics using data from Major League Baseball. This post...

Read more »

column-store R or: how i learned to stop worrying and love monetdb

March 18, 2013
By

"Combining R's sophisticated calculations and MonetDB's excellent data access performance is a no-brainer. One gets the best of two (open source) worlds with minimal hassle." - Dr. Hannes Mühleisen"oh wow that was fast like a cheetah with a jetpack or something" - anthony damicowhy try monetdb + ra speed test of four analysis commands on sixty-seven million...

Read more »

Using maps and ggplot2 to visualize college hockey championships

March 13, 2013
By
Using maps and ggplot2 to visualize college hockey championships

Short: I plot the frequency of college hockey championships by state using the maps package, and ggplot2 Note: this example is based heavily on the example provided athttp://www.dataincolour.com/2011/07/maps-with-ggplot2/ data reference:http://en.wikipedia.org/wiki/NCAA_Men%27s_Ice_Hockey_Championship Question of interestAs a good Minnesotan, I've believed for quite some time that the colder, Northern states enjoy a competitive advantage when it...

Read more »

Getting flexible with SAP HANA

Getting flexible with SAP HANA

Most of you might not be aware of a feature introduced on SAP HANA SPS5. This new feature is called "Flexible Tables", which means that you can define a table that will grow depending on your needs. Let's see an example...You define a table with ID, NA...

Read more »

ddply in action

March 7, 2013
By
ddply in action

Top Batting Averages Over Time Top Batting Averages Over Time reference:http://www.baseball-databank.org/ ShortI'm going to use plyr and ggplot2 to look at how top batting averages have changed over time First load the data: options(width = 100)library(ggplot2) ## Warning message: package 'ggplot2' was built under R version 2.14.2 library(plyr)data(baseball)head(baseball) ## ...

Read more »