585 search results for "SQL"

Ack! Duplicates in the Data!

May 3, 2012
By
Ack!  Duplicates in the Data!

As I mentioned in a previous post, I compiled the data set that I’m currently working on in PostgreSQL.  To get this massive data set, I had to write a query that was massive by dint of the number of … Continue reading →

Read more »

Google BigQuery and the Github Data Challenge

May 1, 2012
By

Github has made data on its code repositories, developer updates, forks etc. from the public GitHub timeline available for analysis, and is offering prizes for the most interesting visualization of the data. Sounds like a great challenge for R programmers! The R language is currently the 26th most popular on GitHub (up from #29 in December), and it would...

Read more »

The R-Podcast Episode 6: Importing Data from External Sources

April 29, 2012
By

In this episode: Listener feedback and importing data from external sources into R. We dive into the basics of importing delimited text files using read.table and its varients. We also discuss recommendations for importing MS Excel spreadsheet files, relational databases such as MySQL, data from HTML tables, and files produced by other statistical computing packages.

Read more »

soilDB Demo: Processing SSURGO Attribute Data with SDA_query()

April 26, 2012
By
soilDB Demo: Processing SSURGO Attribute Data with SDA_query()

Mapping near Paloma, CA This image has nothing to do with the following content. A quick example of how to use the USDA-NRCS soil data access query facility (SDA), via the soilDB package for R. The following code describes how to get component-level so...

Read more »

Sanitizing data in SAP HANA with R

Sanitizing data in SAP HANA with R

From April 10 to April 11, my team (Anne, Juergen and myself) host an InnoJam in Boston. It was a really great event, but the data provided by the City of Boston wasn't exactly in the best shape, so we took a lot of efforts (with a help of the SAP Guru...

Read more »

Using SNA in Predictive Modeling

April 10, 2012
By
Using SNA in Predictive Modeling

In a previous post, I described the basics of social network analysis. I plan to extend that example here with an application in predictive analytics. Let's suppose we have the following network (visualized in R)Suppose we have used the igraph package ...

Read more »

Quick off the mark

April 9, 2012
By
Quick off the mark

With none of the top teams overimpressing this season, Alan Pardew’s performance with Newcastle – especially in the transfer market -is likely to see him receive coniderable recognition in the Manager of the Year award Recent acquisition, Papiss Cissé, has proved particularly fruitful with a brace against Swansea last time out taking him to nine

Read more »

R package ETLUtils @ CRAN – easy loading into ffdf

The R package ETLUtils is now available for download at it's CRAN repository.It's a package which facilitates the ETL in situations where you need to interact with SQL databases in a corporate environment. Basically it currently focusses on t...

Read more »

The race for speed at the data layer

April 6, 2012
By

The competition amongst database vendors to create the fastest, most powerful "data layer" — the hardware and software to provide storage for Big Data with high-performance data processing — is clearly heating up. The Netezza appliance has been so successful that IBM has been racing to keep up with demand. SAP is also seeing success with its HANA in-memory...

Read more »

Introduction to ORE Embedded R Script Execution

April 2, 2012
By
Introduction to ORE Embedded R Script Execution

This Oracle R Enterprise (ORE) tutorial, on embedded R execution, is the third in a series to help users get started using ORE. See these links for the first tutorial on the transparency layer and second tutorial on the statistics engine. Oracle R Enterprise is a component in the Oracle Advanced Analytics Option of Oracle Database Enterprise...

Read more »