570 search results for "sql"

Delete rows from R data frame

October 8, 2009
By
Delete rows from R data frame

Deleting rows from a data frame in R is easy by combining simple operations. Let’s say you are working with the built-in data set airquality and need to remove rows where the ozona is NA (also called null, blank or missing). The method is a conce...

Read more »

Export Data Frames To Multi-worksheet Excel File

October 6, 2009
By
Export Data Frames To Multi-worksheet Excel File

A few weeks ago I needed to export a number of data frames to separate worksheets in an Excel file. Although one could output csv-files from R and then import them manually or with the help of VBA into Excel, I was after a more streamlined solution, as I would need to repeat this process

Read more »

Survive R

September 29, 2009
By

New PDF slides version (presented at the Bay Area R Users Meetup October 13, 2009). We at Win-Vector LLC appear to like R a bit more than some of our, perhaps wiser, colleagues ( see: Choose your weapon: Matlab, R or something else? and R and data ). While we do like R (see: Exciting Related posts:

Read more »

How to Import MS Excel Data into R

September 26, 2009
By
How to Import MS Excel Data into R

s Sir Francis Bacon said, “Histories make men wise; poets witty; the mathematics subtile; natural philosophy deep; moral grave; logic and rhetoric able to contend.” And Windows stupid. He should have added the last sentence if he were a Windows user in this age. 1. Avoid Using M$ Excel A lot of R users often ask this question:

Read more »

SAS: “The query requires remerging summary statistics back with the original data”

September 22, 2009
By
SAS: “The query requires remerging summary statistics back with the original data”

Coming from a background writing SQL code directly for “real” RDBMS (Microsoft SQL Server, MySQL, and SQLite), I was initially confused when SAS would give me the following ‘note’ for a simple summary PROC SQL query: 429 proc sql; 430 create table undel_monthly as 431 select 432 year(date) as year, 433 month(date) as month, 434

Read more »

Aggregating SSURGO Data in R

September 10, 2009
By
Aggregating SSURGO Data in R

  Premise SSURGO is a digital, high-resolution (1:24,000), soil survey database produced by the USDA-NRCS. It is one of the largest and most complete spatial databases in the world; and is available for nearly the entire USA at no cost. These data are distributed as a combination of geographic and text data, representing soil map units and their...

Read more »

ClipPath copies filename and path from windows for loading into R

September 4, 2009
By

I wish I would have discovered this long ago.  Loading data into R or MySQL requires you to specify the full path to the file.  If you do this on a Windows machine there are two annoyances.  First, if you save something to your desktop the path to your desktop is really long.  Second, windows by default uses backslashes...

Read more »

A Fast Intro to PLYR for R

August 27, 2009
By
A Fast Intro to PLYR for R

I’m not dead yet! Although it has been rumored that I am. The new job is going great and I’m thrilled to be with a new firm doing interesting work alongside smart people. It makes me seem smarter by simple association. There’s been a lot going on recently in the R user community. There was an

Read more »

Select operations on R data frames

July 26, 2009
By
Select operations on R data frames

The R language is weird - particularly for those coming from a typical programmer's background, which likely includes OO languages in the curly-brace family and relational databases using SQL. A key data structure in R, the data.frame, is used somethin...

Read more »

Massively parallel database for analytics

July 22, 2009
By
Massively parallel database for analytics

This is by far the best description of why traditional parallel databases (like Teradata, Greenplum et al.) is a evolutionary dead end. But much more than a theoretical discussion, they have built a solution which they call HadoopDB. It is based on Hadoop, PostgreSQL, and Hive and is completely Open Source. Alternative, column-based, backends to PostgreSQL...

Read more »