579 search results for "SQL"

If you are into large data and work a lot with package ff

August 8, 2012
By
If you are into large data and work a lot with package ff

The ff package is a great and efficient way of working with large datasets.  One of the main reasons why I prefer to use it above other packages that allow working with large datasets is that it is a complete set of tools. When comparing it to the other open source 'bigdata' packages in R It is not...

Read more »

Data Parallelism Using Oracle R Enterprise

August 2, 2012
By

Modern computer processors are adequately optimized for many statistical calculations, but large data operations may require hours or days to return a result.  Oracle R Enterprise (ORE), a set of R packages designed to process large data computations in Oracle Database, can run many R operations in parallel, significantly reducing processing time. ORE supports parallelism through the transparency layer,...

Read more »

ScraperWiki in R

July 29, 2012
By

ScraperWiki describes itself as an online tool for gathering, cleaning and analysing data from the web. It is a programming oriented approach, users can implement ETL processes in Python, PHP or Ruby, share these processes among the community (or pay for privacy) and schedule automated runs. The software behind the service is open source, and there is...

Read more »

Success does not require understanding

July 23, 2012
By

I took part in the second Data Science London Hackathon last weekend (also my second hackathon) and it was a very different experience compared to the first hackathon. Once again Carlos and his team really looked after us. The data was released 24 hours before the competition started and even though I had spent less

Read more »

The R packages in a data scientist’s toolbox

July 17, 2012
By

John Myles White, self-described "statistics hacker" and co-author of "Machine Learning for Hackers" was interviewed recently by The Setup. In the interview, he describes his some of his go-to R packages for data science: Most of my work involves programming, so programming languages and their libraries are the bulk of the software I use. I primarily program in R,...

Read more »

Using R for classification in small-N studies

July 14, 2012
By
Using R for classification in small-N studies

Rick Davies just wrote an interesting post which combined thoughts on QCA (and multi-valued QCA or mvQCA) and classification trees with thoughts on INUS causation and classification trees. The question was something like: how can we look at a small-to-medium set of cases (like a dozen or a hundred countries or development programs) and tease

Read more »

In case you missed it: June 2012 Roundup

July 11, 2012
By

In case you missed them, here are some articles from June of particular interest to R users. The FDA goes on the record that it's OK to use R for drug trials. A review of talks at the useR! 2012 conference. Using the negative binomial distribution to convert monthly fecundity into the chances of having a baby in a...

Read more »

Sourcing Code from GitHub

July 10, 2012
By
Sourcing Code from GitHub

In previous posts I described how to input data stored on GitHub directly into R. You can do the same thing with source code stored on GitHub. Hadley Wickham has actually made the whole process easier by combining the getURL, textConnection, and source commands into one function: source_url. This is in his devtools...

Read more »

Trend and Spatial Pattern of Poverty in the Philippines

July 9, 2012
By
Trend and Spatial Pattern of Poverty in the Philippines

In a teaching demo that I have conducted, I discussed on how R can be used to analyze trends and spatial pattern of poverty incidence in the Philippines. Playing on the data I got from the National Statistical Coordination Board below is what I got.&...

Read more »