418 search results for "hadoop"

From cats to zombies, Wednesday at useR2015

July 1, 2015
By
From cats to zombies, Wednesday at useR2015

The morning opened with someone who I was too bleary eyed to work out who it was. Possibly the dean of the University of Aalborg. Anyway, he said that this is the largest ever useR conference, and the first ever in a Nordic country. Take that, Norway! Also, considering that there are now quite a

Read more »

Chronicles from useR! – day 1

Chronicles from useR! – day 1

Today I woke up for the first time in Aalborg. My colle

Read more »

useR 2015: Computational

July 1, 2015
By
useR 2015: Computational

These are my initial notes from useR 2015. I will/may revise when I have time. Computational Performance; Chair: Dirk Eddelbuettel Running R+Hadoop using Docker Containers (E. James Harner) Introduction Big data architectures: HDFS/Hadoop: software framework for distributed storage and distributed processing Tachyon/Spark: uses in-memory Rc2 server (R cloud computing) Has an editor & output panel.

Read more »

Exploring SparkR

Exploring SparkR

A colleague from work, asked me to investigate about Spark and R. So the most obvious thing to was to investigate about SparkR -;)I installed Scala, Hadoop, Spark and SparkR...not sure Hadoop is needed for this...but I wanted to have the full picture -...

Read more »

SparkR: Distributed data frames with Spark and R

June 12, 2015
By

R is now integrated with Apache Spark, the open-source cluster computing framework. The Databricks blog announced this week that yesterday's release of Spark 1.4 would include SparkR, "an R package that allows data scientists to analyze large datasets and interactively run jobs on them from the R shell". The SparkR 1.4 announcement led with the news: Spark 1.4 introduces...

Read more »

Estimating Analytics Software Market Share by Counting Books

June 9, 2015
By
Estimating Analytics Software Market Share by Counting Books

Below is the latest update to The Popularity of Data Analysis Software. Books The number of books published on each software package or language reflects its relative popularity. Amazon.com offers an advanced search method which works well for all the software except R … Continue reading →

Read more »

R in a 64 bit world

June 8, 2015
By
R in a 64 bit world

32 bit data structures (pointers, integer representations, single precision floating point) have been past their “best before date” for quite some time. R itself moved to a 64 bit memory model some time ago, but still has only 32 bit integers. This is going to get more and more awkward going forward. What is R … Continue reading...

Read more »

Any R code as a cloud service: R demonstration at BUILD

June 5, 2015
By
Any R code as a cloud service: R demonstration at BUILD

At last month's BUILD conference for Microsoft developers in San Francisco, R was front-and-center on the keynote stage. In the keynote, Microsoft CVP Joseph Sirosh introduced the "language of data": open source R. Sirosh encouraged the audience to learn R, saying "if there is a single language that you choose to learn today .. let it be R". The...

Read more »

Update on Snowdoop, a MapReduce Alternative

May 29, 2015
By
Update on Snowdoop, a MapReduce Alternative

In blog posts a few months ago, I proposed an alternative to MapReduce, e.g. to Hadoop, which I called “Snowdoop.” I pointed out that systems like Hadoop and Spark are very difficult to install and configure, are either too primitive (Hadoop)  or too abstract (Spark) to program, and above all, are SLOW. Spark is of … Continue reading...

Read more »

SparkR preview by Vincent Warmerdam

May 28, 2015
By
SparkR preview by Vincent Warmerdam

SparkR preview in Rstudio Apache Spark is the hip new technology on the block. It allows you to write scripts in a functional style and the technology behind it will allow you to run iterative tasks very quickly on a cluster of machines. It’s benchmarked to be quicker than hadoop for most machine learning use

Read more »