545 search results for "Hadoop"

So you want to be a data scientist

August 10, 2016
By
So you want to be a data scientist

From HuffingtonPostThe New York Times made it look so easy. Take a few courses in data science and a web-based startup will readily pay top dollars for your newly acquired skills.Since the McKinsey Global Institute reported on the impending shortage of data crunchers, the wanna be data scientists are searching for...

Read more »

Deep Learning Part 1: Comparison of Symbolic Deep Learning Frameworks

August 9, 2016
By

by Anusua Trivedi, Microsoft Data Scientist Background and Approach This blog series is based on my upcoming talk on re-usability of Deep Learning Models at the Hadoop+Strata World Conference in Singapore. This blog series will be in several parts – where I describe my experiences and go deep into the reasons behind my choices. Deep learning is an emerging...

Read more »

New cheat-sheet for the dplyrXdf package

August 8, 2016
By
New cheat-sheet for the dplyrXdf package

Hadley Wickham's dplyr package is an amazing tool for restructuring, filtering, and aggregating data sets using its elegant grammar of data manipulation. By default, it works on in-memory data frames, which means you're limited to the amount of data you can fit into R's memory. Hadley also provided an extension mechanism to make dplyr work with external data sources,...

Read more »

stacksurveyr: An R package with the 2016 Developer Survey Results

July 18, 2016
By
stacksurveyr: An R package with the 2016 Developer Survey Results

This year, more than fifty thousand programmers answered the Stack Overflow 2016 Developer Survey, in the largest survey of professional developers in history. Last week Stack Overflow released the full (anonymized) results of the survey at stackoverf...

Read more »

New Release of partools Package

July 17, 2016
By
New Release of partools Package

My new release of partools is now on CRAN. The package is aimed at doing parallel data science in what I call an “un-MapReduce” manner. It takes the point of view that MapReduce-based frameworks such as Hadoop and Spark are fine for the types of applications their designers had in mind, namely rather simple SQL … Continue...

Read more »

Notes from the Kölner R meeting, 9 July 2016

July 13, 2016
By
Notes from the Kölner R meeting, 9 July 2016

Last Thursday the Cologne R user group came together again. This time, our two speakers arrived from Bavaria, to talk about Spark and R Server.Introduction to Apache SparkDownload slidesDubravko Dulic gave an introduction to Apache Spark and why Spark might be of interest to data scientists using...

Read more »

Introducing the free Microsoft R Client

July 11, 2016
By

Over the years, we've shared several posts on using the ScaleR package to import, process, visualize and analyze large data sets with R. Until now, you needed to have access to a Microsoft R Server license to take advantage of the package. Now, you can use all of the capabilities of ScaleR free of charge with Microsoft R Client...

Read more »

In case you missed it: June 2016 roundup

July 8, 2016
By

In case you missed them, here are some articles from June of particular interest to R users. A preview of the tutorials presented at the useR! 2016 conference. A "advanced beginner's" guide to R published by ComputerWorld includes guides on data wrangling, visualization, and data APIs. Microsoft R Server now runs on Apache Spark, bringing high performance to big-data...

Read more »

Euro 2016 analytics: Who’s playing the toughest game?

July 1, 2016
By
Euro 2016 analytics: Who’s playing the toughest game?

I am really enjoying Uefa Euro 2016 Footbal Competition, even because our national team has done pretty well so far. That’s why after  browsing for a while statistics section of official EURO 2016 website I decided to do some analysis on the data they share Just to be clear from the beginning: we are not talking Related Post

Read more »

Microsoft Analytics in 2016

June 23, 2016
By
Microsoft Analytics in 2016

If you had asked me two years ago if Microsoft was a serious vendor for data science and analytics infrastructure and tools, I would have laughed. At the time their offering seemed to me to consist of Excel against SQL Server. There is nothing really wrong (or exciting) about SQL Server, but friends don’t let friends use Excel for...

Read more »

Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)