Articles by Data Science Notes - R

Read from hdfs with R. Brief overview of SparkR.

February 19, 2016 | Data Science Notes - R

Disclaimer: originally I planned to write post about R functions/packages which allow to read data from hdfs (with benchmarks), but in the end it became more like an overview of SparkR capabilities. Nowadays working with “big data” almost always means working with hadoop ecosystem. A few years ago this ... [Read more...]

Analyzing texts with text2vec package.

November 8, 2015 | Data Science Notes - R

In the last weeks I have actively worked on text2vec (formerly tmlite) - R package, which provides tools for fast text vectorization and state-of-the art word embeddings. This project is an experiment for me - what can a single person do in a particular area? After these hard weeks, ... [Read more...]

Introducing tmlite – new framework for text mining in R

September 15, 2015 | Data Science Notes - R

IMPORTANT NOTE Code from this post is outdated (package APIs were changed). See this post. Today I am pleased to present tmlite - small, but fast and robust package for text-mining tasks in R. It is not availible yet on CRAN, but you can install it directly from github:
devtools<span>::</span>install_github<span>(</span><span>"dselivanov/tmlite"</span><span>)</span>
... [Read more...]

Locality Sensitive Hashing In R Part 1

January 1, 2015 | Data Science Notes - R

Introduction In the next series of posts I will try to explain base concepts Locality Sensitive Hashing technique. Note, that I will try to follow general functional programming style. So I will use R’s Higher-Order Functions instead of traditional R’s *apply functions family (I suppose this post will ... [Read more...]

Rmongodb 1.8.0

November 1, 2014 | Data Science Notes - R

Today I’m introducing new version of rmongodb (which I started to maintain) – v1.8.0. Install it from github:
<span>library</span><span>(</span>devtools<span>)</span>
install_github<span>(</span><span>"mongosoup/[email protected]"</span><span>)</span>
Release version will be uploaded to CRAN shortly. This release brings a lot of improvements to rmongodb: Now rmongodb correctly handles arrays. mongo.bson.to.list() rewritten from scratch. R’s ... [Read more...]

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)