1026 search results for "sql"

How to use SparkR in Cloudera Hadoop

Suppose you are an avid R user, and you would like to use SparkR in Cloudera Hadoop; unfortunately, as of the latest CDH version (5.7), SparkR is still not supported (and, according to a recent discussion in the Cloudera forums, we shouldn’t expect this to happen anytime soon). Is there anything  you can do? Well, indeed there is. In...

Read more »

Get Involved with the R Consortium

April 14, 2016
By
Get Involved with the R Consortium

by Joseph Rickert The R Consortium, the non-profit trade organization formed under the Linux Foundation to support the R language and the R Community, is beginning to build real momentum. First of all, two new companies recently joined the Consortium: Avant which provides online personal and auto loans and Procogia, a consulting firm that helps companies make data-driven business...

Read more »

Merging Dataframes Exercises

April 14, 2016
By
Merging Dataframes Exercises

When combining separate dataframes, (in the R programming language), into a single dataframe, using the cbind() function usually requires use of the “Match()” function. To simulate the database joining functionality in SQL, the “Merge()” function in R accomplishes dataframe merging with the following protocols; “Inner Join” where the left table has matching rows from one,

Read more »

Microsoft Data Science VM now available as a Linux instance

April 13, 2016
By
Microsoft Data Science VM now available as a Linux instance

Microsoft's Linux Data Science Virtual Machine is now available for use on the Azure Marketplace. Like the Windows-based instance of the Data Science VM, this pre-built system based on Linux CentOS 7.2 includes all the tools you'll need to analyze data, including Microsoft R Open, Anaconda Python, Jupyter Notebooks and a PostgreSQL database instance. It also includes a suite...

Read more »

Learn R By Intensive Practice – Part 2

April 13, 2016
By
Learn R By Intensive Practice – Part 2

This is a continuation of part 1 of the Learn R By Intensive Practice Series. In this part, we step up the game and learn a number of key concepts such as lists, sampling, data frames etc. At the end of each video, you will be solving a practice challenge based on what you learnt 11. Get specific...

Read more »

Determining the Number of Factors with Parallel Analysis in R

April 12, 2016
By
Determining the Number of Factors with Parallel Analysis in R

Tom Schmitt April 12, 2016 As discussed on page 308 and illustrated on page 312 of Schmitt (2011), a first essential step in Factor Analysis is to determine the appropriate number of factors with Parallel Analysis in R. The data consists of 26 psychological tests administered by Holzinger and Swineford (1939) to 145 students and Continue Reading.. The post...

Read more »

Clandestine DNS lookups with gdns

April 10, 2016
By
Clandestine DNS lookups with gdns

Google recently announced their DNS-over-HTTPS API, which “enhances privacy and security between a client and a recursive resolver, and complements DNSSEC to provide end-to-end authenticated DNS lookups”. The REST API they provided was pretty simple to wrap into a package and I tossed in some SPF functions that I had lying around to bulk it... Continue reading →

Read more »

R Markdown & Bloggin’: Part 1 – Inserting Code

April 8, 2016
By
R Markdown & Bloggin’: Part 1 – Inserting Code

Reino Bruner April 8, 2016 As a data scientist, I find the vast majority of the useful content I produce just gets stored into a rainy day folder until I have further need for it. I think it would be more beneficial if I brought some of the functions, processes, and knowledge I have developed Continue Reading.. The post...

Read more »

In case you missed it: March 2016 roundup

April 8, 2016
By

In case you missed them, here are some articles from February of particular interest to R users. Reviews of new CRAN packages RtutoR, lavaan.shiny, dCovTS, glmmsr, GLMMRR, MultivariateRandomForest, genie, kmlShape, deepboost and rEDM. You can now create and host Jupyter notebooks based on R, for free, in Azure ML Studio. Calculating learning curves for predictive models with doParallel. An...

Read more »

Answers to FAQ about SparkR for R users

April 5, 2016
By
Answers to FAQ about SparkR for R users

Many people keep asking me whether I have tried SparkR, is it worth using, is it sexy or WHAT is it at all. I felt that creating frequently asked questions (FAQ) in the field of WHAT is that Spark/SparkR? would help many R Scientists to understand this Big Data Buzz-tool. I have gathered information from the...

Read more »

Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)