Introduction to R for Data Science :: Session 3

May 23, 2016
By

(This article was first published on The Exactness of Mind, and kindly contributed to R-bloggers)

Welcome to Introduction to R for Data Science Session 3! The course is co-organized by Data Science Serbia and Startit. You will find all course material (R scripts, data sets, SlideShare presentations, readings) on these pages.

Welcome to the third session of Introduction to R for Data Science! Check out the Course Overview to acess the learning material presented thus far.

Data Science Serbia Course Pages [in Serbian]

Startit Course Pages [in Serbian]

Lecturers

Summary of Session 3, 12. may 2016 :: Introduction to R: Lists and Functions in R

Introduction to lists and functions in R.  R is a higher programming language where one uses the list data type a lot. We will introduce this dynamic data type during this session. R is also a functional programming language: everything that happens in R is a call and an execution of some function. Even “operators” in R – as simple as “+” or “-” – are functions. In this session we learn how to write R functions. We then combine functions and lists to learn about a rather handy lapply() function. Then we proceed to demonstrate the usage of her cousin apply() – applied to matrices across their dimensions.

Intro to R for Data Science SlideShare :: Session 3

Introduction to R for Data Science :: Session 3 from Goran S. Milovanovic

R script :: Session 3

########################################################
# Introduction to R for Data Science
# SESSION 3 :: 12 May, 2016
# Data Science Community Serbia + Startit
# :: Goran S. Milovanović and Branko Kovač ::
########################################################
 
# clear all
rm(list=ls());
 
# It's time to speak about lists
num_vct <- c(2:5) # just another num vector
chr_vct <- c("data", "science") # char vector
data_frame <- data.frame(x = c("a", "b", "c", "d"), y = c(1:4)) # simple df
 
lista <- list(data_frame, num_vct, chr_vct) # and this is a list
lista # this is our list
 
str(lista) # about a list
length(lista)
 
as.list(chr_vct) # another way to create a list
 
# Lists manipulation
names(lista) <- c("data", "numbers", "words")
 
lista[3] # 3rd element?
lista[[3]] # 3rd element?
 
is.list(lista[3]) # is this a list?
is.list(lista[[3]]) # and this?
 
class(lista[[3]]) # also a list? not be so sure!
 
lista$words # we can also extract an element this way
lista[["words"]] # or even like this
 
length(lista$words) # 2 as expected
 
lista[["words"]][1] # digging even deeper
 
lista$new_elem <- c(TRUE, FALSE, FALSE, TRUE) # add new element
 
length(lista) # now list has 4 elements
lista$new_elem <- NULL # but we can remove it easily
 
new_vect <- unlist(lista) # creating a vector from list
 
# Introduction to Functions in R
# (w. less formalism but tips & tricks added)
 
# elementary: a defition of a function in R
fun <- function(x) x+10;
fun(5)
 
# taking two arguments
fun2 <- function(x,y) x+y;
fun2(3,4)
 
# using "{" and "}" to enclose multiple R expresions in the function body
fun <- function(x,y) {
  a <- sum(x);
  b <- sum(y);
  a-b
}
r <- c(5,4,3);
q <- c(1,1,1);
fun(r,q)
fun(c(5,4,3),c(1,1,1)) # NOTE: "{" and "}" are generally used in R to mark the beginning and the end of block
# a function is a function:
is.function(fun);
is.function(log); # log is built-in
 
# printing function to acess their source code;
fun
log # try: is.primitive(log); this one is written in C, belongs to the base package - it's "under the hood"
 
# Built in functions + functional programming ("Everything is a function...")
"^"(2,2)
"^"(2,3) # magic! - how do you do that?
2^2
2^3
# the difference between "operators" and "functions" in R: none. Everything is a function:
"+"(2,2) # Four?
2+2 # yeah, right
# Oh but I love this
"-"("+"(3,5),2);
"&"(">"(2,2),T);
"&"(">"(3,2),T); # punishment: write all your lab code for this week in this fashion...
# built in functions:
x <- 16;
sqrt(x);
x <- c(1,2,3,4,5,6,7,8,9);
mean(x);
# whatch for NAs in statistics (!)
x <- c(1,2,3,4,5,6,7,8,NA);
mean(x);
mean(x, na.rm = T); # right!
median(x);
sd(x);
sum(x);
sum(x, na.rm = T); # a-ha!
 
# Lexical scoping in R + nested functions
# example taken from: http://adv-r.had.co.nz/Functions.html
# "Advanced R" by Hadley Wickham
# ";"s added by GSM
x <- 1;
h <- function() {
  y <- 2;
  i <- function() {
    z <- 3
    c(x, y, z)
  }
  i();
}
h();
 
# Messing up argument names (never do this in nested functions unless you have to)
rm(x, h);
x <- 1;
h <- function(x) {
  y <- x+1
  i <- function(x) {
    z <- x+2;
    z
  }
  z <- i(x);
  c(x,y,z)
}
h(x)
 
# Two things that come handy: lapply and apply
# Step 1: here's a list:
aList <- list(c(1,2,3),
              c(4,5,6),
              c(7,8,9),
              c(10,11,12));
# Step 2: I want to apply the following function:
myFun <- function(x) {
  x[1]+x[2]-x[3]
}
# to all elements of the aList list, and get the result as a list again. Here it is:
res <- lapply(aList, function(x) {
  x[1]+x[2]-x[3]
});
unlist(res) # to get a vector
rm(myFun);
 
# Now say I've got a matrix
myMat <- matrix(c(1,2,3,4,5,6,7,8,9),
                  nrow=3,
                  ncol=3);
# btw
is.function(matrix);
# reminder
class(myMat);
typeof(myMat);
# now, I want the sums of all rows:
rsMyMat <- apply(myMat, 1, function(x) {
  sum(x)
});
rsMyMat;
is.list(rsMyMat) # just beatiful
# for columns:
csMyMat <- apply(myMat, 2, function(x) {
  sum(x)
});
# with existings functions such as sum(), this will do:
rsMyMat1 <- apply(myMat, 1, sum);
rsMyMat1
csMyMat1 <- apply(myMat, 2, sum);
csMyMat1

Readings :: Session 4 [19. May, 2016, @Startit.rs, 19h CET]

Chapters 1 – 10, The Art of R Programming, Norman Matloff

Session 3 Photos

20160428_204815image

To leave a comment for the author, please follow the link and comment on their blog: The Exactness of Mind.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

Search R-bloggers


Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)