Introduction to R for Absolute Beginners

October 13, 2014
By

(This article was first published on Mango Solutions Shop, and kindly contributed to R-bloggers)

By Chris Campbell – Senior Consultant, UK.
As this was the first ever Manchester R the topics were designed for helping people make a start in R. We started off with a pre-session workshop; a brief introduction to R for absolute beginners. Moving from grapical user interfaces (GUIs) to the console can be daunting at first. Starting with the simplest tasks is essential. This workshop will show you some bread and butter commands.

> # We type commands at the prompt (>)
> # Text after a hash (#) is a comment
> # This text won't be interpreted by R
> 
> # All other text is interpreted
>
> 3 * 5
[1] 15
> 
> # We can create an object with the assign operator (<-)
> 
> x <- 42
> 
> # We can print its value by typing its name
> 
> x
[1] 42
> 
> # We can perform operations using mathematical syntax
> 
> x * 5 
[1] 210
> x ^ 3
[1] 74088

In R we do most of our work with objects. Learning how to work with data objects is essential to everything we do in R. Vectors are a commonly used simple data object. Harnessing the power of vector calculations is a key skill for R users.

> # This type of object, a vector, can contain multiple values
> # The c function combines values
>
> stock <- c(68486, 38831, 56415, 44117)
> 
> stock
[1] 68486 38831 56415 44117
> 
> stock * 100 / 75623
[1] 90.56240 51.34813 74.60032 58.33807

An essential skill is extracting values from data objects.

> # Square brackets let us select elements
> 
> stock[2]
[1] 38831
> 
> # We can extract elements by index, or position along the vector
> 
> stock[c(1, 3)]
[1] 68486 56415

Another useful data structure is the data frame. There are some useful features of data frames that can catch beginners out. The most common difficulty new users have when working with data frames for the first time is that text strings are converted to factors. Factors are an incredibly useful data class, but require some practice, and need to be treated with care.

> # We can store 2 dimensional data in a data frame
> 
> stockTime <- data.frame(Time = c(0, 2, 5, 7), Value = stock, 
+     Rainfall = c(44, 12, 15, 15))

Two dimensional objects also need an additional index when extracting information. Just like referencing a spreadsheet cell by rows and columns we refer to contents in a data frame using the row index, then the column index using square brackets.

> # when we use square brackets with 2 dimensional 
> # objects, we need to provide 2 arguments:
> # rows, then columns
> 
> stockTime[3, 3]
[1] 15

There is a lot of great functionality that we can use when working with data objects. When did our stock value reach a certain value? We can perform logical tests on columns of our data frame, and use these to select data.

> # We can also select elements using logical values
> 
> stock == 1
[1] FALSE FALSE FALSE FALSE
>
> stock == 38831
[1] FALSE  TRUE FALSE FALSE
>
> stockTime[stock == 38831, c(1, 3)]
  Time Rainfall
2    2       12
> 
> # we can also refer to rows or columns by name
> 
> stockTime[stock == 38831, "Value"]
[1] 38831

One of the most gratifying features of R as a new user is how quickly one can create plots.

> # We can create a plot with plot
> 
> plot(Value ~ Time, data = stockTime)


Learning R is more like an exploration of a new language than mastering a static syllabus. Getting help is an essential part of working in R at all levels of experience. The user documentation can sometimes seem a little terse, but there is a wealth of excellent books, web resources, mailing lists, and friendly consultants that can help with problems large and small.

> # We can get help with question mark (?)
>
> ?plot
starting httpd help server ... done
>
> # Or drop us an email [email protected]

A useful suite of functions that new users should know about are functions for writing out fixed sized images. A useful help topic is png, the portable network graphics device.
Understanding how to provide arguments to functions is another essential concept. Arguments are information provided in a variety of formats that modify the behaviour of functions.

> # arguments provide information to functions
> # they are separated by commas
>
> plot(Value ~ Time, data = stockTime, type = "l")


I hope this was a useful tutorial for those who are thinking of taking the plunge into R. While there is certainly a learning curve, you will be able to create new plots, new analyses and new ideas that would not be possible with graphical software. Learn, share, evolve!

 

To leave a comment for the author, please follow the link and comment on their blog: Mango Solutions Shop.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)