Introduction to Statistical Methods in R

[This article was first published on r – Lunean, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Data analyses are the product of many different tasks, and statistical methods are one key aspect of any data analysis. There is a common workflow in the related areas of informatics, data mining, data science, machine learning, and statistics. The workflow tasks include data preparation, the development of predictive mathematical models, and the interpretation and preparation of analysis results (including the development of visualizations to communicate findings).

The presentation provides information on the last two steps of this workflow and reproducible code examples and presents a walk-through of many common statistical methods (including regression, clustering (e.g. K-means and hiearchical), and dimensionality reduction (e.g. prinical component analysis (PCA)) used to explore data with examples in R.

Novice users are shown how to navigate the resuting R object to extract specific elements of interest, such as correlation p-values, regression coefficients, etc. The presentation additionally tries to tackle of some of the key concerns about these introductory methods by providing guidance on the interpretation of analyses results, such as understanding the approximately 10 values returned in a simple linear regression; the importance of and how to deal with missing values through imputation in real world problems; determining the quality of clustering results; and understanding the data transformations that take place in dimension reduction methods. Also provided is information about more sophisticated methodologies, such as regularized regression methods: LASSO, Ridge, and Elastic Net regression, and packages to make use of these more advanced methods in R, such as glmnet for regularized regression.

Usage of these statistical methods for modeling can help users to understand their data sets, and these methodologies can be coupled with other aspects of R and RStudio to develop interactive analyses using the Shiny R package.


The post Introduction to Statistical Methods in R appeared first on Lunean.

To leave a comment for the author, please follow the link and comment on their blog: r – Lunean. offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)