# Articles by Easy Guides

### Regression Analysis Essentials For Machine Learning

March 21, 2018 |

Regression analysis consists of a set of machine learning methods that allow us to predict a continuous outcome variable (y) based on the value of one or multiple predictor variables (x). Briefly, the goal of regression model is to build a mathematical equation that defines y as a function of ... [Read more...]

### ggpubr: Create Easily Publication Ready Plots

September 14, 2017 |

The ggpubr R package facilitates the creation of beautiful ggplot2-based graphs for researcher with non-advanced programming backgrounds. The current material presents a collection of articles for simply creating and customizing publication-ready... [Read more...]

### The Ultimate Guide To Partitioning Clustering

September 6, 2017 |

In this first volume of symplyR, we are excited to share our Practical Guides to Partioning Clustering. The course materials contain 3 chapters organized as follow: K-Means Clustering Essentials Contents: K-means basic ideas K-means algorithm ... [Read more...]

### Practical Guide to Principal Component Methods in R

August 24, 2017 |

Introduction Although there are several good books on principal component methods (PCMs) and related topics, we felt that many of them are either too theoretical or too advanced. This book provides a solid practical guidance to summarize, visu... [Read more...]

### simplyR

August 19, 2017 |

simplyR is a web space where we’ll be posting practical and easy guides for solving real important problems using R programming language. As we aren’t fans of unnecessary complications, we’ll keep the content of our tutorials / R codes as simpl... [Read more...]

### Elegant correlation table using xtable R package

August 7, 2017 |

Correlation matrix analysis is an important method to find dependence between variables. Computing correlation matrix and drawing correlogram is explained here. The aim of this article is to show you how to get the lower and the upper triangular part of a correlation matrix. We will also use the xtable ... [Read more...]

### Saving High-Resolution ggplots: How to Preserve Semi-Transparency

August 4, 2017 |

This article describes solutions for preserving semi-transparency when saving a ggplot2-based graphs into a high quality postscript (.eps) file format. Contents: Create a ggplot with semi-transparent color Save ggplots with semi-transparent colors Use cairo-based postscript graphics devices Export to powerpoint Create a ggplot with semi-transparent color To illustrate this, ... [Read more...]

### F-Test: Compare Two Variances in R

August 2, 2017 |

F-test is used to assess whether the variances of two populations (A and B) are equal. Contents When to you use F-test? Research questions and statistical hypotheses Formula of F-test Compute F-test in R R function Import and check your data into R Preleminary test to check F-test assumptions Compute ... [Read more...]

### ggplot2 – Easy way to mix multiple graphs on the same page

July 26, 2017 |

To arrange multiple ggplot2 graphs on the same page, the standard R functions - par() and layout() - cannot be used. The basic solution is to use the gridExtra R package, which comes with the following functions: grid.arrange() and arrangeGrob() to arrange multiple ggplots on one page marrangeGrob() for ... [Read more...]

### Bar Plots and Modern Alternatives

June 28, 2017 |

This article describes how to create easily basic and ordered bar plots using ggplot2 based helper functions available in the ggpubr R package. We’ll also present some modern alternatives to bar plots, including lollipop charts and cleveland’s dot plots. Note that, the approach to build a bar plot, ... [Read more...]

### Facilitating Exploratory Data Visualization: Application to TCGA Genomic Data

June 12, 2017 |

In genomic fields, it’s very common to explore the gene expression profile of one or a list of genes involved in a pathway of interest. Here, we present some helper functions in the ggpubr R package to facilitate exploratory data analysis (EDA) for life scientists. Exploratory Data visualization: Gene ... [Read more...]

### Add P-values and Significance Levels to ggplots

June 8, 2017 |

In this article, we’ll describe how to easily i) compare means of two or multiple groups; ii) and to automatically add p-values and significance levels to a ggplot (such as box plots, dot plots, bar plots and line plots …). Contents: Prerequisites Methods for comparing means R functions to add ... [Read more...]

### fastqcr: An R Package Facilitating Quality Controls of Sequencing Data for Large Numbers of Samples

April 11, 2017 |

Introduction High throughput sequencing data can contain hundreds of millions of sequences (also known as reads). The raw sequencing reads may contain PCR primers, adaptors, low quality bases, duplicates and other contaminants coming from the experimental protocols. As these may affect the results of downstream analysis, it’s essential to ... [Read more...]

### Survminer Cheatsheet to Create Easily Survival Plots

March 23, 2017 |

We recently released the survminer verion 0.3, which includes many new features to help in visualizing and sumarizing survival analysis results. In this article, we present a cheatsheet for survminer, created by Przemysław Biecek, and provide an overview of main functions. survminer cheatsheet The cheatsheet can be downloaded from STHDA ... [Read more...]

### survminer 0.3.0

March 20, 2017 |

I’m very pleased to announce that survminer 0.3.0 is now available on CRAN. survminer makes it easy to create elegant and informative survival curves. It includes also functions for summarizing and inspecting graphically the Cox proportional hazards model assumptions. This is a big release and a special thanks goes to ... [Read more...]

### Factoextra R Package: Easy Multivariate Data Analyses and Elegant Visualization

February 19, 2017 |

factoextra is an R package making easy to extract and visualize the output of exploratory multivariate data analyses, including: Principal Component Analysis (PCA), which is used to summarize the information contained in a continuous (i.e, quantitative) multivariate data by reducing the dimensionality of the data without loosing important information. ... [Read more...]

### Text mining and word cloud fundamentals in R : 5 simple steps you should know

February 11, 2017 |

Text mining methods allow us to highlight the most frequently used keywords in a paragraph of texts. One can create a word cloud, also referred as text cloud or tag cloud, which is a visual representation of text data. The procedure of creating word clouds is very simple in R ... [Read more...]

### Practical Guide to Cluster Analysis in R – Book

February 7, 2017 |

Introduction Large amounts of data are collected every day from satellite images, bio-medical, security, marketing, web search, geo-spatial or other automatic equipment. Mining knowledge from these big data far exceeds human’s abilities. Clustering is one of the important data mining methods for discovering knowledge in multidimensional data. The goal ... [Read more...]

### Survival Analysis

December 12, 2016 |

Survival analysis corresponds to a set of statistical methods for investigating the time it takes for an event of interest to occur. In this chapter, we start by describing how to fit survival curves and how to perform logrank tests comparing the survival time of two or more groups of ... [Read more...]

### survminer 0.2.4

December 12, 2016 |

I’m very pleased to announce survminer 0.2.4. It comes with many new features and minor changes. Install survminer with:
`install.packages("survminer")`
To load the package, type this:
`library(survminer)`
ContentsNew features Minor changes Bug fixes Summary of survival curves Plot survival curves Determine the optimal cutpoint for continuous variables Facet the output ... [Read more...]
1 2