# Simulating some synthetic data.

[This article was first published on

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

In many cases we require some data with certain characteristics to develop a model, perform research, to test an algorithm or simply to practice.**Data R Value**, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Here I show an example of how to generate some synthetic data that can help you to generate your own.

We will need the ggplot2 library to display our data:

**library(ggplot2)**

Now we define the dimensions of the arrangement we need:

**lrows <- 3035**

lcols <- 11

lcols <- 11

in this case 3035 rows and 11 columns.

Now we define the array first containing zeros on all entries:

**syn_data <- array(data = 0, dim = c(lrows, lcols))**

Our data look like this:

Now let’s name each field (column):

**colnames(syn_data) <- c("ONE","NUMBER","R1","R2","R3",**

“R4”, “R5”, “R6″,”NINE”,”TEN”, “ELEVEN”)

“R4”, “R5”, “R6″,”NINE”,”TEN”, “ELEVEN”

Now let’s assign values to some columns.

**syn_data[,2] <- c(seq(lrows, 1))**

syn_data[,1] <- c(runif(lrows, 0.0, 7.5))

syn_data[,9] <- c(runif(lrows, 10, 100))

syn_data[,10] <-c(runif(lrows, 5.0, 50))

syn_data[,11] <-c(runif(lrows, 30.0, 60.0))

syn_data[,1] <- c(runif(lrows, 0.0, 7.5))

syn_data[,9] <- c(runif(lrows, 10, 100))

syn_data[,10] <-c(runif(lrows, 5.0, 50))

syn_data[,11] <-c(runif(lrows, 30.0, 60.0))

You can see each line of the script and see what kind of value it assigned to each entry of which column:

Now for columns R1 to R6 I want to assign a random integer value between 1 and 56 for which we use the following -for- and -while- cycle:

**for(i in 1:lrows){**

j = 1

while(j <= 6){

syn_data[i,j+2] <- sample(1:56,1)

j = j + 1

}

}

j = 1

while(j <= 6){

syn_data[i,j+2] <- sample(1:56,1)

j = j + 1

}

}

Now our data looks like this:

So far we have our synthetic data. Now let’s do some treatments.

First we convert the array to a Data Frame type object:

**syn_data <- as.data.frame(syn_data)**

Now let us calculate the mean of each row from R1 to R6 and accumulate these means in a vector:

**smeans = vector()**

for(i in 1:lrows){

smeans[i] <- sum(syn_data[i , 3:8])/6

}

for(i in 1:lrows){

smeans[i] <- sum(syn_data[i , 3:8])/6

}

Finally we perform a visualization of the vector of means:

This is a very crude example and is actually inefficient but it is a start. It is up to you to improve it and adapt it to your needs.

You can download the script from this example in:

https://github.com/pakinja

To

**leave a comment**for the author, please follow the link and comment on their blog:**Data R Value**.R-bloggers.com offers

**daily e-mail updates**about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.