434 search results for "boxplot"

The 5th Tribe, Support Vector Machines and caret

October 15, 2015
By
The 5th Tribe, Support Vector Machines and caret

by Joseph Rickert In his new book, The Master Algorithm, Pedro Domingos takes on the heroic task of explaining machine learning to a wide audience and classifies machine learning practitioners into 5 tribes*, each with its own fundamental approach to learning problems. To the 5th tribe, the analogizers, Pedro ascribes the Support Vector Machine (SVM) as it's master algorithm....

Read more »

Hypothesis-Driven Development Part V: Stop-Loss, Deflating Sharpes, and Out-of-Sample

September 24, 2015
By
Hypothesis-Driven Development Part V: Stop-Loss, Deflating Sharpes, and Out-of-Sample

This post will demonstrate a stop-loss rule inspired by Andrew Lo’s paper “when do stop-loss rules stop losses”? Furthermore, it … Continue reading →

Read more »

Convergence and Asymptotic Results

September 24, 2015
By
Convergence and Asymptotic Results

Last week, in our mathematical statistics course, we’ve seen the law of large numbers (that was proven in the probability course), claiming that given a collection  of i.i.d. random variables, with To visualize that convergence, we can use > m=100 > mean_samples=function(n=10){ + X=matrix(rnorm(n*m),nrow=m,ncol=n) + return(apply(X,1,mean)) + } > B=matrix(NA,100,20) > for(i in 1:20){ + B=mean_samples(i*10) + } > colnames(B)=as.character(seq(10,200,by=10)) > boxplot(B) It is...

Read more »

Fitting a neural network in R; neuralnet package

September 23, 2015
By
Fitting a neural network in R; neuralnet package

Neural networks have always been one of the most fascinating machine learning model in my opinion, not only because of the fancy backpropagation algorithm, but also because of their complexity (think of deep learning with many hidden layers) and structure inspired by the brain. Neural networks have not always been popular, partly because they were,

Read more »

Predicting creditability using logistic regression in R: cross validating the classifier (part 2)

September 15, 2015
By
Predicting creditability using logistic regression in R: cross validating the classifier (part 2)

Now that we fitted the classifier and run some preliminary tests, in order to get a grasp at how our model is doing when predicting creditability we need to run some cross validation methods.Cross validation is a model evaluation method that does not u...

Read more »

Hypothesis Driven Development Part III: Monte Carlo In Asset Allocation Tests

September 10, 2015
By
Hypothesis Driven Development Part III: Monte Carlo In Asset Allocation Tests

This post will show how to use Monte Carlo to test for signal intelligence. Although I had rejected this strategy … Continue reading →

Read more »

Hypothesis-Driven Development Part II

September 8, 2015
By
Hypothesis-Driven Development Part II

This post will evaluate signals based on the rank regression hypotheses covered in the last post. The last time around, … Continue reading →

Read more »

Free R Help

September 3, 2015
By
Free R Help

Today I am giving away 10 sessions of free, online, one-on-one R help. My hope is to get a better understanding of how my readers use R, and the issues they face when working on their own projects. The sessions will be over the next two weeks, online and 30-60 minutes each. I just purchased Screenhero, The post

Read more »

Use box plots to assess the distribution and to identify the outliers in your dataset

August 14, 2015
By
Use box plots to assess the distribution and to identify the outliers in your dataset

After you check the distribution of the data by ploting the histogram, the second thing to do is to look for outliers. Identifying the outliers is important becuase it might happen that an association you find in your analysis can be explained by the presence of outliers. The best tool to identify the outliers is

Read more »

Importing the New Zealand Income Survey SURF

August 14, 2015
By
Importing the New Zealand Income Survey SURF

The quest for income microdata For a separate project, I've been looking for source data on income and wealth inequality. Not aggregate data like Gini coefficients or the percentage of income earned by the bottom 20% or top 1%, but the sources used to calculate those things. Because it's sensitve personal financial data either from surveys or tax...

Read more »

Sponsors

Mango solutions



RStudio homepage



Zero Inflated Models and Generalized Linear Mixed Models with R

Quantide: statistical consulting and training

datasociety

http://www.eoda.de





ODSC

ODSC

CRC R books series





Six Sigma Online Training









Contact us if you wish to help support R-bloggers, and place your banner here.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)