Articles by chenangen

Solutions for Multicollinearity in Regression(2)

February 16, 2014 |

Continue to discuss this topic about multicollinearity in regression. Firstly, it is necessary introduce how to calculate the VIF and condition number via software such as R. Of course it is really easy for us. The vif() in car and kappa() can be applied to calculate the VIF and condition ... [Read more...]

Plot 3D Topographic Map in R

February 7, 2014 |

As we all know, there are a lot of packages provide functions to plot maps, such as ggmap, GEOmap, rworldmap and so on. For visualizing 2D topographic map, here is a good example. Besides, 3D topographic map is also easily to be plotted via some excellent functions and packages. The ... [Read more...]

Solutions for Multicollinearity in Regression(1)

February 3, 2014 |

In multiple regression analysis, multicollinearity is a common phenomenon, in which two or more predictor variables are highly correlated. If there is an exact linear relationship (perfect multicollinearity) among the independent variables, the rank of X is less than k+1(assume the number of predictor variables is k), and the ... [Read more...]

Visualization of AQI

February 2, 2014 |

The day before yesterday is spring festival which is one of the most famous Chinese festivals, and setting off firecrackers outside on New Year Eve is a traditional custom. However, firecrackers will pollute circumstance severely and cause the hazy weather. Of course the pollution of different province is not the ... [Read more...]

Playing Financial Data Series(1)

January 24, 2014 |

These days I became interested in financial data, such as stock price, exchange rate and so on. Obviously there are a lot of available models to fit, analyze and predict these types of data. For instance, basic time series model arima(p,d,q), Garch model, and multivariate time series ... [Read more...]

The number of clusters in Hierarchical Clustering

January 22, 2014 |

Cluster analysis is widely applied in data analysis. Obviously hierarchical clustering is the simple and important method to do clustering. In brief, hierarchical clustering methods use the elements of a proximity matrix to generate a tree diagram or dendogram. From the tree diagram, we can draw our own conclusions about ... [Read more...]

Happy new year

December 29, 2013 |

Although 2013 was not perfect for me, it still gave me a lot of happiness and beneficial experiences which were worthy to recall.  It is in 2014 that numerous difficult problems need to be solved. Application is still a headache and the final tests are also troublesome. Whereas,  2014 is full of hope. ... [Read more...]

High frequency words in TOEFL

December 27, 2013 |

In general, TOEFL(Test of English as a Foreign Language) is not an easy test for Chinese students, including me.  Relatively speaking, the reading section is little easier than the other sections (listening, speaking, writing). Interestingly, when I prepared my TOEFL test, I found that some important words appeared frequently ... [Read more...]

PCA or SPCA or NSPCA?

November 15, 2013 |

Principal component analysis(PCA) is one of the classical methods in multivariate statistics. In addition, it is now widely used as a way to implement data-processing and dimension-reduction. Besides statistics, there are numerous applications about PCA in engineering, biology, and so on. There are two main optimal properties of PCA,  ... [Read more...]