Blog Archives

Model Segmentation with Cubist

March 18, 2015
By
Model Segmentation with Cubist

Cubist is a tree-based model with a OLS regression attached to each terminal node and is somewhat similar to mob() function in the Party package (https://statcompute.wordpress.com/2014/10/26/model-segmentation-with-recursive-partitioning). Below is a demonstrate of cubist() model with the classic Boston housing data.

Read more »

Download Federal Reserve Economic Data (FRED) with Python

December 10, 2014
By
Download Federal Reserve Economic Data (FRED) with Python

In the operational loss calculation, it is important to use CPI (Consumer Price Index) adjusting historical losses. Below is an example showing how to download CPI data online directly from Federal Reserve Bank of St. Louis and then to calculate monthly and quarterly CPI adjustment factors with Python.

Read more »

Query Pandas DataFrame with SQL

November 1, 2014
By
Query Pandas DataFrame with SQL

Similar to SQLDF package providing a seamless interface between SQL statement and R data.frame, PANDASQL allows python users to use SQL querying Pandas DataFrames. Below are some examples showing how to use PANDASQL to do SELECT / AGGREGATE / JOIN operations. More information is also available on the GitHub (https://github.com/yhat/pandasql).

Read more »

Flexible Beta Modeling

October 27, 2014
By
Flexible Beta Modeling

Read more »

Model Segmentation with Recursive Partitioning

October 26, 2014
By
Model Segmentation with Recursive Partitioning

Read more »

Estimating a Beta Regression with The Variable Dispersion in R

October 19, 2014
By
Estimating a Beta Regression with The Variable Dispersion in R

Read more »

Fitting Lasso with Julia

October 7, 2014
By
Fitting Lasso with Julia

Julia Code R Code

Read more »

By-Group Aggregation in Parallel

October 4, 2014
By
By-Group Aggregation in Parallel

Similar to the row search, by-group aggregation is another perfect use case to demonstrate the power of split-and-conquer with parallelism. In the example below, it is shown that the homebrew by-group aggregation with foreach pakage, albeit inefficiently coded, is still a lot faster than the summarize() function in Hmisc package.

Read more »

Vector Search vs. Binary Search

October 1, 2014
By
Vector Search vs. Binary Search

Read more »

Row Search in Parallel

September 28, 2014
By
Row Search in Parallel

I’ve been always wondering whether the efficiency of row search can be improved if the whole data.frame is splitted into chunks and then the row search is conducted within each chunk in parallel. In the R code below, a comparison is done between the standard row search and the parallel row search with the FOREACH

Read more »

Sponsors

Mango solutions



RStudio homepage



Zero Inflated Models and Generalized Linear Mixed Models with R

Quantide: statistical consulting and training



http://www.eoda.de









ODSC

CRC R books series













Contact us if you wish to help support R-bloggers, and place your banner here.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)