### Credit Risk Modelling using Machine Learning: A Gentle Introduction

August 2, 2020 |

Are you interested in guest posting? Publish at DataScience+ via your RStudio editor. Category Programming Tags Decision Trees Logistic Regression Machine Learning R Programming Random Forests Assume you are given a dataset for a large bank and you are tasked to come up with a credit risk score for each ... [Read more...]

### What’s the Difference Between Instagram and TikTok? Using Word Embeddings to Find Out

August 1, 2020 |

TL;DR Instagram - Tiktok = Photos, Photographers and Selfies Tiktok - Instagram = Witchcraft and Teens but read the whole post to find out why! Purpose The original intent of this post was to learn to train my own Word2Vec model, however, as is a running theme.. my laptop is ...

### What’s the Difference Between Instagram and TikTok? Using Word Embeddings to Find Out

August 1, 2020 |

TL;DR Instagram - Tiktok = Photos, Photographers and Selfies Tiktok - Instagram = Witchcraft and Teens but read the whole post to find out why! Purpose The original intent of this post was to learn to train my own Word2Vec model, however, as is a running theme.. my laptop is ...

### Weathering the Storm

August 1, 2020 |

Covid-19 began battering the financial markets in February. Which sectors are faring best? I’ll compare each sector in the S&P 500 with the overall market. And I’ll baseline each at 100% as of February 19th, 2020 so we can see which have recovered lost ground.
symbols <-
c(
"EOD/SPY",
"EOD/XLV",
"EOD/XLK",
"EOD/XLE",
"EOD/XLF",
"EOD/XLC",
"EOD/XLI",
"EOD/XLY",
"EOD/XLP",
"EOD/XLRE",
"EOD/XLU",
"EOD/XLB"
)

from <- "2020-02-19"
eod_sectors <-
tq_get(symbols, get = "quandl", from = from) %>%
group_by(symbol) %>%
mutate(
type = if_else(symbol == "EOD/SPY", "Market", "Sector"),
sector = case_when(
symbol == "EOD/SPY"  ~ "S&P 500",
symbol == "EOD/XLB"  ~ "Materials",
symbol == "EOD/XLE"  ~ "Energy",
symbol == "EOD/XLU"  ~ "Utilities",
symbol == "EOD/XLI"  ~ "Industrical",
symbol == "EOD/XLRE" ~ "Real Estate",
symbol == "EOD/XLV"  ~ "Health",
symbol == "EOD/XLK"  ~ "Technology",
symbol == "EOD/XLF"  ~ "Financial",
symbol == "EOD/XLC"  ~ "Communication",
symbol == "EOD/XLY"  ~ "Consumer Discretionary",
symbol == "EOD/XLP"  ~ "Consumer Staples",
TRUE                 ~ "Other"
)
) %>%
ungroup()
With all that ...

### Choroplethr v3.6.4 is now on CRAN (and the Future of Choroplethr)

August 1, 2020 |

Choroplethr v3.6.4 is now on CRAN. This is the first update to the package in two years, and was necessary because of a recent change to the tigris package, which choroplethr uses to make Census Tract maps. I also took this opportunity to add new example demographic data for Census ...

### Visualisation options to show growth in occupations in the Australian health industry by @ellis2013nz

August 1, 2020 |

Visualising growth in occupations in one industry A chart is doing the rounds purporting to show the number of administrators working in health care in the USA has grown much faster than the number of physicians - more than 3,000% growth from 1970 to ... [Read more...]

### How to manage credentials and secrets safely in R

August 1, 2020 |

Are you interested in guest posting? Publish at DataScience+ via your RStudio editor. Category Programming Tags credentials lares R Programming yaml yml If you have ever received an embarrassing message with a warning saying that you may have published your credentials or secrets when publishing your code, you know what ... [Read more...]

### Why R? 2020 (Remote) Call for Papers Extended

July 31, 2020 |

This decided to give you one more week to submit a talk or a workshop to Call for Papers for 2020.whyr.pl remote conference. Please fill this form 2020.whyr.pl/submit/ if you are interested in an active participation. The new deadline for submissi... [Read more...]

### Why R? 2020 (Remote) Call for Papers Extended

July 31, 2020 |

This decided to give you one more week to submit a talk or a workshop to Call for Papers for 2020.whyr.pl remote conference. Please fill this form 2020.whyr.pl/submit/ if you are interested in an active participation. The new deadline for submissi... [Read more...]

### fairmodels: let’s fight with biased Machine Learning models (part 1 — detection)

July 31, 2020 |

fairmodels: let’s fight with biased Machine Learning models (part 1 — detection) Author: Jakub Wiśniewski TL;DR The fairmodels R Package facilitates bias detection through model visualizations. It implements few mitigation strategies that could reduce the bias. It enables easy to use checks for fairness metrics and comparison between different ...

### Explainable ‘AI’ using Gradient Boosted randomized networks Pt2 (the Lasso)

July 30, 2020 |

Explainable 'AI' using Gradient Boosted randomized networks Pt2 (the Lasso).

### Feature Leakage, and the case to identify it with EDA vs. Machine Learning

July 30, 2020 |

This is a corrected version of an earlier post on the same topic. This version contains the correct links to the original post, to faciliate discussion via comments. On one of my projects, my team and I were tasked with building a mortgage leads gener... [Read more...]

### Explainable ‘AI’ using Gradient Boosted randomized networks Pt2 (the Lasso)

July 30, 2020 |

Explainable 'AI' using Gradient Boosted randomized networks Pt2 (the Lasso).

### Handling R6 objects in C++

July 30, 2020 |

Introduction When we are using R6 objects and want to introduce some C++ code in our project, we may also want to interact with these objects using Rcpp. With this objective in mind, the key to interacting with R6 objects is that they are essentially... [Read more...]

### I like to MVO it!

July 30, 2020 |

In our last post, we ran through a bunch of weighting scenarios using our returns simulation. This resulted in three million portfolios comprised in part, or total, of four assets: stocks, bonds, gold, and real estate. These simulations relaxed the allocation constraints to allow us to exclude assets, yielding a ...

### Explainable ‘AI’ using Gradient Boosted randomized networks Pt2 (the Lasso)

July 30, 2020 |

Explainable 'AI' using Gradient Boosted randomized networks Pt2 (the Lasso).

### R Package Integration with Modern Reusable C++ Code Using Rcpp – Part 3

July 30, 2020 |

Daniel Hanson is a full-time lecturer in the Computational Finance & Risk Management program within the Department of Applied Mathematics at the University of Washington. In the previous post in this series, we looked at some design considerations when integrating standard and reusable C++ code into an R package. Specific emphasis ...

### Spatial GLMM(s) using the INLA Approximation

July 30, 2020 |

The INLA Approach to Bayesian models The Integrated Nested Laplace Approximation, or INLA, approach is a recently developed, computationally simpler method for fitting Bayesian models [(Rue et al., 2009, compared to traditional Markov Chain Monte Carlo (MCMC) approaches. INLA fits models that are classified as latent Gaussian models, which are applicable ...

### rfm 0.2.2

July 30, 2020 |

We’re excited to announce the release of rfm 0.2.2 on CRAN! rfm provides tools for customer segmentation using Recency Frequency Monetary value analysis. It includes a Shiny app for interactive segmentation. You can install rfm with:
install.packages("rfm")
In this blog post, we will summarize the changes implemented in the current (0.2.2) ...