The Case for tidymodels

April 20, 2020 | 0 Comments

If you are a data scientist with a built-out set of modeling tools that you know well, and which are almost always adequate for getting your work done, it is probably difficult for you to imagine what would induce you to give them up. Changing out what works is a ...
[Read more...]

How to showcase CSS+JS+HTML snippets with Hugo?

April 20, 2020 | 0 Comments

I’ve recently found myself having to write a bit of CSS or JS for websites made with Hugo. Note for usual readers: it is a topic not directly related to R, but you might have played with either or both CSS and JS for your R blog or Shiny ... [Read more...]

Automatic Code Cleaning in R with Rclean

April 20, 2020 | 0 Comments

Leave the code cleaner than you found it. – R.C. Martin in Clean Code The R language has become very popular among scientists and analysts because it enables the rapid development of software and empowers scientific investigation. However, regardless of the language used, data analysis is usually complicated. Because of ... [Read more...]

A case against pipes in R and what to do instead

April 20, 2020 | 0 Comments

A case against pipes in R and what to do instead Pipes (%__%) are great for improving readibility of lengthy data processing scripts, but I’m beggining to learn they have some weaknesses when it comes to large and complex data processing. We are ... [Read more...]

random generators produce ties

April 20, 2020 | 0 Comments

“…an essential part of understanding how many ties these RNGs produce is to understand how many ties one expects in 32-bit integer arithmetic.” A sort of a birthday-problem paper for random generators by Markus Hofert on arXiv as to why they produce ties. As shown for instance in the R ... [Read more...]

Q is for qplot versus ggplot

April 20, 2020 | 0 Comments

Two years ago, when I did Blogging A to Z of R, I talked about qplots. qplots are great for quick plots - which is why they're named as such - because they use variable types to determine the best plot to generate. For instance, if I give it a ...
[Read more...]

The Treachery of Models

April 20, 2020 | 0 Comments

In Magritte’s famous 1929 painting The Treachery of Images, a pipe is depicted with the caption “Ceci n’est pas une pipe“, French for “This is not a pipe”. The seemingly dissonant statement under what is a very clearly depicted pipe forces the viewer to confront the distinction between the ...
[Read more...]

How to Run Python’s Scikit-Learn in R in 5 minutes

April 19, 2020 | 0 Comments

The 2 most popular data science languages - Python and R - are often pitted as rivals. This couldn’t be further from the truth. Data scientists that learn to use the strengths of both languages are valuable because they have NO LIMITS. Machine Le...
[Read more...]

Israeli elections on Twitter

April 19, 2020 | 0 Comments

Introduction Israel had its 3rd election within 12 months on March 2, 2020. This is because our Knesset - Hebrew term for house of representatives - wasn’t able to form or hold a government after each of the previous elections. As I won’t get into the politics of why they didn’...
[Read more...]

Analyzing the US Masters 2020 1-Hour ePostal Results

April 19, 2020 | 0 Comments

Every year US Masters Swimming runs an event called the 1 Hour ePostal. Rules are simple - athletes swim as many lengths of a 25 yard (or longer) swimming pool as possible in one hour, during the month of Feburary, and without the aid of equipment beyond a legal suit and googles. ...
[Read more...]

Risk premia

April 19, 2020 | 0 Comments

Our last post discussed using the discounted cash flow model (DCF) as a method to set return expectations that one would ultimately employ in building a satisfactory portfolio. We noted that if one were able to have a reasonably good estimate of the cash flow growth rate of an asset, ...
[Read more...]

Use R & GitHub as a Workout planner

April 19, 2020 | 0 Comments

Over the years, I’ve been trying a bunch a different applications and methods to stay motivated to workout. But every time it’s the same: at some point the application is great but limited if you do not pay, or the workouts are repetitive, or simpl...
[Read more...]

dowhy library exploration

April 19, 2020 | 0 Comments

It is not often that I find myself thinking “man, I wish we had in R that cool python library!”. That is however the case with the dowhy library which “provides a unified interface for causal inference methods and automatically tests many assumptions, thus making inference accessible to non-experts”. Luckily ...
[Read more...]

New R package: GetCVMData

April 19, 2020 | 0 Comments

Package GetCVMData is an alternative to GetDFPData. Both have the same objective: fetch corporate data of Brazilian companies trading at B3, but diverge in their source. While GetDFPData imports data directly from the DFP and FRE systems, GetCVMData uses the CVM ftp site for grabbing compiled .csv files. When doing ... [Read more...]

Financial Datasets Available in the Website

April 19, 2020 | 0 Comments

I’ve been researching financial data for over 10 years and compiled a great deal of compiled tables. Most of these comes from my R packages and have been used for creating class material, doing research and even writing a book. These files were mostly found in many copies across different ... [Read more...]

No excuse not to be a Bayesian anymore

April 19, 2020 | 0 Comments

My first encounter with Bayesian statistics was around 10 years ago, when I was doing my econometrics master’s degree. I was immediately very interested by the Bayesian approach to fit econometric models, because, when you’re reading about Bayesian approaches, it just sounds so easy and natural. You have a ... [Read more...]

prrd 0.0.3: More improvements

April 19, 2020 | 0 Comments

Back in early 2018, the prrd package was introduced as release 0.0.1, uploaded to CRAN, and updated once as release 0.0.2. I have used it extensively for every CRAN release of Rcpp, RcppArmadillo, RcppEigen, BH, and possibly others. The idea of prrd ...
[Read more...]

KNN With Pokemon

April 18, 2020 | 0 Comments

This analysis introduces the K-Nearest Neighbor (KNN) machine learning algorithm using the familiar Pokemon dataset. By the end of this blog post you should have an understanding of the following: What the KNN machine learning algorithm is How to program the algorithm in R A bit more about Pokemon If ...
[Read more...]

COVID-19 Canada Data Explorer Tool

April 18, 2020 | 0 Comments

In the times of pandemic, the data community can help in many ways, including by developing instruments to track and break down the data on the spread of the dreaded coronavirus disease. The COVID-19 Canada Data Explorer app was built with R, including Shiny and Leaflet, to process the official ...
[Read more...]

How to make Powepoint Slides PPT using RStudio in 2020

April 18, 2020 | 0 Comments

The reason I wanted to make this short tutorial is because there are a lot of old tutorial available on Internet to help you make a powerpoint using R. But some time back RStudio made this new option available that makes it extremely easy and simple...
[Read more...]
1 41 42 43 44 45 1,682