Parsing code
Parsing R files
Parsing markdown files
Binding files together
Organizing snippets
This post is a follow-up to my previous post Identifying R Functions & Packages Used in GitHub Repos that introduced funspotr.
funspotr can also be applied to gists:
By functions or packages used…?https://t.co/kbSLOpQZLF
— Bryan ... [Read more...]

Documenting #rstats posts
Examples
Julia Silge Blog
David Robinson Tidy Tuesday
R for Data Science Chapters
Bryan Shalloway Blog
TLDR: funspotr provides helpers for spotting the functions and packages in R and Rmarkdown files and associated github repositories. See Examples for catalogues of the functions/packages used in posts by ... [Read more...]

NBA Playoffs and the Lakers
Data Prep
Scraping Betting Markets
Steps
Joining with FiveThirtyEight data
Analysis
How much does FiveThirtyEight differ from markets?
Closing Thought
Appendix
Potential Reasons for the Difference
Calculating percentiles of diff
TLDR: FiveThirtyEight’s forecasts of NBA playoff berths seem to hold-up OK against betting markets. ...

[Read more...]
Macro in the Shell
Example
Setting-up Gaurd Rails
Closing
Appendix
Related Alternative
Other Resources
There is many a data science meme degrading excel:
(Google Sheets seems to have escaped most of the memes here.)
While I no longer use ...

[Read more...]
Quantile Regression
Example
Quantile Regression Forest
Review
Performance
Coverage
Interval Width
Closing Notes
Appendix
Residual Plots
Other Charts
In this post I will build prediction intervals using quantile regression, more specifically, quantile regression forests. This is my third post on prediction intervals. Prior posts:
Understanding Prediction Intervals (Part 1)
Simulating Prediction ...

[Read more...]
Rough Idea
Inspiration
Procedure
Example
Simulate Prediction Interval
Review
Interval Width
Coverage
Closing Notes
Appendix
Conformal Inference
Other Examples Using Simulation
Confusion With Confidence Intervals
Adjusting Procedure
Alternative Procedure With CV
Part 1 of my series of posts on building prediction intervals used data held-out from model training to evaluate the ...

[Read more...]
Providing More Than Point Estimates
Considering Uncertainty
Observation Specific Intervals
A Few Things to Know About Prediction Intervals
Prediction Intervals and Confidence Intervals
Analytic Method of Calculating Prediction Intervals
Visual Comparison of Prediction Intervals and Confidence Intervals
Inference or Prediction?
Cautions With Overfitting
Generalizability
Review Prediction Intervals
Coverage
Interval Width
...

[Read more...]
Model Performance Metrics
Lending Data Example
Starter Code
Weighting by Classification Outcomes
Metrics Across Decision Thresholds
Weighting by Observations
Closing note
Appendix
Weights of Observations During and Prior to Modeling
Notes on Cost Sensitive Classification
Weighted Classification Metrics
Questions on Cost Sensitive Classification
Arriving at Weights
Weighting in predictive modeling ...

[Read more...]
Model Performance Metrics
Lending Data Example
Starter Code
Weighting by Classification Outcomes
Metrics Across Decision Thresholds
Weighting by Observations
Closing note
Appendix
Weights of Observations During and Prior to Modeling
Notes on Cost Sensitive Classification
Weighted Classification Metrics
Questions on Cost Sensitive Classification
Arriving at Weights
Weighting in predictive modeling ...

[Read more...]
Create Data
Association of ‘feature’ and ‘target’
Resample
Build Models
Rescale Predictions to Predicted Probabilities
Appendix
Density Plots
Lift Plot
Comparing Scaling Methods
TLDR: In classification problems, under and over sampling1 techniques shift the distribution of predicted probabilities towards the minority class. If your problem requires accurate probabilities you will ...

[Read more...]
Create Data
Association of ‘feature’ and ‘target’
Resample
Build Models
Rescale Predictions to Predicted Probabilities
Appendix
Density Plots
Lift Plot
TLDR: In classification problems, under and over sampling1 techniques shift the distribution of predicted probabilities towards the minority class. If your problem requires accurate probabilities you will need to adjust ...

[Read more...]
Simple Example
Applying Incentives
Takeaways of Resulting Distribution
Think Carefully About Assumptions
How to Set Assumptions
Appendix
Simple Assumptions
Trade-offs
In this post I will use incentives for sales representatives in pricing to provide examples of factors to consider when attempting to influence an existing distribution.
For instance, if you ...

[Read more...]
Load data
Feature Engineering & Data Splits
Lag Based Features (Before Split, use dplyr or similar)
Data Splits
Other Features (After Split, use recipes)
Model Specification and Training
Model Evaluation
Appendix
Model Building with Hyperparam...

[Read more...]
Load data
Feature Engineering & Data Splits
Lag Based Features (Before Split, use dplyr or similar)
Data Splits
Other Features (After Split, use recipes)
Model Specification and Training
Model Evaluation
Appendix
Model Building with Hyperparam...

[Read more...]
What influences price?
Simple linear regression model
Inference and challenges
Violation of model assumptions
The tug-of-war between colinear inputs
Improving model fit, considerations
Closing notes and tips
Appendix
Pricing challenges
Future... [Read more...]

Function expecting one column
Functions allowing multiple columns
Older approaches
Appendix
dplyr, the foundational tidyverse package, makes a trade-off between being easy to code in interactively at the expense of being more difficult to create...

[Read more...]
Function expecting one column
Functions allowing multiple columns
Older approaches
Appendix
dplyr, the foundational tidyverse package, makes a trade-off between being easy to code in interactively at the expense of being more difficult to create...

[Read more...]
Learning R’s %__%
Using the pipe operator (%__%) is one of my favorite things about coding in R and the tidyverse. However when it was first shown to me, I couldn’t understand what the #rstats nut describing it was so enthusiastic about. They t... [Read more...]

Learning R’s %__%
Using the pipe operator (%__%) is one of my favorite things about coding in R and the tidyverse. However when it was first shown to me, I couldn’t understand what the #rstats nut describing it was so enthusiastic about. They t... [Read more...]

Overview
I. Nest and pivot
II. Expand combinations
III. Filter redundancies
IV. Map function(s)
V. Return to normal dataframe
VI. Bind back to data
Functionalize
Example creating & evaluating features
When is this approach inappropriate?
Appen...

[Read more...]Copyright © 2022 | MH Corporate basic by MH Themes

e-mails with the latest R posts.

(You will not see this message again.)