Solutions to working with small sample sizes

March 10, 2020 | Paul van der Laken

Both in science and business, we often experience difficulties collecting enough data to test our hypotheses, either because target groups are small or hard to access, or because data collection entails prohibitive costs. Such obstacles may result in data sets that are too small for the complexity of the statistical ...
Simulating data with Bayesian networks, by Daniel Oehm

February 11, 2020 | Paul van der Laken

Daniel Oehm wrote this interesting blog about how to simulate realistic data using a Bayesian network. Bayesian networks are a type of probabilistic graphical model that uses Bayesian inference for probability computations. Bayesian networks aim to model conditional dependence, and therefore causation, by representing conditional dependence by edges in a ...
Learn Julia for Data Science

February 10, 2020 | Paul van der Laken

Most data scientists favor Python as a programming language these days. However, there’s also still a large group of data scientists coming from a statistics, econometrics, or social science and therefore favoring R, the programming language they learned in university. Now there’s a new kid on the block: ...
Why Gordon Shotwell uses R

January 6, 2020 | Paul van der Laken

This blog by Gordon Shotwell has passed my Twitter feed a couple of times now and I thought I’d share it here: It in, Gordon present his reasons for using R, describing R’s four unique selling point, and outlining a ...
Anomaly Detection Resources

December 19, 2019 | Paul van der Laken

Carnegie Mellon PhD student Yue Zhao collects this great Github repository of anomaly detection resources: The repository consists of tools for multiple languages (R, Python, Matlab, Java) and resources in the form of: Books & Academic Papers Online Courses and Videos Outlier Datasets Algorithms and Applications ... [Read more...]
