Articles by matloff

R > Python: a Concrete Example

November 20, 2018 | matloff

I like both Python and R, and teach them both, but for data science R is the clear choice. When asked why, I always note (a) written by statisticians for statisticians, (b) built-in matrix type and matrix manipulations, (c) great graphics, both base and CRAN, (d) excellent parallelization facilities, etc. ...
[Read more...]

Example of Overfitting

November 16, 2018 | matloff

I occasionally see queries on various social media as to overfitting — what is it?, etc. I’ll post an example here. (I mentioned it at my talk the other night on our novel approach to missing values, but had a bug in the code. Here is the correct account.) The ... [Read more...]

Manifold Visualization: Second Example

October 1, 2018 | matloff

In last night’s post, I introduced prVis(), a new visualization tool which we have invented, available in our polyreg package. Recall that prVis() is intended as a simpler alternative to recent visualization tools like t-SNE and UMAP. Here I will post another example. The dataset is prgeng, included in ...
[Read more...]

Manifold Visualization: Polynomials to the Rescue

October 1, 2018 | matloff

Our arXiv paper and the associated R package polyreg caused a bit of a stir, both pro and con, when we first announced them here in June. The discussion even spread as far as Twitter, Reddit and Hacker News. We’ll be announcing a revised paper, and various new features ...
[Read more...]

What, No Parentheses?

August 25, 2018 | matloff

I’m about to show you an R trick. Various readers may find it cool, useful and interesting, or stupid, useless and an evil deed undermining the sanctity of R’s functional programming nature (“All bow”). But I hope many of you will find the material here rather intriguing if ... [Read more...]

Update on Polynomial Regression in Lieu of Neural Nets

July 1, 2018 | matloff

There was quite a reaction to our paper, “Polynomial Regression as an Alternative to Neural Nets” (by Cheng, Khomtchouk, Matloff and Mohanty), leading to discussions/debates on Twitter, Reddit, Hacker News and so on. Accordingly, we have posted a revised version of the paper. Some of the new features: Though ...
[Read more...]

Neural Networks Are Essentially Polynomial Regression

June 20, 2018 | matloff

You may be interested in my new arXiv paper, joint work with Xi Cheng, an undergraduate at UC Davis (now heading to Cornell for grad school); Bohdan Khomtchouk, a post doc in biology at Stanford; and Pete Mohanty,  a Science, Engineering & Education Fellow in statistics at Stanford. The paper is ... [Read more...]

Women in R

June 8, 2018 | matloff

Last week I gave one of the keynote addresses at R/Finance 2018 in Chicago. I considered it an honor and a pleasure to be there, both because of the stimulating intellectual exchange and the fine level of camaraderie and hospitality that prevailed. I mentioned at the start of my talk ... [Read more...]

Xie Yihui, R Superstar and Mensch

February 23, 2018 | matloff

Yesterday a friend told me, “Yihui has written the most remarkably open blog post, and you’ve got to read it.” I did and it was. Though my post here is not about R per se, it is about a great contributor to R, our Yihui, Dr. of Statistics and (...
[Read more...]

cdparcoord: Parallel Coordinates Plots for Categorical Data

September 4, 2017 | matloff

My students, Vincent Yang and Harrison Nguyen, and I have developed a new data visualization package, cdparcoord, available now on CRAN. It can be viewed as an extension of the freqparcoord package written by a former grad student, Yingkang Xie and myself, which I have written about before in this ... [Read more...]

Wrong on an Astronomical Scale

August 20, 2017 | matloff

I recently posted an update regarding our R package revisit, aimed at partially remedying the reproducibility crisis, both in the sense of (a) providing transparency to data analyses and (b) flagging possible statistical errors, including misuse of significance testing. One person commented to me that it may not be important ...
[Read more...]

Update on Our ‘revisit’ Package

August 16, 2017 | matloff

On May 31, I made a post here about our R package revisit, which is designed to help remedy the reproducibility crisis in science. The intended user audience includes reviewers of research manuscripts submitted for publication, scientists who wish to confirm the results in a published paper, and explore alternate analyses, ... [Read more...]

Understanding Overhead Issues in Parallel Computation

July 29, 2017 | matloff

In my talk at useR! earlier this month, I emphasized the fact that a major impediment to obtaining good speed from parallelizing an algorithm is systems overhead of various kinds, including: Contention for memory/network. Bandwidth limits — CPU/memory, CPU/network, CPU/GPU. Cache coherency problems. Contention for I/O ... [Read more...]

My Presentation at useR! 2017, Etc.

July 8, 2017 | matloff

I gave a talk titled, “Parallel Computation in R:  What We Want, and How We (Might) Get It,” at last week’s useR! 2017 conference in Brussels. You can view my slides here, and I think the conference organizers said the videos would be placed online, not sure of that though. ... [Read more...]

A Partial Remedy to the Reproducibility Problem

May 31, 2017 | matloff

Several years ago, John Ionnidis jolted the scientific establishment with an article titled, “Why Most Published Research Findings Are False.” He had concerns about inattention to statistical power, multiple inference issues and so on. Most people had already been aware of all this, of course, but that conversation opened the ... [Read more...]

Online But In-Class Examinations (with an R Example)

April 15, 2017 | matloff

About a year-and-a-half ago, some students and I wrote OMSI, Online Measurement of Student Insight, an online software tool to improve examinations for students and save instructors lots of time and drudgery currently spent on administering exams. It is written in a mixture of Python and R. (Python because it ...
[Read more...]

A Python-Like walk() Function for R

April 8, 2017 | matloff

A really nice function available in Python is walk(), which recursively descends a directory tree, calling a user-supplied function in each directory within the tree. It might be used, say, to count the number of files, or maybe to remove all small files and so on. I had students in ... [Read more...]
1 2 3 4 6

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)