Articles by hrbrmstr

New CRAN Package Announcement: splashr

August 29, 2017 | hrbrmstr

I’m pleased to announce that splashr is now on CRAN. (That image was generated with splashr::render_png(url = "https://cran.r-project.org/web/packages/splashr/")). The package is an R interface to the Splash javascript rendering service. It works in a similar fashion to Selenium but is fear ...
[Read more...]

Unbottling “.msg” Files in R

August 25, 2017 | hrbrmstr

There was a discussion on Twitter about the need to read in “.msg” files using R. The “MSG” file format is one of the many binary abominations created by Microsoft to lock folks and users into their platform and tools. Thankfully, they (eventually) provided documentation for the MSG file format ...
[Read more...]

Reticulating Readability

August 24, 2017 | hrbrmstr

I needed to clean some web HTML content for a project and I usually use hgr::clean_text() for it and that generally works pretty well. The clean_text() function uses an XSLT stylesheet to try to remove all non-“main text content” from an HTML document and it usually ... [Read more...]

Caching httr Requests? This means WAR[C]!

August 22, 2017 | hrbrmstr

I’ve blathered about my crawl_delay project before and am just waiting for a rainy weekend to be able to crank out a follow-up post on it. Working on that project involved sifting through thousands of Web Archive (WARC) files. While I have a nascent package on github to ... [Read more...]

R⁶ — Reticulating Parquet Files

August 1, 2017 | hrbrmstr

The reticulate package provides a very clean & concise interface bridge between R and Python which makes it handy to work with modules that have yet to be ported to R (going native is always better when you can do it). This post shows how to use reticulate to create parquet ... [Read more...]

R⁶ — General (Attys) Distributions

July 25, 2017 | hrbrmstr

Matt @stiles is a spiffy data journalist at the @latimes and he posted an interesting chart on U.S. Attorneys General longevity (given that the current US AG is on thin ice): Only Watergate and the Civil War have prompted shorter tenures as AG (if Sessions were to leave now). ...
[Read more...]

R⁶ — Disproving Approval

June 18, 2017 | hrbrmstr

I couldn’t let this stand unchallenged: The new Rasmussen Poll, one of the most accurate in the 2016 Election, just out with a Trump 50% Approval Rating.That's higher than O's #'s!— Donald J. Trump (@realDonaldTrump) June 18, 2017 Ramussen makes their Presidential polling data available for both ? & O. Why not compare their ...
[Read more...]

Keeping Users Safe While Collecting Data

June 13, 2017 | hrbrmstr

I caught a mention of this project by Pete Warden on Four Short Links today. If his name sounds familiar, he’s the creator of the DSTK, an O’Reilly author, and now works at Google. A decidedly clever and decent chap. The project goal is noble: crowdsource and make ...
[Read more...]

Engaging the tidyverse Clean Slate Protocol

June 10, 2017 | hrbrmstr

I caught the 0.7.0 release of dplyr on my home CRAN server early Friday morning and immediately set out to install it since I’m eager to finish up my sergeant package and get it on CRAN. “Tidyverse” upgrades aren’t trivial for me as I tinker quite a bit with ...
[Read more...]

R⁶ — Scraping Images To PDFs

June 5, 2017 | hrbrmstr

I’ve been doing intermittent prep work for a follow-up to an earlier post on store closings and came across this CNN Money “article” on it. Said “article” is a deliberately obfuscated or lazily crafted series of GIF images that contain all the Radio Shack impending store closings. It’s ... [Read more...]

Drilling Into CSVs — Teaser Trailer

May 31, 2017 | hrbrmstr

I used reading a directory of CSVs as the foundational example in my recent post on idioms. During my exchange with Matt, Hadley and a few others — in the crazy Twitter thread that spawned said post — I mentioned that I’d personally “just use Drill”. I’ll use this post ... [Read more...]

R⁶ — Idiomatic (for the People)

May 23, 2017 | hrbrmstr

NOTE: I’ll do my best to ensure the next post will have nothing to do with Twitter, and this post might not completely meet my R⁶ criteria. A single, altruistic, nigh exuberant R tweet about slurping up a directory of CSVs devolved quickly — at least in my opinion, and ... [Read more...]

A Very Palette-able Post

May 21, 2017 | hrbrmstr

Many of my posts seem to begin with a link to a tweet, and this one falls into that pattern: And @_inundata is already working on a #rstats palette. https://t.co/bNfpL7OmVl— Timothée Poisot (@tpoi) May 21, 2017 I’d seen the Ars Tech post about the named color ...
[Read more...]
1 9 10 11 12 13 21

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)