Blog Archives

Getting data from PDFs the easy way with R

August 24, 2018
By
Getting data from PDFs the easy way with R

Earlier this year, a new package called tabulizer was released in R, which allows you to automatically pull out tables and text from PDFs. Note, this package only works if the PDF’s text is highlightable (if it’s typed) — i.e. it won’t work for scanned-in PDFs, or image files converted to PDFs. If you don’t The post Getting data...

Read more »

How to get live stock prices with Python

July 31, 2018
By
How to get live stock prices with Python

In a previous post, I gave an introduction to the yahoo_fin package. The most updated version of the package includes new functionality allowing you to scrape live stock prices from Yahoo Finance (real-time). In this article, we’ll go through a couple ways of getting real-time data from Yahoo Finance for stocks, as well as how The post How to...

Read more »

How to download image files with RoboBrowser

July 16, 2018
By
How to download image files with RoboBrowser

In a previous post, we showed how RoboBrowser can be used to fill out online forms for getting historical weather data from Wunderground. This article will talk about how to use RoboBrowser to batch download collections of image files from Pexels, a site which offers free downloads. If you’re looking to work with images, or The post How to...

Read more »

R: How to create, delete, move, and more with files

July 11, 2018
By
R: How to create, delete, move, and more with files

Though Python is usually thought of over R for doing system administration tasks, R is actually quite useful in this regard. In this post we’re going to talk about using R to create, delete, move, and obtain information on files. How to get and change the current working directory Before working with files, it’s usually The post R: How...

Read more »

ICA on Images with Python

June 23, 2018
By
ICA on Images with Python

Click here to see my recommended reading list. What is Independent Component Analysis (ICA)? If you’re already familiar with ICA, feel free to skip below to how we implement it in Python. ICA is a type of dimensionality reduction algorithm that transforms a set of variables to a new set of components; it does so The post ICA on...

Read more »

Coding with the Yahoo_fin Package

January 24, 2018
By
Coding with the Yahoo_fin Package

Subscribe to TheAutomatic.net via the area on the right side of the page. The yahoo_fin package contains functions to scrape stock-related data from Yahoo Finance and NASDAQ. You can view the official documentation by clicking this link, but the below post will provide a few more in-depth examples. All of the functions in yahoo_fin are The post Coding with...

Read more »

Timing Python Processes

January 14, 2018
By
Timing Python Processes

Timing Python processes is made possible with several different packages. One of the most common ways is using the standard library package, time, which we’ll demonstrate with an example. However, another package that is very useful for timing a process — and particularly telling you how far along a process has come — is tqdm. The post Timing Python...

Read more »

Underrated R Functions

December 30, 2017
By
Underrated R Functions

I wanted to write a post about a couple of handy functions in R that don’t always get the recognition they deserve. This article will talk about a few functions that form part of R’s core functional programming capabilities. R has thousands of functions, so this is just a short list, and I’ll probably write The post Underrated R...

Read more »

Vectorize Fuzzy Matching

December 11, 2017
By
Vectorize Fuzzy Matching

One of the best things about R is its ability to vectorize code. This allows you to run code much faster than you would if you were using a for or while loop. In this post, we’re going to show you how to use vectorization to speed up fuzzy matching. First, a little bit of The post Vectorize Fuzzy...

Read more »

Running R Code in Parallel

October 14, 2017
By
Running R Code in Parallel

Background Running R code in parallel can be very useful in speeding up performance. Basically, parallelization allows you to run multiple processes in your code simultaneously, rather than than iterating over a list one element at a time, or running a single process at a time. Thankfully, running R code in parallel is relatively simple The post Running R...

Read more »

Search R-bloggers

Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)