Blog Archives

Comparing rental prices in the US market

December 10, 2017
By
Comparing rental prices in the US market

Rental Prices in the US As a fortunate resident of New York City, I get to enjoy the many activities offered by a city that never really turns off its lights. The price to pay for such entertainment, however, are the crazy rental fees. I have stopped counting the number of conversations I have had with friends about some of...

Read more »

NFL Series

October 7, 2017
By
NFL Series

If you have previously attempted to analyze NFL data, it is likely that you have tried to scrape ESPN or football-reference, which provides a wealth on statistics surrounding game data. However, if you ever wanted to obtain truly in-depth data, then it...

Read more »

Building a Kafka and Spark Streaming pipeline – Part I

September 24, 2016
By
Building a Kafka and Spark Streaming pipeline – Part I

Many companies across a multitude of industries are currently maintaining data pipelines used to ingest and analyze large data streams. In effect, the proper implementation of such pipelines belongs to the realm of “data engineering”, and represent...

Read more »

A map of elevators in NYC

August 30, 2016
By

Not too long ago, I came across a random tweet pointing to a GitHub repository full of miscallaneous datasets. I was imemdiately excited by the various data science and visualization problems tha I could take on and immediately decided to get my hand...

Read more »

The biggest liars in US politics

June 10, 2016
By
The biggest liars in US politics

Who lies the most in US politics? Most Americans, and anyone that follows US politics, will be aware of the tremendous changes and volatility that has struck the US political landscape in the past year. The ascent of Donald Trump from a billionaire entertainer to a fully fledged presidential candidate, alongside the unexpected popularity of Bernie Sanders and the nomination...

Read more »

Data science with Docker

April 29, 2016
By

Using docker to facilitate your data science pipelines Until recently, and like many other fellow data scientists I have talked to, I built data science pipelines on my local machine or a remote host while relying on virtual environments. In doing so, I ensured some degree of replicability by keeping check of language versions, library versions, and so on. While...

Read more »

Player and roster similarity in the NBA

March 16, 2016
By

Recently, professional sports associations and teams have made big strides towards leveraging data to inform both personel and on-the-field decision making. While the four major leagues (NBA, NFL, MLB, NHL) vary in terms of where they are in that process, most people would argue that the NBA is at the forefront of this movement. If you have never heard...

Read more »

Tracking Social Issues and Topics in Presidential Speeches

October 22, 2015
By

Scraping presidential transcripts To begin, we must scrape the content of all presidential speeches recorded in American history. To do that, I’ll rely on the very handy BeautifulSoup library, and eventually store all data in a pandas dataframe that will be persisted in a pickle file. # import required libraries to scrape presidential transcripts from bs4 import BeautifulSoup import pandas as pd import...

Read more »

Cloning a graph in Python

October 9, 2015
By

If you have ever played around with Algorithms & Data Structures, then you most likely have heard of Leetcode.com, which contains a number of famous (or infamous) of technical questions. One of my favorite in there is the graph clone question, which can be shortly stated as: Clone an undirected graph. Each node in the graph contains a label and...

Read more »

Prison Locations around the USA

September 8, 2015
By

I recently discovered the enigma.io resource, a repository of freely available public with the following goals (as stated on their website): The volume of data created by governments and businesses is growing exponentially. Organizations struggle ju...

Read more »

Search R-bloggers


Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)