Blog Archives

Natural language processing tutorial

June 25, 2013
By
Natural language processing tutorial

Introduction This will serve as an introduction to natural language processing. I adapted it from slides for a recent talk at Boston Python. We will go from tokenization to feature extraction to creating a model using a machine learning algorithm. The goal is to provide a reasonable baseline on top of which more complex natural language processing can be...

Read more »

My talk at Boston Python

June 25, 2013
By

I just gave a talk at Boston Python about natural language processing in general, and edX ease and discern in specific. You can find the presentation source here, and the web version of it here. There is a video of it here. Nelle Varoquaux and Micha...

Read more »

How Many Data Scientists Are There?

August 9, 2012
By
How Many Data Scientists Are There?

How Many Data Scientists Are There?I've seen a lot of articles lately about “Big Data” and the looming “talent gap.” This article from the Wall Street Journal is a good example. It cites a McKinsey estimate that states that we will need 1.5 million more managers and analysts who are conversant with “big data.” Of...

Read more »

Tracking US Sentiments Over Time In Wikileaks

June 18, 2012
By
Tracking US Sentiments Over Time In Wikileaks

Introduction I recently posted about using the Wikileaks cable corpus to find word use patterns, both over time, and in secret cables vs unclassified cables. I received a lot of good suggestions for further topics to pursue with the corpus, and probably the most interesting was the idea to do sentiment analysis over time on a variety of...

Read more »

Finding word use patterns in Wikileaks cables

June 12, 2012
By
Finding word use patterns in Wikileaks cables

6/18: A follow-up to this post is now available here.Recent DiscoveriesWhen I was a diplomat, I was always interested in the Wikileaks cables and what could be done with them. Unfortunately, I never got a chance to look at the site in depth, due to security policies. Now that the ex- is firmly prepended to diplomat...

Read more »

NBA Playoffs Update 5 (5-4)

June 9, 2012
By
NBA Playoffs Update 5 (5-4)

This is the sixth post in my series on predicting the NBA playoffs with an algorithm. After the Boston loss in their last game, the algorithm is now 5-4 in the playoffs. Hopefully it is correct tonight! Open Sourcing the CodeI have had a couple of re...

Read more »

NBA Playoff Predictions Update 4 (5-3)

June 7, 2012
By
NBA Playoff Predictions Update 4 (5-3)

This is update 4 to my original post about predicting the NBA playoffs with R. With the Thunder beating the Spurs and the Heat losing to the Celtics, the algorithm went 1-1 on predictions, making it 5-3 so far.Making some improvements I have been posting for some time about incorporating more data into the models, and I...

Read more »

NBA Playoff Predictions Update 3 (4-2)

June 5, 2012
By
NBA Playoff Predictions Update 3 (4-2)

This is my third update to my original post on predicting the NBA playoffs with an algorithm. Here are updates 1 and 2. The algorithm correctly predicted a Boston win, but missed on the Spurs/Thunder game, so it is currently 4-2. Haven't had any time...

Read more »

NBA Playoff Predictions Update 2 and Results (3-1)

June 3, 2012
By
NBA Playoff Predictions Update 2 and Results (3-1)

This is my second follow-up to my previous two posts which were about predicting NBA games with an algorithm, and my first update to the algorithm. The algorithm's record is now 3-1, as it correctly predicted Boston and Oklahoma City as winners of the...

Read more »

Predicting NBA Playoff Games – Results and Update 1

June 1, 2012
By
Predicting NBA Playoff Games – Results and Update 1

Game ResultsI recently made a post about developing an algorithm to predict the NBA playoffs, and I concluded with 2 predictions. Although Miami beat the Celtics to make my algorithm 1-0 in terms of predictions, it fell to 1-1 when the Thunder beat th...

Read more »

Sponsors

Mango solutions



plotly webpage

dominolab webpage



Zero Inflated Models and Generalized Linear Mixed Models with R

Quantide: statistical consulting and training

datasociety

http://www.eoda.de





ODSC

ODSC

CRC R books series





Six Sigma Online Training









Contact us if you wish to help support R-bloggers, and place your banner here.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)