Blog Archives

Machine learning for hackers

October 23, 2012
By
Machine learning for hackers

Which way do you prefer to learn a new material – deep theoretical background first and practice later or do you like to break things in order to fix them? If latter is your way of learning things, then most likely you will enjoy Machine Learning for Hackers. The book has chapters on machine learning

Read more »

Garmin data visualization

October 4, 2012
By
Garmin data visualization

People go on rage, when governments initiate surveillance projects like CleanIT, nevertheless share very private data without a doubt. I have to admit, that some data leaks are well buried in the process. Take for example Garmin which produces GPS training devices for runners. In order to see your workouts you are forced to upload

Read more »

RStudio server through ssh

August 10, 2012
By

R language has numerous IDEs – RStudio, Vim plugin, Eclipse plugin. RStudio really shines for R language, nevertheless Vim plugin might be well adapted for R if you are Vim guru. Eclipse? Who needs such behemoth? Turns out a student in Ljubljana badly needs it. Most of the time I use remote server for R

Read more »

Building a presentation, report or paper in R

August 1, 2012
By

If you need to build a presentation, obviously you have following options: Powerpoint alike presentation Online engines LaTex The first two are beloved by business people and the third one is widely used in academia. The objective of the first group is shiny presentation, contrary to the second where asceticism and demand for automation are

Read more »

How to track Twitter unfollowers in R

July 18, 2012
By
How to track Twitter unfollowers in R

I have Twitter account and it is relatively easy to see new followers or subscribers. However, I was looking for ways to know who are the unfollowers. I have noticed, that some (un)subscriptions happen in bulks, which made me thinking that either I tweeted some bullshit and upset bunch of people or spam bots work

Read more »

Data mining for network security and intrusion detection

July 16, 2012
By
Data mining for network security and intrusion detection

In preparation for “Haxogreen” hackers summer camp which takes place in Luxembourg, I was exploring network security world. My motivation was to find out how data mining is applicable to network security and intrusion detection. Flame virus, Stuxnet, Duqu proved that static, signature based security systems are not able to detect very advanced, government sponsored

Read more »

My first competition at Kaggle

July 2, 2012
By
My first competition at Kaggle

For me Kaggle becomes a social network for data scientist, as stackoverflow.com or github.com for programmers. If you are data scientist, machine learner or statistician you better off to have a profile there, otherwise you do not exist. Nevertheless, I won’t bet on rosy future for data scientist as journalists suggest (sexy job for next

Read more »

GitHub data analysis

May 15, 2012
By
GitHub data analysis

Few weeks ago GitHub announced, that its timeline data is available on bigquery for analysis. Moreover, it offers prizes for the best visualization of the data. Despite my art skills and minimal chances to win beauty contest, I decided to crunch GitHub data and run data analysis. After initial trial of bigquery service, I found hard

Read more »

Machine learning for identification of cars

April 22, 2012
By
Machine learning for identification of cars

There are plenty of data on internet, however it is raw data. Think for a second about public surveillance cameras - useful to check the traffic on the route or busy place, but anything else? What if you want to know how many cars are on the route? How many car were yesterday at the same time?

Read more »

How to organize R user group

April 18, 2012
By

The first thing, what you have to do is to estimate how many users will be interested in local R group. I would say, that out of one million inhabitants you can expect 10-20 users. Based on this raw number, you can know, what challenges are waiting for you. If you expect 100 or more users, you have

Read more »