November 2010

Feature selection: Using the caret package

November 16, 2010 | Allan Engelhardt

Feature selection is an important step for practical commercial data mining which is often characterised by data sets with far too many variables for model building. In a previous post we looked at all-relevant feature selection using the Boruta package while in this post we consider the same (artificial, toy) ...
[Read more...]

Feature selection: Using the caret package

November 16, 2010 | Allan Engelhardt

Feature selection is an important step for practical commercial data mining which is often characterised by data sets with far too many variables for model building. In a previous post we looked at all-relevant feature selection using the Boruta package while in this post we consider the same (artificial, toy) ... [Read more...]

Data Science meets Humanities

November 16, 2010 | David Smith

There's an interesting article in the NYT today about the emerging discipline of "digital humanities": extracting digital data from historical archives to answer questions from the Arts and Humanities. From the article: Members of a new generation of digitally savvy humanists argue it is time to stop looking for inspiration ... [Read more...]

Postdoc in Wharton

November 16, 2010 | xi'an

Just received this email from José Bernardo about an exciting postdoc position in Wharton: POST-DOCTORAL FELLOW – DEPARTMENT OF STATISTICS, THE WHARTON SCHOOL The Department of Statistics at The Wharton School of the University of Pennsylvania is seeking candidates for a Post-Doctoral Fellowship. This research fellowship provides full funding without any ... [Read more...]

Loops in R: Think different

November 15, 2010 | David Smith

Especially for programmers that come to R from other languages, R sometimes gets dinged about the speed of its for loops. But a lot of the time, where you might have needed an iterative loop in another language to solve a specific task, you don't need a for loop in ... [Read more...]

Isarithmic History of the Two-Party Vote

November 15, 2010 | d sparks

A few weeks ago, I shared a series of choropleth maps of U.S. presidential election returns, illustrating the relative support for Democratic, Republican, and third Party candidates since 1920. The granularity of these county level results led me to wonder whether it would be possible to develop an isarithmic map ... [Read more...]

Introducing Monte Carlo in PaRis

November 14, 2010 | xi'an

As already announced on Statisfaction, I will start a short [14 hour] course in English based on Introducing Monte Carlo Methods with R at ENSAE next Tuesday. The slides were written by George Casella for a course he gave in Italy last spring and he kindly agreed on making them available ... [Read more...]

ZAT! 2010

November 13, 2010 | romain francois

Tomorrow is the last day to enjoy the first edition of Montpellier's ZAT! (Zones Artistiques Temporaires). I was there this afternoon and tonight, but I found it much more picture worthy tonight: Other people have also taken pictures and sha... [Read more...]

Reporting Standard Errors for USL Coefficients

November 13, 2010 | Neil Gunther

In a recent Guerrilla CaP Group discussion, Baron S. wrote:....BS__ Using gnuplot against the dataset I gave, I get BS__    sigma   0.0207163 +/- 0.001323 (6.385%) BS__    kappa   0.000861226 +/- 5.414e-05 (6.287%) The Gnuplot output includes the errors for each of the universal scalability law (USL) coefficients. A question about the magnitude of these errors ... [Read more...]

My Day at ACM Data Mining Camp III

November 13, 2010 | Ryan Rosario

My first time at ACM Data Mining Camp was so awesome, that I was thrilled the make the trip up to San Jose for the November 2010 version. In July, I gave a talk at the Emerging Technologies for Online Learning Symposium conference with a faculty member in the Department of ... [Read more...]

Programming with R – Checking Data Types

November 13, 2010 | Ralph

There are a number of useful functions in R that test the variable type or convert between different variable types. These can be used to validate function input to ensure that sensible answers are returned from a function or to ensure that the function doesn’t fail. Following on from ... [Read more...]

Because it’s Friday: Asteroids

November 12, 2010 | David Smith

A huge mass of rock hurtling in from space could really make a mess of your weekend plans. So it's comforting to know that the world's astronomers are out there keeping an eye for any potential earth-grazers. See their discoveries over the past 30 years in this beautifully-designed animation: Earth crossers ... [Read more...]

New R User Group in Cincinnati / Dayton

November 12, 2010 | David Smith

The latest local R user group to join the fold is CinDay RUG, serving the Cincinnati/Dayton area in Ohio. The group was founded by Stu Rodgers, who decided to set it up after posting a query on LinkedIn and finding several other R users in the area. Even if ... [Read more...]

Update: Forbes wants your R stories by Nov 17

November 12, 2010 | David Smith

I mentioned recently that Forbes is seeking stories about R for a forthcoming issue. Well, the story will now be in the December issue (bumped up from the January issue), so be sure to get your post your stories about R to the Mean Business blog by November 17. Forbes: Names ... [Read more...]
1 3 4 5 6 7 9

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)