2042 search results for "regression"

Building a DGA Classifier: Part 3, Model Selection

October 6, 2014
By
Building a DGA Classifier: Part 3, Model Selection

This is part two of a three-part blog series on building a DGA classifier and it is split into the three phases of building a classifier: 1) Data preparation 2) Feature engineering and 3) Model selection (this post) Back in part 1, we prepared the data and we are starting with a nice clean list of domains labeled as either legitimate (“legit”) or generated by an algorithm (“dga”)....

Read more »

TBATS with regressors

October 5, 2014
By

I’ve received a few emails about including regression variables (i.e., covariates) in TBATS models. As TBATS models are related to ETS models, tbats() is unlikely to ever include covariates as explained here. It won’t actually complain if you include an xreg argument, but it will ignore it. When I want to include covariates in a

Read more »

I don’t want to learn R! SPSS is fine! (responses)

October 2, 2014
By
I don’t want to learn R! SPSS is fine! (responses)

  I frequently find myself thinking of the best way to convince an SPSS user to make the switch to R. In the process, I came up with the three following most common objections by SPSS users and my responses.   Objection #1) I can’t use R because I don’t know how to program Response: Of ...

Read more »

Building a DGA Classifer: Part 2, Feature Engineering

October 2, 2014
By
Building a DGA Classifer: Part 2, Feature Engineering

This is part two of a three-part blog series on building a DGA classifier and it is split into the three phases of building a classifier: 1) Data preperation 2) Feature engineering and 3) Model selection. Back in part 1, we prepared the data and we are starting with a nice clean list of domains labeled as either legitamate (“legit”) or generated by...

Read more »

Why are we still teaching T-tests?

September 30, 2014
By

The following post by Norm Matloff originally appeared on his blog, Mad(Data)Scientist, on September 15th. We rarely republish posts that have appeared on other blogs, however, the questions that Norm raises both with respect to the teaching of statistics, and his assertion that "R's statistical procedures are centered far too much on significance testing" deserve a second look. Moreover,...

Read more »

Example 2014.11: Contrasts the basic way for R

September 30, 2014
By
Example 2014.11: Contrasts the basic way for R

As we discuss in section 6.1.4 of the second edition, R and SAS handle categorical variables and their parameterization in models quite differently. SAS treats them on a procedure-by-procedure basis, which leads to some odd differences in capabilities and default parameterizations. For example, in the logistic procedure, the default is effect cell coding, while in the genmod...

Read more »

seeking altruistic social scientists, demographers, survey researchers

September 30, 2014
By
seeking altruistic social scientists, demographers, survey researchers

hi everyone, please share this:  if you are an experienced user of a publicly-available survey data set from any country or international organization, let's work together on some user-friendly code and a short blog post for http://asdfree.com.&nb...

Read more »

Recognizing Patterns in the Purchase Process by Following the Pathways Marked By Others

September 27, 2014
By
Recognizing Patterns in the Purchase Process by Following the Pathways Marked By Others

Herbert Simon's "ant on the beach" does not search for food in a straight line because the environment is not uniform with pebbles, pools and rough terrain. At least the ant's decision making is confined to the 3-dimensional space defining the beach. C...

Read more »

Estimating Generalization Error with the PRESS statistic

September 25, 2014
By
Estimating Generalization Error with the PRESS statistic

As we’ve mentioned on previous occasions, one of the defining characteristics of data science is the emphasis on the availability of “large” data sets, which we define as “enough data that statistical efficiency is not a concern” (note that a “large” data set need not be “big data,” however you choose to define it). In Related posts:

Read more »

DescTools: a new R "misc package"

September 25, 2014
By
DescTools: a new R "misc package"

by Joseph Rickert One of the most difficult things about R, a problem that is particularly vexing to beginners, is finding things. This is an unintended consequence of R's spectacular, but mostly uncoordinated, organic growth. The R core team does a superb job of maintaining the stability and growth of the R language itself, but the innovation engine for...

Read more »

Sponsors

Mango solutions



RStudio homepage



Zero Inflated Models and Generalized Linear Mixed Models with R

Quantide: statistical consulting and training

datasociety

http://www.eoda.de





ODSC

ODSC

CRC R books series





Six Sigma Online Training









Contact us if you wish to help support R-bloggers, and place your banner here.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)