Formula for Kickstarter Success: Copious Planning (just like real life)

September 24, 2012
By
Formula for Kickstarter Success: Copious Planning (just like real life)

It seems like the press can’t wait two days between awful depressing articles about how Kickstarter encourages fraud, broken promises, and tears.  Today I’d like to take you through a project thatshipped — let’s see what we can learn.  I interviewed Diana Rodgers of Wonder Threads.     Diana sought funding to expand her tech … Continue reading...

Read more »

Transition probabilities when adjacent sequence items must be different

September 24, 2012
By

Generating a random sequence from a fixed set of items is a common requirement, e.g., given the items A, B and C we might generate the sequence BACABCCBABC. Often the randomness is tempered by requirements such as each item having each item appear a given number of times in a sequence of a given length,

Read more »

Top 20 Data Visualization Tools

September 24, 2012
By
Top 20 Data Visualization Tools

Every researcher or practitioner of quality (or pretty much any other subject, for that matter) needs a great toolbox packed with flexible visualization tools. I am very happy to see this list

Read more »

Learn R and Python, and Have Fun Doing It

September 24, 2012
By

If you need to catch up on all those years you spent not learning how to code (you need to know how to code), here are a few resources to help you quickly learn R and Python, and have a little fun doing it.First, the free online Coursera course Co...

Read more »

From continuous to categorical

September 24, 2012
By
From continuous to categorical

During data analysis, it is often super useful to turn continuous variables into categorical ones.  In Stata you would do something like this:gen catvar=0replace catvar=1 if contvar>0 & contvar<=3replace catvar=2 if contvar>3 & co...

Read more »

Data Frames and Transactions

September 24, 2012
By

Transactions are a very useful tool when dealing with data mining.  It provides a way to mine itemsets or rules on datasets. In R the data must be in transactions form.  If the data is only available in a data.frame then to create (or coerce) the data frame to transaction the researcher may use the

Read more »

Coursera’s free online R course starts today

September 24, 2012
By

Coursera offers a number of on-line courses, all available for free and taught by experts in their fields. Today, the course Computing for Data Analysis begins. Taught by Johns Hopkins Biostatistics professor (and co-author of the Simply Statistics blog) Roger Peng, the course will teach you how to program in R and use the language for data analysis. Here's...

Read more »

An R Users’ Group in Davis

September 24, 2012
By

I’m excited to share that we’ve started a new R users’ group at UC Davis! Right now our main purpose is to run weekly 2-hour work/hack sessions where R users can get together to work through problems together. More info here

Read more »

Example 10.3: Enhanced scatterplot with marginal histograms

September 24, 2012
By
Example 10.3: Enhanced scatterplot with marginal histograms

Back in example 8.41 we showed how to make a graphic combining a scatterplot with histograms of each variable. A commenter suggested we change the R graphic to allow post-hoc plotting of, for example, lowess lines. In addition, there are further refinements to be made. In this R-only entry, we'll make the figure...

Read more »

Use GBIF and googleVis to Make Maps with Species Occurrence Data

September 24, 2012
By
Use GBIF and googleVis to Make Maps with Species Occurrence Data

This is a short follow up on THIS posting.. I will briefly show how to use the dismo- and the googeVis package to plot species occurrences on an interactive Google map, like the one below (HERE is the R-script)MapID2ce4348e653

Computing kook density in R

September 24, 2012
By
Computing kook density in R

Do you ever see strange lights in the sky? Do you wonder what really goes on in Area 51? Would you like to use your R hacking skills to get to the bottom of the whole UFO conspiracy? Of course, you would! UFO data from infochimps is the focus of a dat...

Read more »

qgraph version 1.1.0 and how to simply make a GUI using ‘rpanel’

September 24, 2012
By
qgraph version 1.1.0 and how to simply make a GUI using ‘rpanel’

Last week I have updated the ‘qgraph‘ package to version 1.1.0, available on CRAN now. Besides some internal changes (especially the self-loops have been substantially improved) the most important change is the addition of a GUI interface, which can be … Continue reading →

Read more »

The fear-index: is the VIX efficient to be warned about high volatility? (Finance & Systematic Processus)

September 24, 2012
By
The fear-index: is the VIX efficient to be warned about high volatility?   (Finance & Systematic Processus)

Simple visually-weighted regression plots

September 24, 2012
By
Simple visually-weighted regression plots

There has recently been a lot of discussion of so-called “visually-weighted regression” plots. Folk hero Hadley Wickham suggests that such plots would be easy to implement with ggplot2, and so I have attempted to prove him right. The approa...

Read more »

New Zealand school performance: beyond the headlines

September 24, 2012
By
New Zealand school performance: beyond the headlines

I like the idea of having data on school performance, not to directly rank schools—hard, to say the least, at this stage—but because we can start having a look at the factors influencing test results. I imagine the opportunity in … Continue reading →

Read more »

Variance targeting in garch estimation

September 24, 2012
By
Variance targeting in garch estimation

What is variance targeting in garch estimation?  And what is its effect? Previously Related posts are: A practical introduction to garch modeling Variability of garch estimates garch estimation on impossibly long series The last two of these show the variability of garch estimates on simulated series where we know the right answer.  In response to … Continue reading...

Read more »

Popularity indicator, with images (NFL)

September 23, 2012
By
Popularity indicator, with images (NFL)

It’s Friday night, there’s nothing good on TV, mmm conditions are perfect for shaggin about in R. So I’m an NFL fan, and (shameless plug) avid fan of this NFL podcast. They run their own pickem league which unless users … Continue reading →

Read more »

Universal portfolio, part 11

September 23, 2012
By
Universal portfolio, part 11

First an apology, the links to the Universal Portfolio paper have stopped working.  This is because the personal webpage of Thomas Cover at Stanford has been taken down, but fortunately the content moved elsewhere.  The new link is Universal ...

Read more »

Minimum Correlation Algorithm Example

September 23, 2012
By
Minimum Correlation Algorithm Example

Today I want to follow up with the Minimum Correlation Algorithm Paper post and show how to incorporate the Minimum Correlation Algorithm into your portfolio construction work flow and also explain why I like the Minimum Correlation Algorithm. First, let’s load the ETF’s data set used in the Minimum Correlation Algorithm Paper using the Systematic

Read more »

Video: Analyzing Big Data using Oracle R Enterprise

September 23, 2012
By

Learn how Oracle R Enterprise is used to generate new insight and new value to business, answering not only what happened, but why ...

Read more »

Football model; plots and usage

September 23, 2012
By
Football model; plots and usage

After reading data, making a predictions display and building a football data model it is time to put this to validate a bit more (regression plots) and put to usage. It appears that the regression plots in the car package were not ...

Read more »

Project Euler — problem 20

September 23, 2012
By

It’s been quite a while since my last post on Euler problems. Today a visitor post his solution to the second problem nicely, which encouraged me to keep solving these problems. Just for fun! 10! = 10 * 9 * … * 3 * 2 * 1 … Continue reading →

Read more »

The infamous apply function

September 23, 2012
By
The infamous apply function

For R beginners, the apply() function seems like a secret doorway into programming bliss. It seems so powerful, and yet, beyond reach. For those just starting out, examples of how to use apply() can really help with the intuition of how to h...

Read more »

Text Analysis Tutorial on Spam Email in R

September 23, 2012
By
Text Analysis Tutorial on Spam Email in R

Hi everyone – I just wrote a tutorial on text analysis in R using the tm and wordcloud packages. Thought some of you here might be interested in it: text-analysis-75-925

Read more »

Maximum likelihood estimates for multivariate distributions

September 22, 2012
By
Maximum likelihood estimates for multivariate distributions

Consider our loss-ALAE dataset, and - as in Frees & Valdez (1998) - let us fit a parametric model, in order to price a reinsurance treaty. The dataset is the following, > library(evd) > data(lossalae) > Z=lossalae > X=Z;Y=Z ...

Read more »

Spacing measures: heterogeneity in numerical distributions

Spacing measures: heterogeneity in numerical distributions

Numerically-coded data sequences can exhibit a very wide range of distributional characteristics, including near-Gaussian (historically, the most popular working assumption), strongly asymmetric, light- or heavy-tailed, multi-modal, or discrete (e.g., count data).  In addition, numerically coded values can be effectively categorical, either ordered, or unordered.  A specific example that illustrates the range of distributional behavior often seen in a collection...

Read more »

Maximum likelihood estimates for multivariate distributions

September 22, 2012
By
Maximum likelihood estimates for multivariate distributions

Consider our loss-ALAE dataset, and – as in Frees & Valdez (1998) - let us fit a parametric model, in order to price a reinsurance treaty. The dataset is the following, > library(evd) > data(lossalae) > Z=lossalae > X=Z;Y=Z The first step can be to estimate marginal distributions, independently. Here, we consider lognormal distributions for both components, > Fempx=function(x) mean(X<=x) >...

Read more »

Good programming practices in R

September 22, 2012
By

I write sloppy R scripts. It is a byproduct of working with a high-level language that allows you to quickly write functional code on the fly (see this post for a nice description of the problem in Python code) and the result of my limited formal training in computer programming. The lack of formal training

Read more »

KLEMS (1)

September 22, 2012
By

This post is actually a homework I did. The data file contains input use, output, quantities, costs, and prices for total U.S. nondurable manufacturing for 1949-2001. The data are defined as follows: , , , , = Inputs corresponding to capital, labor, energy, materials, and purchased services, = represents total output, = respective quantity indexes, ...

Read more »

Sponsors

Mango solutions



RStudio homepage



Zero Inflated Models and Generalized Linear Mixed Models with R

Quantide: statistical consulting and training



http://www.eoda.de







ODSC

ODSC

CRC R books series





Six Sigma Online Training





Contact us if you wish to help support R-bloggers, and place your banner here.