# Articles by mrtnj

### ‘Simulating genetic data with R: an example with deleterious variants (and a pun)’

June 16, 2019 |

A few weeks ago, I gave a talk at the Edinburgh R users group EdinbR on the RAGE paper. Since this is an R meetup, the talk concentrated on the mechanics of genetic data simulation and with the paper as a case study. I showed off some of what Chris ...

### What single step does with relationship

May 28, 2019 |

We had a journal club about the single step GBLUP method for genomic evaluation a few weeks ago. In this post, we’ll make a few graphs of how the single step method models relatedness between individuals. Imagine you want to use genomic selection in a breeding program that already ...

### Using R: plotting the genome on a line

March 31, 2019 |

Imagine you want to make a Manhattan-style plot or anything else where you want a series of intervals laid out on one axis after one another. If it’s actually a Manhattan plot you may have a friendly R package that does it for you, but here is how to ...

### Showing a difference in means between two groups

January 13, 2019 |

Visualising a difference in mean between two groups isn’t as straightforward as it should. After all, it’s probably the most common quantitative analysis in science. There are two obvious options: we can either plot the data from the two groups separately, or we can show the estimate of ...

### Using R: the best thing I’ve changed about my code in years

December 1, 2018 |

Hopefully, one’s coding habits are constantly improving. If you feel any doubt about yourself, I suggest looking back at something you wrote 2011. One thing I’ve changed recently that made my life so much better is a simple silly thing: meaningful name for index and counter variables. Take a ... [Read more...]

### Using R: reshape2 to tidyr

December 17, 2017 |

Tidy data — it’s one of those terms that tend to confuse people, and certainly confused me. It’s Codd’s third normal form, but you can’t go around telling that to people and expect to be understood. One form is ”long”, the other is ”wide”. One form is ”... [Read more...]

### Scripting for data analysis (with R)

July 30, 2017 |

Course materials (GitHub) This was a PhD course given in the spring of 2017 at Linköping University. The course was organised by the graduate school Forum scientium and was aimed at people who might be interested in using R for data analysis. The materials developed from a part of a ... [Read more...]

### Summer of data science 1: Genomic prediction machines #SoDS17

July 9, 2017 |

Genetics is a data science, right? One of my Summer of data science learning points was to play with out of the box prediction tools. So let’s try out a few genomic prediction methods. The code is on GitHub, and the simulated data are on Figshare. Genomic selection is ...

### Using R: When using do in dplyr, don’t forget the dot

May 21, 2017 |

There will be a few posts about switching from plyr/reshape2 for data wrangling to the more contemporary dplyr/tidyr. My most common use of plyr looked something like this: we take a data frame, split it by some column(s), and use an anonymous function to do something useful. ... [Read more...]

### Mutation, selection, and drift (with Shiny)

May 14, 2017 |

Imagine a gene that comes in two variants, where one of them is deleterious to the carrier. This is not so hard to imagine, and it is often the case. Most mutations don’t matter at all. Of those that matter, most are damaging. Next, imagine that the mutation happens ...

### Using R: a function that adds multiple ggplot2 layers

April 23, 2017 |

Another interesting thing that an R course participant identified: Sometimes one wants to make a function that returns multiple layers to be added to a ggplot2 plot. One could think that just adding them and returning would work, but it doesn’t. I think it has to do with how + ...

### Using R: Don’t save your workspace

April 2, 2017 |

To everyone learning R: Don’t save your workspace. When you exit an R session, you’re faced with the question of whether or not to save your workspace. You should almost never answer yes. Saving your workspace creates an image of your current variables and functions, and saves them ... [Read more...]

### It seems dplyr is overtaking correlation heatmaps

March 8, 2017 |

(… on my blog, that is.) For a long time, my correlation heatmap with ggplot2 was the most viewed post on this blog. It still leads the overall top list, but by far the most searched and visited post nowadays is this one about dplyr (followed by it’s sibling about ... [Read more...]

### Using R: tibbles and the t.test function

February 12, 2017 |

A participant in the R course I’m teaching showed me a case where a tbl_df (the new flavour of data frame provided by the tibble package; standard in new RStudio versions) interacts badly with the t.test function. I had not seen this happen before. The reason is ... [Read more...]

### Balancing a centrifuge

June 11, 2016 |

I saw this cute little paper on arxiv about balancing a centrifuge: Peil & Hauryliuk (2010) A new spin on spinning your samples: balancing rotors in a non-trivial manner. Let us have a look at the maths of balancing a centrifuge. The way I think most people (including myself) balance their samples ...

### Toying with models: The Game of Life with selection

February 29, 2016 |

Conway’s Game of life is probably the most famous cellular automaton, consisting of a grid of cells developing according simple rules. Today, we’re going to add mutation and selection to the game, and see let patterns evolve. The fate of a cell depends on the number cells that ...

### Toying with models: The Luria–Delbrück fluctuation test

February 19, 2016 |

I hope that Genetics will continue running expository papers about their old classics, like this one by Philip Meneely about Luria & Delbrück (1943). Luria & Delbrück performed an experiment on bacteriophage resistance in Escherichia coli, growing bacterial cultures, exposing … Läs mer →

### R in genomics @ SciLifeLab, Solna

March 24, 2015 |

Dear diary, I went to the Stockholm R useR group meetup on R in genomics at the Stockholm node of SciLifeLab. It was nice. If I had worked a bit closer I would attend meetups all the time. I even got to be pretentious with my notebook while waiting for ...

### Finding the distance from ChIP signals to genes

July 4, 2014 |

I’ve had a couple of months off from blogging. Time for some computer-assisted biology! Robert Griffin asks on Stack Exchange about finding the distance between HP1 binding sites and genes in Drosophila melanogaster.  We can get a rough idea with some public chromatin immunoprecipitation data, R and the wonderful ... [Read more...]

### More fun with %.% and %>%

March 27, 2014 |

The %.% operator in dplyr allows one to put functions together without lots of nested parentheses. The flanking percent signs are R’s way of denoting infix operators; you might have used %in% which corresponds to the match function or %*% which is matrix multiplication. The %.% operator is also called chain, and ... [Read more...]
1 2 3 4