An R flaw: unexpected attribute droppings

[This article was first published on Odd Hypothesis, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Today I was putting some code together that made plots from slices of a 3-dimensional array object aa. A couple of the dimensions in aa had names defined by named vectors. For example:

<span class="pun">></span><span class="pln"> aa </span><span class="pun">=</span><span class="pln"> array</span><span class="pun">(</span><span class="pln">runif</span><span class="pun">(</span><span class="lit">2</span><span class="pun">*</span><span class="lit">3</span><span class="pun">*</span><span class="lit">4</span><span class="pun">),</span><span class="pln"> <br />             dim</span><span class="pun">=</span><span class="pln">c</span><span class="pun">(</span><span class="lit">2</span><span class="pun">,</span><span class="lit">3</span><span class="pun">,</span><span class="lit">4</span><span class="pun">),</span><span class="pln"> <br />             dimnames</span><span class="pun">=</span><span class="pln">list</span><span class="pun">(</span><span class="pln">id  </span><span class="pun">=</span><span class="pln"> c</span><span class="pun">(</span><span class="pln">good</span><span class="pun">=</span><span class="str">'id1'</span><span class="pun">,</span><span class="pln"> evil</span><span class="pun">=</span><span class="str">'id2'</span><span class="pun">),</span><span class="pln"> <br />                           x   </span><span class="pun">=</span><span class="pln"> c</span><span class="pun">(</span><span class="lit">1</span><span class="pun">,</span><span class="lit">2</span><span class="pun">,</span><span class="lit">3</span><span class="pun">),</span><span class="pln"> <br />                           </span><span class="kwd">var</span><span class="pln"> </span><span class="pun">=</span><span class="pln"> c</span><span class="pun">(</span><span class="pln">up</span><span class="pun">=</span><span class="str">'a'</span><span class="pun">,</span><span class="pln"> dn</span><span class="pun">=</span><span class="str">'b'</span><span class="pun">,</span><span class="pln"> lt</span><span class="pun">=</span><span class="str">'c'</span><span class="pun">,</span><span class="pln"> rt</span><span class="pun">=</span><span class="str">'d'</span><span class="pun">)))</span><span class="pln"><br /></span><span class="pun">></span><span class="pln"> str</span><span class="pun">(</span><span class="pln">aa</span><span class="pun">)</span><span class="pln"><br /> num </span><span class="pun">[</span><span class="lit">1</span><span class="pun">:</span><span class="lit">2</span><span class="pun">,</span><span class="pln"> </span><span class="lit">1</span><span class="pun">:</span><span class="lit">3</span><span class="pun">,</span><span class="pln"> </span><span class="lit">1</span><span class="pun">:</span><span class="lit">4</span><span class="pun">]</span><span class="pln"> </span><span class="lit">0.0138</span><span class="pln"> </span><span class="lit">0.2942</span><span class="pln"> </span><span class="lit">0.7988</span><span class="pln"> </span><span class="lit">0.3465</span><span class="pln"> </span><span class="lit">0.8751</span><span class="pln"> </span><span class="pun">...</span><span class="pln"><br /> </span><span class="pun">-</span><span class="pln"> attr</span><span class="pun">(*,</span><span class="pln"> </span><span class="str">"dimnames"</span><span class="pun">)=</span><span class="typ">List</span><span class="pln"> of </span><span class="lit">3</span><span class="pln"><br />  </span><span class="pun">..</span><span class="pln">$ id </span><span class="pun">:</span><span class="pln"> </span><span class="typ">Named</span><span class="pln"> chr </span><span class="pun">[</span><span class="lit">1</span><span class="pun">:</span><span class="lit">2</span><span class="pun">]</span><span class="pln"> </span><span class="str">"id1"</span><span class="pln"> </span><span class="str">"id2"</span><span class="pln"><br />  </span><span class="pun">..</span><span class="pln"> </span><span class="pun">..-</span><span class="pln"> attr</span><span class="pun">(*,</span><span class="pln"> </span><span class="str">"names"</span><span class="pun">)=</span><span class="pln"> chr </span><span class="pun">[</span><span class="lit">1</span><span class="pun">:</span><span class="lit">2</span><span class="pun">]</span><span class="pln"> </span><span class="str">"good"</span><span class="pln"> </span><span class="str">"evil"</span><span class="pln"><br />  </span><span class="pun">..</span><span class="pln">$ x  </span><span class="pun">:</span><span class="pln"> chr </span><span class="pun">[</span><span class="lit">1</span><span class="pun">:</span><span class="lit">3</span><span class="pun">]</span><span class="pln"> </span><span class="str">"1"</span><span class="pln"> </span><span class="str">"2"</span><span class="pln"> </span><span class="str">"3"</span><span class="pln"><br />  </span><span class="pun">..</span><span class="pln">$ </span><span class="kwd">var</span><span class="pun">:</span><span class="pln"> </span><span class="typ">Named</span><span class="pln"> chr </span><span class="pun">[</span><span class="lit">1</span><span class="pun">:</span><span class="lit">4</span><span class="pun">]</span><span class="pln"> </span><span class="str">"a"</span><span class="pln"> </span><span class="str">"b"</span><span class="pln"> </span><span class="str">"c"</span><span class="pln"> </span><span class="str">"d"</span><span class="pln"><br />  </span><span class="pun">..</span><span class="pln"> </span><span class="pun">..-</span><span class="pln"> attr</span><span class="pun">(*,</span><span class="pln"> </span><span class="str">"names"</span><span class="pun">)=</span><span class="pln"> chr </span><span class="pun">[</span><span class="lit">1</span><span class="pun">:</span><span class="lit">4</span><span class="pun">]</span><span class="pln"> </span><span class="str">"up"</span><span class="pln"> </span><span class="str">"dn"</span><span class="pln"> </span><span class="str">"lt"</span><span class="pln"> </span><span class="str">"rt"</span>

Thus, I could access “aliases” for dimension names in id and var by:

<span class="pun">></span><span class="pln"> names</span><span class="pun">(</span><span class="pln">dimnames</span><span class="pun">(</span><span class="pln">aa</span><span class="pun">)</span><span class="pln">$id</span><span class="pun">)</span><span class="pln"><br /></span><span class="pun">[</span><span class="lit">1</span><span class="pun">]</span><span class="pln"> </span><span class="str">"good"</span><span class="pln"> </span><span class="str">"evil"</span><span class="pln"><br /></span><span class="pun">></span><span class="pln"> names</span><span class="pun">(</span><span class="pln">dimnames</span><span class="pun">(</span><span class="pln">aa</span><span class="pun">)</span><span class="pln">$var</span><span class="pun">)</span><span class="pln"><br /></span><span class="pun">[</span><span class="lit">1</span><span class="pun">]</span><span class="pln"> </span><span class="str">"up"</span><span class="pln"> </span><span class="str">"dn"</span><span class="pln"> </span><span class="str">"lt"</span><span class="pln"> </span><span class="str">"rt"</span>

The code I wrote would iterate over the 3rd dimension, using the resulting 2D array’s to produce a series of plots using matplot(). To make legends more readable, I made use of the names attribute for dimnames as above. In the first version, I used apply() to do the iterating:

<span class="pun">></span><span class="pln"> apply</span><span class="pun">(</span><span class="pln">aa</span><span class="pun">,</span><span class="pln"> </span><span class="lit">3</span><span class="pun">,</span><span class="pln"> </span><span class="kwd">function</span><span class="pun">(</span><span class="pln">xy</span><span class="pun">)</span><span class="pln"> </span><span class="pun">{</span><span class="pln"><br />    x </span><span class="pun">=</span><span class="pln"> </span><span class="kwd">as</span><span class="pun">.</span><span class="pln">numeric</span><span class="pun">(</span><span class="pln">dimnames</span><span class="pun">(</span><span class="pln">xy</span><span class="pun">)</span><span class="pln">$x</span><span class="pun">)</span><span class="pln"><br />    matplot</span><span class="pun">(</span><span class="pln">x</span><span class="pun">,</span><span class="pln"> y</span><span class="pun">=</span><span class="pln">t</span><span class="pun">(</span><span class="pln">xy</span><span class="pun">))</span><span class="pln"><br />    legend</span><span class="pun">(</span><span class="str">'topleft'</span><span class="pun">,</span><span class="pln"> legend</span><span class="pun">=</span><span class="pln">names</span><span class="pun">(</span><span class="pln">dimnames</span><span class="pun">(</span><span class="pln">xy</span><span class="pun">)</span><span class="pln">$id</span><span class="pun">),</span><span class="pln"> fill</span><span class="pun">=</span><span class="lit">1</span><span class="pun">:</span><span class="pln">nrow</span><span class="pun">(</span><span class="pln">xy</span><span class="pun">))</span><span class="pln"><br /><br />    NULL<br />  </span><span class="pun">})</span>

This worked perfectly fine, however, later I decided it would be more informative to use the names in the iterating dimension for a plot title. So I refactored a bit to use sapply():

<span class="pun">></span><span class="pln"> sapply</span><span class="pun">(</span><span class="lit">1</span><span class="pun">:</span><span class="pln">dim</span><span class="pun">(</span><span class="pln">aa</span><span class="pun">)[</span><span class="lit">3</span><span class="pun">],</span><span class="pln"> </span><span class="kwd">function</span><span class="pun">(</span><span class="pln">k</span><span class="pun">)</span><span class="pln"> </span><span class="pun">{</span><span class="pln"><br />    xy </span><span class="pun">=</span><span class="pln"> aa</span><span class="pun">[,,</span><span class="pln">k</span><span class="pun">]</span><span class="pln"><br /><br />    x </span><span class="pun">=</span><span class="pln"> </span><span class="kwd">as</span><span class="pun">.</span><span class="pln">numeric</span><span class="pun">(</span><span class="pln">dimnames</span><span class="pun">(</span><span class="pln">xy</span><span class="pun">)</span><span class="pln">$x</span><span class="pun">)</span><span class="pln"><br />    matplot</span><span class="pun">(</span><span class="pln">x</span><span class="pun">,</span><span class="pln"> y</span><span class="pun">=</span><span class="pln">t</span><span class="pun">(</span><span class="pln">xy</span><span class="pun">))</span><span class="pln"><br />    legend</span><span class="pun">(</span><span class="str">'topleft'</span><span class="pun">,</span><span class="pln"> legend</span><span class="pun">=</span><span class="pln">names</span><span class="pun">(</span><span class="pln">dimnames</span><span class="pun">(</span><span class="pln">xy</span><span class="pun">)</span><span class="pln">$id</span><span class="pun">),</span><span class="pln"> fill</span><span class="pun">=</span><span class="lit">1</span><span class="pun">:</span><span class="pln">nrow</span><span class="pun">(</span><span class="pln">xy</span><span class="pun">))</span><span class="pln"><br /><br />    title</span><span class="pun">(</span><span class="pln">main</span><span class="pun">=</span><span class="pln">names</span><span class="pun">(</span><span class="pln">dimnames</span><span class="pun">(</span><span class="pln">aa</span><span class="pun">)</span><span class="pln">$var</span><span class="pun">[</span><span class="pln">k</span><span class="pun">]))</span><span class="pln"><br /><br />    NULL<br />  </span><span class="pun">})</span>

I was a little surprised that this threw an error indicating that the names associated with dimnames(aa)$id were non-existant:

<span class="pln"> </span><span class="typ">Error</span><span class="pln"> </span><span class="kwd">in</span><span class="pln"> legend</span><span class="pun">(</span><span class="str">"topleft"</span><span class="pun">,</span><span class="pln"> legend </span><span class="pun">=</span><span class="pln"> names</span><span class="pun">(</span><span class="pln">dimnames</span><span class="pun">(</span><span class="pln">xy</span><span class="pun">)</span><span class="pln">$id</span><span class="pun">),</span><span class="pln"> fill </span><span class="pun">=</span><span class="pln"> </span><span class="lit">1</span><span class="pun">:</span><span class="pln">nrow</span><span class="pun">(</span><span class="pln">xy</span><span class="pun">))</span><span class="pln"> </span><span class="pun">:</span><span class="pln"> <br />  </span><span class="str">'legend'</span><span class="pln"> </span><span class="kwd">is</span><span class="pln"> of length </span><span class="lit">0</span><span class="pln"> </span>

Upon inspection, it seems that it is R’s default behavior to drop attributes on dimnames when an array is subsetted.

<span class="pun">></span><span class="pln"> str</span><span class="pun">(</span><span class="pln">aa</span><span class="pun">[,,</span><span class="lit">1</span><span class="pun">])</span><span class="pln"><br /> num </span><span class="pun">[</span><span class="lit">1</span><span class="pun">:</span><span class="lit">2</span><span class="pun">,</span><span class="pln"> </span><span class="lit">1</span><span class="pun">:</span><span class="lit">3</span><span class="pun">]</span><span class="pln"> </span><span class="lit">0.0138</span><span class="pln"> </span><span class="lit">0.2942</span><span class="pln"> </span><span class="lit">0.7988</span><span class="pln"> </span><span class="lit">0.3465</span><span class="pln"> </span><span class="lit">0.8751</span><span class="pln"> </span><span class="pun">...</span><span class="pln"><br /> </span><span class="pun">-</span><span class="pln"> attr</span><span class="pun">(*,</span><span class="pln"> </span><span class="str">"dimnames"</span><span class="pun">)=</span><span class="typ">List</span><span class="pln"> of </span><span class="lit">2</span><span class="pln"><br />  </span><span class="pun">..</span><span class="pln">$ id</span><span class="pun">:</span><span class="pln"> chr </span><span class="pun">[</span><span class="lit">1</span><span class="pun">:</span><span class="lit">2</span><span class="pun">]</span><span class="pln"> </span><span class="str">"id1"</span><span class="pln"> </span><span class="str">"id2"</span><span class="pln"><br />  </span><span class="pun">..</span><span class="pln">$ x </span><span class="pun">:</span><span class="pln"> chr </span><span class="pun">[</span><span class="lit">1</span><span class="pun">:</span><span class="lit">3</span><span class="pun">]</span><span class="pln"> </span><span class="str">"1"</span><span class="pln"> </span><span class="str">"2"</span><span class="pln"> </span><span class="str">"3"</span>

Adding a drop=FALSE to the indexing doesn’t work. The only fix I could come up with was to reassign the additional attributes after subsetting:

<span class="pun">></span><span class="pln"> sapply</span><span class="pun">(</span><span class="lit">1</span><span class="pun">:</span><span class="pln">dim</span><span class="pun">(</span><span class="pln">aa</span><span class="pun">)[</span><span class="lit">3</span><span class="pun">],</span><span class="pln"> </span><span class="kwd">function</span><span class="pun">(</span><span class="pln">k</span><span class="pun">)</span><span class="pln"> </span><span class="pun">{</span><span class="pln"><br />    xy </span><span class="pun">=</span><span class="pln"> aa</span><span class="pun">[,,</span><span class="pln">k</span><span class="pun">]</span><span class="pln"><br /><br />    </span><span class="com"># !! recover additional dimname attributes </span><span class="pln"><br />    </span><span class="com">#    dropped by subsetting !! #</span><span class="pln"><br />    dimnames</span><span class="pun">(</span><span class="pln">xy</span><span class="pun">)</span><span class="pln"> </span><span class="pun">=</span><span class="pln"> dimnames</span><span class="pun">(</span><span class="pln">aa</span><span class="pun">)[</span><span class="pln">names</span><span class="pun">(</span><span class="pln">dimnames</span><span class="pun">(</span><span class="pln">aa</span><span class="pun">))</span><span class="pln"> </span><span class="pun">%</span><span class="kwd">in</span><span class="pun">%</span><span class="pln"> names</span><span class="pun">(</span><span class="pln">dimnames</span><span class="pun">(</span><span class="pln">xy</span><span class="pun">))]</span><span class="pln"><br /><br />    x </span><span class="pun">=</span><span class="pln"> </span><span class="kwd">as</span><span class="pun">.</span><span class="pln">numeric</span><span class="pun">(</span><span class="pln">dimnames</span><span class="pun">(</span><span class="pln">xy</span><span class="pun">)</span><span class="pln">$x</span><span class="pun">)</span><span class="pln"><br />    matplot</span><span class="pun">(</span><span class="pln">x</span><span class="pun">,</span><span class="pln"> y</span><span class="pun">=</span><span class="pln">t</span><span class="pun">(</span><span class="pln">xy</span><span class="pun">))</span><span class="pln"><br />    legend</span><span class="pun">(</span><span class="str">'topleft'</span><span class="pun">,</span><span class="pln"> legend</span><span class="pun">=</span><span class="pln">names</span><span class="pun">(</span><span class="pln">dimnames</span><span class="pun">(</span><span class="pln">xy</span><span class="pun">)</span><span class="pln">$id</span><span class="pun">),</span><span class="pln"> fill</span><span class="pun">=</span><span class="lit">1</span><span class="pun">:</span><span class="pln">nrow</span><span class="pun">(</span><span class="pln">xy</span><span class="pun">))</span><span class="pln"><br /><br />    title</span><span class="pun">(</span><span class="pln">main</span><span class="pun">=</span><span class="pln">names</span><span class="pun">(</span><span class="pln">dimnames</span><span class="pun">(</span><span class="pln">aa</span><span class="pun">)</span><span class="pln">$var</span><span class="pun">[</span><span class="pln">k</span><span class="pun">]))</span><span class="pln"><br /><br />    NULL<br />  </span><span class="pun">})</span>

To the greater R community, I ask – is this behavior a flaw, or was it done on purpose? If the latter, I pleadingly ask WHYYYYyyyyyyyy!

Written with StackEdit.

To leave a comment for the author, please follow the link and comment on their blog: Odd Hypothesis.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)