reticulate allows us to toggle between
python in the same session, callling
R objects when running
python scripts and vice versa. When calling
R data structures in
R structures are converted to the equivalent
python structures where applicable. However, like translating English to Mandarin, translating
R structures to
python may not be straightforward which we will see later.
vector (more specifically atomic vector)
matrix (special kind of array which is 2 dimensional)
In this post, we will look at translating
R’s vector into
# load libraries library(tidyverse) library(reticulate)
R vector is a python …
well it depends if the
R vector has single or multiple elements.
R vector has only 1 element, the python structure will be a scalar. A scalar is a structure which contains a single value. The value can be any type e.g. 69, 0.07, or ‘banana’.
Let’s verify with some code. Is
Rvec_1 an atomic vector?
Rvec_1<-1 is.vector(Rvec_1) ##  TRUE is.atomic(Rvec_1) ##  TRUE
Indirectly, you can print the
class() of the object. If it prints the element type, you can infer the object is a vector.
class(Rvec_1) ##  "numeric"
Is a single element
R vector a
python scalar structure?
py_run_string("import numpy as np") py_eval("np.isscalar(r.Rvec_1)") ##  TRUE
Likewise, you can print the
type of the object. If it prints the element type, you can infer the structure is a scalar.
py_eval("type(r.Rvec_1)") ## <class 'float'>
If you wish to run everything in
R and achieve the above, you will have to convert the
R object into a
python object and store this converted object in your
R’s global environment. From my previous introduction to the
reticulate package, you can do this using the
r_to_py(Rvec_1) %>% class() ##  "python.builtin.float" "python.builtin.object"
There you have it. When you convert a single element
R vector into
python, it is a
float element type which is indicative that it is a
python scalar structure.
R vector has multiple elements, the python structure will be a list. Let’s assert this with some code. Is
R atomic vector? The
class() is an element type thus it can be inferred to be a
Rvec_multi<-c(66,99, 0.07) class(Rvec_multi) ##  "numeric"
Is a multi element
R vector a
python list? Yes, it is.
r_to_py(Rvec_multi) %>% class() ##  "python.builtin.list" "python.builtin.object"
Occasionally, you may work with named vectors in
R; for instance, when calculating quantiles.
(Rvec_name<-quantile(rnorm(100))) ## 0% 25% 50% 75% 100% ## -3.20985677 -0.60767955 0.03261286 0.77174145 2.58177907
Named vectors are still considered vectors.
Rvec_name %>% class() ##  "numeric"
Do note that the names in the named vectors (e.g. 0%, 25%..) are treated as character and NOT numbers.
Rvec_name %>% str() ## Named num [1:5] -3.2099 -0.6077 0.0326 0.7717 2.5818 ## - attr(*, "names")= chr [1:5] "0%" "25%" "50%" "75%" ...
python ignores the names when translating a multi element named vector.
python treats it like another
r_to_py(Rvec_name) ## [-3.2098567735297747, -0.6076795490223229, 0.032612857512085064, 0.7717414498796162, 2.5817790740464406]
Some differences between
We have been using element types to infer if the object is a
R vector or a
python scalar. Thus, it would helpful to know some of the differences between
python element types.
Element types (numbers)
R treats numbers as floats/numerics regardless if they are whole numbers or numbers with decimals
class(1) ##  "numeric" class(0.07) ##  "numeric"
On the other hand,
python treats whole numbers as integers.
py_eval("type(1)") ## <class 'int'>
Python treats number with decimals just like
R, as floats/numerics
py_eval("type(0.07)") ## <class 'float'>
The trick for
R to treat whole numbers as integers in the eyes of both
python is to add the suffix
L after the number.
Rvec_1int<-1L class(Rvec_1int) ##  "integer" r_to_py(Rvec_1int) %>% class() ##  "python.builtin.int" "python.builtin.object"
Elements in multi element
R vectors adhere to singularity. In other words, different element types are coerced such that all elements have the same type.
Let’s look at an example. First, I will create 3 single element vectors of different element types.
Relement_int=2L class(Relement_int) ##  "integer" Relement_bool=TRUE class(Relement_bool) ##  "logical" Relement_char="banana" class(Relement_char) ##  "character"
Next, I will combine these vectors into a multi element vector. Let’s reassess the element type for each element.
Rvec_mix<- c(Relement_int, Relement_bool, Relement_char) class(Rvec_mix) ##  "character" class(Rvec_mix) ##  "character" class(Rvec_mix) ##  "character"
As you can see, all the different elements have been coerced into the same element type when they are combined in a multi element vector. Often, the individual elements are coerced into strings as strings is the most accommodating element type.
python doesn’t coerce element types when lists are created. The integrity of each element type remains unchanged.
py_run_string("Plist_mix=(r.Relement_int, r.Relement_bool, r.Relement_char)") py_eval("type(Plist_mix)") ## <class 'int'> py_eval("type(Plist_mix)") ## <class 'bool'> py_eval("type(Plist_mix)") ## <class 'str'>
Besides the differences in element types, there are differences in indexing for each language.
R uses non-zero indexing
Rvec_multi ##  66
python uses zero indexing
py_eval("r.Rvec_multi") ##  66
Indexing (negative numbers)
In addition to non-zero and zero indexing, there are other differences in indexing. In
R, negative index number means that the element of that index number is excluded.
Rvec_multi[-1] ##  99.00 0.07
python, negative index number means that indexing begins from the end of the dataset.
py_eval("r.Rvec_multi[-1]") ##  0.07