inegiR version 1.2

Posted on February 21, 2016 by En El Margen - R-English in R bloggers | 0 Comments

[This article was first published on En El Margen - R-English, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Version 1.2 of inegiR is now on CRAN so I thought I’d write a few words/vignette about what’s new or different, if at all. By the way, i’m writing in english because more people seem to read r-bloggers than my blog (no surprise there), however the pdf manual and most documentation is still in spanish.

Bug fixes

Thanks to Diego Valle who reported a slight bug, the more random dates (“bienal” and “decenal”) were not being parsed correctly.

Also added warnings and error handling when the data doesn’t exist for municipalities (issue is here)

New functions

Grids

Thanks to Arturo Cardenas who unwittingly built a new function for the DENUE part of the package that’s incorporated in this version.

As he wrote in his blog, the denue API only allows us to download businesses in a radious of a maximum of 5 kilometers. However, we can get around this limitation by asking the API a series of coordinates that we know overlap each other to create a square of a larger size. This is a picture, taken from that post, detailing what I mean:

Each circle is, of course 5 kms in radius and so the API would give us everything inside.

The hacer_grid() function helps us in the process by creating a data.frame with a series of coordinates that create a grid like the one in the image if we supply it 2 corners in latitud and longitud.

But the more powerful denue_grid() does the interesting part. Using the former function, it also downloads the denue data and returns a unique business data.frame in that grid (if you want duplicates as well, you can eliminate the unique part by setting the unicos = FALSE parameter)

Example with Grids

Here is an example with the city of Monterrey, let’s say I want all the businesses in San Pedro (a municipality that is part of the metropolitan area).

The total area is roughly about 45 kms, give or take (I know this is not geographically accurate):

I feed the upper right hand and lower left hand coordinates to the function, and voila:

library(inegiR)

upper_lat = 25.686917
upper_long = -100.429398
lower_lat = 25.612030
lower_long = -100.333032
token_denue <- "mytoken"

sanpedro <- denue_grid(upper_lat, lower_lat, 
                       upper_long, lower_long, 
                       token = token_denue)

Simple as that!

Factor productivity

By using two fairly consistent surveys that INEGI makes on a monthly bases, I added two functions to calculate productivity, by state in two important industries.

For both cases, productivity is defined as total value produced in state divided by number of total occupied people in the industry in the state. Bear in mind that value produced is in thousands of pesos, so 100 would be equal to 100 thousand pesos “produced” by each person.

We can simply get a time series by the doing the following:

library(eem)
library(ggplot2)
# ts for Manufacturing in state of Nuevo León:
token <- "mytoken"
pm <- series_productividad_man(token)
nl <- data.frame("Productivity" = pm$NL, "Date" = as.Date(pm$Fechas))
ggplot(nl, aes(x = Date, y = Productivity))+
  geom_line(colour = eem_colors[1])+
  theme_eem()+
  labs(title = "Productivity in Manufacturing n State of Nuevo León", 
        y = "Thousands of pesos x person")

# ts for Construction in state of Nuevo León:
pc <- series_productividad_const(token)
nl <- data.frame("Productivity" = pc$NL, "Date" = as.Date(pc$Fechas))
ggplot(nl, aes(x = Date, y = Productivity))+
  geom_line(colour = eem_colors[1])+
  theme_eem()+
  labs(title = "Productivity in Construction n State of Nuevo León", 
        y = "Thousands of pesos x person")

New geography

These last two examples lead me to another point: the names in the functions with states have changed. In the first version, Nuevo León state was “NuevoLeon”, it has been changed to “NL”. This is more conscise, easier to read and consistent with the new constitutional name change for Mexico City (it is now “CDMX”, as opposed to “DF”).

The other advantage is that these names will be consistent with Diego Valle’s mxmaps package to easily make chroloplethr maps (it’s available here). There is a nifty function to make these included in the package using inegiR, but now you can do this both ways!

To switch between “old names” and the new ones, i’ve left the following catalog here:

Name of State	Previous Name	New Name
Aguascalientes	Aguascalientes	AGS
Baja California	BajaCalifornia	BC
Baja California Sur	BajaCaliforniaSur	BCS
Campeche	Campeche	CAMP
Coahuila	Coahuila	COAH
Colima	Colima	COL
Chiapas	Chiapas	CHPS
Chihuahua	Chihuahua	CHIH
Distrito Federal	DF	CDMX
Durango	Durango	DGO
Guanajuato	Guanajuato	GTO
Guerrero	Guerrero	GRO
Hidalgo	Hidalgo	HGO
Jalisco	Jalisco	JAL
Estado de México	EdoMexico	MEX
Michoacán	Michoacan	MICH
Morelos	Morelos	MOR
Nayarit	Nayarit	NAY
Nuevo León	NuevoLeon	NL
Oaxaca	Oaxaca	OAX
Puebla	Puebla	PUE
Querétaro	Queretaro	QRO
Quintana Roo	QuintanaRoo	QROO
San Luís Potosí	SanLuisPotosi	SLP
Sinaloa	Sinaloa	SIN
Sonora	Sonora	SON
Tabasco	Tabasco	TAB
Tamaulipas	Tamaulipas	TAM
Tlaxcala	Tlaxcala	TLAX
Veracruz	Veracruz	VER
Yucatán	Yucatan	YUC
Zacatecas	Zacatecas	ZAC

If there are any suggestions or bugs, you can find me at twitter or github.

To leave a comment for the author, please follow the link and comment on their blog: En El Margen - R-English.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

R-bloggers

R news and tutorials contributed by hundreds of R bloggers

inegiR version 1.2

Bug fixes

New functions

Grids

Example with Grids

Factor productivity

New geography

Related

Bug fixes

New functions

Grids

Example with Grids

Factor productivity

New geography

Related

Never miss an update! Subscribe to R-bloggers to receive e-mails with the latest R posts. (You will not see this message again.)

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)