GetDFPData2 available in CRAN!

[This article was first published on R | msperlin, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

After testing the package extensivelly, GetDFPData2 is finally available in CRAN. GetDFPData2 is the second and backwards incompatible version of GetDPFData, a R package for downloading annual financial reports from B3, the Brazilian financial exchange. Unlike its first iteration, GetDFPData2 imports data using a database of csv files from CVM, which makes it execution much faster than its predecessor. However, the output is slightly different.

A shiny app – web interface – is also available at https://www.msperlin.com/shiny/GetDFPData2/.

The previous version, GetDFPData, is deprecated and will not be developed any further. All efforts goes to GetDFPData2 and GetFREData (soon in CRAN).

Installation

# available in cran (stable)
install.packages('GetDFPData2')

# github (dev version)
devtools::install_github('msperlin/GetDFPData2')

Examples of Usage

Information about available companies

library(GetDFPData2)

# information about companies
df_info <- get_info_companies(tempdir())
## Fetching info on B3 companies
##  Dowloading file from CVM
##  File not found, downloading it..
##  Success
##  Reading file from CVM
##  Saving cache data
##  Got 2331 lines for 2290 companies [Actives = 648 Inactives = 1653]
print(df_info )
## # A tibble: 2,331 x 44
##    CD_CVM DENOM_SOCIAL    DENOM_COMERC   SETOR_ATIV  PF_PJ CNPJ  DT_REG DT_CONST
##     <dbl> <chr>           <chr>          <chr>       <chr> <chr> <chr>  <chr>   
##  1  25224 2W ENERGIA S.A. <NA>           Construção… PJ    8773… 29/10… 23/03/2…
##  2  21954 3A COMPANHIA S… TRIPLO A  COM… Securitiza… PJ    1139… 08/03… 03/11/2…
##  3  25291 3R PETROLEUM O… <NA>           Petróleo e… PJ    1209… 09/11… 08/06/2…
##  4  16330 521 PARTICIPAÇ… 521 PARTICIPA… Emp. Adm. … PJ    1547… 11/07… 30/07/1…
##  5  16284 524 PARTICIPAÇ… 524 PARTICIPA… Emp. Adm. … PJ    1851… 30/05… 02/04/1…
##  6  16349 525 PARTICIPAÇ… 525 PARTICIPA… Emp. Adm. … PJ    1919… 16/07… 02/04/1…
##  7     35 A J RENNER SA … A J RENNER     Emp. Adm. … PJ    9265… 24/06… <NA>    
##  8  16802 A.P. PARTICIPA… A.P. PARTICIP… Emp. Adm. … PJ    2288… 21/01… 14/12/1…
##  9  13307 ABC DADOS E IN… ABC COMPUTADO… Máquinas, … PJ    2164… 03/06… <NA>    
## 10  16934 ABC SUPERMERCA… ABC SUPERMERC… Comércio (… PJ    2258… 27/02… 30/09/1…
## # … with 2,321 more rows, and 36 more variables: DT_CANCEL <chr>,
## #   MOTIVO_CANCEL <chr>, SIT_REG <chr>, DT_INI_SIT <chr>, SIT_EMISSOR <chr>,
## #   DT_INI_SIT_EMISSOR <chr>, CATEG_REG <chr>, DT_INI_CATEG <chr>,
## #   AUDITOR <chr>, CNPJ_AUDITOR <dbl>, TP_ENDER <chr>, LOGRADOURO <chr>,
## #   COMPL <chr>, BAIRRO <chr>, CIDADE <chr>, UF <chr>, PAIS <chr>,
## #   CD_POSTAL <lgl>, TEL <chr>, FAX <chr>, EMAIL <chr>, TP_RESP <chr>,
## #   RESP <chr>, DT_INI_RESP <chr>, LOGRADOURO_RESP <chr>, COMPL_RESP <chr>,
## #   BAIRRO_RESP <chr>, CIDADE_RESP <chr>, UF_RESP <chr>, PAIS_RESP <chr>,
## #   CEP_RESP <dbl>, TEL_RESP <chr>, FAX_RESP <chr>, EMAIL_RESP <chr>,
## #   TP_MERC <chr>, cnpj_number <dbl>

Searching for companies

search_company('grendene', cache_folder = tempdir())
## Fetching info on B3 companies
##  Found cache file. Loading data..
##  Got 2331 lines for 2290 companies [Actives = 648 Inactives = 1653]
## Found 1 companies:
## GRENDENE SA | situation = ATIVO | sector = Têxtil e Vestuário | CD_CVM = 19615

Downloading Annual Financial Reports

# downloading DFP data
l_dfp <- get_dfp_data(companies_cvm_codes = 19615, 
                      use_memoise = FALSE,
                      clean_data = TRUE,
                      cache_folder = tempdir(), # use local folder in live code
                      type_docs = c('DRE'), 
                      type_format = 'con',
                      first_year = 2019, 
                      last_year = 2020)
##  Dowloading dfp_cia_aberta_2019.zip
##  File not found, downloading it..
##  Success
##      Unzipping
##      Reading dfp_cia_aberta_DRE_con_2019.csv | Cleaning table
##      Got 30 rows | 1 companies
##  Dowloading dfp_cia_aberta_2020.zip
##  File not found, downloading it..
##  Success
##      Unzipping
##      Reading dfp_cia_aberta_DRE_con_2020.csv | Cleaning table
##      Got 32 rows | 1 companies
str(l_dfp)
## List of 1
##  $ DF Consolidado - Demonstração do Resultado: tibble[,16] [62 × 16] (S3: tbl_df/tbl/data.frame)
##   ..$ CNPJ_CIA    : chr [1:62] "89.850.341/0001-60" "89.850.341/0001-60" "89.850.341/0001-60" "89.850.341/0001-60" ...
##   ..$ CD_CVM      : num [1:62] 19615 19615 19615 19615 19615 ...
##   ..$ DT_REFER    : Date[1:62], format: "2019-12-31" "2019-12-31" ...
##   ..$ DT_INI_EXERC: Date[1:62], format: "2019-01-01" "2019-01-01" ...
##   ..$ DT_FIM_EXERC: Date[1:62], format: "2019-12-31" "2019-12-31" ...
##   ..$ DENOM_CIA   : chr [1:62] "GRENDENE S.A." "GRENDENE S.A." "GRENDENE S.A." "GRENDENE S.A." ...
##   ..$ VERSAO      : num [1:62] 2 2 2 2 2 2 2 2 2 2 ...
##   ..$ GRUPO_DFP   : chr [1:62] "DF Consolidado - Demonstração do Resultado" "DF Consolidado - Demonstração do Resultado" "DF Consolidado - Demonstração do Resultado" "DF Consolidado - Demonstração do Resultado" ...
##   ..$ MOEDA       : chr [1:62] "REAL" "REAL" "REAL" "REAL" ...
##   ..$ ESCALA_MOEDA: chr [1:62] "MIL" "MIL" "MIL" "MIL" ...
##   ..$ ORDEM_EXERC : chr [1:62] "ÚLTIMO" "ÚLTIMO" "ÚLTIMO" "ÚLTIMO" ...
##   ..$ CD_CONTA    : chr [1:62] "3.01" "3.02" "3.03" "3.04" ...
##   ..$ DS_CONTA    : chr [1:62] "Receita de Venda de Bens e/ou Serviços" "Custo dos Bens e/ou Serviços Vendidos" "Resultado Bruto" "Despesas/Receitas Operacionais" ...
##   ..$ VL_CONTA    : num [1:62] 2071034 -1126511 944523 -590995 -530825 ...
##   ..$ COLUNA_DF   : logi [1:62] NA NA NA NA NA NA ...
##   ..$ source_file : chr [1:62] "dfp_cia_aberta_DRE_con_2019.csv" "dfp_cia_aberta_DRE_con_2019.csv" "dfp_cia_aberta_DRE_con_2019.csv" "dfp_cia_aberta_DRE_con_2019.csv" ...

To leave a comment for the author, please follow the link and comment on their blog: R | msperlin.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)