Canadian Hansard Live Database

[This article was first published on pacha.dev/blog, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Motivation

The original LiPad site provides a SQL dump that requires you to configure PostreSQL on your own laptop or server.

Because some classmates asked me about it, I decided to host the database to simplify its access.

Description

Databases.pacha.dev provides a complete the PostgreSQL database of the Canadian Hansard dataset, including supplementary data. It covers the Canadian Hansard from 1901 to 2019 and doesn’t incldue modifications to LiPad contents, it only saves you the step of configuring PostgreSQL.

Usage

You can adapt this code for R. Please see my post about the census to get an idea of other functions you can use with SQL.

library(dplyr)
library(RPostgres)

con <- dbConnect(
  Postgres(),
  user = Sys.getenv("dbedu_usr"),
  password = Sys.getenv("dbedu_pwd"),
  dbname = "canadianhansard",
  host = "databases.pacha.dev"
)

dbListTables(con)
 [1] "andablog_entryimage"         "contactme_contact_msg"      
 [3] "dilipadsite_datenav"         "dilipadsite_constituency"   
 [5] "dilipadsite_basehansard"     "auth_group"                 
 [7] "auth_group_permissions"      "auth_user_user_permissions" 
 [9] "auth_user_groups"            "better500s_caughterror"     
[11] "dilipadsite_basehansard_bak" "dilipadsite_blogger"        
[13] "auth_user"                   "auth_permission"            
[15] "dilipadsite_datenav_bak"     "dilipadsite_datepickle"     
[17] "dilipadsite_parlsess"        "dilipadsite_member"         
[19] "django_admin_log"            "django_migrations"          
[21] "django_content_type"         "dilipadsite_position"       
[23] "django_session"              "downtime_period"            
[25] "taggit_templatetags_amodel"  "taggit_taggeditem"          
[27] "taggit_tag"                  "andablog_entry"             
[29] "django_site"                 "dilipadsite_party"          
tbl(con, "dilipadsite_basehansard") %>% glimpse()
Rows: ??
Columns: 15
Database: postgres  [[email protected]:5432/canadianhansard]
$ basepk          <int> 2691572, 2691576, 514587, 14, 21, 4043, 514588, 3, 4, …
$ hid             <chr> "ca.proc.d.1970-11-13.28.1.89", "ca.proc.d.1970-11-13.…
$ speechdate      <date> 1970-11-13, 1970-11-13, 1918-05-20, 1901-02-06, 1901-…
$ pid             <chr> "7f5ea10f-f4c5-4350-b707-6c6522149e5f", "intervention"…
$ opid            <chr> "ca.m.3511", "", "ca.m.4097", "", "", "", "ca.m.4942",…
$ speakeroldname  <chr> "Mr. Cafik:", "", "Sir GEORGE FOSTER:", "", "", "", "M…
$ speakerposition <chr> "", "", "", "topic", "stagedirection", "", "", "topic"…
$ maintopic       <chr> "GOVERNMENT ORDERS", "GOVERNMENT ORDERS", "SUPPLY.-", …
$ subtopic        <chr> "PUBLIC ORDER (TEMPORARY MEASURES) ACT, 1970 PROVISION…
$ speechtext      <chr> "Let me point out to the hon. member for Matane that I…
$ speakerparty    <chr> "Liberal", "", "Conservative (1867-1942)", "", "", "",…
$ speakerriding   <chr> "", "", "", "", "", "", "", "", "", "", "", "", "", ""…
$ speakername     <chr> "Norman Augustine Cafik", "An hon. Member:", "George G…
$ speakerurl      <chr> "http://www.parl.gc.ca/parlinfo/Files/Parliamentarian.…
$ subsubtopic     <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
dbDisconnect(con)

Citing

For citation purposes, a full description of the project is available in the following article:

Beelen, K., Thijm, T. A., Cochrane, C., Halvemaan, K., Hirst, G., Kimmins, M., Lijbrink, S., Marx, M., Naderi, N., Rheault, L., Polyanovsky, R., and Whyte, T. (2017). “Digitization of the Canadian Parliamentary Debates.” Canadian Journal of Political Science, 50(3), 849–864. https://doi.org/10.1017/S0008423916001165.

Contributing

If you find this database to be useful, please consider donating on Buy me a Coffee. Additionally, I provide R and Python training services in English and Spanish language.

To leave a comment for the author, please follow the link and comment on their blog: pacha.dev/blog.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)