Raw data for domains in the pharmaversesdtm package
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Background
The
Why now?
With the recent release of the
What is in
pharmaverseraw | Test raw data for use within the pharmaverse family of packages {pharmaverseraw} ?
The
– AE: Adverse Events
– DS: Subject Disposition
– DM: Demographics
– EC/EX: Exposure
These raw datasets in
- EDC agnostic: They are not tied to any specific Electronic Data Capture (EDC) system like Rave or Veeva.
- Standards agnostic: Some variables follow CDASH (Clinical Data Acquisition Standards Harmonization), while others do not. This reflects real-world data standards variability across companies.
The annotated case report forms corresponding to the raw datasets are also present in the inst\acrf
folder. These PDF files illustrate how each raw variable aligns with SDTM expectations, offering insight into the mapping logic used in
How are these datasets created?
The datasets in
From raw to SDTM
Using
Below is an example snippet that shows how to use a raw AE dataset from
library(pharmaverseraw) library(sdtm.oak) library(dplyr) # Read in raw data ae_raw <- pharmaverseraw::ae_raw # Derive oak_id_vars ae_raw <- ae_raw %>% generate_oak_id_vars( pat_var = "PATNUM", raw_src = "ae_raw" ) # Map AETERM and AESDTH variables for AE domain ae <- # Derive topic variable # Map AETERM using assign_no_ct, raw_var=IT.AETERM, tgt_var=AETERM assign_no_ct( raw_dat = ae_raw, raw_var = "IT.AETERM", tgt_var = "AETERM", id_vars = oak_id_vars() ) %>% # Map AESDTH using hardcode_no_ct and condition_add, raw_var=IT.AESDTH, tgt_var=AESDTH # If Yes then AESDTH = Y else Not submitted hardcode_no_ct( raw_dat = condition_add(ae_raw, IT.AESDTH == "Yes"), raw_var = "IT.AESDTH", tgt_var = "AESDTH", tgt_val = "Y", id_vars = oak_id_vars() ) %>% hardcode_no_ct( raw_dat = condition_add(ae_raw, IT.AESDTH != "Yes"), raw_var = "IT.AESDTH", tgt_var = "AESDTH", tgt_val = "Not Submitted", id_vars = oak_id_vars() )
Get involved
Similar to the other tools under pharmaverse umbrella, the
If you’d like to get involved, here are some ways you can help:
– Add new raw datasets: Take other SDTM domains from the
– Create mock aCRFs for the raw datasets: Develop annotated case report forms (aCRFs) that illustrate how the raw variables are mapped to align with SDTM standards.
– Prepare documentation: For each raw dataset you create, include documentation that explains the data structure, variable definitions, and any relevant notes.
Whether you’re a programmer, CDISC expert, or clinical data manager, your contributions can make a meaningful impact. Visit the github repository to open up a issue or start a discussion.
Last updated
2025-07-07 17:06:21.537118
Details
Reuse
Citation
@online{chen2025, author = {Chen, Shiyu}, title = {Raw Data for Domains in the Pharmaversesdtm Package}, date = {2025-07-07}, url = {https://pharmaverse.github.io/blog/posts/2025-07-07_pharmaverse.../pharmaverseraw__package.html}, langid = {en} }
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.