Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Background
The < bslib-tooltip placement="auto" bsoptions="[]" data-require-bs-version="5" data-require-bs-caller="tooltip()"> < template>SDTM Test Data for the Pharmaverse Family of Packages • pharmaversesdtm {pharmaversesdtm} and < bslib-tooltip placement="auto" bsoptions="[]" data-require-bs-version="5" data-require-bs-caller="tooltip()"> < template>ADaM Test Data for the Pharmaverse Family of Packages • pharmaverseadam {pharmaverseadam} packages have been available for some time, providing reusable examples for SDTM and ADaM datasets, respectively. However, one critical piece of the workflow was missing: the raw datasets that serve as the starting point for these examples.
< section id="why-now" class="level2">Why now?
With the recent release of the < bslib-tooltip placement="auto" bsoptions="[]" data-require-bs-version="5" data-require-bs-caller="tooltip()"> < template>SDTM Data Transformation Engine • sdtm.oak {sdtm.oak} package — an open-source package that enables SDTM programming in R — we now have an opportunity to complete the picture. The new < bslib-tooltip placement="auto" bsoptions="[]" data-require-bs-version="5" data-require-bs-caller="tooltip()"> < template>pharmaverseraw | Test raw data for use within the pharmaverse family of packages {pharmaverseraw} package fills this gap by providing example raw datasets that can be used as input for < bslib-tooltip placement="auto" bsoptions="[]" data-require-bs-version="5" data-require-bs-caller="tooltip()"> < template>SDTM Test Data for the Pharmaverse Family of Packages • pharmaversesdtm {pharmaversesdtm} datasets generation with < bslib-tooltip placement="auto" bsoptions="[]" data-require-bs-version="5" data-require-bs-caller="tooltip()"> < template>SDTM Data Transformation Engine • sdtm.oak {sdtm.oak} .
< section id="what-is-in" class="level2">What is in < bslib-tooltip placement="auto" bsoptions="[]" data-require-bs-version="5" data-require-bs-caller="tooltip()">
< template>pharmaverseraw | Test raw data for use within the pharmaverse family of packages {pharmaverseraw} ?
The < bslib-tooltip placement="auto" bsoptions="[]" data-require-bs-version="5" data-require-bs-caller="tooltip()"> < template>pharmaverseraw | Test raw data for use within the pharmaverse family of packages {pharmaverseraw} package v0.1.0 is out on CRAN. It is also available from the pharmaverse site at: https://pharmaverse.org/e2eclinical/developers/. It includes raw datasets for the following SDTM domains:
– AE: Adverse Events
– DS: Subject Disposition
– DM: Demographics
– EC/EX: Exposure
These raw datasets in < bslib-tooltip placement="auto" bsoptions="[]" data-require-bs-version="5" data-require-bs-caller="tooltip()"> < template>pharmaverseraw | Test raw data for use within the pharmaverse family of packages {pharmaverseraw} package are intentionally designed to be:
- EDC agnostic: They are not tied to any specific Electronic Data Capture (EDC) system like Rave or Veeva.
- Standards agnostic: Some variables follow CDASH (Clinical Data Acquisition Standards Harmonization), while others do not. This reflects real-world data standards variability across companies.
The annotated case report forms corresponding to the raw datasets are also present in the inst\acrf
folder. These PDF files illustrate how each raw variable aligns with SDTM expectations, offering insight into the mapping logic used in < bslib-tooltip placement="auto" bsoptions="[]" data-require-bs-version="5" data-require-bs-caller="tooltip()"> < template>SDTM Data Transformation Engine • sdtm.oak {sdtm.oak} .
How are these datasets created?
The datasets in < bslib-tooltip placement="auto" bsoptions="[]" data-require-bs-version="5" data-require-bs-caller="tooltip()"> < template>pharmaverseraw | Test raw data for use within the pharmaverse family of packages {pharmaverseraw} were created through reverse engineering – we started with the finalized SDTM datasets in < bslib-tooltip placement="auto" bsoptions="[]" data-require-bs-version="5" data-require-bs-caller="tooltip()"> < template>SDTM Test Data for the Pharmaverse Family of Packages • pharmaversesdtm {pharmaversesdtm} and worked backward to construct plausible raw datasets that could reasonably result in those SDTM outputs. This approach ensures data consistency while allowing us to demonstrate the flexibility of < bslib-tooltip placement="auto" bsoptions="[]" data-require-bs-version="5" data-require-bs-caller="tooltip()"> < template>SDTM Data Transformation Engine • sdtm.oak {sdtm.oak} in handling raw data in different formats.
< section id="from-raw-to-sdtm" class="level2">From raw to SDTM
Using < bslib-tooltip placement="auto" bsoptions="[]" data-require-bs-version="5" data-require-bs-caller="tooltip()"> < template>SDTM Data Transformation Engine • sdtm.oak {sdtm.oak} , you can take raw datasets from < bslib-tooltip placement="auto" bsoptions="[]" data-require-bs-version="5" data-require-bs-caller="tooltip()"> < template>pharmaverseraw | Test raw data for use within the pharmaverse family of packages {pharmaverseraw} and apply SDTM mapping functions to map the target SDTM variables. There will be new SDTM examples published using this data later at: https://pharmaverse.github.io/examples/sdtm/examples.html
Below is an example snippet that shows how to use a raw AE dataset from < bslib-tooltip placement="auto" bsoptions="[]" data-require-bs-version="5" data-require-bs-caller="tooltip()"> < template>pharmaverseraw | Test raw data for use within the pharmaverse family of packages {pharmaverseraw} and generate SDTM AE variables with < bslib-tooltip placement="auto" bsoptions="[]" data-require-bs-version="5" data-require-bs-caller="tooltip()"> < template>SDTM Data Transformation Engine • sdtm.oak {sdtm.oak} :
library(pharmaverseraw) library(sdtm.oak) library(dplyr) # Read in raw data ae_raw <- pharmaverseraw::ae_raw # Derive oak_id_vars ae_raw <- ae_raw %>% generate_oak_id_vars( pat_var = "PATNUM", raw_src = "ae_raw" ) # Map AETERM and AESDTH variables for AE domain ae <- # Derive topic variable # Map AETERM using assign_no_ct, raw_var=IT.AETERM, tgt_var=AETERM assign_no_ct( raw_dat = ae_raw, raw_var = "IT.AETERM", tgt_var = "AETERM", id_vars = oak_id_vars() ) %>% # Map AESDTH using hardcode_no_ct and condition_add, raw_var=IT.AESDTH, tgt_var=AESDTH # If Yes then AESDTH = Y else Not submitted hardcode_no_ct( raw_dat = condition_add(ae_raw, IT.AESDTH == "Yes"), raw_var = "IT.AESDTH", tgt_var = "AESDTH", tgt_val = "Y", id_vars = oak_id_vars() ) %>% hardcode_no_ct( raw_dat = condition_add(ae_raw, IT.AESDTH != "Yes"), raw_var = "IT.AESDTH", tgt_var = "AESDTH", tgt_val = "Not Submitted", id_vars = oak_id_vars() )
Get involved
Similar to the other tools under pharmaverse umbrella, the < bslib-tooltip placement="auto" bsoptions="[]" data-require-bs-version="5" data-require-bs-caller="tooltip()"> < template>pharmaverseraw | Test raw data for use within the pharmaverse family of packages {pharmaverseraw} package is open-source and community-driven. We welcome volunteers who are interested in contributing to the continued development and improvement of this package.
If you’d like to get involved, here are some ways you can help:
– Add new raw datasets: Take other SDTM domains from the < bslib-tooltip placement="auto" bsoptions="[]" data-require-bs-version="5" data-require-bs-caller="tooltip()"> < template>SDTM Test Data for the Pharmaverse Family of Packages • pharmaversesdtm {pharmaversesdtm} package and create corresponding raw datasets using R. This will help expand the coverage of the package.
– Create mock aCRFs for the raw datasets: Develop annotated case report forms (aCRFs) that illustrate how the raw variables are mapped to align with SDTM standards.
– Prepare documentation: For each raw dataset you create, include documentation that explains the data structure, variable definitions, and any relevant notes.
Whether you’re a programmer, CDISC expert, or clinical data manager, your contributions can make a meaningful impact. Visit the github repository to open up a issue or start a discussion.
< !--------------- appendices go here ----------------->Last updated
2025-07-07 17:06:21.537118
Details
< section class="quarto-appendix-contents" id="quarto-reuse">Reuse
< section class="quarto-appendix-contents" id="quarto-citation">Citation
@online{chen2025, author = {Chen, Shiyu}, title = {Raw Data for Domains in the Pharmaversesdtm Package}, date = {2025-07-07}, url = {https://pharmaverse.github.io/blog/posts/2025-07-07_pharmaverse.../pharmaverseraw__package.html}, langid = {en} }
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.