Site icon R-bloggers

Raw data for domains in the pharmaversesdtm package

[This article was first published on pharmaverse blog, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
< !--------------- typical setup -----------------> < !--------------- post begins here -----------------> < section id="background" class="level2">

Background

The < bslib-tooltip placement="auto" bsoptions="[]" data-require-bs-version="5" data-require-bs-caller="tooltip()"> < template>SDTM Test Data for the Pharmaverse Family of Packages • pharmaversesdtm {pharmaversesdtm} and < bslib-tooltip placement="auto" bsoptions="[]" data-require-bs-version="5" data-require-bs-caller="tooltip()"> < template>ADaM Test Data for the Pharmaverse Family of Packages • pharmaverseadam {pharmaverseadam} packages have been available for some time, providing reusable examples for SDTM and ADaM datasets, respectively. However, one critical piece of the workflow was missing: the raw datasets that serve as the starting point for these examples.

< section id="why-now" class="level2">

Why now?

With the recent release of the < bslib-tooltip placement="auto" bsoptions="[]" data-require-bs-version="5" data-require-bs-caller="tooltip()"> < template>SDTM Data Transformation Engine • sdtm.oak {sdtm.oak} package — an open-source package that enables SDTM programming in R — we now have an opportunity to complete the picture. The new < bslib-tooltip placement="auto" bsoptions="[]" data-require-bs-version="5" data-require-bs-caller="tooltip()"> < template>pharmaverseraw | Test raw data for use within the pharmaverse family of packages {pharmaverseraw} package fills this gap by providing example raw datasets that can be used as input for < bslib-tooltip placement="auto" bsoptions="[]" data-require-bs-version="5" data-require-bs-caller="tooltip()"> < template>SDTM Test Data for the Pharmaverse Family of Packages • pharmaversesdtm {pharmaversesdtm} datasets generation with < bslib-tooltip placement="auto" bsoptions="[]" data-require-bs-version="5" data-require-bs-caller="tooltip()"> < template>SDTM Data Transformation Engine • sdtm.oak {sdtm.oak} .

< section id="what-is-in" class="level2">

What is in < bslib-tooltip placement="auto" bsoptions="[]" data-require-bs-version="5" data-require-bs-caller="tooltip()">

< template>pharmaverseraw | Test raw data for use within the pharmaverse family of packages {pharmaverseraw} ?

The < bslib-tooltip placement="auto" bsoptions="[]" data-require-bs-version="5" data-require-bs-caller="tooltip()"> < template>pharmaverseraw | Test raw data for use within the pharmaverse family of packages {pharmaverseraw} package v0.1.0 is out on CRAN. It is also available from the pharmaverse site at: https://pharmaverse.org/e2eclinical/developers/. It includes raw datasets for the following SDTM domains:

– AE: Adverse Events

– DS: Subject Disposition

– DM: Demographics

– EC/EX: Exposure

These raw datasets in < bslib-tooltip placement="auto" bsoptions="[]" data-require-bs-version="5" data-require-bs-caller="tooltip()"> < template>pharmaverseraw | Test raw data for use within the pharmaverse family of packages {pharmaverseraw} package are intentionally designed to be:

The annotated case report forms corresponding to the raw datasets are also present in the inst\acrf folder. These PDF files illustrate how each raw variable aligns with SDTM expectations, offering insight into the mapping logic used in < bslib-tooltip placement="auto" bsoptions="[]" data-require-bs-version="5" data-require-bs-caller="tooltip()"> < template>SDTM Data Transformation Engine • sdtm.oak {sdtm.oak} .

< section id="how-are-these-datasets-created" class="level2">

How are these datasets created?

The datasets in < bslib-tooltip placement="auto" bsoptions="[]" data-require-bs-version="5" data-require-bs-caller="tooltip()"> < template>pharmaverseraw | Test raw data for use within the pharmaverse family of packages {pharmaverseraw} were created through reverse engineering – we started with the finalized SDTM datasets in < bslib-tooltip placement="auto" bsoptions="[]" data-require-bs-version="5" data-require-bs-caller="tooltip()"> < template>SDTM Test Data for the Pharmaverse Family of Packages • pharmaversesdtm {pharmaversesdtm} and worked backward to construct plausible raw datasets that could reasonably result in those SDTM outputs. This approach ensures data consistency while allowing us to demonstrate the flexibility of < bslib-tooltip placement="auto" bsoptions="[]" data-require-bs-version="5" data-require-bs-caller="tooltip()"> < template>SDTM Data Transformation Engine • sdtm.oak {sdtm.oak} in handling raw data in different formats.

< section id="from-raw-to-sdtm" class="level2">

From raw to SDTM

Using < bslib-tooltip placement="auto" bsoptions="[]" data-require-bs-version="5" data-require-bs-caller="tooltip()"> < template>SDTM Data Transformation Engine • sdtm.oak {sdtm.oak} , you can take raw datasets from < bslib-tooltip placement="auto" bsoptions="[]" data-require-bs-version="5" data-require-bs-caller="tooltip()"> < template>pharmaverseraw | Test raw data for use within the pharmaverse family of packages {pharmaverseraw} and apply SDTM mapping functions to map the target SDTM variables. There will be new SDTM examples published using this data later at: https://pharmaverse.github.io/examples/sdtm/examples.html

Below is an example snippet that shows how to use a raw AE dataset from < bslib-tooltip placement="auto" bsoptions="[]" data-require-bs-version="5" data-require-bs-caller="tooltip()"> < template>pharmaverseraw | Test raw data for use within the pharmaverse family of packages {pharmaverseraw} and generate SDTM AE variables with < bslib-tooltip placement="auto" bsoptions="[]" data-require-bs-version="5" data-require-bs-caller="tooltip()"> < template>SDTM Data Transformation Engine • sdtm.oak {sdtm.oak} :

library(pharmaverseraw)
library(sdtm.oak)
library(dplyr)

# Read in raw data
ae_raw <- pharmaverseraw::ae_raw

# Derive oak_id_vars
ae_raw <- ae_raw %>%
  generate_oak_id_vars(
    pat_var = "PATNUM",
    raw_src = "ae_raw"
  )

# Map AETERM and AESDTH variables for AE domain
ae <-
  # Derive topic variable
  # Map AETERM using assign_no_ct, raw_var=IT.AETERM, tgt_var=AETERM
  assign_no_ct(
    raw_dat = ae_raw,
    raw_var = "IT.AETERM",
    tgt_var = "AETERM",
    id_vars = oak_id_vars()
  ) %>%
  # Map AESDTH using hardcode_no_ct and condition_add, raw_var=IT.AESDTH, tgt_var=AESDTH
  # If Yes then AESDTH = Y else Not submitted
  hardcode_no_ct(
    raw_dat = condition_add(ae_raw, IT.AESDTH == "Yes"),
    raw_var = "IT.AESDTH",
    tgt_var = "AESDTH",
    tgt_val = "Y",
    id_vars = oak_id_vars()
  ) %>%
  hardcode_no_ct(
    raw_dat = condition_add(ae_raw, IT.AESDTH != "Yes"),
    raw_var = "IT.AESDTH",
    tgt_var = "AESDTH",
    tgt_val = "Not Submitted",
    id_vars = oak_id_vars()
  )
< section id="get-involved" class="level2">

Get involved

Similar to the other tools under pharmaverse umbrella, the < bslib-tooltip placement="auto" bsoptions="[]" data-require-bs-version="5" data-require-bs-caller="tooltip()"> < template>pharmaverseraw | Test raw data for use within the pharmaverse family of packages {pharmaverseraw} package is open-source and community-driven. We welcome volunteers who are interested in contributing to the continued development and improvement of this package.

If you’d like to get involved, here are some ways you can help:

Add new raw datasets: Take other SDTM domains from the < bslib-tooltip placement="auto" bsoptions="[]" data-require-bs-version="5" data-require-bs-caller="tooltip()"> < template>SDTM Test Data for the Pharmaverse Family of Packages • pharmaversesdtm {pharmaversesdtm} package and create corresponding raw datasets using R. This will help expand the coverage of the package.

Create mock aCRFs for the raw datasets: Develop annotated case report forms (aCRFs) that illustrate how the raw variables are mapped to align with SDTM standards.

Prepare documentation: For each raw dataset you create, include documentation that explains the data structure, variable definitions, and any relevant notes.

Whether you’re a programmer, CDISC expert, or clinical data manager, your contributions can make a meaningful impact. Visit the github repository to open up a issue or start a discussion.

< !--------------- appendices go here ----------------->
< section id="last-updated" class="level2 appendix">

Last updated

2025-07-07 17:06:21.537118

< section id="details" class="level2 appendix">

Details

Source, Session info

< section class="quarto-appendix-contents" id="quarto-reuse">

Reuse

CC BY 4.0
< section class="quarto-appendix-contents" id="quarto-citation">

Citation

BibTeX citation:
@online{chen2025,
  author = {Chen, Shiyu},
  title = {Raw Data for Domains in the Pharmaversesdtm Package},
  date = {2025-07-07},
  url = {https://pharmaverse.github.io/blog/posts/2025-07-07_pharmaverse.../pharmaverseraw__package.html},
  langid = {en}
}
For attribution, please cite this work as:
Chen, Shiyu. 2025. “Raw Data for Domains in the Pharmaversesdtm Package.” July 7, 2025. https://pharmaverse.github.io/blog/posts/2025-07-07_pharmaverse…/pharmaverseraw__package.html.
To leave a comment for the author, please follow the link and comment on their blog: pharmaverse blog.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Exit mobile version