Monday, April 10, 2017

Exploring CDISC Analysis Data Model (ADaM)

Clinical Data Interchange Standards Consortium (CDISC) defines and manages industry level data standards that are widely used during the analysis, reporting and submission of clinical data. For instance, the Study Data Tabulation Model (SDTM) is the submission data standard into which raw study data are mapped and collated. ADaM is a companion standard for use with analysis data and it is best practice to use SDTM data as the source for these datasets. Doing this allows for the easy documentation of any data processing with Define-XML, the CDISC standard for data definition files.

Being able to trace the flow from source values to derived ones is a clear intention of the ADaM standard and that applies to the structure of any datasets and the required linkage to machine-readable metadata. It also is crucial that data are made analysis-ready so that the production of tables, listings and figures needs minimal effort to achieve with currently available tools with little or no further data manipulation.

While SDTM domain classes are determined according to data type such as interventions, events or findings, their ADaM equivalents are classified by analysis approach. Of the main data structures, one is best suited to the needs of analysis of continuous data values while another supports categorical analyses. There also is a subject-level analysis dataset that needs to be created for every study where ADaM is used.

All ADaM datasets are named ADxxxx, where xxxx is sponsor-defined and often carries over the name of the source SDTM domain. For example, an ADaM domain called ADLB would use the LB SDTM domain as its data source. This one-to-one domain mapping is not mandatory though and the required number of ADaM domains depends on the needs of any study data analysis or data review. An ADaM domain may use more than one SDTM domain as its source and carry a unique name that reflects this.

For ADaM variables, the naming conventions should follow the standardized variable names defined in the ADaM Implementation Guide. Any variables copied directly from SDTM data into an ADaM domain shall be used unchanged, with no change made either to their attributes (name, label, type, length, etc.) or their contents. Sponsor-defined variable names can be given to any other analysis variable that is not defined within the ADaM or SDTM standards. Following these conventions will provide clarity for the reviewer.

The ADaM subject-level analysis dataset is called ADSL and contains a maximum of one record per subject that contains variables which contain key information for subject disposition, demographic, and baseline characteristics. Other variables within ADSL will contain planned or actual treatment group information as well as key dates and times of the subjects study participation on the study. Not all variables within ADSL may be used directly for analysis but could be used in conjunction with other datasets for display or grouping purposes or possibly included simply as variables of interest for review. Given that the intention of ADSL is to contain variables that describe subjects, analysis populations and treatment groups to which they belong or prognostic factors, subject level efficacy information should not be added here but should be placed in another domain. Variables from ADSL may be added to other ADaM domains where doing so aids output creation or data review.
Another main class of ADaM datasets is the Basic Data Structure (BDS) and this contains one or more records per subject, analysis parameter or analysis timepoint. It is possible to add derived analysis parameters if required for an analysis. An example would be where a derivation uses results from a number of different parameters or where a mean is calculated at subject level from all the values collected for a subject. Derived records also may be added to support Last Observation Carried Forward (LOCF) or Worst Observation Carried Forward (WOCF) analyses.
The BDS is especially useful for continuous value analyses such as presenting mean, median, standard deviation and so on. This may not be the only usage but for a domain to comply with the BDS standard, it at the very least must contain variables for study and subject identifiers, analysis parameter name and code as well as analysis values. If any of these are absent, then the dataset does not fit the BDS description.

A variant of the BDS is available for Time to Event (TTE) analyses that are commonly used in therapeutic areas like oncology. This additionally contains variables for the original date of risk used for the start times in any TTE analysis or censoring for subject where the events of interest are not observed.

In February 2016, CDISC published the Occurrence Data Structure (OccDS) for use in categorical analyses where summaries of frequencies and percentages of occurrence are planned. This is an extension of the previously published ADAE structure that contains extra variables for use with concomitant mediation or medical history data. Data from other SDTM domains in the event or intervention classes may be mapped into OccDS if it fulfils their analysis needs. Some, such as exposure data, may be mapped to either BDS or OccDS depending on the analysis and even may be split into two ADaM domains in study where both categorical and continuous analyses are required.
Currently, ADaM supports the majority of analysis needs for clinical data. It may not be as prescriptive as SDTM but if offers flexibility while at the same time ensuring that a set of analysis data standards can be set in place by a sponsor. ADaM datasets also can be submitted to a regulatory agency much like SDTM and has in-built traceability while also having compatibility with Define-XML, so that machine-readable data definitions can be supplied along with any detailed computational details.


No comments:

Post a Comment