- Parallel method
- Retrospective method
- Linear method
- Hybrid method
Parallel method:
The advantages of this method are:
- The development of the SDTM and ADaM datasets are independent and can be completed at any time without input from the other.
- The creation of the SDTM can happen at the time of submission so there is no effort wasted if the clinical trial is either unsuccessful or not included in a submission.
- The independence of the SDTM and ADaM datasets allows for parallel project teams to perform the extraction, transformation, and load (ETL) processes. This may be important for outsourced projects.
- This method requires a minimum amount of re-engineering of existing processes within most pharmaceutical companies.
The disadvantages of this method are:
- Documentation for each set of data share no similarity and parallel creation decreases efficiency.
- Derivation of created variables in analysis datasets does not reference variables in the SDTM. This creates a significant disconnect between the two sets of submission data.
- The regulatory agency does not have the original DBMS extract with which to verify or explore derivations performed in the analysis datasets. Similarly, they would not have the DBMS annotated CRF’s to understand the original source of the derivations of the analysis variables.
- Any analysis programs submitted to the agency as analysis-level documentation has limited value since the source data is not available.
- Validation is necessary to ensure that similar records or variables in both SDTM and ADaM datasets are identical, such as indication of which record is considered baseline.
Retrospective method:
The advantages to this method are:
- The creation of the SDTM can happen at the time of submission so there is no effort wasted if the clinical trial is either unsuccessful or not included in a submission.
- As enhancements to the SDTM standards are released, the analysis datasets are not affected and any enhancement can be represented during the creation of the SDTM.
The disadvantages of this method are:
- The regulatory agency does not have the original DBMS extract with which to verify or explore derivations performed in the analysis datasets. Similarly, they would not have the DBMS annotated CRF’s to understand the original source of the derivations of the analysis variables.
- Any analysis programs submitted to the agency as analysis-level documentation has limited value since the source data is not available.
- Any date imputation or other types of hard coding performed during the creation of the analysis datasets would have to be undone since the SDTM represent the data as it was collected.
- All CRF variables represented in the SDTM would need to be retained in the analysis datasets even if they are not used for analysis. This increases the complexity of documentation.
- Validation is necessary to ensure that the SDTM adequately represent the original source data. This step could be potentially difficult and result in a loss of efficiency.
Linear method:
The advantages of this method are:
- Analysis programs submitted to the agency as analysis-level documentation utilize the SDTM domains as input and are thus useable and informative to the reviewer.
- Using SDTM domains as input to analysis datasets allows for the standardization of analysis dataset structures and programming methods to produce study report summaries.
- Traceability
The disadvantages of this method are:
- The development of the analysis datasets relies on the completion of the SDTM domains.
- The SDTM domains are created for all clinical trials regardless of whether they will be part of a submission.
- It is potentially more difficult to manage if the data management and/or the biostatistics is outsourced.
Hybrid method:
With this method, the differences between SDTM Draft domains and SDTM Final domains are envisioned to be small.
The SDTM Final domains contain the subset of variables or records that are optimally created during the analysis or at the final stage of submission preparation.
An example is the creation of USUBJID. This variable is required in the SDTM and provides a unique key identifier for a given subject. In some situations, however, the creation of USUBJID cannot be defined until all studies are complete since a given subject may participate in multiple trials.
Other examples include the creation of expected variable that is present in all Findings domains that indicate the data record considered to be the baseline value (e.g. ‘EGBLFL’). Since these indicator flags likely would be derived in the AD’s, creating the Final SDTM domains retrospectively from the AD’s prevents redundant derivation and eliminates the possibility of discord between SDTM domain and the analysis dataset. Finally, population indicator variables, such as those for intent-to-treat or per protocol status, can be optimally created in the AD and then placed in the supplemental qualifier domain.
The advantages to this method are:
- With a few possible exceptions, analysis programs submitted to the agency as analysis-level documentation utilize the SDTM domains as input and are thus useable and informative to the reviewer.
- Using SDTM domains as input to analysis datasets allows for standardization of analysis dataset structures and programming methods to produced study report summaries.
- The variables or records in the SDTM that need bio statistical input, such as indication of baseline records or creation of population flags, is done in harmony with analysis datasets so there is no possibility of discrepancy.
- If important, derived records can be added to the SDTM domains thus providing the reviewer with both CRF and analysis records.
- Final completion of the SDTM domains can be done at the time of submission.
The disadvantages of this method are:
- The development of the analysis datasets relies on the completion of the SDTM domains.
- The SDTM domains are created for all clinical trials regardless of whether they will be part of a submission.
- This may be potentially more difficult to manage if the data management and/or the Biostatistics is outsourced.
RECOMMENDATIONS:
Each organization will need to leverage the advantages and disadvantages of these methods when deciding an implementation plan. For submissions that are prepared within the near future, several of the above methods may need to be used in tandem to accommodate both legacy data and ongoing studies.
But as CDISC standards become adopted within an organization, one would expect that efficiencies will be gained if one method were used for all new studies going forward. Weighing the advantages and disadvantages of each method above, the linear or the hybrid method are the most parsimonious and are long-term solutions.