Team discussions regarding CDISC often bring in the mists of darkness, which obscure the landscape and prevent us from moving in a clear direction. Then if we weren’t confused enough, the discussion moves to SDTM, ADaM, and clinical databases, and we feel like we are spinning out of control. The land of clinical study data can be challenging and confusing, but I hope that a ray of light can help clear our minds and the path before us. Knowledge brings light to any discussion, and this post will present the basic details of the CDISC data standards.
Before we begin, a short list of abbreviations for your reference while reading this post:
ADaM = Analysis Data Model
CDASH = Clinical Data Acquisition Standards Harmonization
CDISC = Clinical Data Interchange Standards Consortium
SDTM = Study Data Tabulation Model
Clinical trials have been conducted since biblical times (Book of Daniel, chapter 1, verses 12-15) and likely even earlier. At the heart of every clinical trial is a scientific hypothesis. And to answer each hypothesis, we collect data for analysis. Thus the data collected during a clinical trial is the tangible results of the hard work of the entire clinical team. That clinical trial data can then be analyzed multiple times and in multiple ways to address the study hypothesis, or generate new hypotheses for future examination. Without effective data collection, clinical trials would be forgotten and we would rely on memories rather than data.
Through history, we have constant improved our data collection techniques, accuracy, and precision to minimize errors in the data from clinical trials. These innovations and improvements were often a result of innovative scientists and entrepreneurs. These advances include clinical monitoring, case report forms, electronic data capture, and clinical study databases. With the vast improvements in data collection techniques and the increased speed at which we acquire new data, similar advances in data analysis have occurred. Despite all these advances, improvements in data collection and analysis were not harmonized across pharmaceutical companies, therapeutic areas, or countries of the world. Thus clinical trial data was not interchangeable or accessible to researchers with new hypotheses.
The clinical data interchange standards consortium (CDISC) was formed in 1997 to develop global standards and innovations to streamline medical research and ensure a link with healthcare. The CDISC mission is “to develop and support global, platform-independent data standards that enable information system interoperability to improve medical research and related areas of healthcare.” The consortium includes members from pharmaceutical companies, medical device manufacturers, regulatory authorities, and service providers. This consortium publishes standards that are recommended for clinical trial data to further the goal of interoperability.
The clinical data acquisition standards harmonization (CDASH) was initiated to standardize the data collection process. When data is collected from a clinical trial, it is entered into an electronic database (like a large spreadsheet). Each item placed in the database normally includes unique identifying information. For example, if body weight is measured, the data is body weight and the unique identifying information includes patient ID, date, time, study, study visit, etc. Each piece of information is put into a spot in the database called a “field” or “database field”. Values for a specific “field” or all body weights can then be extracted from the database for analysis. The CDASH standards specify the name and type of fields that can be used. For example, to record weight, one company might use “Weight” while another may use “WT”. The CDASH standards specify the “field” names and how the data is organized. The CDASH standards are used when developing case report forms (CRF) and electronic data capture (EDC) systems.
After the data is collected into a clinical database, it must be converted into standard data tables to be used for analysis. The study data tabulation model (SDTM) defines the way in which individual observations from a clinical study are compiled. The basic concept is that each piece of data can be uniquely identified based on corresponding information (eg, patient ID, date, time, study, study visit, procedure, measurement unit, etc.). Thus each row contains one piece of data and many columns of identifying information. While this method may lead to bloated files due to many blank columns, it is comprehensive and consistent across studies. The data in SDTM are broken into multiple “domains” such as demographics (DM), subject visits (SV), concomitant medications (CM), exposure (EX), adverse events (AE), ECG results (EG), laboratory results (LB), PK concentrations (PC), PK parameters (PP), and vital signs (VS). Each domain usually is constructed as a single file with the domain as the filename (e.g., CM.xpt). These SDTM data sets can be used directly for analysis if no further calculations are necessary.
Analysis data sets are created to enable the statistical and scientific analysis of the study results. The analysis data model (ADaM) specifies the fundamental principles and standards to ensure that there is clear lineage from data collection to analysis. The ADaM data sets are the “authoritative source for all data derivations used in statistical analyses.” For example, if change from baseline in body weight was the primary efficacy variable, the SDTM would contain each body weight measurements. An ADaM data set would include the derived change from baseline body weight for each time point to be included in the statistical analysis. The ADaM data sets are not required unless data derivations are performed based on SDTM data. In addition ADaM data sets should only be derived from SDTM datasets.
While the detailed CDISC data standards contain additional details, the basic fact remains that data standardization allows for interoperability between software, analysts, and organizations. The CDASH standards have been applied to data collection to standardize the variable names in clinical databases. The SDTM data sets provide standards for organizing clinical trial data following database lock. And the ADaM data sets provide a connection between the SDTM data sets and final statistical analyses. In conclusion, these concepts are simply standards for clinical trial data.
Just like we have standards for electrical outlets that permit developers to create fantastic electric equipment, the CDISC standards are intended to permit drug and device developers the opportunity to analyze clinical trial data and make important healthcare discoveries. You can read more about CDISC standards at the consortium website (link). Hopefully these basic details about CDISC data standards will help you see more clearly in the light during your next team meeting.
The FDA has mandated that sponsors use the CDISC SEND format for electronic submissions. To learn how NCA data needs to be prepared, transformed, and formatted to be SEND-ready and how Phoenix tools can save time, reduce errors, and increase compliance, please watch this webinar.