What types of data are available?
Datasets undergo rigorous quality review upon submission to the Data Archive. Secondary researchers should be able to reproduce published results (or to approximate study findings, in cases where certain variables were modified during deidentification).
Datasets submitted to the NCTN/NCORP Data Archive data include:
- Patient-level, de-identified clinical data
- Data dictionaries
- Limited metadata fields
- All variables used in the published analyses (with some exceptions, such as modifications required to de-identify data)
- Depending on the trial, correlative data (imaging, genomics, etc.) may also be available in dbGaP.
Which clinical trials have available data?
View the List of Available Datasets in the NCTN/NCORP Data Archive
As of May 7, 2026, data representing the experience of 26,224 patients from 14 trials submitted to the NCTN/NCORP Data Archive are available in dbGaP.
Disease sites for these trials include:
- Breast Neoplasm (3 Trials)
- Gastrointestinal Neoplasm (3 Trials)
- Lung, Mediastinal and Pleural Neoplasm (5 Trials)
- Reproductive System Neoplasm, Male (1 Trial)
- Hematopoietic Neoplasm/Leukemia (1 Trial)
With some exceptions, datasets submitted to the Data Archive represent clinical data from NCTN and NCORP clinical trials analyzed in primary publications or select non-primary publications from 2015 or later.
Data sharing is expected for the following trials and publications:
- Phase II/III and phase III trials: primary publications beginning January 1, 2015
- Phase II/III and phase III trials: select non-primary publications beginning April 1, 2018
- Phase II trials: select primary and non-primary publications beginning January 1, 2015
- Data are also available for select publications published prior to January 1, 2015.
Data are not available immediately upon publication. Prior to May 2, 2025, datasets submitted to the Data Archive were shared on a separate Data Archive webpage. These datasets are being released through dbGaP on an ongoing basis, along with newly submitted datasets from recent publications. Please check back frequently to see the most recent list of available datasets.