Centers align sequence reads to a reference genome to produce a Sequence Alignment Map (SAM) format file. The SAM file is then converted into a binary form, or Binary-sequence Alignment Format (BAM) file, for indexing to allow efficient random access of the data contained within. BAM files are submitted to dbGaP. The DCC tracks the submitted BAM files and provides the relationship between BAM files and biospecimen IDs.
The SAM and BAM specifications and tools to use them are available for download.