TCGA Client Side Validator is a Java application of QCLive (submission validator). This application is run locally by all data submission centers to validate data archives before submitting them to the DCC.
To get detailed information including examples, click the button below to access the TCGA Client Side Validator User's Guide.
11/19/2015 - Version 1.39.1
- XSD 2.7.0 support (Release Notes)
10/28/2015 - Version 1.39.0
- Exclude diagnostic images from uuid validation
07/09/2015 - Version 1.36.0
01/08/2015 - Version 1.35.0
11/19/14 - Version 1.33.1
- Updated CSV to address BCR archives validation performance issue.
10/14/2014 - Version 1.33.0
07/10/2014 - Version 1.31.0
05/14/2014 - Version 1.30.1
05/08/2014 - Version 1.30.0
03/06/2014 - Version 1.29.0
01/09/2014 - Version 1.28.0
12/12/2013 - Version 1.27.2
11/15/2013 - Version 1.27.0
11/05/2013 - Version 1.26.0
08/28/2013 - Version 1.25.2
08/12/2013 - Version 1.25.0
- Release notes
- Beta release of the client-side validator. Previous releases below refer to the legacy standalone validator.
06/18/2013 - 'May2013-M2_patch'
- Reapply fix VCF header validator bug
06/13/2013 - 'May2013-M2'
- Added new disease TGCT (Testicular Germ Cell Tumors)
- Added new platforms Mixed_DNASeq and Mixed_DNASeq_Cont to support mutation calls from multiple sequencing platforms
06/07/2013 - 'April2013-M1-patch'
- Fixed bug in VCF header validator with trailing spaces in quoted comment strings
05/08/2013 - 'April2013-M1'
- Added support to Accept/Require/Validate SDRF and IDF files for GSCs in mage-tab archives
- XSD 2.6 support (Updated biotab generator, clinical loader, QcLive)
- Added support for MAF 2.3 and 2.4 specs
- Improved Standalone - decreased number of calls to web services
03/23/2013 - January2013-XSD2.6 patch
- Updated QcLive, Biotab files generator and clinical loader to support clinical XSD 2.6
02/25/2013 - January2013
- Running an archive with an empty RNASeq data file now generates an error
- MAF files will no longer be rejected if
germis present in the file name. Also, the
Mutation_Statusfield will allow Somatic, Germline, Somatic, LOH, None, or Unknown as per the previous specfication. The DCC will manually ensure (with input from the GSCs) that germline mafs and VCFs are in archives with controlled access.
- TCGA archive validator was updated so that an informative error is provided when values are missing from the SAMPLE header.
- TCGA archive validator was updated so that a more informative error message is provided when detected in VCF submissions, as well as to not require an INDIVIDUAL declaration in the header of VCF files, in keeping with the specifications.
11/13/2012 - October2012
- Fixed bug where mismatch between a UUID's disease and the archive disease would fail the archive but was not reported to the submitter
- Fixed bug where uppercase letters in UUID in SDRF would incorrectly cause validation to fail
- Fixed bug where attempting to validate Extract Names with white space (such as non-TCGA controls) via webservices caused a 400 error
- Fixed bug incorrectly reporting missing assembly header in VCF files
- Fixed bug in VCF validation erroneously requiring samples listed in SAMPLE header to have data columns
- Fixed bug in VCF validation causing unexpected error when column header line is missing
- Updated allowed values for "Sequencer" column in MAF 2.3
10/02/2012 - September2012-M1-patch
- Improved reporting of errors caused by rate-limiting response from DCC's UUID webservice
08/29/2012 - September2012-M1
- Assembly header in VCF file can contain any value
- Files in data archive
MANIFEST.txtdo not have to be in the SDRF
- Control aliquots do not have to match archive disease
08/22/2012 - June2012-M2-patch
- Fixed erroneous validation errors resulting from rejected webservice calls
- Note: validation using remote webservices (i.e. without the -noremote flag) and the -useuuid flag will take significantly more time now.
08/21/2012 - June2012-M2 - UUID Transition Release
- Removed validation of MetadataResource field in SAMPLE header of VCF files
- Fixed erroneous checking for UUIDs in RNASeq data file names
- Removed requirement for GCC submissions to use Data Matrix file format for all Level 2 data files
- Modified VCF validation to support UUID transition
- Modified MAF validation to support UUID transition
- Modified SDRF validation to support UUID transition
- Modified RNASeq validation to support an extra column for barcode, for UUID transition
- Modified Standalone validator to do uuid and barcode validation with the noremote flag
07/10/2012 - May2012-M3-patch-2012-07-10
- Disable submission/deployment of MAF archives containing germline mutations
06/19/2012 - May2012-M3
- Dropped barcodes warnings are not falsely stated in emails to user
- Fixed failure to clean up archive job after QCLive processes an archive
- Platform is now validated for CNTL archives
- Validating that CNTL archive should only contain control file
- BCR archive with invalid data should fail with exception and it should not be in the distro dir
- Updated biotab generator to concatenate date fields to only show date_of elements
- QCLive now checks if SDRF references all data files across all batches
- Soundcheck no longer throws an exception when directory path contains a space
- Moved all deployed BCR XML, MAF and biotab files to the Open Access tier
- Fixed failure to add 'Do Not Use' annotation
- Removed 'Value for batch_number (....) should not include revision and series ' warning msgs from BiospecimenXMLValidator
05/18/2012 - Apr2012-patch-2012-05-18
- Fixed out of memory error for large VCF files
05/10/2012 - Apr2012
- Accept only MAF version 2.3 files
- MAF 2.3 now accepts multiple centers
- Update TCGA VCF validation to accept only version 1.1 files
- Fixed bug to not throw error when there is a missing GCC file in a submitted experiment
- RNASeqV2 archives now accepted
04/03/2012 - March2012
- QCLive now validates the XSD 2.5 changes for Biotab files
- QCLive now validates control aliquot barcodes
- QCLive and TCGA Archive Validator updated to accept GDAC submissions
- Updated biotab files with new protected/public elements
03/19/2012 - Dec2011
- Fixed batched calls to webservices to be under 4000 chars.
02/01/2012 - Dec2011
- No change since beta (see below)
01/26/2012 - Dec2011 PRE-RELEASE (BETA)
- More informative error/warning messages for submitted archives
- Files missing in SDRF give one warning message per file rather than multiple warnings per file
- Clinical xml dates are correctly validated
- NEW Allow and validate Level 2 RNASeq archives containing VCF files.
- Fixed bug where too many SDRF error emails were sent to user
- Fixed bugs with HumanMethylation450 and 27 data loading
- Improved Date handling for validating XML files
- Improved the error messages for hidden files within archives
- Shipped portions now have shipping dates included
- Blank values in SDRF now caught in QcLive and the standalone validator
- Fixed Java memory error when running a large MAF file
- Eliminated an erroneous empty barcode error when running a MAF archive through the standalone
- Improved barcode validation on the standalone validator
- Fixed an error when validating Affymetrix Human Exon 1.0 ST Array data
- Fixed bug with Barcode validation web service (Standalone validator)
12/07/2011 - 'Oct282011'
- Improved the validation process of SDRF files
- Improved the validation process of UUID's
- Archives that contain hidden files or folders will now fail
- Fixed various small QCLive issues
- Fixed various Standalone Validator issues
- Fixed bug so that QCLive and TCGA Archive Validator are now able to accept TCGA VCF files for validation as per the specification
- Improved the validation process of auxiliary files from BCR's
10/27/2011 - 'Oct282011-M1'
- Fixed bugs so that QCLive and TCGA Archive Validator are now able to accept TCGA VCF files for validation as per the specification
- Fixed various small QCLive issues
- Improved the validation process of BCR archives
- Improved the validation process of maf files
- Updated access locations for archives
- Fix made to disable barcode validation for everything in QCLive
- Improved the Level 3 Loader process of checking/creating new experiment records
10/24/2011 - 'Oct282011' PRE-RELEASE (BETA)
- This is a pre-release beta version of the validator
- Fixed validator hanging while validating MAF files (temporary fix)
- Fixed validator not displaying proper error messages
- Fixed validator error occurring while validating protein array SDRF
10/11/2011 - 'Sep302011'
- Allow standalone validator to warn about barcodes that have not been reported by the BCR
- Updated implementation of VCF validation
- Updated chromosome validation
- Removed validation of Wiggle (.wig) files
9/16/2011 - 'Sep022011-Patch'
- Fixed bug with validating shipped portion barcodes with alphanumeric plate IDs
- Fixed validation of VCF files to allow underscores in ID colum
9/06/2011 - 'Sep022011'
- Bug fixes and improvements to VCF implementation for TCGA-specific rules
8/23/2011 - 'August052011'
- Updated VCF implementation for TCGA-specific rules
- Added support for Illumina HiSeq platform
07/21/2011 - 'July082011'
- BCR XSD version 2.4 Implementation
- Updated MAF validator. MAF Validator was updated to accommodate for SOLiD or Illumina platforms. Also modified Variant_Classification field and removed "Indel" from the list of acceptable values for Variant_Type.
- Implemented validator for Variant Call Format (VCF) version 4.1 files
- Now able to validate protein array level 2 data file names and tab delimited formats for text files
06/30/2011 - 3.9.1 RELEASE
- Fixed errors occurring during SDRF validation and IDF validation
- Fixed Clinical XML validator bug
06/15/2011 - 3.9 RELEASE
- Added validation of protein Level 3 archives
- Modified QCLive SDRF processor to handle UUIDs in Extract Name column
- Added Clinical XML file validation after dates obscuration
06/02/2011 - 3.8 RELEASE
- Latest release code included in this standalone validator.
05/04/2011 - 3.8 PRE-RELEASE (BETA)
- This is a pre-release beta version of the validator
- Added validation of miRNASeq archives
- Please do not submit miRNASeq archives until this version of the server-side software has been released
- Before submitting to the DCC, please run your archives against the stable release
03/23/2011 - 3.7.1 RELEASE
- Note: the validator released initially on 3/23 was missing a file; if you encounter a "no class def found error" please download the validator again
- Updated for latest server-side software release
- Updated RNASeq gene file validation
02/14/2011 - 3.6 RELEASE
- Added RNASeq data file and SDRF validation
- Also includes BCR XSD validators for 2.3 spec
11/24/2010 - 3.5.1 RELEASE
- Fixed build bug, all necessary modules now included
In BCR XML: the
schemaLocationattribute in the top-level XML elements MUST have a value of the form
or the archive will fail.
11/08/2010 -*3.5 RELEASE;
- Updated BCR XSD to version 2.3 spec
- Updated MAF file validator to 2.2 spec
10/20/2010 - 3.4 RELEASE
- Updated MAF validator to conform to 2.1 spec (make sure the line
#version 2.1is added to the top of MAF files)
- Added WIG file validation
08/18/2010 - 3.3.1 RELEASE
- Patch fix for BCR biospecimen file validation -- validator will correctly validate aliquot barcodes now
08/16/2010 - 3.3 RELEASE
- Updated BCR XML validator to check new 2.x schema
- Enabled remote validation (using DCC Web Service) by default. To disable remote validation, run with -noremote flag.
06/23/2010 - 3.1.2 RELEASE
- Fixed MAF validation for finalized spec
06/02/2010 - 3.1.1 RELEASE
- The DCC has upgraded to Java version 1.6. Please update your environment to Java 1.6 before using the new validator.
- Fixed erroneous validation rules for new MAF files (INS and DEL variant types)
- Removed validation of sequencing phase for new MAF files
05/03/2010 - 3.1 RELEASE
- Added validation for new MAF files
- For version 2 and higher of the MAF specification a header line "#version 2.0" must be added or the validator will assume the MAF specification is version 1.x
- Made BCR clinical XML validation more flexible about formatting
- Improved error reporting for clinical validation errors
- Fixed incorrect barcode-validity error when processing clinical XML files
04/02/2010 - 3.0.6 RELEASE
- Fixed error in barcode validation
03/30/2010 - 3.0.5 RELEASE
- Fixed barcode validation for barcodes with alphanumeric patient and plate IDs
- Enforced requirement in SDRF for Extract Name to be either a full aliquot barcode or a correctly represented internal-control
- Removed erroneous warning for repeated internal controls accross batches
02/22/2010 - NOTE
- The zip file posted on 2/15 was missing several Java classes needed by the validator. If you downloaded it and have been getting errors about "class not found exceptions" please download the software again, as it has been updated to include these missing classes.
02/19/2010 - 3.0.4 RELEASE
- Require MAGE-TAB data matrix files to use "Composite Element REF" instead of "CompositeElement REF" in minor header, in compliance with MAGE-TAB specification (note: "Reporter REF" may also be used as appropriate).
- Fixed problems with data matrix validation when there are multiple types of level 2 files in the same column
- Add warning when samples are used in different batches of the same experiment
- Disallowed extra files in Level_1, Level_2, and Level_3 archives. Only files referred to in the SDRF, plus the MANIFEST.txt and DESCRIPTION.txt files may be in Level archives
- Improved date validation for BCR XML files
01/05/10 - 3.0.2 RELEASE
- Bug fixes
11/19/09 - 3.0.1 RELEASE
- For GSC archives, checks that MAF files are accompanied by VCF files
10/07/09 - 3.0 RELEASE
- Checks new submission process type (3.x+) archives
- Performs all checks listed in the "MAF Checks" section of the TCGA GSC Standard Operating Procedures (PDF included in this release).
See notes in Description
Bug fix for null pointer exception during CGCC archive validation