NIH | National Cancer Institute | NCI Wiki  

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

divii_mlsmr.csv NSC to PubChem SID for the Diversity Set.

2D structures (

...

Old Versions)

All Open (June 2016 Release) 284176 compounds. 81 MB compressed, uncompresses to 710 MB

...

All Open (March 2012 Release) 273885 compounds. 64 MB compressed, uncompresses to 648 MB

Mechanistic Set

3D

...

structures (Old Versions)

Mechanistic Set

SMILES

...

strings (Old Versions)

SMILES strings - 237,771 structures in SMILES format. This database contains essentially all open structures in the NCI database up until about June, 1995. It includes metal-containing compounds and other 'weird stuff'. It is therefore up to the user to ascertain the usefulness of any of these SMILES strings for the intended purpose. Because different conversion programs produce different output, two versions of the SMILES database are provided.

...

Converted using CACTVS 4.4 MB compressed using standard Unix compress, uncompresses to ca. 15 MB. (zip compressed ) The program CACTVS v. 3.2 was used to convert the connection tables to SMILES strings. Thanks to Wolf-Dietrich Ihlenfeldt for providing us with the conversion scripts handling the formal charge problem and other 'unusual stuff' in the NCI database.

Chemical Names (Old Versions)

chemnames_Aug2013.zip All chemical names available. First field is NSC number, second filed is the name and the third field is the name type (most just generic "Chemical Name"). The field separator is a "|".  Note that we do not have chemical names for most of the compounds and many of the names that are there are systematic names that might not be very useful for searching.

...