Input: 1. Thesaurus.owl (the inferred version) 2. Value set definition xml files Processing: Step 1: Find all parent-child (subClassOf) and concept_in_subset (A8) relationships in NCI Thesaurus. parent-child (subClassOf): Potency Unit|C48470|Biscuit Dosing Unit|C111139 Antiparasitic Agent|C276|Milbemycin A3 5-Oxime|C87654 Application|C60755|Product Correspondence|C97024 Country|C25464|Yemen|C17264 Peptide Vaccine|C1752|HLA-A*0201-Restricted URLC10-VEGFR1-VEGFR2 Multipeptide Vaccine|C77900 Pharmacologic Management|C21090|Pharmacotherapy Discontinuation|C128535 Evaluator|C51824|Macroscopic Findings Evaluator|C119859 CDISC Questionnaire Terminology|C100110|CDISC Questionnaire DLQI Test Name Terminology|C119069 Malignant Osteoblast|C36901|Malignant Fusiform Osteoblast|C67521 Receptor|C18106|Programmed Cell Death 1 Ligand 1|C96024 WISP2 Gene|C21417|WISP2 wt Allele|C52966 ... ... concept_in_subset: C61305|A8|C63923 C61305|A8|C128784 C2048|A8|C128784 C123164|A8|C90259 C123164|A8|C123272 C123165|A8|C90259 C123165|A8|C123272 C120556|A8|C116978 C120556|A8|C116977 C120556|A8|C128784 C2688|A8|C116978 C2688|A8|C116977 ... ... Step 2: Parse value set definition xml files to get data required for resolving the corresponding value sets. Step 3: Resolve all value sets based on the data found in Steps 1 and 2 above. This step will produce: - a set of concepts that belong to at least one value set - a one to many mapping from the above concept codes to the URIs of value sets to which they belong. Resolve value sets: 1|CDISC_Questionnaire_Terminology.txt|http://evs.nci.nih.gov/valueset/C100110|3346 2|CDISC_Questionnaire_Category_Terminology.txt|http://evs.nci.nih.gov/valueset/C100129|402 3|CDISC_SDTM_Relationship_to_Subject_Terminology.txt|http://evs.nci.nih.gov/valueset/C100130|58 4|CDISC_Questionnaire_ADAS-Cog_CDISC_Version_Test_Name_Terminology.txt|http://evs.nci.nih.gov/valueset/C100131|110 5|CDISC_Questionnaire_ADAS-Cog_CDISC_Version_Test_Code_Terminology.txt|http://evs.nci.nih.gov/valueset/C100132|110 6|CDISC_Questionnaire_BPRS-A_Test_Name_Terminology.txt|http://evs.nci.nih.gov/valueset/C100133|18 7|CDISC_Questionnaire_BPRS-A_Test_Code_Terminology.txt|http://evs.nci.nih.gov/valueset/C100134|18 8|CDISC_Questionnaire_EQ-5D-3L_Test_Name_Terminology.txt|http://evs.nci.nih.gov/valueset/C100135|6 9|CDISC_Questionnaire_EQ-5D-3L_Test_Code_Terminology.txt|http://evs.nci.nih.gov/valueset/C100136|6 ... ... 735|Product_Characteristic_ICSR_Terminology.txt|http://evs.nci.nih.gov/valueset/C99174|4 736|Substance_Administration_ICSR_Terminology.txt|http://evs.nci.nih.gov/valueset/C99175|1 737|SPL_Miscellaneous_Identifier_Types_Terminology.txt|http://evs.nci.nih.gov/valueset/C99288|3 738|CDISC_SDTM_Laterality_Terminology.txt|http://evs.nci.nih.gov/valueset/NICHD/C99073|7 739|CDISC_SDTM_Directionality_Terminology.txt|http://evs.nci.nih.gov/valueset/NICHD/C99074|32 740|NCIt_Neoplasm_Tree.txt|http://ncit:Neoplasm|8479 741|NDFRT_Mechanism_of_Action.txt|http://ndfrt:MoA|1 742|NDFRT_Physiologic_Effects.txt|http://ndfrt:PE|1 743|NDFRT_Structural_Class.txt|http://ndfrt:SC|1 Concepts that belong to at least one value set (1) C105289 (2) C71913 (3) C71914 (4) C71911 (5) C71912 (6) C71910 (7) C105293 (8) C105292 (9) C105295 (10) C105294 (11) C71908 (12) C18772 (13) C105297 (14) C71907 (15) C105296 ... ... (49288) C16950 (49289) C16953 (49290) C16952 (49291) C52188 (49292) C52187 (49293) C52185 (49294) C52186 (49295) C52183 (49296) C52184 (49297) C52181 (49298) C52182 (49299) C16955 Mapping from the above concept codes to the URIs of value sets: C105289|http://evs.nci.nih.gov/valueset/C100110 C105289|http://evs.nci.nih.gov/valueset/C105139 C105289|http://evs.nci.nih.gov/valueset/C105140 C105289|http://evs.nci.nih.gov/valueset/C61410 C105289|http://evs.nci.nih.gov/valueset/C66830 C71913|http://evs.nci.nih.gov/valueset/C63923 C71914|http://evs.nci.nih.gov/valueset/C63923 C71911|http://evs.nci.nih.gov/valueset/C63923 C71912|http://evs.nci.nih.gov/valueset/C63923 C71910|http://evs.nci.nih.gov/valueset/C63923 C105293|http://evs.nci.nih.gov/valueset/C100110 C105293|http://evs.nci.nih.gov/valueset/C105139 C105293|http://evs.nci.nih.gov/valueset/C105140 ... ... C52183|http://evs.nci.nih.gov/valueset/C63923 C52184|http://evs.nci.nih.gov/valueset/C128784 C52184|http://evs.nci.nih.gov/valueset/C63923 C52181|http://evs.nci.nih.gov/valueset/C63923 C52182|http://evs.nci.nih.gov/valueset/C128784 C16955|http://evs.nci.nih.gov/valueset/C106478 C16955|http://evs.nci.nih.gov/valueset/C106479 C16955|http://evs.nci.nih.gov/valueset/C118466 C16955|http://evs.nci.nih.gov/valueset/C61410 C16955|http://evs.nci.nih.gov/valueset/C66830 C16955|http://evs.nci.nih.gov/valueset/C90259 ... ... Step 4: Construct a new owl file to include concepts belonging to the set of concepts found above. Remove all but presentational properties (owl:AnnotationProperty) from the content of the corresponding NCIt (or NDFRT) concept. Remove all relationships (rdfs:subClassOf, owl:ObjectProperty, and owl annonymous classes) including equivalent class, restrictions, and associations, etc) Add new properties where the property name could be Value Set (a new owl:AnnotationProperty needs to be defined) and the property value would be the URI of the value set. Example: C16955 (P8: VALUE SET annotation property) Parity A measurement of the total number of live-born offspring a female has delivered.CDISC The number of pregnancies reaching 20 weeks and 0 days of gestation or beyond, regardless of the number of fetuses or outcomes.NCINICHD The number of pregnancies reaching 20 weeks and 0 days of gestation or beyond, regardless of the number of fetuses or outcomes.NICHD BRTHLVNPTCDISCSDTM-RPTESTCD Live Birth CountSYNCI Number of Live BirthsPTCDISCSDTM-RPTEST Number of Live BirthsSYCDISC ParaSYNCI ParityPTNCI ParityPTNICHD C0030563 C16955 CDISC NICHD Organism Attribute Parity Parity Parity Output: NCI_Thesaurus_Value_Set.owl generated above. Usage: This owl file, once loaded, can be used for: - Code search: (CNS union?) - Name search: (CNS union?) restrictToMatchingDesignations, restrictToMatchingProperties - Resolve value set: Search concepts with a property matching a given URI. All functions can be achieved through the native LexEVSAPI. Possible convenience method - restrictToMatchingProperties on multiple matching values. Implementation Details: - concepts in NDFRT or other terminologies. (use the namespace prefix) - assignment of coding scheme name and version, and namespace at load time.