TC - JDI Train Set Test
I. Testing target:
Three tables are generated for TC database in the JDI train set. They are:
These tables are tested (by comparing to the previous release) after they are generated. The source file details of each release are:
Release | JDs | MEDLINE & Year | MRCON |
---|---|---|---|
2007 | Lisp file | MEDLINE 2004, DA: 1999, 2000, 2001 | 2003AC |
2004.99.01 | lsi2006.xml | MEDLINE 2004, DA: 1999, 2000, 2001 | 2003AC |
2004.00.02 | lsi2006.xml | MEDLINE 2004, DA: 2000, 2001, 2002 | 2003AC |
2004.01.03 | lsi2006.xml | MEDLINE 2004, DA: 2001, 2002, 2003 | 2003AC |
2005.02.04 | lsi2006.xml | MEDLINE 2005, DA: 2002, 2003, 2004 | 2004AC |
2006.03.05 | lsi2006.xml | MEDLINE 2006, DA: 2003, 2004, 2005 | 2005AC |
2007.04.06 | lsi2006.xml | MEDLINE 2007, DA: 2004, 2005, 2006 | 2006AD |
2008 | lsi2007.xml | MEDLINE 2008, DA: 2005, 2006, 2007 | 2007AC |
Release | JDs | MEDLINE & Year | MRCON |
---|---|---|---|
2007 | Lisp file | MEDLINE 2004, DA: 1999, 2000, 2001 | 2003AC |
2008.99.01 | lsi2007.xml | MEDLINE 2008, DA: 1999, 2000, 2001 | 2007AC |
2008.00.02 | lsi2007.xml | MEDLINE 2008, DA: 2000, 2001, 2002 | 2007AC |
2008.01.03 | lsi2007.xml | MEDLINE 2008, DA: 2001, 2002, 2003 | 2007AC |
2008.02.04 | lsi2007.xml | MEDLINE 2008, DA: 2002, 2003, 2004 | 2007AC |
2008.03.05 | lsi2007.xml | MEDLINE 2008, DA: 2003, 2004, 2005 | 2007AC |
2008.04.06 | lsi2007.xml | MEDLINE 2008, DA: 2004, 2005, 2006 | 2007AC |
2008.05.07 | lsi2007.xml | MEDLINE 2008, DA: 2005, 2006, 2007 | 2007AC |
II. Testing Procedures:
entity name: word/Mh/Sh | JDID | count score |
---|
shell> flds 1,2,3 WordJdidWcDcTable.txt > WordJdidWcTable.txt shell> flds 1,2,4 WordJdidWcDcTable.txt > WordJdidDcTable.txt
entity name: word/Mh/Sh | Similarity distance |
---|
Please note that some entity does not have any common JD score in both/either release. In such case, the similarity distance will be a NaN and should not be compared. For examples, in the comparison of 2007 and 2007+ releases, four MeSH main headings falls in this category
Main Heading | Not common JDs, 2007 | Not common JDs, 2007+ |
---|---|---|
butirosin sulfate | JD007 | JD136 |
capreomycin sulfate | JD007 | JD136 |
certificate of need | JD027 | |
congenital, hereditary, and neonatal diseases and abnormalities | JD006 |
III. Testing Results:
Please refer to JDI similarity tests for the TC annual release.
2007 to 2008
Versions | Mh-DC | Sh-Dc | Word-Dc | Word-Wc |
---|---|---|---|---|
2007 -2008.99.01 | 0.9913 | 0.9949 | 0.9923 | 0.9911 |
2008.99.01-2008.00.02 | 0.9903 | 0.9970 | 0.9793 | 0.9737 |
2008.00.02-2008.01.03 | 0.9892 | 1.0000 | 0.9772 | 0.9723 |
2008.01.03-2008.02.04 | 0.9894 | 1.0000 | 0.9795 | 0.9739 |
2008.02.04-2008.03.05 | 0.9918 | 1.0000 | 0.9808 | 0.9753 |
2008.03.05-2008.04.06 | 0.9900 | 1.0000 | 0.9795 | 0.9742 |
2008.04.06-2008.05.07 | 0.9894 | 1.0000 | 0.9797 | 0.9742 |
Compare to 2007
Versions | Mh-DC | Sh-Dc | Word-Dc | Word-Wc |
---|---|---|---|---|
2007-2008.99.07 | 0. | 0. | 0. | 0. |
2007-2008.00.07 | 0. | 0. | 0. | 0. |
2007-2008.01.07 | 0. | 0. | 0. | 0. |
2007-2008.02.07 | 0.9585 | 0.9929 | 0.8739 | 0.8618 |
2007-2008.03.07 | 0.9532 | 0.9929 | 0.8568 | 0.8443 |
2007-2008.04.07 | 0.9507 | 0.9931 | 0.8516 | 0.8387 |
2007-2008.05.07 | 0.9464 | 0.9932 | 0.8472 | 0.8337 |
2007-2008.06.07 | 0.9398 | 0.9930 | 0.8432 | 0.8293 |
2007-2008.07.07 | 0.9275 | 0.9923 | 0.8428 | 0.8283 |
Compare to 2008
Versions | Mh-DC | Sh-Dc | Word-Dc | Word-Wc |
---|---|---|---|---|
2008-2008.99.07 | 0. | 0. | 0. | 0. |
2008-2008.00.07 | 0. | 0. | 0. | 0. |
2008-2008.01.07 | 0. | 0. | 0. | 0. |
2008-2008.02.07 | 0.9936 | 1.0000 | 0.9893 | 0.9857 |
2008-2008.03.07 | 0.9948 | 1.0000 | 0.9920 | 0.9891 |
2008-2008.04.07 | 0.9969 | 1.0000 | 0.9957 | 0.9939 |
2008-2008.05.07 | 1.0000 | 1.0000 | 1.0000 | 1.0000 |
2008-2008.06.07 | 0.9962 | 1.0000 | 0.9935 | 0.9906 |
2008-2008.07.07 | 0.9848 | 1.0000 | 0.9803 | 0.9736 |