Because of a lapse in government funding, the information on this website may not be up to date, transactions submitted via the website may not be processed, and the agency may not be able to respond to inquiries until appropriations are enacted.
The NIH Clinical Center (the research hospital of NIH) is open. For more details about its operating status, please visit cc.nih.gov.
Updates regarding government operating status and resumption of normal operations can be found at OPM.gov.
Text Categorization
Pre-Process: Jid-Ta-Jds
Description:
This file includes the information of Journal Id (JID), Journal title (TA), and the associated Journal Descriptors (JDs) from List of Serials Indexed file lsi${YEAR}.xml. It was originally manually maintained by NLM and Susanne in 2004 training set. It was static and provided by Susanne as "jid-ta-jd.im.20031201.mod.fixed.l".
In the Java 2007 release, we derived this file from List of Serials Indexed file, lsi2006.xml. We use lsi2007.xml for the 2008 release.
Input:
By NLM:
ftp://ftp.nlm.nih.gov/online/journals/lsi2007.xml
Java File & Algorithm:
GenerateJidTaJdsFromLsi.java
parse lsi.xml file
Find xml tag <NlmUniqueID> for Journal ID, JID
Find xml tag <MedlineTA> for Journal Title, TA
Find xml tag <BroadJournalHeading> for Journal Descriptors, JDs
Find xml tag <BroadJournalHeadingList> for the begining of JDs
print out information in the new format to file: jidTaJds.out
perform unique sort on jidTaJds.out to get jidTaJds.txt
(sort -u jidTaJds.out > jidTaJds.txt)