PERSONNEL

MISSING

Daniel Le, PhD

Former Employee

Applied Clinical Informatics Branch
Electronics Engineer

Contact Information


Expertise and Research Interests:

Daniel X. Le received the BS degree summa cum laude in Electrical and Computer Engineering from California State Polytechnic University, Pomona, in June 1986 and the MS and PhD degrees in Computer Science from George Mason University, Fairfax, Virginia, in January 1993 and February 1997, respectively.

From June 1986 to April 1989, he was a software engineer at the Jet Propulsion Laboratory, Pasadena, California. From April 1989 to September 1990, he was a system engineer at Science Applications International Corporation, McLean, Virginia. Since September 1990, he has been an electronics engineer here at the Lister Hill National Center for Biomedical Communications, the research and development arm of the National Library of Medicine.

Dr. Le's research interests are in document analysis and understanding, neural networks, optical character recognition, image quality and image processing. Dr. Le holds one US patent on automated portrait/landscape orientation detection in binary document images.


Publications:

Le D, Mork J. Check Tags MeSH Terms Indexing Research Project. A report to the Applied Clinical Informatics Branch.

Le DX, Mork JG, Antani S. Hybrid Ensemble-Rule Algorithm for Improved MEDLINE® Sentence Boundary Detection. AMIA Annual Symposium Proceeding 2021;2021:677-686.

Rae A, Kim J, Le DX, Thoma GR. Main Content Detection in HTML Journal Articles. DocEng ’18: ACM Symposium on Document Engineering 2018, August 28–31, 2018, Halifax, NS, Canada. ACM, New York, NY, USA, 4 pages. https://doi.org/10.1145/3209280.3229115

Kim I, Le DX, Thoma GR. Automated method for extracting "citation sentences" from online biomedical articles using SVM-based text summarization technique. Proc. the 2014 IEEE Int'l Conf. on Systems, Man, and Cybernetics (SMC 2014), pp. 2006-2011, San Diego, October, 2014

Kim J, Le DX, Thoma GR. Identification of Investigator Name Zones Using SVM Classifiers and Heuristic Rules. 12th international Conference on Document Analysis and Recognition (ICDAR). Washington D.C., August 2013.

Zhang X, Zou J, Le DX, Thoma GR. Combining Discriminative SVM models for the Improved Recognition of Investigator Names in Medical Articles. . Proceedings of SPIE Volume 8658, Document Recognition and Retrieval XX, IS&T/SPIE Electronic Imaging 2013.

Kim I, Le DX, Thoma GR. Identifying “comment-on” citation data in online biomedical articles using SVM-based text summarization technique. Proc. Int’l Conf. Artificial Intelligence (ICAI’12), vol. 1, pp. 431-437, Las Vegas, July 2012.

Kim J, Le DX, Thoma GR. Combining SVM Classifiers to Identify Investigator Name Zones in Biomedical Articles. . IS&T/SPIE’s 22nd Annual Symposium on Electronic Imaging. San Francisco, CA, January 2012; 8297.

Zhang X, Zou J, Le DX, Thoma GR. A structural SVM approach for reference parsing. BMC Bioinformatics. 2011 Jun 9;12 Suppl 3:S7. doi: 10.1186/1471-2105-12-S3-S7.

Kim I, Le DX, Thoma GR. Automated identification of biomedical article type using support vector machines. Proc. 18th SPIE Document Recognition and Retrieval, 7874:787403 (1-9), San Francisco, January 2011.

Zhang X, Zou J, Le DX, Thoma GR. Investigator Name Recognition From Medical Journal Articles: A Comparative Study of SVM and Structural SVM. International Workshop on Document Analysis Systems. June 2010:121-8

Zou J, Le DX, Thoma GR. Locating and parsing bibliographic references in HTML medical articles. Int J Doc Anal Recognit. 2010 Jun 1;13(2):107-119.

Kim J, Le DX, Thoma GR. Naive Bayes and SVM Classifiers For Classifying Databank Accession Number Sentences From Online Biomedical Articles. IS&T/SPIE's 22nd Annual Symposium on Electronic Imaging. San Jose, CA. January 2010;7534:75340U-1 - 8

Zhang X, Zou J, Le DX, Thoma GR. A Stacked Sequential Learning Method For Investigator Name Recognition From Web-based Medical Articles. 17th Document Recognition and Retrieval Conference (SPIE-DR&R). San Jose, CA. January 2010;7534:753404-7

Kim J, Le DX, Thoma GR. Inferring Grant Support Types From Online Biomedical Articles. 22nd IEEE ISCBMS. Albuquerque, NM. August 2009

Zhang X, Zou J, Le DX, Thoma GR. A Semi-supervised Learning Method to Classify Grant Support Zone in Web-based Medical Articles. Proc SPIE Electronic Imaging Science and Technology, Document Recognition and Retrieval. January 2009;7247:7247 OW(1-8)

Kim J, Le DX, Thoma GR. Naive Bayes Classifier for Extracting Bibliographic Information From Biomedical Online Articles. Proc 2008 International Conference on Data Mining. Las Vegas, Nevada, USA. July 2008;II:373-8

Thoma GR, Le DX, Kim I, Kim JW, Moon C, Tran L, Zou J. Automation to Accelerate the Production of MEDLINE. April 2008 Technical Report to the LHNCBC Board of Scientific Counselors.

Kim IC, Le DX, Thoma GR. Hybrid approach combining contextual and statistical information for identifying and statistical information for identifying MEDLINE citation terms. Proc. SPIE-IS/T Electronic Imaging. San Jose, CA. January 2008;6815:68150P(1-9)

Zou J, Le DX, Thoma GR. Extracting a Sparsely-Located Named Entity from Online HTML Medical Articles Using Support Vector Machine. Proc SPIE-IS/T Electronic Imaging. San Jose, CA. January 2008;6815:6815OP(1-10)

Zou J, Le DX, Thoma GR. Online Medical Journal Article Layout Analysis. Proc SPIE-IS&T Electronic Imaging 2007, SPIE Vol. 6500: 65000V (1-12)

Zou J, Le DX, Thoma GR. Structure and Content Analysis for HTML Medical Articles: A Hidden Markov Model Approach. Proc August 2007 ACM Symposium on Document Engineering. pp. 199-201

Kim IC, Le DX, Thoma GR. Identification of "comment-on sentences" in online biomedical documents using support vector machines. . Proc. SPIE conference on Document Recognition and Retrieval, 6500:65000O (1-8), San Jose, January 2007.

Kim J, Le DX, Thoma GR. Automatic Extraction of Bibliographic Information from Biomedical Online Journal Articles Using a String Matching Algorithm. Proc IEEE CBMS, June 2006, Salt Lake City, Utah; 905-10

Zou J, Le DX, Thoma GR. Combining DOM Tree and Geometric Layout Analysis for Online Medical Journal Article Segmentation. Proc JCDL, June 2006, Chapel Hill, NC; 119-28

Kim J, Le DX, Thoma GR. Automated Labeling Of Biomedical Online Journal Articles. In: Callaos N, Lesso W, editors. SCI 2005. Proc 9th World Multiconference on Systemics, Cybernetics and Informatics; 2005 Jul 10-13; Vol. 4; Orlando (FL): International Institute of Informatics and Systemics; c2005. 406-11

Le DX, Thoma GR. Automatically Creating Biomedical Bibliographic Records from Printed Volumes of Old Indexes. In: Callaos N, Lesso W, editors. SCI 2005. Proc 9th World Multiconference on Systemics, Cybernetics and Informatics; 2005 Jul 10-13; Vol. 3, Computer Science and Engineering. Orlando (FL): International Institute of Informatics and Systemics; c2005. 267-74

Kim I, Le DX, Thoma GR. Automated Cleanup Processing for Extracting Bibliographic Data from Biomedical Online Journals. In: Callaos N, Lesso W, editors. SCI 2005. Proc. 9th World Multiconference on Systemics, Cybernetics and Informatics; 2005 Jul 10-13; Vol. 4; Orlando (FL): International Institute of Informatics and Systemics; c2005. 401-5

Kim J, Le DX, Thoma GR. Automated Labeling for Biomedical Journals Published in Foreign Languages. Proc. 8th World Multiconference on Systemics, Cybernetics and Informatics. 2004 Jul.;:444-9.

Le DX, Thoma GR. Automated Article Links Identification for Web-Based Online Medical Journals. Proc. 8th World Multiconference on Systemics, Cybernetics and Informatics. 2004 Jul.;5:462-6.

Le DX, Straughan SR, Thoma GR. Greek Alphabet Recognition Technique for Biomedical Documents. Proc. 6th World Multiconference on Systemics, Cybernetics and Informatics, eds: Callaos N, et al. 2002 July;III: 86-91.

Tran LQ, Moon CW, Le DX, Thoma GR. Web Page Downloading and Classification. Proc. 14th IEEE Symposium on Computer-Based Medical Systems: IEEE Computer Society. 2001 Jul;:321-6.

Mao S, Kim J, Le DX, Thoma GR. Generating Robust Features for Style-Independent Labeling of Bibliographic Fields in Medical Journal Articles. Proc. 7th World Multiconference on Systemics, Cybernetics and Informatics.2003 July;III:53-6.

Kim J, Le DX, Thoma GR. Automated Labeling Algorithms for Biomedical Document Images. Proc. 7th World Multiconference on Systemics, Cybernetics and Informatics. 2003 July;V: 352-57.

Ford G, Hauser SE, Le DX, Thoma GR. Pattern Matching Techniques for Correcting Low Confidence OCR Words in a Known Context. Proc. SPIE., Document Recognition and Retrieval VIII. 2001 Jan;4307:241-9.

Kim J, Le DX, Thoma GR. Automated Labeling in Document Images. Proc. SPIE, Document Recognition and Retrieval VIII. 2001 Jan;4307:111-22.

Le DX, Thoma GR. Automated Document Labeling for Web-Based Online Medical Journals. Proc. 7th World Multiconference on Systemics, Cybernetics and Informatics. 2003 July;II: 411-15.

Hauser SE, Le DX, Thoma GR. Automated Zone Correction in Bitmapped Document Images. SPIE: Document Recognition and Retrieval VII. 2000 Jan;3976: 248-58.

Le DX, Tran LQ, Chow J, Kim J, Hauser SE, Moon CW, Thoma GR. Automated Medical Citation Records Creation for Web-Based Online Journals. Proc. 14th IEEE Symposium on Computer-Based Medical Systems: IEEE Computer Society. 2001.

Le DX, Thoma GR. Page Layout Classification Technique for Biomedical Documents. Proc. World Multiconference on Systems, Cybernetics and Informatics (SCI). 2000 Jul.;X: 348-52.

Kim J, Le DX, Thoma GR. Automated Labeling of Bibliographic Data Extracted from Biomedical Online Journals. Proc. SPIE Electronic Imaging. 2003 Jan;5010: 47-56.

Thoma GR, Ford G, Le DX, Li Z. Text Verification in an Automated System for the Extraction of Bibliographic Data. Proc. 5th International Workshop on Document Analysis Systems, Springer-Verlag: Berlin. 2002 Aug;: 423-32.

Burgun A, Bodenreider O, Le Duff F, Moussouni F, Loreal O. Representation of roles in biomedical ontologies: a case study in functional genomics. Proc AMIA Symp. 2002:86-90.