Summary
We are looking for a highly motivated Post-Doctoral Associate to join the Li Lab, with a focus on developing clinically grounded large language model (LLM) systems to advance patient safety, documentation automation, clinical trial matching, and interpretable phenotyping in oncology and thrombotic disorders. The ideal candidate will have expertise in machine learning (ML) and natural language processing (NLP), particularly in applying LLMs to complex, unstructured clinical data. The successful candidate will become part of our dynamic research team, focusing on the complex relationship between cancer and thrombosis. This role will utilize advanced computational techniques, including multi-modal clinical, imaging, and omics data, to uncover new insights and develop predictive models. Through interdisciplinary collaboration, the aim is to deepen our understanding of thrombotic complications in cancer patients and contribute to the development of innovative strategies for risk assessment, prevention, and treatment.
The Specialist Postdoctoral Associate will work in an apprenticeship capacity to advance their career as a scientific professional at the intersection of clinical informatics and artificial intelligence. This role is designed for a candidate with demonstrated expertise in advanced computational methods—particularly large language models (LLMs) and machine learning (ML)—to develop clinically grounded, scalable AI systems that enhance development of innovative strategies for risk assessment, prevention, and treatment, and support interpretable disease phenotyping for thrombosis and bleeding models in cancer patients.
The Specialist Postdoctoral Associate will work independently but will be supported by a dedicated team of research coordinators, data analysts, statisticians, ML engineer, and medical students. The primary focus will be on research projects involving data science, clinical informatics, machine learning modeling, and clinical hematology/oncology. We offer access to extensive local and national oncology electronic health record (EHR) databases, advanced computational resources, a strong track record of past trainee publications, and opportunities for faculty career advancement.
Job Duties
- Conducts Research:
- Conducts in-depth literature reviews and stays abreast of the latest advancements in cancer-related thrombosis research, machine learning, and NLP.
- Leads the development and application of large language model (LLM) and machine learning (ML) approaches to extract, structure, and interpret unstructured clinical data, such as free-text notes.
- Collaborates with clinicians and researchers to integrate multi-omics data and clinical variables for comprehensive analysis and interpretation.
- Analyzes large-scale biomedical data (e.g., EHRs, clinical trials, omics, and imaging) using advanced computational pipelines to uncover clinically meaningful insights into cancer and thrombosis.
- Conducts Model Development:
- Implements predictive models to identify cancer patients at high risk of thrombosis, leveraging both traditional statistical approaches and cutting-edge machine learning algorithms.
- Implements and refines NLP pipelines that transform free-text clinical documentation into structured, actionable data for use in electronic health records (EHRs) and clinical decision support systems.
- Employs interpretable and robust modeling approaches—such as rule refinement, clustering, and causal inference—to support disease subtyping, progression prediction, and treatment effect estimation.
- Conducts Interdisciplinary Collaboration:
- Collaborates closely with oncologists, hematologists, bioinformaticians, and computational biologists to contextualize findings and validate hypotheses.
- Participates in research meetings, seminars, and workshops to exchange ideas and foster interdisciplinary collaboration.
- Contributes to manuscript preparation and grant writing efforts to disseminate research findings and secure external funding.
Minimum Qualifications
- Ph.D. in Chemistry, Computational Sciences, Computational Biology, Structural Biology, Computer Science, Bioinformatics, Statistics, or related disciplines. May also include Ph.D. in Biology or Biomedical Sciences in combination with an M.S. or extensive multidisciplinary experience in one of the above quantitative fields.
Preferred Qualifications
- Ph.D. in Biomedical Informatics Proficient in Python and R programming languages.
- Prior clinical or research experience in hematology and oncology.
- A strong record of first or senior-position peer-reviewed publications.
- Prior experience in algorithm development, especially with software packages and libraries commonly used in informatics, machine learning, and NLP.
Baylor College of Medicine is an Equal Opportunity/Affirmative Action/Equal Access Employer.