Resume

Basics

Education

  • Aug. 2020 - Dec. 2025
    University of Texas at Dallas
    Ph.D. in Computer Science
  • Sep. 2018 - May 2020
    Rutgers University
    M.S. in Computer Science
  • Sep. 2013 - June 2017
    South China Agricultural University
    B.Eng. in Software Engineering

Work

  • May 2024 - Aug. 2024
    Applied Scientist Intern
    Amazon
    • Developed an LLM-based workflow to build a contextual knowledge graph, enhancing the customer search experience by enabling context-aware search functionality, covering 38% search queries for 4 product types.
  • May 2023 - Aug. 2023
    Applied Scientist Intern
    Amazon
    • Researched and developed an LLM-based workflow to auto-extract customer-centric product metadata for 5 types of products on amazon.com, poised to significantly automate labor-intensive tasks.
    • Retrieval-augmented Anthropic’s Claude and other LLMs with external knowledge sources (RAG) to ensure evidence-grounded results, thereby reducing the risk of inaccuracies in model outputs (hallucinations).
  • May 2021 - Present
    Research Assistant
    Human Language Technology Research Institute at UT Dallas
    • Designed a model that automatically extracts and filters syntactic and semantic features from student essays, achieving a 1.6% improvement in cross-prompt essay scoring over previous state-of-the-art models.
    • Built a corpus for automated essay scoring with 1,006 essays, alleviating the issue that most available corpora contain only holistic scores or trait scores that are too coarse-grained, and published it in NAACL’24.
    • Built a pipelined model for entity coreference resolution, achieving the best performance in the CODI-CRAC 2022 shared task which is 1.6x as good as the baseline and 1.2x as good as the second-ranked team.
    • Published a survey and position paper on automated essay scoring in IJCAI’24 and EMNLP’24, providing a comprehensive analysis of existing approaches and proposing innovative directions for future research.
    • Developed a top-performing end-to-end model for the discourse deixis track in the CODI-CRAC 2021 shared task, utilizing resolution constraints to achieve a performance of 1.8x as good as the second-ranked team.
    • Developed an end-to-end model for discourse deixis resolution that leveraged task-specific characteristics and outperformed previous state-of-the-art by 27%, resulting in a first-author publication in EMNLP 2022.
    • Collected and analyzed data for identifying propaganda content in Spanish magazines during World War II. Identified key challenges for developing a deep learning model. Published findings in AAAI 2023.
  • June 2019 - May 2020
    Research Assistant
    NLP Group at Rutgers University
    • Developed a multimodal classification model to predict the coherence relation between an image and its caption, aiding the development of controllable caption generation and leading to an ACL 2020 publication.

Service

  • Conference Reviewer: EMNLP 2023, ACL 2023, ACL ARR
  • Journal Reviewer: AIJ, TALLIP
  • Conference Volunteer: EMNLP 2024

Awards

Languages

Cantonese Chinese
Native
Mandarin Chinese
Native
English
Fluent
Japanese
Elementary