Hye Sun Yun

Hello! 👋

My name is Hye Sun Yun (윤혜선 in Korean).

I am a Computer Science PhD candidate at Northeastern University in the Khoury College of Computer Sciences. I work on Human-Computer Interactions (HCI) and Natural Language Processing (NLP), mainly on applications in health. I am co-advised by Timothy Bickmore and Byron Wallace. I am interested in responsible and human-centered AI and applying HCI and NLP research methods to meet the needs, interests, preferences, and requirements of end users. Specifically, my focus is on building and evaluating language technologies to improve access to health and medical research information.

Before starting my PhD in fall of 2020, I worked as a full-stack software engineer/manager for three years at Wayfair. I earned my bachelor’s in Computer Science and Africana Studies at Wellesley College.

I love playing Ultimate Frisbee, running, cooking, and baking. A couple of years ago I completed Reddit’s 52 weeks of cooking challenge. You can read all about my cooking adventures in my cooking blog.

If you are interested in knowing more about my research or want to chat about CS PhD application process, please feel free to email me!

news

Oct 07, 2025	I defended by thesis proposal! The title of my thesis is “Advancing Health Information Access with LLMs: Understanding Users and Models for Responsible Health AI”. I would like to thank my committee members: Tim Bickmore, Byron Wallace, Mai ElSherief (Northeastern), and Emma Pierson (UC Berkeley).
Aug 02, 2025	This fall semester, I will be co-teaching DS 2500 (Intermediate Programming with Data) at Northeastern University alongside Deahan Yu as an instructor of record.
Jun 25, 2025	I will be attending this year’s AHLI Conference on Health, Inference, and Learning (CHIL) conference at Berkeley, CA. During the poster session tomorrow (June 26), I will be presenting our paper Caught in the Web of Words: Do LLMs Fall for Spin in Medical Literature?.
Jun 12, 2025	Today, I gave a talk on my two most recent research papers (CHI and CHIL) at Wellesley College’s Computer Science Department.
May 26, 2025	I will be attending the 2025 CSST Summer Research Institute in the Adirondacks.
Apr 16, 2025	Our paper Caught in the Web of Words: Do LLMs Fall for Spin in Medical Literature? has been accepted at AHLI Conference on Health, Inference, and Learning (CHIL) 2025 conference.
Apr 08, 2025	I will be giving a remote research talk titled “Beyond Hallucinations: Unveiling Hidden Dangers of LLMs in Health Information Access” at Clemson University’s School of Computing Seminar on April 18, 2025 (2:30-3:30 pm). The slides for the talk can be accessed here.
Mar 31, 2025	Our paper Online Health Information–Seeking in the Era of Large Language Models: Cross-Sectional Web-Based Survey Study is finally published in the Journal of Medical Internet Research (JMIR)!
Mar 29, 2025	Framing Health Information: The Impact of Search Methods and Source Types on User Trust and Satisfaction in the Age of LLMs has been accepted as Late Breaking Work in CHI 2025! See you in Japan soon.
Jan 04, 2025	This spring I am the Instructor of Record for a special topics course — Research in Human-Centered NLP (CS4973/CS6983) — at Northeastern.
Jul 28, 2024	I am thrilled to announced that Automatically Extracting Numerical Results from Randomized Controlled Trials with Large Language Models has been accepted for publication at the Machine Learning for Healthcare 2024 conference.
Jul 04, 2024	Our paper Keeping Users Engaged During Repeated Interviews by a Virtual Agent: Using Large Language Models to Reliably Diversify Questions has been accepted for publication at the 24th ACM International Conference on Intelligent Virtual Agents.
May 20, 2024	I will be interning at Truveta this summer working on clinical data extraction using LLMs with the ML/AI team!
Oct 18, 2023	I have been accepted to present at the Cutting-Edge Connections in PhD Research: Healthcare Today which will be happening at Northeastern University on November 17th. I will be presenting my recent work on using large language models to generate diverse mental health questionnaire versions while retaining good psychometric properties.