Jisu Han

Hi, I'm Jisu! I am a first-year Ph.D. student in the Interdisciplinary Program in Artificial Intelligence (IPAI) at Seoul National University, and a member of the M.IN.D Lab, where I am fortunate to be advised by Taesup Moon.

I obtained my Master's Degree in the Graduate School of AI at KAIST, in the Humanoid Generalization Lab directed by Beomjoon Kim. I received my Bachelor's Degree in Computer Science from Ewha Womans University. I was fortunate to work with Joseph Lim and Jaeheung Park.

Mail / Github / LinkedIn / Google Scholar / X

The future I imagine looks like the coexistence portrayed in Detroit: Become Human — robots that don't just follow commands, but live alongside us, reading a room and adjusting as a person would. By skill here I mean an abstraction over how an agent gets something done — not a single fixed motion, but something that can be represented, grounded in a situation, and changed when it doesn't work. Cognitive science describes human skill this way too: under schema theory, a person reusing a "carry this" skill never repeats the same motion — they hold a flexible schema, read the context, and adjust, or fail and adjust the schema itself. I want robots to hold skills the same way: not as fixed scripts, but as abstractions that are represented, grounded in context, and updated through experience. Concretely, I study how agents represent, ground, and update skills along this loop.

① Representation

What is a skill made of? — language, trajectory, policy, effect

② Grounding

Does the skill still hold beyond where it was learned — in a new context, under safety constraints, over a long horizon?

③ Update

When a skill fails or breaks a constraint, can the agent tell on its own — and refine it, add a missing one, or drop one that no longer earns its place?

I look at how that plays out unevenly across LLMs, VLMs, and VLAs, and ask how their strengths can cover one another's gaps: LLMs and VLMs each handle part of the loop — representing and updating in language, or grounding in what's seen — but neither has to act, or live with the consequences of failing. VLA is where it becomes one piece: a single system that must represent, ground, and update together, in real time, under real physical consequence — the loop, embodied.

🤖 Humanoid Cognition where I see this heading — a body that thinks and adapts for itself, sharing the spaces we live and work in.

🏠household 🏥hospital 📦warehouse 🚀space

Detroit: Become Human — the coexistence I imagine

Each paper below is tagged by where it sits on that loop.

* denotes equal contribution

SafeHRI-Nav: A Context-Aware Low-Level Safe Navigation Benchmark for Human-Robot Interaction

Authors anonymized during review

In submission

GroundingVisionRobot

TL;DR

Benchmarks whether foundation-model robot policies adjust low-level navigation to safety context. It uses matched scenes where obstacle semantics or safety demand changes, exposing failures like treating a baby too similarly to a rigid object.

Option-aware Temporally Abstracted Value for Offline Goal-Conditioned Reinforcement Learning

Hongjoon Ahn*, Heewoong Choi*, Jisu Han*, Taesup Moon

Neural Information Processing Systems (NeurIPS), 2025 Spotlight

RepresentationRobot

TL;DR

Introduces option-aware, temporally abstracted value learning for offline goal-conditioned RL. By reasoning over options instead of only primitive actions, the agent can make more reliable long-horizon goal-reaching decisions from static datasets.

Arxiv

Hierarchical and Modular Network on Non-prehensile Manipulation in General Environments

Yoonyoung Cho*, Junhyek Han*, Jisu Han, Beomjoon Kim

Robotics: Science and Systems (RSS), 2025

GroundingRepresentationRobot

TL;DR

Proposes a hierarchical, modular network for non-prehensile manipulation such as pushing. The structure separates reusable manipulation components, helping policies generalize across broader object configurations, layouts, and environments.

Paper Project

Adaptive visual abstraction via object token merging and pruning for efficient robot manipulation

Jisu Han

CVPR Workshop (Causal and Object-Centric Representations for Robotics), 2024 Oral

GroundingVisionRobot

TL;DR

Reduces visual computation for robot manipulation by adaptively merging and pruning object tokens. The policy keeps task-relevant object information while removing redundant regions, improving efficiency without losing key grounding cues.

Paper Github

Preference learning for guiding the tree search in continuous POMDPs

Jiyong Ahn, Sanghyeon Son, Dongryung Lee, Jisu Han, Dongwon Son, and Beomjoon Kim

Conference on Robot Learning (CoRL), 2023

GroundingRepresentationRobot

TL;DR

Learns a preference model that scores candidate outcomes during online planning in continuous POMDPs. These preferences guide tree search toward actions that better match expert intent, improving decision-making under uncertainty.

Paper Video Project Github

Real-world Chess Robot Teacher

Jisu Han, Chaehyun Song, Minjae Song, Hwancheol Kim, and Semin Ahn

2025 Fall Class Project

Real-world robotic teacher system capable of teaching chess in physical environments. Qwen3-VL-Thinking model for Chess reasoning, and Isaac-Gr00T for action reasoning.

RepresentationGroundingLanguageVisionRobot

Notion Demo

WraspRobot: Bug-catching Robot

Jisu Han, Jaehoon Choi, Gunwoo Choi, and Dongwook Lee

Huggingface LeRobot WorldWide Hackathon, 2025 Top 10 Finalist among 3,000+ global participants

Huggingface Winner Space X

Clarifying the task: Identifying task from human videos as a representation

Jisu Han and Doohyun Lee

AI611: Machine Learning for Robotics (Prof. Joseph Lim) project, 2023

To learn a generalized reward function that can be utilized on reinforcement learning, we devise a representation that can effectively disentangle environment information and task information.

Project

Cart MEME: Deep Learning Based Autonomous-Driving Cart

Jisu Han, Jiyoon Park, Chaewon Kim, and Sangsoo Park

Korea Information Processing Society (KIPS), 2021

Paper Github

TheCodeEscape: VR room escape game based on Unity3D and Oculus

Jisu Han, Minyeong Hwang, and Seoungwoon Jung

KAIST MadCamp Final Project, 2019

Github Demo

Teaching Assistant

Fall 2025

M2608.002100 Advanced Deep Learning

Seoul National University (Prof. Taesup Moon)

Selected as an Outstanding Teaching Assistant for contributions to course operation, student support, and project mentoring.

Research Experience

Sep 2024 – Nov 2024

Research Assistant, Cognitive Learning for Vision and Robotics Lab

Korea Advanced Institute of Science and Technology (KAIST)

Research of analyzing the impact of data scaling on the performance of robotics policies

Jul 2021 – Aug 2022

Research Intern, Intelligent Mobile Manipulation Lab

Korea Advanced Institute of Science and Technology (KAIST)

Research on interactive perception

Dec 2020 – Aug 2021

Research Intern, Dynamic Robotic Systems (DYROS) Lab

Seoul National University

Developed a deep learning based grasp solution, published a domestic paper

May 2019 – Jul 2019

Research Intern, Information Coding and Processing Lab

Ewha Womans University

Developed a drowsy driving prevention system based on the driver's eye-tracking system using OpenCV