SNU Workshop on Perception and Intelligence

SNU GoGE/SDG Workshop

Wednesday, October 1st, 2025
10:00 AM - 4:00 PM, KST
INMC (Building 132) Room 103

Schedule


10:00 AM - 11:00 AM Invited speaker : Prince Gupta (Meta), Xiaqing Pan (Meta)

Prince Gupta (Meta) Xiaqing Pan (Meta)
Title : Intro to Meta's Project Aria, and its use in Robotics and Contextual AI Research

Abstract : This talk introduces Project Aria and the new Aria Gen 2 research glasses—Meta’s most advanced platform for egocentric AI, machine perception, and robotics research. Aria Gen 2 offers a state-of-the-art sensor suite—including HDR cameras, eye tracking, spatial microphones, IMUs, and new sensors for heart rate and voice isolation—enabling rich, multimodal data capture from a human perspective. With ultra low-power, on-device machine perception (SLAM, eye and hand tracking, speech recognition), all-day usability, and best-in-class open-ear audio, Aria Gen 2 unlocks new possibilities for prototyping, data collection, and real-world AI applications. We’ll discuss how universities can leverage the Aria Research Kit and open-source datasets to accelerate research in robotics, contextual AI, accessibility, and human-computer interaction, and highlight real-world use cases from leading academic and industry partners.

Bio : Prince Gupta is the Director of Product Management at Meta’s Reality Labs Research division, where he leads the development of Project Aria and has overseen the launch of its first- and second-generation devices. With over 15 years of experience in augmented reality (AR), virtual reality (VR), and artificial intelligence (AI), he has been instrumental in advancing breakthrough hardware and research platforms that are defining the next era of computing.
Xiaqing Pan is a Senior Research Engineering Manager at Meta Reality Labs Research. He leads a team developing open-sourced software and applications for Project Aria, an egocentric research data capture device . His research areas focus on egocentric machine perception such as 3D localization, 3D object / scene reconstruction, 3D object detection, etc. Before joining Meta, Xiaqing obtained his PhD from University of Southern California in 2017. His research areas included 2D / 3D shape analysis, retrieval and classification.


11:00 AM - 11:30 AM Invited speaker : Yonghyeon Lee (MIT)

Yonghyeon Lee (MIT)
Title : Reactive and Reflexive Robot Control: The Foundation of Physical Intelligence

Abstract : While vision and language models for high-level perception, reasoning, and task planning have advanced at an extraordinary pace through data-driven approaches, low-level movement planning and motor control -- the essence, and perhaps the final frontier, of true physical intelligence -- still have a long way to go. This talk will highlight two key elements that are essential for advancing physical intelligence: reactive motion planning and reflexive, tactile-reactive control. Reactive planning refers to real-time motion planning that responds robustly and safely to external disturbances affecting the robot, environment, or objects, while reflexive control refers to feedback control for contact-rich manipulation that leverages high-bandwidth tactile sensing, relies minimally on vision, and maintains robustness against visual perturbations. The talk will present approaches ranging from learning-based to hierarchical model-based control as initial steps toward true physical intelligence, and conclude with a discussion of challenges and future directions.

Bio : Yonghyeon Lee is a Postdoctoral Associate in the Biomimetic Robotics Lab at MIT. He received his B.S. and Ph.D. in Mechanical Engineering from Seoul National University and was previously an AI Research Fellow at the Korea Institute for Advanced Study. His research explores geometric methods for learning and control, 3D perception, reactive motion planning, and tactile-feedback–driven manipulation in dexterous robotic systems.


11:30 AM - 1:00 PM Break



1:00 PM - 1:50 PM Student talks

Junoh Kang (SNU) | FIFO-Diffusion: Generating Infinite Videos from Text without Training
Heewoong Choi (SNU) | Option-aware Temporally Abstracted Value for Offline Goal-Conditioned Reinforcement Learning
Hojun Jang (SNU) | Audio-aided Character Control for Inertial Measurement Tracking
Hyeonbum Choi (SNU) | Society of Mind Meets Real-Time Strategy: A Hierarchical Multi-Agent Framework for Strategic Reasoning
Byeonghui Kim (SNU) | Multi-Modal Grounded Planning and Efficient Replanning For Learning Embodied Agents with A Few Examples

1:50 PM - 2:30 PM Break



2:30 PM - 3:00 PM Invited speaker : Harold Soh (NUS)

Harold Soh (NUS)
Title : VLMs with Tactile Intelligence

Abstract : This talk will present recent work on generative models for embodied intelligence, with a focus on multimodal large models that integrate touch with vision and language. I will share some of our early efforts on tactile integration for manipulation and discuss what these suggest about the role of touch in foundation models. I will also briefly highlight opportunities in grounding generative AI within embodied settings, with examples in traversability prediction for navigation and in constraining generative models to follow user-specified objectives.

Bio : Harold Soh is an Associate Professor of Computer Science at the National University of Singapore, where he leads the Collaborative Learning and Adaptive Robots (CLeAR) lab. His research focuses on AI for trustworthy collaborative robots, with interests in generative modeling and decision-making. Harold’s work has earned awards at IROS and T-AFFC, as well as a R:SS Early Career Spotlight. He has served the HRI community in leadership roles, including co-Program Chair of ACM/IEEE HRI 2024, and is an Associate Editor at IJRR and ACM THRI. He is also a PI at the Smart Systems Institute and co-founder of TacnIQ, a startup on touch-enabled intelligence.


3:00 PM - 3:30 PM Invited speaker : Antoine ANDRÉ (AIST)

Antoine ANDRÉ (AIST)
Title : Omnidirectional vision for robotic applications

Abstract : Omnidirectional cameras and their ability to capture a whole scene at a given viewpoint have become compact and affordable enough to be easily embedded on robotic platforms, hence allowing for new possible applications. Their wide field of view (up to 360 degrees) thus often outperform conventional cameras on tasks ranging from visual-SLAM, odometry, visual place recognition to visual servoing. However, these benefits come with challenges such as a high level of distorsions at the image borders and the need for accurate projection models and image representations. This talk will discuss how to tackle these challenges and how it can unlock new capabilities in vision for robotics.

Bio : Antoine ANDRÉ currently holds a position of permanent researcher at the CNRS-AIST Joint Robotics Laboratory in Tsukuba, Japan since 2023 where he investigates computer vision using wide field of view cameras for robotic applications. He was a JSPS postdoctoral fellow at CNRS-AIST JRL from 2022 to 2023. Prior to that he obtained his PhD degree in 2021 from the University of Burgundy Franche-Comté in Besançon, France where he studied computer vision for microrobotic applications.


3:30 PM - 4:00 PM Invited speaker : Karen Liu (Stanford)

Karen Liu (Stanford)
Title : From Human Character to Humanoids

Abstract : For the past three decades, research in computer animation has largely revolved around a central question: How can we make character animation more realistic? With recent advances in humanoid robotics, we are now able to pose a deeper question: How can we make our character animation real? Humanoids hold tremendous promise as embodied agents, but they also bring new challenges. Unlike virtual characters, they must be capable of locomotion, manipulation, and navigation in the physical world—often without access to abundant training data. Here, character animation can play a crucial role in bridging the technological gap between simulated motion and physical embodiment. In this talk, I will present recent work that supports this perspective and demonstrate how animation-inspired approaches contribute to developing robust humanoid skills. I will conclude by discussing the key bets we are making toward achieving truly autonomous humanoids

Bio : C. Karen Liu is a professor in the Computer Science Department at Stanford University. Liu's research interests are in computer graphics and robotics, including physics-based animation, robot learning, and computational biomechanics. She developed computational approaches to modeling realistic and natural human movements, learning complex control policies for humanoids and assistive robots, and advancing fundamental numerical simulation and optimal control algorithms. The algorithms and software developed in her lab have fostered interdisciplinary collaboration with researchers in robotics, computer graphics, mechanical engineering, biomechanics, neuroscience, and biology. Liu received a National Science Foundation CAREER Award, an Alfred P. Sloan Fellowship, and was named Young Innovators Under 35 by Technology Review. Liu also received the ACM SIGGRAPH Significant New Researcher Award for her contribution in the field of computer graphics. In 2021, Liu was inducted to ACM SIGGRAPH Academy.

The program is generously supported by the Brain Korea program, and the Interdisciplinary Program in Artificial Intelligence.
Contact : Young Min Kim (youngmin.kim@snu.ac.kr)


Copyright 2024, 3D Vision Laboratory, Dept. of Electrical and Computer Engineering, Seoul National University.
Contact: Room 916, Building 301, 1 Gwanak-ro, Gwanak-gu, Seoul, Republic of Korea