About Me

I am a final-year undergraduate student majoring in CS (Elite class) at Guohao College, Tongji University, with rank 2 of 52 . I will be pursuing my Master of Science in Machine Learning at Carnegie Mellon University starting Fall 2026.

My research focuses on generative model and multimodal learning, particularly using vision as a foundation for reasoning and knowledge acquisition. I am passionate about developing unified models to integrate vision, language, and other modalities to tackle challenges in complex physical world.

You can find my full CV here. Feel free to contact me via email if you are interested in my work.


Projects

Framework

💃  Text-Driven 2D Human Motion Generation

🛠️ Tech: Python, PyTorch, Transformers 🗓️ Date: Jul 2024 - Jul 2025
  • Developed a model to generate 2D human motion from textual descriptions, focusing on realism and adaptability.
  • Proposed a novel dataset for two-person interactions, addressing a gap in existing research.
  • Designed and implemented a Transformer-based model with a custom attention mechanism and integrated deep reinforcement learning strategies to optimize performance.
🔗  Code is coming soon

🤖  Real-Time Vision System for Robotics

🛠️ Tech: C++, OpenVino, YOLO, MPC 🗓️ Date: Oct 2024 - Current
  • Designed and implemented automated targeting and shooting mechanisms for robotic vehicles, focusing on real-time object detection, tracking and fire control.
  • 🏆 RoboMaster 2025 Super Confrontation: Champion (Eastern Region) & National Top 8 (National Final), Ranked #1 in overall auto-aim accuracy (Eastern Region).
  • 🏆 RoboMaster 2025 University League: 2nd Runner-up (3rd Place) (Shanghai Regional).
🔗  View Project on GitHub
Framework

📍  Real-Time 3D LiDAR SLAM and Relocalization System

🛠️ Tech: C++, ROS 2, Point-LIO, GICP 🗓️ Date: Oct 2025 - Current
  • Developed a robust 3D localization framework in ROS 2 integrating Point-LIO with wheel odometry.
  • Engineered a global relocalization pipeline based on Generalized ICP for fast and precise point cloud registration.
  • Implemented dynamic point cloud processing modules, including moving object detection and ROI optimization using PCL.

🐧  Discrete Unix V6++ Operating System

🛠️ Tech: C++, Assembly, Bochs 🗓️ Date: Mar 2025
  • Addressed inefficiencies in the original Unix V6++ kernel, such as high external fragmentation from contiguous memory requirements and slow `fork()` performance.
  • Implemented a comprehensive discretization of the process image, enabling non-contiguous memory allocation for processes.
  • Introduced a copy-on-write (COW) mechanism to minimize overhead during process creation, significantly improving efficiency.
🔗  View Project on GitHub
MockMaster Project

💻  MockMaster - An AI & HCI-Powered Mock Interview System

🛠️ Tech: Vue 3, Node.js, MediaPipe, DeepSeek API 🗓️ Date: May 2025
  • An AI-powered mock interview system using computer vision and NLP to provide an immersive experience.
  • Core features include real-time analysis of facial expressions and gestures, plus intelligent resume feedback.
  • Supports both 1v1 interviews with an AI agent and simulations of multi-person group interview scenarios.
🔗  View Project on GitHub