MICCAI, 2026 Early Accept
Long-horizon surgical VideoQA with temporally faithful representation and adaptive inference-time reasoning.
Diandian Guo
I am a Ph.D. student at The Chinese University of Hong Kong, advised by Prof. Pheng-Ann Heng. My research focuses on computer vision for surgical intelligence, medical AI, and AI-assisted AR/XR systems.
Previously, I received my M.S. in Elektromobilitat from the University of Stuttgart, where I worked with Prof. Bin Yang on deep learning. Before that, I received my B.S. from Jilin University.
I work on surgical scene understanding, long-horizon surgical video reasoning, operating-room scene graph generation, surgical image restoration, and robust visual perception. Selected publications are listed below. For a full list, please see Google Scholar. * indicates equal contribution.
MICCAI, 2026 Early Accept
Long-horizon surgical VideoQA with temporally faithful representation and adaptive inference-time reasoning.
CVPR, 2026
A real-world SurgClean benchmark for endoscopic desmoking, defogging, and desplashing.
CVPR, 2026
BlooDet jointly detects bleeding regions and bleeding points in laparoscopic surgical videos.
AAAI, 2025
PmNet models short- and long-range surgical temporal cues for liver resection workflow monitoring.
IEEE Transactions on Medical Imaging, 2025
Single-stage bi-modal transformer for 2D-3D operating-room scene graph generation.
MICCAI, 2024
Tri-modal temporal modeling and medical-LLM knowledge transfer for OR scene graph generation.
CVPR, 2024 Highlight
Video semantic segmentation for driving scenes using vanishing-point-guided temporal correspondence.