ORSGG

4D-OR Dataset (Özsoy et al.)

4D-OR Dataset

We gratefully acknowledge the contributions of 4D-OR benchmark (Özsoy et al.), which significantly facilitates our work.

Based on 4D-OR, we have reformatted the dataset to be more user-friendly.

Reformatting of SGG Annotations

Unlike the original 4D-OR dataset, we standardized the SGG annotations to align with popular open-world SGG datasets such as Visual Genome (Krishna et al., 2017) and Open Images (Kuznetsova et al., 2020). This adjustment makes the data more accessible and easier to process and apply.

Reformatted annotations for object detections in ORs

Reformatted annotations for triplet relations in ORs

Reformatting of Overall Structure

The original 4D-OR dataset was organized based on different surgical segments, mixing data from various modalities within separate surgical segment folders. To facilitate processing, we reorganized the dataset based on modality, categorizing input data into distinct modalities (e.g., 2D multi-view images, 3D point clouds, textual annotations). This reorganization enables more efficient multimodal processing.

Reformatted multi-view 2D image inputs

Reformatted 3D point cloud inputs

Training and Inference

Download the Processed reformatted 4D-OR provided below. The data folder should be like this:

S2Former-OR/data/: 
    /images/: unzip 4d_or_images_multiview_reltrformat.zip
    /points/: unzip points.zip
    /infer/: unzip infer.zip
    /train.json: from reltr_annotations_8.3.zip
    /val.json: from reltr_annotations_8.3.zip
    /test.json: from reltr_annotations_8.3.zip
    /rel.json: from reltr_annotations_8.3.zip

After that, you can use the scripts in S²Former-OR and TriTemp-OR for training and inference. Please note that you need to first run the provided inference script and upload your inferred predictions here for evaluation on the test set.

Reformatted Version

Citation

Jialun Pei, Diandian Guo, Jingyang Zhang, Manxi Lin, Yueming Jin and Pheng Ann Heng. S²Former-OR: Single-Stage Bimodal Transformer for Scene Graph Generation in OR. TMI, 2024.[Arxiv][Github][Reformated 4D-OR Dataset]

Diandian Guo, Manxi Lin, Jialun Pei, He Tang, Yueming Jin and Pheng Ann Heng. Tri-modal Confluence with Temporal Dynamics for Scene Graph Generation in Operating Rooms. MICCAI, 2024.[Arxiv][Github]

Bibtex

@article{s2former2024,
               author={Jialun Pei and Diandian Guo and Jingyang Zhang and Manxi Lin and Yueming Jin and Pheng Ann Heng},
               title={S²Former-OR: Single-Stage Bimodal Transformer for Scene Graph Generation in OR},
               booktitle={TMI},
               year={2024}
}

@inproceedings{tritemp2024,
               author={Diandian Guo and Manxi Lin and Jialun Pei and He Tang and Yueming Jin and Pheng Ann Heng},
               title={Tri-modal Confluence with Temporal Dynamics for Scene Graph Generation in Operating Rooms},
               booktitle={MICCAI},
               year={2024}
}

Demo

4D-OR Dataset (Özsoy et al.)

4D-OR Dataset

Reformatted Version

Methods

Version 1: S²Former-OR (TMI 2024)

Overview of the proposed single-stage multi-view bi-modal S²Former-OR for scene graph generation from operating rooms.

Version 2: TriTemp-OR (MICCAI 2024)

Overview of the proposed TriTemp-OR for scene graph generation in ORs.

Qualitative results of S²Former-OR and existing OR-SGG models on 4D-OR test set.

Qualitative results of TriTemp-OR and existing OR-SGG models on 4D-OR test set.

Results

Detailed comparisons of S²Former-OR with existing OR-SGG models on 4D-OR test set.

Detailed comparisons of TriTemp-OR with existing OR-SGG models on 4D-OR test set.

Citation

Jialun Pei, Diandian Guo, Jingyang Zhang, Manxi Lin, Yueming Jin and Pheng Ann Heng. S²Former-OR: Single-Stage Bimodal Transformer for Scene Graph Generation in OR. TMI, 2024.[Arxiv][Github][Reformated 4D-OR Dataset]

Diandian Guo, Manxi Lin, Jialun Pei, He Tang, Yueming Jin and Pheng Ann Heng. Tri-modal Confluence with Temporal Dynamics for Scene Graph Generation in Operating Rooms. MICCAI, 2024.[Arxiv][Github]

Visitors

Demo

4D-OR Dataset (Özsoy et al.)

4D-OR Dataset

Reformatted Version

Methods

Version 1: S2Former-OR (TMI 2024)

Overview of the proposed single-stage multi-view bi-modal S2Former-OR for scene graph generation from operating rooms.

Version 2: TriTemp-OR (MICCAI 2024)

Overview of the proposed TriTemp-OR for scene graph generation in ORs.

Qualitative results of S2Former-OR and existing OR-SGG models on 4D-OR test set.

Qualitative results of TriTemp-OR and existing OR-SGG models on 4D-OR test set.

Results

Detailed comparisons of S2Former-OR with existing OR-SGG models on 4D-OR test set.

Detailed comparisons of TriTemp-OR with existing OR-SGG models on 4D-OR test set.

Citation

Jialun Pei, Diandian Guo, Jingyang Zhang, Manxi Lin, Yueming Jin and Pheng Ann Heng. S2Former-OR: Single-Stage Bimodal Transformer for Scene Graph Generation in OR. TMI, 2024.[Arxiv][Github][Reformated 4D-OR Dataset]

Diandian Guo, Manxi Lin, Jialun Pei, He Tang, Yueming Jin and Pheng Ann Heng. Tri-modal Confluence with Temporal Dynamics for Scene Graph Generation in Operating Rooms. MICCAI, 2024.[Arxiv][Github]

Visitors

Version 1: S²Former-OR (TMI 2024)

Overview of the proposed single-stage multi-view bi-modal S²Former-OR for scene graph generation from operating rooms.

Qualitative results of S²Former-OR and existing OR-SGG models on 4D-OR test set.

Detailed comparisons of S²Former-OR with existing OR-SGG models on 4D-OR test set.

Jialun Pei, Diandian Guo, Jingyang Zhang, Manxi Lin, Yueming Jin and Pheng Ann Heng. S²Former-OR: Single-Stage Bimodal Transformer for Scene Graph Generation in OR. TMI, 2024.[Arxiv][Github][Reformated 4D-OR Dataset]