We describe a detector of robotic instrument parts in image-guided surgery. The detector consists of a huge ensemble of scale-variant and pose-dedicated, rigid appearance templates. The templates, which are equipped with pose-related keypoints and segmentation masks, allow for explicit pose estimation and segmentation of multiple end-effectors as well as fine-grained non-maximum suppression. We train the templates by grouping examples of end-effector articulations, imaged at various viewpoints, in thus arising space of instrument shapes. Proposed shape-based grouping forms tight clusters of pose-specific end-effector appearance. Experimental results show that the proposed method can effectively estimate the end-effector pose and delineate its boundary while being trained with moderately sized data clusters. We then show that matching such huge ensemble of templates takes less than one second on commodity hardware.
Authors
Additional information
- DOI
- Digital Object Identifier link open in new tab 10.1007/978-3-319-67543-5_1
- Category
- Aktywność konferencyjna
- Type
- materiały konferencyjne indeksowane w Web of Science
- Language
- angielski
- Publication year
- 2017