欢迎访问南华大学学报（自然科学版）

快速检索			高级检索

曹政涛,黄文丰,宁志刚,廖祥云,熊雪颖,王琼.基于半稠密COLMAP自监督单目内窥镜深度估计[J].南华大学学报(自然科学版),2021,(5):52~62.[CAO Zhengtao,HUANG Wenfeng,NING Zhigang,LIAO Xiangyun,XIONG Xueying,WANG Qiong.Self-supervised Monocular Endoscope Depth Estimation Based on Semi-dense COLMAP[J].Journal of University of South China(Science and Technology),2021,(5):52~62.]

基于半稠密COLMAP自监督单目内窥镜深度估计

Self-supervised Monocular Endoscope Depth Estimation Based on Semi-dense COLMAP

投稿时间：2021-02-03

DOI：

中文关键词: 单目内窥镜 COLMAP 注意力模型深度估计自监督

英文关键词:monocular endoscope COLMAP attention model depth estimation self-supervision

基金项目:国家自然科学基金青年基金项目(61902386;62072452);广东省重点领域研发计划项目(2020B010165004);广东省自然科学基金项目(2018A030313100);深圳市重点基础研发项目(JCYJ20180507182415428)

作者	单位	E-mail
曹政涛	南华大学电气工程学院,湖南衡阳 421001 中国科学院人机智能协同系统重点实验室,中国科学院深圳先进技术研究院,广东深圳 518055	1187816147@qq.com,xiongxueying.snow@163.com
黄文丰	中国科学院人机智能协同系统重点实验室,中国科学院深圳先进技术研究院,广东深圳 518055
宁志刚	南华大学电气工程学院,湖南衡阳 421001
廖祥云	中国科学院人机智能协同系统重点实验室,中国科学院深圳先进技术研究院,广东深圳 518055
熊雪颖	武汉大学中南医院医学影像科, 湖北武汉 473001	xiongxueying.snow@163.com
王琼	中国科学院人机智能协同系统重点实验室,中国科学院深圳先进技术研究院,广东深圳 518055

摘要点击次数: 916

全文下载次数: 432

中文摘要:

在单目内窥镜场景下人体内组织表面纹理稀疏,视野受限给图像的深度估计带来了一定的困难。针对以上问题,提出了一种基于半稠密COLMAP(structure-from-motion and multi-view stereo generation pipeline)结合动态卷积注意力机制的自监督单目深度估计方法。通过改进的COLMAP进行图像序列预处理,产生加权半稠密深度图作为监督信号,该过程引入加权可靠度对半稠密深度图中的干扰点和不准确点进行丢弃或抑制操作,在训练网络中加入了具有动态卷积的注意力机制模型(Selective Kernel Networks, SKNet),这种注意力机制模型可以对输入的特征图进行动态卷积以获得更多感受野的信息,加强网络对特征的提取能力。在肝脏数据集上进行试验,结果表明,绝对相对差为0.135,阈值T<1.25³时,准确率为0.985,对监督数据、SKNet模型进行了消融实验,证明了半稠密重建、SKNet模型以及加权半稠密深度图的有效性。

英文摘要:

In monocular images, the sparse surface texture of tissues and the limited view field bring difficulties to the depth estimation. To solve the above problems, a self-supervised monocular depth estimation method based on semi-dense COLMAP(Structure-from-Motion and Multi-View Stereo generation pipeline) combined with dynamic convolutional attention mechanism is proposed. The study preprocesses the image sequence through the improved COLMAP to generate a weighted semi-dense depth map as a supervisory signal. In this process, weighted reliability is introduced to discard or suppress interference and inaccurate points in the semi-dense depth map. An attention mechanism model with dynamic convolution (Selective Kernel Networks, SKNet) is added to the training network, which can dynamically convolve the input feature map to obtain more information about the receptive field and the ability to extract features is improved. Experiments on the liver data set show that the average relative difference is 0.135, and the accuracy rate is 0.985 where thr<1.25³. An ablation experiment was performed on the supervised data and the SKNet model, which proved the effectiveness of the semi-dense reconstruction, the SKNet model and the weighted semi-dense depth map.

查看全文查看/发表评论下载PDF阅读器

关闭