曹政涛,黄文丰,宁志刚,廖祥云,熊雪颖,王琼.基于半稠密COLMAP自监督单目内窥镜深度估计[J].南华大学学报(自然科学版),2021,(5):52~62.[CAO Zhengtao,HUANG Wenfeng,NING Zhigang,LIAO Xiangyun,XIONG Xueying,WANG Qiong.Self-supervised Monocular Endoscope Depth Estimation Based on Semi-dense COLMAP[J].Journal of University of South China(Science and Technology),2021,(5):52~62.]
基于半稠密COLMAP自监督单目内窥镜深度估计
Self-supervised Monocular Endoscope Depth Estimation Based on Semi-dense COLMAP
投稿时间:2021-02-03  
DOI:
中文关键词:  单目内窥镜  COLMAP  注意力模型  深度估计  自监督
英文关键词:monocular endoscope  COLMAP  attention model  depth estimation  self-supervision
基金项目:国家自然科学基金青年基金项目(61902386;62072452);广东省重点领域研发计划项目(2020B010165004);广东省自然科学基金项目(2018A030313100);深圳市重点基础研发项目(JCYJ20180507182415428)
作者单位E-mail
曹政涛 南华大学 电气工程学院,湖南 衡阳 421001
中国科学院人机智能协同系统重点实验室,中国科学院深圳先进技术研究院,广东 深圳 518055 
1187816147@qq.com,xiongxueying.snow@163.com 
黄文丰 中国科学院人机智能协同系统重点实验室,中国科学院深圳先进技术研究院,广东 深圳 518055  
宁志刚 南华大学 电气工程学院,湖南 衡阳 421001  
廖祥云 中国科学院人机智能协同系统重点实验室,中国科学院深圳先进技术研究院,广东 深圳 518055  
熊雪颖 武汉大学中南医院 医学影像科, 湖北 武汉 473001 xiongxueying.snow@163.com 
王琼 中国科学院人机智能协同系统重点实验室,中国科学院深圳先进技术研究院,广东 深圳 518055  
摘要点击次数: 650
全文下载次数: 327
中文摘要:
      在单目内窥镜场景下人体内组织表面纹理稀疏,视野受限给图像的深度估计带来了一定的困难。针对以上问题,提出了一种基于半稠密COLMAP(structure-from-motion and multi-view stereo generation pipeline)结合动态卷积注意力机制的自监督单目深度估计方法。通过改进的COLMAP进行图像序列预处理,产生加权半稠密深度图作为监督信号,该过程引入加权可靠度对半稠密深度图中的干扰点和不准确点进行丢弃或抑制操作,在训练网络中加入了具有动态卷积的注意力机制模型(Selective Kernel Networks, SKNet),这种注意力机制模型可以对输入的特征图进行动态卷积以获得更多感受野的信息,加强网络对特征的提取能力。在肝脏数据集上进行试验,结果表明,绝对相对差为0.135,阈值T<1.253时,准确率为0.985,对监督数据、SKNet模型进行了消融实验,证明了半稠密重建、SKNet模型以及加权半稠密深度图的有效性。
英文摘要:
      In monocular images, the sparse surface texture of tissues and the limited view field bring difficulties to the depth estimation. To solve the above problems, a self-supervised monocular depth estimation method based on semi-dense COLMAP(Structure-from-Motion and Multi-View Stereo generation pipeline) combined with dynamic convolutional attention mechanism is proposed. The study preprocesses the image sequence through the improved COLMAP to generate a weighted semi-dense depth map as a supervisory signal. In this process, weighted reliability is introduced to discard or suppress interference and inaccurate points in the semi-dense depth map. An attention mechanism model with dynamic convolution (Selective Kernel Networks, SKNet) is added to the training network, which can dynamically convolve the input feature map to obtain more information about the receptive field and the ability to extract features is improved. Experiments on the liver data set show that the average relative difference is 0.135, and the accuracy rate is 0.985 where thr<1.253. An ablation experiment was performed on the supervised data and the SKNet model, which proved the effectiveness of the semi-dense reconstruction, the SKNet model and the weighted semi-dense depth map.
查看全文  查看/发表评论  下载PDF阅读器
关闭