计算机视觉顶会(ICCV、CVPR、ECCV)
论文 | 代码 | 期刊/年份 | 作者/学校 | 备注 |
---|---|---|---|---|
R-CNN | CVPR 2014 | Ross Girshick/UC Berkeley | 二阶段 | |
Fast R-CNN | ICCV 2015 | Ross Girshick/Microsoft Research | 二阶段 | |
Faster R-CNN | NIPS 2015 | Shaoqing Ren/Microsoft Research | 二阶段 | |
YOLO | CVPR 2016 | Joseph Redmon/University of Washington | 一阶段 | |
SSD | ECCV 2016 | Wei Liu/UNC Chapel Hill | 一阶段 | |
FPN | CVPR 2017 | Tsung-Yi Lin/Facebook AI Research | ||
YOLO v2 | CVPR 2017 | Joseph Redmon/University of Washington | 一阶段 | |
RetinaNet | ICCV 2017 | Tsung-Yi Lin/Facebook AI Research | 一阶段 | |
Mask R-CNN | ICCV 2017 | Kaiming He/Facebook AI Research | 二阶段 | |
YOLO v3 | Arvix 2018 | Joseph Redmon/University of Washington | 一阶段 | |
RefineDet | CVPR2018 |
简图比较
R-CNN系列
YOLO
SSD
SSD
核心设计理念
- 采用多尺度特征图用于检测(好)
- 设置先验框(快)
- 采用卷积进行检测(特征块不用pooling)
难点
- 正负样本获得(难例挖掘)
- 从特征块中预测bbox和cls(不使用pooling,而是通过concat)
数据
PASCAL VOC2007
PASCAL VOC2012
COCO
模型
每一层的大小的推断代码1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39from mxnet.gluon import nn
from mxnet import nd
# data_Conv6为Conv6层的数据, 此层上应用的卷积操作是op_Conv6
data_Conv6 = nd.random.randn(1,1024,19,19) # 输入输出数据格式是 batch x channel x height x width
print("data_Conv6.shape =", data_Conv6.shape)
op_Conv6 = nn.Conv2D(1024, kernel_size=1) # Conv: 1*1*1024
op_Conv6.initialize(force_reinit=True)
data_Conv7 = op_Conv6(data_Conv6) # 19*19*1024
print("data_Conv7.shape =", data_Conv7.shape)
op_Conv7 = nn.Sequential()
op_Conv7.add(nn.Conv2D(256, kernel_size=1)) # Conv: 1*1*256
op_Conv7.add(nn.Conv2D(512, kernel_size=3, padding=1, strides=2)) # Conv: 3*3*512-s2
op_Conv7.initialize(force_reinit=True)
data_Conv8_2 = op_Conv7(data_Conv7) # 10*10*512
print("data_Conv8_2.shape =", data_Conv8_2.shape)
op_Conv8_2 = nn.Sequential()
op_Conv8_2.add(nn.Conv2D(128, kernel_size=1)) # Conv: 1*1*128
op_Conv8_2.add(nn.Conv2D(256, kernel_size=3, padding=1, strides=2)) # Conv: 3*3*256-s2
op_Conv8_2.initialize(force_reinit=True)
data_Conv9_2 = op_Conv8_2(data_Conv8_2) # 10*10*512
print("data_Conv9_2.shape =", data_Conv9_2.shape)
op_Conv9_2 = nn.Sequential()
op_Conv9_2.add(nn.Conv2D(128, kernel_size=1)) # Conv: 1*1*128
op_Conv9_2.add(nn.Conv2D(256, kernel_size=3, strides=1)) # Conv: 3*3*256-s1
op_Conv9_2.initialize(force_reinit=True)
data_Conv10_2 = op_Conv9_2(data_Conv9_2) # 10*10*512
print("data_Conv10_2.shape =", data_Conv10_2.shape)
op_Conv10_2 = nn.Sequential()
op_Conv10_2.add(nn.Conv2D(128, kernel_size=1)) # Conv: 1*1*128
op_Conv10_2.add(nn.Conv2D(256, kernel_size=3, strides=1)) # Conv: 3*3*256-s1
op_Conv10_2.initialize(force_reinit=True)
data_Conv11_2 = op_Conv10_2(data_Conv10_2) # 10*10*512
print("data_Conv11_2.shape =", data_Conv11_2.shape)
损失函数
优化算法
SGD
- initial learning rate 10−3
- 0.9 momentum
- 0.0005 weight decay
- batch size 32
参考
链接 | 说明 |
---|---|
SSD Paper、SSD Slide、SSD Code-Caffe | 官方 |
目标检测 SSD原理与实现 | 讲解细致,有tf实现 |
深度学习笔记(七)SSD 论文阅读笔记简化 | 讲解细致,尤其是正负样本获得,可继续阅读同一作者的深度学习笔记(七)SSD 论文阅读笔记 |
YOLO
参考
链接 | 说明 |
---|---|
YOLOv1 Paper、YOLOv2 Paper、YOLOv3 Paper | 官方 |
目标检测 YOLO原理与实现 | 讲解细致,有tf实现 |
目标检测 YOLOv2原理与实现(附YOLOv3) | 讲解细致,有tf实现 |
Faster R-CNN
参考
GluonCV-Detection
(机器之心)从R-CNN到RFBNet,目标检测架构5年演进全盘点
【重磅】基于深度学习的目标检测算法综述
白话mAP
NMS原理(非极大值抑制)+python实现
Soft-NMS
R-CNN、SPP-Net、Fast R-CNN、Faster R-CNN总结
Spatial Transformer Networks
理解Spatial Transformer Networks