DatawhalTask02 練死勁兒網路設計

先驗框和錨框是同乙個概念

為了減少先驗框的數量，先利用vgg16對影象進行下取樣

先驗框的生成

def create_prior_boxes():
"""create the 441 prior (default) boxes for the network, as described in the tutorial.
vgg16最後的特徵圖尺寸為 7*7
我們為特徵圖上每乙個cell定義了共9種不同大小和形狀的候選框（3種尺度*3種長寬比=9）
因此總的候選框個數 = 7 * 7 * 9 = 441
:return: prior boxes in center-size coordinates, a tensor of dimensions (441, 4)
"""fmap_dims = 7 #vgg16最後的特徵圖尺寸為 7*7
#為特徵圖上每乙個cell定義了共9種不同大小和形狀的候選框（3種尺度*3種長寬比=9）
obj_scales = [0.2, 0.4, 0.6]
aspect_ratios = [1., 2., 0.5]
prior_boxes = 
for i in range(fmap_dims):
for j in range(fmap_dims):
# +0.5是為了從座標點移動至cell中心，/fmap_dims目的是將座標在特徵圖上歸一化
cx = (j + 0.5) / fmap_dims
cy = (i + 0.5) / fmap_dims
for obj_scale in obj_scales:
for ratio in aspect_ratios:
prior_boxes = torch.floattensor(prior_boxes).to(device) # (441, 4)
prior_boxes.clamp_(0, 1) # (441, 4) #進行了歸一化，所以使用0-1進行截斷防止越界
return prior_boxes

這個時候得到的先驗框是針對特徵圖的尺寸並歸一化的，因此要對映到原圖計算iou或者展示，需要：

img_prior_boxes = prior_boxes * 影象尺寸

類似residual 網路思想，不直接對目標框進行**，而是**和真實目標框的偏差

**中的10和5是經驗取值，為了『scaling the localization gradient』。

vgg16輸出7x7的feature map上的每個先驗框需要**：

1）邊界框的一組21類分數，其中包括voc的20類和乙個背景類。

2）邊界框編碼後的偏移量。

實現方法在feature map後分別接上兩個卷積層：

1）乙個分類**的卷積層採用3x3卷積核padding和stride都為1，每個anchor需要分配21個卷積核，每個位置有9個anchor，因此需要21x9個卷積核。

2）乙個定位**卷積層，每個位置使用3x3卷積核padding和stride都為1，每個anchor需要分配4個卷積核，因此需要4x9個卷積核。

核心**：

...
self.n_classes = n_classes #21
# number of prior-boxes we are considering per position in the feature map
# 9 prior-boxes implies we use 9 different aspect ratios, etc.
n_boxes = 9 
# localization prediction convolutions (predict offsets w.r.t prior-boxes)
self.loc_conv = nn.conv2d(512, n_boxes * 4, kernel_size=3, padding=1)
# class prediction convolutions (predict classes in localization boxes)
self.cl_conv = nn.conv2d(512, n_boxes * n_classes, kernel_size=3, padding=1)
... def forward(self, pool5_feats):
...# predict localization boxes' bounds (as offsets w.r.t prior-boxes)
l_conv = self.loc_conv(pool5_feats) # (n, n_boxes * 4, 7, 7)
l_conv = l_conv.permute(0, 2, 3, 1).contiguous() 
# (n, 7, 7, n_boxes * 4), to match prior-box order (after .view())
# (.contiguous() ensures it is stored in a contiguous chunk of memory, needed for .view() below)
locs = l_conv.view(batch_size, -1, 4) # (n, 441, 4), there are a total 441 boxes on this feature map
# predict classes in localization boxes
c_conv = self.cl_conv(pool5_feats) # (n, n_boxes * n_classes, 7, 7)
c_conv = c_conv.permute(0, 2, 3, 1).contiguous() # (n, 7, 7, n_boxes * n_classes), to match prior-box order (after .view())
classes_scores = c_conv.view(batch_size, -1, self.n_classes) # (n, 441, n_classes), there are a total 441 boxes on this feature map
return locs, classes_scores

影象卷積和池化操作後的特徵圖大小計算方法

3.4 模型結構

字元0 數字0和 0

binoct dechex 縮寫字元解釋0000 000000 00nut null 空字元00110000 6048300 字元0ascii碼值 0 表示空字元，空字元就是平時所說的 0 字元 0 ascii碼值為 48，如 012 字串中的 0 表示字元 0 數字 0，所說的數字 0，就是平...

C語言 0 和0和 0

共同點都是字元不同點 0 對應的ascii碼是0，是ascii碼表中的第乙個字元，即空字元判斷乙個字串是否結束的標誌就是看是否遇到 0 0 對應的ascii碼是48，48對應的十六進製制數就是0x30。0 是字串常量，字串常量是由一對雙引號括起的字串行。字串常量可以含乙個或多個字元。0 是字元...

徹底搞定0x0d和0x0a

我只在arm linux c和vc 下做了試驗，請大家在接觸其它語言環境下，小心推廣，不行就自己動手做試驗，最可靠。在arm linux c和vc 下回車換行的意義如下。回車 cr ascii碼 r 十六進製制，0x0d，回車的作用只是移動游標至該行的起始位置換行 lf ascii碼 n 十六進製...

DatawhalTask02 練死勁兒 網路設計

字元0 數字0和 0

C語言 0 和0和 0

徹底搞定0x0d和0x0a

相關推薦

DatawhalTask02 練死勁兒網路設計