ssd原始碼解析

1、preprocess_for_eval()

image預處理

1）tf_image_whitend()

rgb通道分別減去影象集統計的畫素均值。

2）tf_image.resize_image()

影象縮放成(300,300,3)。

2、ssd_net

1)ssdnet()

初始化功能。

定義引數：

feat_layers: 使用定義的feature層提取annochor。

feat_shape: 每層feat_layers的尺寸，開始定義為二維陣列：[feat_w, feat_h]，ssd網路設定完成後，更新為三維：[feat_w, feat_h, num_anchors]；

anchor_size: 每層feat_layers的anchor的個數；

anchor_ratio: 每層feat_layers的anchor的長寬縮放因子序列；

anchor_steps: 每層feat_layers的ancho的滑動步長。

注意：len(feat_layers) == len(feat_shape) == len(anchor_size) == len(anchor_ratio) == len(anchor_steps)。

2) ssd_net.net()

ssd網路定義ssd_net（）。

過程引數定義：

$end_points: 對每一層的網路進行儲存，具體值如下所示：

'block1': shape = (1, 300, 300, 64),

'block2': shape = (1, 150, 150, 128),

'block3': shape = (1, 75, 75, 256),

'block4': shape = (1, 38, 38, 512),

'block5': shape = (1, 19, 19, 512),

'block6': shape = (1, 19, 19, 1024),

'block7': shape = (1, 19, 19, 1024),

'block8': shape = (1, 10, 10, 512),

'block9': shape = (1, 5, 5, 256),

'block10': shape = (1, 1, 1, 256)

$feat_layer: 進行anchor提取的feat_layers。

['block4', 'block7', 'block8', 'block9', 'block10', 'block11']

$函式ssd_multibox_layer()

該函式功能是建立乙個multbox layer，返回classify和localization的**。返回的網路層大小為eg: location->[1, 38, 38, num_anchors, 4(rectangle的座標點)]、prediction: [1, 38, 38, num_anchors, 21(目標的種類)]。

construct a multibox layer , return a class and localization predictions.

輸入引數：

*inputs: 輸入的feat_layer,如;

*num_class: 目標分類的個數，pascal voc2007為21種分類。

*sizes: 為該層的anchor_size[i]。

*raitos: 為改層的anchor_ratio[i] 。

過程值：

*num_anchors=anchor縮放係數個數 x anchor box個數。

每個畫素點的location的總數(num_loc_pre) = num_anchors x 4(四個座標點)；

每個畫素點的object class 總數(num_cls_pred) = num_anchors x num_classes(目標類別總數)。

返回值：

*logits: 類別**[,]

*locations：位置**[,]

最後更新self.param.feat_shapes引數，使其由二維變成三維。

3、ssd_anchor_one_layer()

函式功能：

對於每一層的feat_layers，計算ssd預設的anchor boxes。

輸入引數：

*img_shape: 原始影象的尺寸（eg: [300, 300]）;

*feat_shape: feat_layer的尺寸；

*sizes: anchor box的寬度預定義序列；

*ratios: anchor box的長寬比例因子序列；

*step：滑動步長。

輸出引數：

*y: 每乙個anchor中心點在大圖中的y軸座標（其值為：feat影象上的y軸座標值 * step/原始影象的高度）；

x: 每乙個anchor中心點在大圖中的x軸座標（其值為：feat影象上的x軸座標值 * step/原始影象的高度）；

h: 每乙個anchor的高度，anchor的總數為len(sizes) * len(ratios);

w: 每乙個anchor的寬度，anchor的總數為len(sizes) * len(ratios)。

4、 ssd訓練的loss定義：

正樣本的loss + 負樣本的loss + anchor的loss。

其中：

1)正/負樣本loss

使用tf.nn.sparse_softmax_cross_entropy_with_logits。

2)anchor的loss

使用localisations的groud true和predict 的差值，再求平均值。

ssd原始碼解析

SSD原始碼解讀 prior box layer

Fabric 原始碼解析原始碼目錄解析

Spring原始碼解析之 Aop原始碼解析（2）

ssd原始碼解析

SSD原始碼解讀 prior box layer

Fabric 原始碼解析 原始碼目錄解析

Spring原始碼解析之 Aop原始碼解析（2）

相關推薦

Fabric 原始碼解析原始碼目錄解析