conv2d函式引數解釋以及padding理解

出處函式

cnn在深度學習中有著舉足輕重的地位，主要用於特徵提取。在tensorflow中涉及的函式是tf.nn.conv2d。

tf.nn.conv2d(input, filter, strides, padding, use_cudnn_on_gpu=true, data_format=「nhwc」, dilations=[1, 1, 1, 1], name=none)

input 代表做卷積的輸入影象的tensor，其shape要求為[batch, in_height, in_width, in_channels]，具體含義是[訓練時乙個batch的數量, 高度, 寬度, 影象通道數]，資料型別為float32或float64；

filter 相當於cnn中的卷積核，該tensor的shape要求為[filter_height, filter_width, in_channels, out_channels]，具體含義是[卷積核的高度，卷積核的寬度，影象通道數，卷積核個數]，要求型別與引數input相同，filter的通道數要求與input的in_channels一致，即第三維in_channels，就是引數input的第四維；

strides [1,stride_h,stride_w,1]步長，即卷積核每次移動的步長；

padding 填充模式，取值只能為「same」或「valid」；

輸出結果是shape為[batch, out_height, out_width, out_channels]，batch取決於input，out_channels取決於filter，而out_height與out_width取決於所有引數，參考示意圖

same模式補

out_height = ceil ( float ( in_height ) / float ( stride_h) )

out_width = ceil ( float ( in_width ) / float ( stride_w ) )

valid模式丟

out_height = ceil(float(in_height - filter_height + 1) / float(stride_h))

out_width = ceil(float(in_width - filter_width + 1) / float(stride_w))

補的方式如下:

補的行數：pad_along_height = max((out_height - 1) * strides[1] + filter_height - in_height, 0)

補的列數：pad_along_width = max((out_width - 1) * strides[2] + filter_width - in_width, 0)

pad_top = pad_along_height // 2

pad_bottom = pad_along_height - pad_top

pad_left = pad_along_width // 2

pad_right = pad_along_width - pad_left

測試例項

import tensorflow as tf

input = tf.variable(tf.random_normal([1,16,64,3]))

filter = tf.variable(tf.random_normal([3,5,3,32]))

op = tf.nn.conv2d(input, filter, strides=[1, 2, 2, 1], padding=『valid』)

with tf.session() as sess:

sess.run(tf.global_variables_initializer())

res = (sess.run(op))

print (res.shape)

conv2d函式引數解釋以及padding理解

Conv2d函式詳解（Pytorch）

TensorFlow 中 conv2d 的確切含義

Matlab函式 conv2的用法

conv2d函式引數解釋以及padding理解

Conv2d函式詳解（Pytorch）

TensorFlow 中 conv2d 的確切含義

Matlab函式 conv2的用法

相關推薦