TensorFlow的資料匯入方法

2021-08-20 13:43:21 字數 3795 閱讀 3691

- 建立tensor,從二進位制檔案隨機讀取乙個mini-batch

- 把mini-batchtensor傳入網路作為輸入節點。

writer = tf.python_io.tfrecordwriter('/tmp/data.tfrecord')

for i in range(0, 10):

# 建立樣本example

# ...

serialized = example.serializetostring() # 序列化

writer.write(serialized) # 寫入檔案

writer.close()

# 建立樣本example

a_data = 0.618 + i # float

b_data = [2016 + i, 2017+i] # int64

c_data = numpy.array([[0, 1, 2],[3, 4, 5]]) + i # bytes

c_data = c_data.astype(numpy.uint8)

c_raw = c.tostring() # 轉化成字串

example = tf.train.example(

features=tf.train.features(

feature=

))

def

read_single_sample

(filename):

# 讀取樣本example的每個成員a,b,c

# ...

return a, b, c

# 讀取樣本example的每個成員a,b,c

filename_queue = tf.train.string_input_producer([filename], num_epochs=none) # 不限定讀取數量

reader = tf.tfrecordreader()

_, serialized_example = reader.read(filename_queue)

# get feature from serialized example

features = tf.parse_single_example(

serialized_example,

features=

)a = features['a']

b = features['b']

c_raw = features['c']

c = tf.decode_raw(c_raw, tf.uint8)

c = tf.reshape(c, [2, 3])

a_batch, b_batch, c_batch = tf.train.shuffle_batch([a, b, c], batch_size=2, capacity=200, min_after_dequeue=100, num_threads=2)
# sess

sess = tf.session()

init = tf.initialize_all_variables()

sess.run(init)

tf.train.start_queue_runners(sess=sess)

a_val, b_val, c_val = sess.run([a_batch, b_batch, c_batch])

a_val, b_val, c_val = sess.run([a_batch, b_batch, c_batch])

import tensorflow as tf

import numpy

defwrite_binary

(): writer = tf.python_io.tfrecordwriter('/tmp/data.tfrecord')

for i in range(0, 2):

a = 0.618 + i

b = [2016 + i, 2017+i]

c = numpy.array([[0, 1, 2],[3, 4, 5]]) + i

c = c.astype(numpy.uint8)

c_raw = c.tostring()

example = tf.train.example(

features=tf.train.features(

feature=))

serialized = example.serializetostring()

writer.write(serialized)

writer.close()

defread_single_sample

(filename):

# output file name string to a queue

filename_queue = tf.train.string_input_producer([filename], num_epochs=none)

# create a reader from file queue

reader = tf.tfrecordreader()

_, serialized_example = reader.read(filename_queue)

# get feature from serialized example

features = tf.parse_single_example(

serialized_example,

features=

)a = features['a']

b = features['b']

c_raw = features['c']

c = tf.decode_raw(c_raw, tf.uint8)

c = tf.reshape(c, [2, 3])

return a, b, c

#-----main function-----

if1:

write_binary()

else:

# create tensor

a, b, c = read_single_sample('/tmp/data.tfrecord')

a_batch, b_batch, c_batch = tf.train.shuffle_batch([a, b, c], batch_size=3, capacity=200, min_after_dequeue=100, num_threads=2)

queues = tf.get_collection(tf.graphkeys.queue_runners)

# sess

sess = tf.session()

init = tf.initialize_all_variables()

sess.run(init)

tf.train.start_queue_runners(sess=sess)

a_val, b_val, c_val = sess.run([a_batch, b_batch, c_batch])

print(a_val, b_val, c_val)

a_val, b_val, c_val = sess.run([a_batch, b_batch, c_batch])

print(a_val, b_val, c_val)

Tensorflow讀取資料

關於tensorflow讀取資料,官網給出了三種方法 對於資料量較小而言,可能一般選擇直接將資料載入進記憶體,然後再分batch輸入網路進行訓練 tip 使用這種方法時,結合yield使用更為簡潔,大家自己嘗試一下吧,我就不贅述了 但是,如果資料量較大,這樣的方法就不適用了,因為太耗記憶體,所以這時...

tensorflow資料輸入

tensorflow的資料輸入採用佇列 執行緒的機制,這樣可以使得系統更加輕量。如例項 獲取資料的列表 image list,label list read image label list images tf.convert to tensor self.image list,dtype tf.s...

Tensorflow載入資料

1.reader tf.textlinereader 每次讀取一行 閱讀器的read方法會輸出乙個key來表徵輸入的檔案和其中的紀錄 對於除錯非常有用 同時得到乙個字串標量,這個字串標量可以被乙個或多個解析器,或者轉換操作將其解碼為張量並且構造成為樣本。file1.csv內容 10010 1112 ...