學習筆記 合成特徵與離群值 後記

2021-08-28 05:17:11 字數 2604 閱讀 9284

import numpy as np

import tensorflow as tf

from tensorflow.data import dataset

import pandas as pd

def my_input_fn(features, targets, batch_size=1, shuffle=true, num_epochs=none):

"""trains a linear regression model of one feature.

args:

features: pandas dataframe of features

targets: pandas dataframe of targets

batch_size: size of batches to be passed to the model

shuffle: true or false. whether to shuffle the data.

num_epochs: number of epochs for which data should be repeated. none = repeat indefinitely

returns:

tuple of (features, labels) for next data batch

"""# convert pandas data into a dict of np arrays.

features =

# construct a dataset, and configure batching/repeating.

ds = dataset.from_tensor_slices((features, targets)) # warning: 2gb limit

ds = ds.batch(batch_size).repeat(num_epochs)

# shuffle the data, if specified.

if shuffle:

ds = ds.shuffle(buffer_size=10000)

# return the next batch of data.

features, labels = ds.make_one_shot_iterator().get_next()

return features, labels

def add_layer(inputs,input_size, output_size, activation_function=none):

weights = tf.variable(tf.random_normal([input_size,output_size]))

biases = tf.variable(tf.zeros(output_size))

wx_b = tf.matmul(inputs, weights) + biases

if activation_function is none:

output = wx_b

else:

output = activation_function(wx_b)

return output

df = pd.read_csv('california_housing_train.csv')

df['median_house_value'] /= 1000

df = df.reindex(np.random.permutation(df.index))

df['rooms_per_person'] = df['total_rooms']/ df['population']

x1 = df[['rooms_per_person']].astype('float32')

y1 = df['median_house_value'].astype('float32')

xs, ys = my_input_fn(x1, y1, batch_size=2000)

xs = tf.expand_dims(xs['rooms_per_person'], -1)

l1 = add_layer(xs, 1, 10, activation_function=tf.nn.tanh)

pred = add_layer(l1, 10, 1)

loss = tf.sqrt(tf.reduce_mean(tf.square(pred - ys)))

train_step = tf.train.adamoptimizer(0.1).minimize(loss)

sess = tf.session()

init = tf.global_variables_initializer()

sess.run(init)

for i in range(1000):

sess.run(train_step)

if i % 50 == 0:

print(sess.run(loss))

經過努力終於把原文dataset的輸出結果匯入到我們的框架了,這裡xs ys 返回的是乙個tensor, xs為字典格式,這裡我們通過提取xs 的values並且將其變形為(batch_size, 1)的格式,放入我們的矩陣中進行運算。這裡的my_input_fn函式完全是原文搬過來的。

機器學習4 特徵向量與特徵值

a為n階矩陣,若數 和n維非0列向量x滿足ax x,那麼數 稱為a的特徵值,x稱為a的對應於特徵值 的特徵向量。式ax x也可寫成 a e x 0,並且 e a 叫做a 的特徵多項式。當特徵多項式等於0的時候,稱為a的特徵方程,特徵方程是乙個齊次線性方程組,求解特徵值的過程其實就是求解特徵方程的解。...

特徵值 與特徵向量 機器學習演算法原理與實踐)

取至 機器學習演算法原理與程式設計實踐 鄭捷 coding utf 8 filename matrix05.py import operator from numpy import eps 1.0e 6 誤差量 矩陣的特徵值和特徵向量 a mat 8,1,6 3,5,7 4,9,2 evals,ev...

學習筆記DL006 特徵分解,奇異值分解

2019獨角獸企業重金招聘python工程師標準 特徵分解。整數分解質因素。特徵分解 eigendecomposition 使用最廣,矩陣分解一組特徵向量 特徵值。方陣?的特徵向量 eigenvector 與?相乘相當對該向量縮放非零向量?標量 為特徵向量對應特徵值 eigenvalue 左特徵向量...