在Java中使用weka 入門

本文介紹如何使用weka構建特徵向量，訓練分類器，測試分類器，使用分類器。

第一步：用特徵表達問題（屬性）

這一步相當於構建乙個arff檔案

我們先把特徵放入weka.core.fastvector中

每個特徵都包含在weka.core.attribute類中

現在我們有兩個numeric 特徵，乙個 nominal 特徵 (blue, gray, black) 和乙個 nominal 類 (positive, negative).

// declare two numeric attributes
attribute attribute1 = new attribute(「firstnumeric」);
attribute attribute2 = new attribute(「secondnumeric」);
// declare a nominal attribute along with its values
fastvector fvnominalval = new fastvector(3);
fvnominalval.addelement(「blue」);
fvnominalval.addelement(「gray」);
fvnominalval.addelement(「black」);
attribute attribute3 = new attribute(「anominal」, fvnominalval);
// declare the class attribute along with its values
fastvector fvclassval = new fastvector(2);
fvclassval.addelement(「positive」);
fvclassval.addelement(「negative」);
attribute classattribute = new attribute(「theclass」, fvclassval);
// declare the feature vector
fastvector fvwekaattributes = new fastvector(4);
fvwekaattributes.addelement(attribute1); 
fvwekaattributes.addelement(attribute2); 
fvwekaattributes.addelement(attribute3); 
fvwekaattributes.addelement(classattribute);

第二步：訓練分類器需要訓練集例項和分類器

我們先建立乙個空的訓練集（weka.core.instances）

命名這個關係為「rel」（相當於檔案名字）

屬性模型使用第一步中定義的vector定義

初始化訓練集容量為10

定義類屬性為第一步向量中的第四個（classindex）

// create an empty training set
instances istrainingset = new instances("rel", fvwekaattributes, 10); 
// set class index
istrainingset.setclassindex(3);

現在用乙個例項來填充訓練集

// create the instance
instance iexample = new instance(4);
iexample.setvalue((attribute)fvwekaattributes.elementat(0), 1.0); 
iexample.setvalue((attribute)fvwekaattributes.elementat(1), 0.5); 
iexample.setvalue((attribute)fvwekaattributes.elementat(2), "gray");
iexample.setvalue((attribute)fvwekaattributes.elementat(3), "positive");
// add the instance
istrainingset.add(iexample);

最後，選擇乙個分類器（

weka.classifiers.classifier）並建立模型，我們使用樸素貝葉斯分類器（

weka.classifiers.bayes.*****bayes）

// create a naïve bayes classifier 
classifier cmodel = (classifier)new *****bayes();
cmodel.buildclassifier(istrainingset);

第三步：測試分類器

我們已經建立並訓練了乙個分類器，現在來測試這個分類器。我們需要乙個評估模型（weka.classifiers.evaluation），把測試集塞進去試試效果如何。

// test the model
evaluation etest = new evaluation(istrainingset);
etest.evaluatemodel(cmodel, istestingset);

評估模型可以輸出一系列統計資料

// print the result à la weka explorer:
string strsummary = etest.tosummarystring();
system.out.println(strsummary);
// get the confusion matrix
double cmmatrix = etest.confusionmatrix();

system.out.println(etest.tomatrixstring());

第四步：使用這個分類器

在實際應用中，使用這個分類器才是終極目標。下面是乙個最簡單的例子，使用在第二步中建立的例項（

iuse）。

// specify that the instance belong to the training set 
// in order to inherit from the set description
iuse.setdataset(istrainingset);
// get the likelihood of each classes 
// fdistribution[0] is the probability of being 「positive」 
// fdistribution[1] is the probability of being 「negative」 
double fdistribution = cmodel.distributionforinstance(iuse);

在Java中使用Oracle blob

oracle中的lob large object 可以儲存非常大的資料可能是4gb 這樣就可以通過將檔案或其它任何物件序列化成位元組輸出流 outputstream 後寫入資料庫，之後使用位元組輸入流 inputstream 將資料讀出然後反序列化為原始檔案或物件。操作時需要使用oracle的jd...

在Java中使用weka 入門

在Java中使用Oracle blob

在Java中使用Oracle blob

在Java中使用Oracle blob

相關推薦