hadoop中conbine的簡單使用《轉》

combine函式把乙個map函式產生的對（多個key, value）合併成乙個新的. 將新的作為輸入到reduce函式中。其格式與reduce函式相同。

例如：將3個檔案中的數值相加。

file1: 1 2 3

file2: 4 5 6

file3: 7 8 9

public class mymapre06

}public static class reduce extends mapreducebase implements

reducer

string res = new stringbuffer(num.tostring()).tostring();

v.set(res);

output.collect(key, v); // 收集reduce輸出結果}}

public static class combiner extends mapreducebase implements

reducer

v.set(num.tostring());

output.collect(key, v); // 收集reduce輸出結果}}

public static void main(string args) throws exception

}經過 combiner函式， file1 為

6， file2 為 15， file3 為 24

進過 reduce函式，輸出 key 為 1 value 為 35

hadoop中的Jobhistory歷史伺服器

1.啟動指令碼 mr jobhistory daemon.sh start historyserver 2.配置說明 jobhistory用於查詢每個job執行完以後的歷史日誌資訊，是作為一台單獨的伺服器執行的。可以在namenode或者datanode上的任意一台啟動即可。預設的配置如下，但是需要...

hadoop中的檔案壓縮

1 減少磁碟的儲存空間 2 減少磁碟io和網路io 3 加快資料傳輸速度磁碟和網路如果小檔案多明顯檔案傳輸會明顯降低 1 考慮檔案的壓縮效率壓縮快慢 2 考慮檔案的壓縮比解壓快慢第一點好理解，壓縮的快肯定好第二點是壓縮比，舉例現在有乙個10g的檔案，一種壓縮演算法能把他壓縮成1g，其他壓...

Hadoop中的jobhistory配置與啟動停止

hadoop中的jobhistory配置與啟動停止 jobhistory配置在yarn site.xml中新增開啟日誌聚合 yarn.log aggregation enable true 在mapred site.xml中新增設定jobhistoryserver 沒有配置的話 history...

hadoop中conbine的簡單使用《轉》

hadoop中的Jobhistory歷史伺服器

hadoop中的檔案壓縮

Hadoop中的jobhistory配置與啟動停止

相關推薦