Spark常用運算元練習

2021-08-15 01:12:49 字數 3075 閱讀 7656

package cn.allengao.exercise

import org.apache.spark.

/** * class_name:

* package:

* describe: sparkrdd運算元練習

* creat_user: allen gao

* creat_date: 2018/1/25

* creat_time: 10:04

**/object sparkrddtest ,集合b=,則兩個集合的笛卡爾積為。

*/val res19 = rdd11.cartesian(rdd12)

//要通過action型別的運算元才能顯示出結果,將結果放到可變陣列中,就可以看到輸出結果,

// 如果不加tobuffer,則列印出來的是乙個引用。

//執行結果:arraybuffer(2, 4, 6, 8, 10, 12, 14, 16, 18, 20)

// println(res1.collect().tobuffer)

//執行結果:arraybuffer(10,12, 14, 16, 18, 20)

// println(res2.collect().tobuffer) //將元素以陣列的方式列印出來

//執行結果:arraybuffer(a, b, c, d, e, f, h, i, j)

// println(res3.collect().tobuffer)

//執行結果:arraybuffer(a, b, c, a, b, b, e, f, g, a, f, g, h, i, j, a, a, b)

// println(res4.collect().tobuffer)

//執行結果:arraybuffer(5, 6, 4, 3, 1, 2, 3, 4)

// println(res5.collect().tobuffer)

//執行結果:arraybuffer(4, 3)

// println(res6.collect().tobuffer)

//執行結果:arraybuffer(4, 6, 2, 1, 3, 5)

// println(res7.collect().tobuffer)

//執行結果:arraybuffer((tom,(1,1)), (jerry,(3,2)))

// println(res8.collect().tobuffer)

//執行結果:arraybuffer((tom,(1,some(1))), (jerry,(3,some(2))), (kitty,(2,none)))

// println(res9.collect().tobuffer)

//執行結果:arraybuffer((tom,(some(1),1)), (jerry,(some(3),2)), (shuke,(none,2)))

// println(res10.collect().tobuffer)

//執行結果:arraybuffer((tom,1), (jerry,3), (kitty,2), (jerry,2), (tom,1), (shuke,2))

// println(res11.collect().tobuffer)

//執行結果:arraybuffer((tom,compactbuffer(1, 1)), (jerry,compactbuffer(3, 2)), (shuke,compactbuffer(2)), (kitty,compactbuffer(2)))

// println(res12.collect().tobuffer)

//執行結果:arraybuffer((tom,2), (jerry,5), (shuke,2), (kitty,2))

// println(res13.collect().tobuffer)

//執行結果:arraybuffer((tom,2), (jerry,5), (shuke,2), (kitty,2))

// println(res14.collect().tobuffer)

//執行結果:arraybuffer((tom,(compactbuffer(1, 2),compactbuffer(1))), (jerry,(compactbuffer(3),compactbuffer(2))), (shuke,(compactbuffer(),compactbuffer(2))), (kitty,(compactbuffer(2),compactbuffer())))

// println(res15.collect().tobuffer)

//執行結果:15

//println(res16)

//執行結果:arraybuffer((tom,4), (jerry,5), (shuke,3), (kitty,7))

// println(res17.collect().tobuffer)

//執行結果:arraybuffer((kitty,7), (jerry,5), (tom,4), (shuke,3))

// println(res18.collect().tobuffer)

/*執行結果:arraybuffer(((tom,1),(jerry,2)), ((tom,1),(tom,3)), ((tom,1),(shuke,2)),

((tom,1),(kitty,5)), ((jerry,3),(jerry,2)), ((jerry,3),(tom,3)), ((jerry,3),(shuke,2)),

((jerry,3),(kitty,5)), ((kitty,2),(jerry,2)), ((kitty,2),(tom,3)), ((kitty,2),(shuke,2)),

((kitty,2),(kitty,5)), ((shuke,1),(jerry,2)), ((shuke,1),(tom,3)), ((shuke,1),(shuke,2)),

((shuke,1),(kitty,5)))

*/println(res19.collect().tobuffer)

}}

Spark 常用運算元

官網rdd操作指南 2 key value資料型別的transfromation運算元 三 連線 3 action運算元 val list list 1 2,3 sc.parallelize list map 10 foreach println 輸出結果 10 20 30 這裡為了節省篇幅去掉了換...

Spark高階運算元練習(二)

package cn.allengao.exercise import org.apache.spark.object sparkrddtest3 執行結果 arraybuffer partid 0,val 1 partid 0,val 2 partid 0,val 3 partid 0,val 4...

Spark常用運算元 action

spark action常用運算元型別如下 1.collectasmap map k,v 二元組rdd轉為map資料型別 countbykey map k,long 統計rdd中每個key出現的次數,還回map型別表示每個key出現了幾次 countbyvalue map t,long 統計rdd中...