專案三大資料離線分析平台

具體匯入方法使用下面的連線

注意修改後需要重新部署到tomcat中

隨意點點

用來寫後台的兩種資料埋點

然後測試**hbase-test

上述做完就是：資料通過編寫資料埋點 -》nginx伺服器上面了

現在我們要做的就是通過flume讀取nginx上面的資料存到hdfs

source：exec

channel：memory

sink：hdfssink

# name the components on this agent a1.sources = r1 a1.sinks = k1 a1.channels = c1 # describe/configure the source a1.sources.r1.type = exec a1.sources.r1.command = tail -f /var/log/nginx/access.log # use a channel which buffers events in memory a1.channels.c1.type = memory # describe the sink a1.sinks.k1.type = hdfs a1.sinks.k1.hdfs.path = hdfs://hadoop1:9000/flume/%y%m%d a1.sinks.k1.hdfs.uselocaltimestamp = true #設定上面的年月日，一定要設定這個引數 a1.sinks.k1.hdfs.filetype = datastream a1.sinks.k1.hdfs.rollinterval = 0 a1.sinks.k1.hdfs.rollsize = 10240 a1.sinks.k1.hdfs.rollcount = 0 # bind the source and sink to the channel a1.sources.r1.channels = c1 a1.sinks.k1.channel = c1

[hadoop@hadoop04 ~]$ flume-ng agent --conf conf --conf-file file2hdfs.properties --name a1 -dflume.hadoop.logger=info,console

注意許可權需要在root使用者下才能訪問nginx的目錄

隨意點幾下這樣在hdfs中就有資料了

解析瀏覽器資訊就用別人寫好的**直接解析就行了。

專案三大資料離線分析平台

大資料分析平台專案2

離線大資料專案流程

Hadoop（三）大資料離線計算與實時計算

專案三 大資料離線分析平台

大資料分析平台 專案2

離線大資料專案流程

Hadoop（三） 大資料離線計算與實時計算

相關推薦

專案三大資料離線分析平台

大資料分析平台專案2

Hadoop（三）大資料離線計算與實時計算