flume實時讀取檔案到kafka

2022-07-12 18:36:09 字數 1605 閱讀 5789

背景:需要實時讀取log日誌檔案中的記錄到kafka

1.zookeeper服務需要開啟,檢視zookeeper的狀態,(zookeeper的安裝及啟動過程可檢視

[root@master kafka_2.11-0.11]# /opt/soft/zookeeper-3.4.13/bin/zkserver.sh status

zookeeper jmx enabled by default

using config: /opt/soft/zookeeper-3.4.13/bin/../conf/zoo.cfg

mode: follower

2.kafka服務需要開啟

/opt/soft/kafka_2.11-0.11/bin/kafka-server-start.sh /opt/soft/kafka_2.11-0.11/config/server.properties
3.啟動flume 的配置檔案

啟動flume:bin/flume-ng agent --conf conf --conf-file ./conf/job/file_to_hdfs.conf --name a1

a1.sources = r1

a1.sinks = k1

a1.channels = c1

# describe/configure the source

a1.sources.r1.type = exec

a1.sources.r1.command = tail -f /opt/data/mall/16/mall.log # 需要監控的檔案

# describe the sink

#a1.sinks.k1.type = logger

a1.sinks.k1.type = org.apache.flume.sink.kafka.kafkasink

a1.sinks.k1.topic = test

a1.sinks.k1.brokerlist = master:9092

a1.sinks.k1.requiredacks = 1

a1.sinks.k1.batchsize = 20

# use a channel which buffers events in memory

a1.channels.c1.type = memory

a1.channels.c1.capacity = 1000

a1.channels.c1.transactioncapacity = 100

# bind the source and sink to the channel

a1.sources.r1.channels = c1

a1.sinks.k1.channel = c1

4.檢視kafka對應的topic裡面資料是否有同步過來

實時讀取目錄檔案到HDFS

1 建立配置檔案flume dir.conf 1.定義agent a3 a3.sources r3 a3.sinks k3 a3.channels c3 2.定義source a3.sources.r3.type spooldir a3.sources.r3.spooldir opt module ...

flume實時收集日誌到kafka

flume實時收集日誌 kafka版本0.8.2 1.版本apache flume 1.7.0 bin.tar.gz 解壓後conf 目錄下配置以.conf結尾的檔案如 flume properties.conf 2.配置檔案資訊 sources 資料來源每增加乙個新增即可 a1.sources r...

Flume採集檔案到HDFS

在flume和hadoop安裝好的情況下 1.遇到的坑 在安裝hadoop時,配置 core site.xml 檔案一定要注意。fs.defaultfs name hdfs master 9000 value property 上述的value值使用的是主機名稱 master 或者ip位址,不能使用...