Sqoop的資料處理方法

-2 ，檢視 mysql 的資料

--connect ## jdbc 連線位址 --connection-manager ## 指定使用的連線管理類 --driver ## 指定jdbc要使用的驅動類 --help ## 列印用法幫助資訊 -p ## 從控制台讀取輸入的密碼 -m ## 複製過程中使用1個map作業，若是不寫預設是使用4個map任務並行匯入 --password ## 密碼 --username ## 賬號 --table ## mysql表名 --fields-terminated-by ## 輸出檔案中的行的字段分隔符 --target-dir ## 指定hdfs路徑 --where ## 指定匯出時所使用的查詢條件 --verbose ## 在控制台列印出詳細執行資訊 --connection-param-file ## 可選，乙個記錄著資料庫連線引數的檔案

命令：
sqoop import \
--connect jdbc:mysql://hadoop01:3306/mysql \
--username root \
--password shiny \
--table help_keyword \
-m 1

驗證結果

缺省會儲存在 hdfs 上的 /user/shiny/help_keyword 目錄中，用逗號對欄位進行分隔

命令：
sqoop import \
--connect jdbc:mysql://hadoop01:3306/mysql \
--username root \
--password shiny \
--table help_keyword \
--target-dir /mysqltohdfs/help_keyword \
--fields-terminated-by '\t' \
-m 1

驗證結果：

使用 --target-dir 引數，指定匯出的檔案儲存路徑為/mysqltohdfs/help_keyword，並指定用"\t"對欄位進行分隔

命令：
sqoop import \
--connect jdbc:mysql://hadoop01:3306/mysql \
--username root \
--password shiny \
--table help_keyword \
--where 'name = "only"' \
--target-dir /mysqltohdfs/where/help_keyword \
--fields-terminated-by '\t' \
-m 1

驗證結果：

將name = 「only」的記錄寫入 /mysqltohdfs/where/help_keyword 目錄中，並用"\t"對欄位進行分隔

命令：
sqoop import \
--connect jdbc:mysql://hadoop01:3306/mysql \
--username root \
--password shiny \
--query 'select * from help_keyword where help_keyword_id < 10 and $conditions' \
--split-by help_keyword_id \
--target-dir /mysqltohdfs/query/help_keyword \
--fields-terminated-by '\t' \
-m 2

驗證結果：

將hql查詢的結果寫入/mysqltohdfs/query/help_keyword 目錄中，並用"\t"對欄位進行分隔

1，普通匯入

2，指定分隔符和資料庫

3，覆蓋表中資料普通匯入語法

資料匯出詳細步驟1，語法

2，資料匯出詳細步驟沒有直接的命令將 hbase 的資料匯出到 mysql

（1）先將 hbase 的資料匯出到 hdfs；

（2）再將資料匯出到 mysql。

Sqoop的資料處理方法

資料處理 pandas資料處理優化方法小結

pandas 的資料處理方法

資料處理方法總結

Sqoop的資料處理方法

資料處理 pandas資料處理優化方法小結

pandas 的資料處理方法

資料處理方法總結

相關推薦