Sqoop的一些使用心得

sqoop作為從關係型資料庫匯入hdfs的工具，我們在從關係型資料庫中匯出資料時可先做資料篩選，選定我們所要的資料，能大大的減輕資料負擔，即sql語句後加where條件的實現！

經測試可執行sqoop指令碼如下：

sqoop import --connect jdbc:oracle:thin:@ip:port:schema --username username -password=password --table *** --columns "columns" --where " c1>=to_date('2015-01-01 00:00:00','yyyy-mm-dd hh24:mi:ss') and c1<=to_date('2015-02-01 00:00:00','yyyy-mm-dd hh24:mi:ss') " -m 8 --split-by id --fields-terminated-by '^' --target-dir /importdata/***/

原測試sqoop指令碼如下：

sqoop import --connect jdbc:oracle:thin:@ip:port:schema --username username -password=password --query 『select columns from talbename where 1=1 and $condtions' --where " c1>=to_date('2015-01-0100:00:00','yyyy-mm-dd hh24:mi:ss') and c1<=to_date('2015-02-01 00:00:00','yyyy-mm-dd hh24:mi:ss') " -m 8 --split-by id --fields-terminated-by '^' --target-dir /importdata/***/

或測試指令碼為：

sqoop import --connect jdbc:oracle:thin:@ip:port:schema --username username -password=password --query 『select columns from talbename where 1=1 and c1>=to_date('2015-01-01 00:00:00','yyyy-mm-dd hh24:mi:ss') and c1<=to_date('2015-02-01 00:00:00','yyyy-mm-dd hh24:mi:ss') and $condtions' -m 8 --split-by id --fields-terminated-by '^' --target-dir /importdata/***/

或測試指令碼為：

sqoop import --connect jdbc:oracle:thin:@ip:port:schema --username username -password=password --query 『select columns from talbename where 1=1 and $contions and c1>=to_date('2015-01-01 00:00:00','yyyy-mm-dd hh24:mi:ss') and c1<=to_date('2015-02-01 00:00:00','yyyy-mm-dd hh24:mi:ss') ' -m 8 --split-by id --fields-terminated-by '^' --target-dir /importdata/***/

或者其他一些測試指令碼，在使用時有心得如下;

--query似乎並不能支援複雜where條件篩選，但必須有where語句與$condtions，--query即使後面加--where 引數也還是無法起到作用，現在測試通過的情況僅有--columns與--where合用時是滿足要求的！

sqoop --query引數中where語句與$condtions佔位符合用是將由--split-by引數劃分出的每個map執行sql補全，如：

a. 通過split-by id指定 map資料劃分所依據的列

b. sqoop執行 select max(id), min(id) from ***獲得資料範圍，比如max(id) = 100, min(id) = 1

c. sqoop若指定使用4個map進行資料的匯入，因此在情況下，可將id的區間分為4段，通過不同的map進行匯入，分別為：

1 <= id and id <= 25

26 <= id and id <= 50

51 <= id and id <= 75

76 <= id and id <= 100

d. 每個map將sql補全，比如 select * from ***where (1 <= id and id <= 25). 用(1 <= id and id <= 25) 替代了--query中的佔位符 $conditions

以上為sqoop從關係型資料庫匯入hdfs使用心得，新人使用，可能理解有誤，還請交流！

Sqoop的一些使用心得

VMware一些使用心得

VMware一些使用心得

hiredis的一些使用心得

Sqoop的一些使用心得

VMware一些使用心得

VMware一些使用心得

hiredis的一些使用心得

相關推薦