2 hive匯入匯出刪除

一、hive的幾種常見的資料匯入方式

（插入過程中，不做源資料檢測，即：什麼資料都可以插入，只是查詢時會報空值）

這裡介紹四種：

（1）從本地檔案系統中匯入資料到hive表；

（2）從hdfs上匯入資料到hive表；

（3）從別的表中查詢出相應的資料並匯入到hive表中；

（4）在建立表的時候通過從別的表中查詢出相應的記錄並插入到所建立的表中

[hadoop@h91 hive-0.9.0-bin]$ bin/hive 進入hive模式

1.從本地檔案系統中匯入資料到hive表

1.1：建立ha表

hive> create table ha(id int,name string)

> row format delimited --關鍵字，設定建立表的時候支援分隔符

> fields terminated by '\t' --關鍵字，定義分隔符型別

> stored as textfile; --關鍵字，設定載入資料的資料型別是文字文件（txt格式）

[row format delimited]關鍵字，是用來設定建立的表在載入資料的時候，支援的列分隔符。

[stored as file_format]關鍵字是用來設定載入資料的資料型別,預設是textfile，如果檔案資料是純文字，就是使用 [stored as textfile]，然後從本地直接拷貝到hdfs上，hive直接可以識別資料。

1.2：作業系統中的文字

[hadoop@h91 ~]$ vim haha.txt

101 zs

102 ls

103 ww

1.3：匯入資料

hive> load data local inpath '/home/hadoop/haha.txt' into table ha;

hive> select * from ha;

和我們熟悉的關係型資料庫不一樣，hive是非關係型資料庫，現在還不支援在insert語句裡面直接給出一組記錄的文字形式，也就是說，hive並不支援insert into …. values形式的語句。

2.從hdfs上匯入資料到hive表；

2.1：在檔案系統中建立檔案，並上傳到hdfs集群中

[hadoop@h91 ~]$ hadoop fs -mkdir abc

[hadoop@h91 ~]$ vim hehe.txt

1001 aa

1002 bb

1003 cc

[hadoop@h91 ~]$ hadoop fs -put hehe.txt /user/hadoop/abc （上傳到 hdfs中）

2.2：建立表

hive> create table he(id int,name string)

> row format delimited

> fields terminated by '\t'

> stored as textfile;

2.3：匯入資料

hive> load data inpath 'hdfs://h101:9000/user/hadoop/abc/hehe.txt' into table he;

3.從別的表中查詢出相應的資料並匯入到hive表中

3.1：查詢原始**，並建立新**

hive> select * from he;

ok1001 aa

1002 bb

1003 cc

hive> create table heihei(id int,name string)

> row format delimited

> fields terminated by '\t'

> stored as textfile;

3.2：把原始表中的資料查詢並插入到新**中

hive> insert into table heihei select * from he; --0.9之後的版本可以使用into

或hive> insert overwrite table heihei select * from ha; --0.9之前的版本不能使用into，只能使用overwrite（insert overwrite 會覆蓋資料）

4.在建立表的時候通過從別的表中查詢出相應的記錄並插入到所建立的表中

hive> create table gaga as select * from he;

******************************

二、匯出資料

（1）匯出到本地檔案系統；

（2）匯出到hdfs中；

（3）匯出到hive的另乙個表中。

1.匯出到本地檔案系統：

hive> insert overwrite local directory '/home/hadoop/he1' select * from he;

[hadoop@h91 ~]$ cd he1（he1為目錄，目錄下有000000_0檔案）

[hadoop@h91 he1]$ cat 000000_0

（發現列之間沒有分割）

可以下面的方式增加分割

hive> insert overwrite local directory '/home/hadoop/he1' select id,concat('\t',name) from he;

和匯入資料到hive不一樣，不能用insert into來將資料匯出

2.匯出到hdfs中。

hive> insert overwrite directory '/user/hadoop/abc' select * from he;

（/user/hadoop/abc 為hdfs下目錄）

[hadoop@h91 hadoop-0.20.2-cdh3u5]$ bin/hadoop fs -ls abc

[hadoop@h91 hadoop-0.20.2-cdh3u5]$ bin/hadoop fs -cat abc/000000_0

3.匯出到hive的另乙個表中

hive> insert into table he12 select * from he;

******************************

三、刪除資料

1.在hadoop集群中刪除檔案即可實現刪除表中資料

既然已經知道匯入資料是通過檔案匯入，或者是**匯入，只需要在hadoop集群中刪除相應的匯入的資料檔案即可

[hadoop@h101 ~]$ hadoop fs -lsr /user/hive/warehouse

drwxr-xr-x - hadoop supergroup 0 2017-08-23 18:42 /user/hive/warehouse/two

-rw-r--r-- 2 hadoop supergroup 27 2017-08-23 18:39 /user/hive/warehouse/two/000000_0

-rw-r--r-- 2 hadoop supergroup 27 2017-08-23 18:40 /user/hive/warehouse/two/a.txt

-rw-r--r-- 2 hadoop supergroup 27 2017-08-23 18:42 /user/hive/warehouse/two/b.txt

2 hive匯入匯出刪除

二（2） hive報錯記錄

Hive 匯入匯出資料

hive資料匯入匯出

2 hive匯入 匯出 刪除

二（2） hive報錯記錄

Hive 匯入匯出資料

hive資料匯入匯出

相關推薦

2 hive匯入匯出刪除