Hive總結（二）Hive資料匯入的三種方式

常用的的有三種：

1.從本地檔案系統中匯入資料到hive表；

2.從hdfs上匯入資料到hive表；

3.在建立表的時候通過從別的表中查詢出相應的記錄並插入到所建立的表中。

hive配置：

hdfs中hive資料檔案存放目錄（啟動hive後hdfs自動建立）：

hdfs： /usr/hive/warehouse

hadoop fs -mkdir /usr/hive/warehouse 命令創立

本地資料存放目錄：

本地：/home/santiago/data/hive

1.在hive中建表

hive> show databases;
okdefault
time taken: 1.706 seconds, fetched: 1 row(s)
hive> create table guo_test(name string,string
string)
> row format delimited
> fields terminated by
',' > stored as textfile；
hive> show tables;
okguo_test
time taken: 0.024 seconds, fetched: 1 row(s)

2.在本地檔案建立同型別資料表

santi@hdp :~/data/hive $ ls hive_test.txt santi@hdp :~/data/hive $ cat hive_test.txt

santi,you are a zhazha.

3.匯入資料並測試

hive>load data local inpath '/home/santi/data/hive/hive_test.txt' into table guo_test; hive> select * from guo_test; hive>dfs -ls /usr/hive/warehouse/guo_test; #hadoop fs -ls /usr/hive/warehouse found 1 items drwxrwxr-x - santiago supergroup 0 2017-01 -1421:13

/usr/hive/warehouse/guo_test

發現hive-site,xml設定的hdfs檔案儲存位置中多了guo_test這個資料夾

#hadoop fs -ls /usr/hive/warehouse/guo_test found 1 items -rwxrwxr -x1 santiago supergroup 24 2017-01 -1421:13 /usr/hive/warehouse/guo_test/hive_test.txt hive> select * from guo_test;

oksanti you are a zhazha.

在該資料夾中找到了所寫入hive資料倉儲的檔案。

[注]本地資料寫入成功，但是從本地將資料匯入到hive表的過程中，其實是先將資料臨時複製到hdfs的乙個目錄下（典型的情況是複製到上傳使用者的hdfs home目錄下,比如/home/santi/），然後再將資料從臨時目錄下移動到對應的hive表的資料目錄裡面(臨時目錄不保留資料)。

1.在hdfs檔案系統上建立資料檔案

hdfs上沒有vim命令，則需要將本地資料檔案手動傳入到hdfs上

/data/hive# vim data_hdtohive /data/hive# cat data_hdtohive data from, hdfs to hive #hadoop fs -put /home/santi/data/hive/data_hdtohive /usr/data/input//資料傳入

# hadoop fs -ls /usr/data/input

2匯入資料

hive> load data inpath '/usr/data/input/data_hdtohive'
into table guo_test;
hive> select * from guo_test;
okdata from hdfs to hive 
santi you are a zhazha.
time taken: 0.172 seconds, fetched: 2 row(s)

資料寫入成功

資料存hive配置的資料儲存位置中。

[注]

從本地匯入資料語句為

hive>load data local inpath 『/home/santi/data/hive/hive_test.txt』 into table guo_test;

從hdfs中匯入資料的語句為

hive> load data inpath 『/usr/data/input/data_hdtohive』 into table guo_test;

差距在local這個命令這裡。

而從hdfs系統上匯入到hive表的時候，資料轉移。hdfs系統上查詢不到相關檔案。

命令為create table 表名 as selecr *** from 表名。

hive> create table hivedata_test1
> as
> select name
> from guo_test;
hive> select * from hivedata_test1;
okdata from
santi
time taken: 0.116 seconds, fetched: 2 row(s)

[注]hive是分割槽表有稍微區別

在hive中，表的每乙個分割槽對應表下的相應目錄，所有分割槽的資料都是儲存在對應的目錄中。比表有a和b兩個分割槽，則對應a=***,b=xx對應表的目錄為/user/hive/warehouse/a=***

user/hive/warehouse/b=xx，所有屬於這個分割槽的資料都存放在這個目錄中。

hive> create table hivedata_test2(
> name string)
> partitioned by
> (string
string)
> row format delimited
> fields terminated by
','> stored as textfile;
hive> insert into table hivedata_test2
> partition(string='best')
> select name
> from guo_test;
hive> select * from hivedata_test2;
okdata from best
santi best
time taken: 1.549 seconds, fetched: 2 row(s)
# hadoop fs -ls /usr/hive/warehouse/hivedata_test2
found 1 items
drwxrwxr-x -santiago supergroup 0
2017-02-14
17:40
/usr/hive/warehouse/hivedata_test2/string=best

Hive總結（二）Hive資料匯入的三種方式

HIVE資料匯入

Hive資料匯入

乾貨總結 Hive 資料匯入 HBase

Hive總結（二）Hive資料匯入的三種方式

HIVE資料匯入

Hive資料匯入

乾貨總結 Hive 資料匯入 HBase

相關推薦