大資料 hive 命令列

hive -h

幫助文件

usage: hive commands. e.g. -d a=b or --define a=b --database specify the database to use -e sql from command line -f sql from files -h,--help print help information -h connecting to hive server on remote host --hiveconf use value for given property commands. e.g. --hivevar a=b -i initialization sql file -p connecting to hive server on port number -s,--silent silent mode in interactive shell -v,--verbose verbose mode (echo executed sql to the

console)

hive -e

執行乙個查詢,之後過程中，會在終端上顯示mapreduce的進度，執行完畢後，最後把查詢結果輸出到終端上，接著hive程序退出，不會進入互動模式

$hive_home/bin/hive -e

'select a.col from tab1 a'

可以在bash中定義hive環境配置，如

$hive_home/bin/hive -e

'select a.col from tab1 a' --hiveconf hive.exec.scratchdir=/home/my/hive_scratch --hiveconf mapred.reduce.tasks=32

hive -s

使用靜音模式執行乙個查詢,加入-s，終端上的輸出不會有mapreduce的進度，執行完畢，只會把查詢結果輸出到終端上。這個靜音模式很實用，,通過第三方程式呼叫，第三方程式通過hive的標準輸出獲取結果集。示例使用靜音模式執行乙個查詢，把結果集匯出。

$hive_home/bin/hive -s -e'select a.col from tab1 a'

> tab1.csv

hive -f

不進入互動模式，執行乙個hive script

$hive_home/bin/hive -f /home/my/hive-script .sql hive-script

.sql

從0.14開始可以執行hdfs上的檔案

$hive_home/bin/hive -f hdfs://:/hive-script.sql

hive -i

指定包含對hive環境配置的變數,預設載入$hive_home/bin/.hiverc and $home/.hiverc，檔案內容可參考一下配置。

#在命令列中顯示當前資料庫名 set hive.cli .print .current .db=true; #查詢出來的結果顯示列的名稱 set hive.cli .print .header=true; #啟用桶表 set hive.enforce .bucketing=true; #壓縮hive的中間結果 set hive.exec .compress .intermediate=true; #對map端輸出的內容使用bzip2編碼/解碼器 set mapred.map .output .compression .codec=org.apache .hadoop .io.compress .bzip2codec; #壓縮hive的輸出 set hive.exec .compress .output=true; #對hive中的mr輸出內容使用bzip2編碼/解碼器 set mapred.output .compression .codec=org.apache .hadoop .io.compress .bzip2codec; #讓hive盡量嘗試local模式查詢而不是mapred方式 set hive.exec .mode .local

.auto=true;

更多參考languagemanual+variablesubstitution

禁用變數：

set hive.variable.substitute=false;

-d(--define)、hive --database

定義乙個變數值，這個變數可以在hive互動shell中引用，比如：-d a=b。以下是個示例:定義了乙個變數k1,值為v1,執行了資料庫為lxw1234,進入hive互動shell之後，可以使用$來引用該變數.

–database指定資料庫

$hive_home/bin/hive -d k1=v1 –database default

這裡將k1的值v1列印出來。

hive> select 『$』 from t_lxw1234 limit 1;
okv1

–hiveconf ;

可以使用該選項設定hive的執行引數配置，相當於在hive互動shell中使用set命令進行設定，比如執行一下指令，進入互動shell之後，執行的所有查詢都會設定20個reduce task。除非又用set mapred.reduce.tasks=n;進行另外設定。

$hive_home/bin/hive –hiveconf mapred.reduce

.tasks=20

–hivevar

用法同-d和–define

quit、exit

退出互動shell

reset

重置所有的hive執行時配置引數，比如，之前使用set命令設定了reduce數量，使用reset之後，重置成hive-site.xml中的配置。

set =

設定hive執行時配置引數，優先順序最高，相同key，後面的設定會覆蓋前面的設定。

set –v

列印出所有hive的配置引數和hadoop的配置引數。

add

包括 add file[s] * 、 add jar[s] * 、add archive[s] *向distributecache中新增乙個或過個檔案、jar包、或者歸檔，新增之後，可以在map和reduce task中使用。

比如，自定義乙個udf函式，打成jar包，在建立函式之前，必須使用add jar命令，將該jar包新增，否則會報錯找不到類。

list

包括 list file[s] 、list jar[s] 、list archive[s]

列出當前distributecache中的檔案、jar包或者歸檔。

delete

包括 delete file[s] * 、delete jar[s] * 、 delete archive[s] *

從distributecache中刪除檔案

!

在互動shell中執行linux作業系統命令並列印出結果，不常用

比如：

hive> !pwd;
/home/lxw1234

dfs

在互動shell中執行hadoop fs 命令，不常用

比如，統計hdfs檔案系統中/tmp/目錄的總大小：

hive> dfs -du
-s /tmp/;
54656194751 /tmp

最常用的，執行hql語句，以分號結尾；

source file

在互動shell中執行乙個指令碼，不常用。

languagemanual+cli

[一起學hive]之八-使用hive命令列

大資料 hive 命令列

Hive命令列引數

Hive命令列工具

hive命令列操作

大資料 hive 命令列

Hive命令列引數

Hive命令列工具

hive命令列操作

相關推薦