Hive 學習記錄入門基礎

一、常用ddl

建表：create [external] table [if not exists] table_name

[(col_name data_type [comment col_comment], ...)]

[comment table_comment]

[partitioned by (col_name data_type [col_comment], col_name data_type [comment col_comment], ...)]

[clustered by (col_name, col_name, ...) [sorted by (col_name, ...)] into num_buckets buckets]

[[row format row_format] [stored as file_format]

| stored by 'storage.handler.class.name' [with serdeproperties (...)]

][location hdfs_path]

[as select_statement]

表重名：

alter table student rename to student1;

更改字段型別：

alter table student change name1 name1 string;

增加分割槽：

alter table student_p add partition(part='a') partition(part='b');

增加/更新列：

alter table table_name add|replace columns (col_name data_type [comment col_comment], ...)

（注：add是代表新增一欄位，字段位置在所有列後面(partition列前)，replace則是表示替換表中所有字段。）

alter table table_name change [column] col_old_name col_new_name column_type [comment col_comment] [first|after column_name]

外部表轉內部表：

alter table tablea set tblproperties('external'='false');

二、載入 / 匯出資料

載入資料：

hive> load data local inpath '/home/hadoop/hivedata/students.txt' overwrite into table student;

hive> load data inpath 'hdfs://mini1:9000/hivedata/course.txt' overwrite into table course;

匯出檔案到檔案：

hive> insert overwrite local directory '/home/hadoop/hivedata/outdata' select * from student;

hive> insert overwrite directory 'hdfs://mini1:9000/hivedata/outdatasc'

row format delimited fields terminated by ','

select * from student;

資料寫入到檔案系統時進行文字序列化，且每列用^a來區分，\n為換行符。用more命令檢視時不容易看出分割符，可以使用: sed -e 's/\x01/|/g' filename來檢視，如：sed -e 's/\x01/,/g' 000000_0

三、hive shell:

hive [-hiveconf x=y]* [<-i filename>]* [<-f filename>|<-e query-string>] [-s]

eg: hive -e 'select count(*) from student'

hive -f query.sql

hive --hiveconf hive.log.dir=/tmp/ --hiveconf hive.log.file=tmp.log -f query.sql

hive --hiveconf hive.root.logger=info,console -f query.sql -- 配置引數 $

hive --hivevar queue_name=test_queue -f query.sql --自定義引數 $

set hive.cli.print.header=true;

set mapreduce.job.queuename=test_queue;

set -v; --檢視所有定義的引數

set hive.cli.print.header; -- 檢視某引數值

四、udf

hive>add jar /home/hadoop/udf.jar; --$hive_home/lib/

hive>create temporary function helloudf as 'org.apache.hadoop.hive.ql.udf.helloudf';

hive>create function helloudf as 'org.apache.hadoop.hive.ql.udf.helloudf' using jar 'hdfs://hadoop001:9000/lib/g6-hadoop-1.0.jar';

#udf remove function

drop function test_udf;

#udf add function

create function test_udf as 'com.xys.bigdata.testudf' using jar 'hdfs://nameservice1/user/hadoop/jar/hive-test-1.4.jar';

# test udf

select test_udf('20200101');

Hive 學習記錄入門基礎

Redis學習記錄入門（一）

python 學習隨筆記錄入門

Python Logging 日誌記錄入門學習

Hive 學習記錄 入門基礎

Redis學習記錄 入門（一）

python 學習隨筆記錄 入門

Python Logging 日誌記錄入門學習

相關推薦

Hive 學習記錄入門基礎

Redis學習記錄入門（一）

python 學習隨筆記錄入門