python聯接hive的模板

python連線hive的demo

在部署了兩個hive後，分別在namenode和乙個datanode上，用nohup hive --service hiveserver &將hive server啟動。

hive server 讓hive提供thrift服務的伺服器形式執行，允許不同的語言編寫客戶端進行訪問。使用thrift、jdbc、odbc聯結器的客戶需要執行hive伺服器來和hive進行通訊。通過設定hive_port環境變數來指明伺服器監聽的埠（預設的是10000）

alias python='python2.6'
pythonpath=$pythonpath:~/pyhive
export path hadoop_home hive_home pythonpath

pyhive為python的thrift檔案目錄，需要載入到pythonpath中

import sys
from hive_service import thrifthive
from hive_service.ttypes import hiveserverexception
from thrift import thrift
from thrift.transport import tsocket
from thrift.transport import ttransport
from thrift.protocol import tbinaryprotocol
try:
transport = tsocket.tsocket('192.168.30.201', 10000)
transport = ttransport.tbufferedtransport(transport)
protocol = tbinaryprotocol.tbinaryprotocol(transport)
client = thrifthive.client(protocol)
transport.open()
hql = '''create table people(a string, b int, c double) row format delimited fields terminated by ',' '''
print hql
client.execute(hql)
client.execute("load data local inpath '/home/diver/data.txt' into table people")
#client.execute("select * from people")
#while (1):
# row = client.fetchone()
# if (row == none):
# break
# print row
client.execute("select count(*) from people")
print client.fetchall()
transport.close()
except thrift.texception, tx:
print '%s' % (tx.message)

這樣有了資料倉儲基本執行指令碼的模板了，對於執行的環境變數可以再做封裝部分，指令碼只突出邏輯處理。

hive內聯接和外聯接

hql很多語句和sql有相似之處，下面用例子快速了解內外聯接的用法在多表操作的時候，經常會遇到需要的資料，一部分存在a表,一部分存在b表，或者存在更多的表中。而我們可以從這些表的關係進行聯接，下面建立兩個表進行例項演示首先建立乙個學生資訊表，有id，s name，c name三個屬性 hive ...

sql 左聯接，右聯接，內聯接的比較

首先需要解釋一下這幾個聯接的意思 2 left join 左聯接返回包括左表中的所有記錄和右表中聯結字段相等的記錄。3 right join 右聯接返回包括右表中的所有記錄和左表中聯結字段相等的記錄。inner join 等值連線只返回兩個表中聯結字段相等的行。接下來，建立乙個資料庫，然後建立...

SQL中常用的聯接

內聯接只包含匹配的行，也就是只返回兩個資料集都有的部分對應的行 select from table1 inner join table2 on table1.column table2.column 外聯接擴充套件了內聯接，它還返回左邊或右邊的資料集中不匹配的資料，不匹配的資料以null顯示左外...

python聯接hive的模板

hive內聯接和外聯接

sql 左聯接，右聯接，內聯接的比較

SQL中常用的聯接

相關推薦