參考官方文檔: DML文檔
10多年建站經(jīng)驗(yàn), 網(wǎng)站制作、成都網(wǎng)站建設(shè)客戶的見(jiàn)證與正確選擇。成都創(chuàng)新互聯(lián)提供完善的營(yíng)銷型網(wǎng)頁(yè)建站明細(xì)報(bào)價(jià)表。后期開(kāi)發(fā)更加便捷高效,我們致力于追求更美、更快、更規(guī)范。
LOAD作用是加載文件到表中(Loading files into tables)
下面是官網(wǎng)上為我們列出的語(yǔ)法:
LOAD DATA [LOCAL] INPATH 'filepath' [OVERWRITE] INTO TABLE tablename [PARTITION (partcol1=val1, partcol2=val2 ...)]
LOAD DATA [LOCAL] INPATH 'filepath' [OVERWRITE] INTO TABLE tablename [PARTITION (partcol1=val1, partcol2=val2 ...)] [INPUTFORMAT 'inputformat' SERDE 'serde'] (3.0 or later)
1.加載數(shù)據(jù)到表中時(shí),Hive不做任何轉(zhuǎn)換。加載操作只是把數(shù)據(jù)拷貝或移動(dòng)操作,即移動(dòng)數(shù)據(jù)文件到Hive表相應(yīng)的位置。
2.加載的目標(biāo)可以是一個(gè)表,也可以是一個(gè)分區(qū)。如果表是分區(qū)的,則必須通過(guò)指定所有分區(qū)列的值來(lái)指定一個(gè)表的分區(qū)。
LOCAL:表示輸入文件在本地文件系統(tǒng)(Linux),如果沒(méi)有加LOCAL,hive則會(huì)去HDFS上查找該文件。
OVERWRITE:表示如果表中有數(shù)據(jù),則先刪除數(shù)據(jù),再插入新數(shù)據(jù),如果沒(méi)有這個(gè)關(guān)鍵詞,則直接附加數(shù)據(jù)到表中。
PARTITION:如果表中存在分區(qū),可以按照分區(qū)進(jìn)行導(dǎo)入。
# 創(chuàng)建一張員工表
hive> create table emp
> (empno int, ename string, job string, mgr int, hiredate string, salary double, comm double, deptno int)
> ROW FORMAT DELIMITED
> FIELDS TERMINATED BY '\t' ;
OK
Time taken: 0.651 seconds
# 把本地文件系統(tǒng)中emp.txt導(dǎo)入表
hive> LOAD DATA LOCAL INPATH '/home/hadoop/emp.txt' OVERWRITE INTO TABLE emp;
Loading data to table default.emp
Table default.emp stats: [numFiles=1, numRows=0, totalSize=886, rawDataSize=0]
OK
Time taken: 1.848 seconds
# 查看表數(shù)據(jù)
hive> select * from emp;
OK
7369 SMITH CLERK 7902 1980-12-17 800.0 NULL 20
7499 ALLEN SALESMAN 7698 1981-2-20 1600.0 300.0 30
7521 WARD SALESMAN 7698 1981-2-22 1250.0 500.0 30
7566 JONES MANAGER 7839 1981-4-2 2975.0 NULL 20
7654 MARTIN SALESMAN 7698 1981-9-28 1250.0 1400.0 30
7698 BLAKE MANAGER 7839 1981-5-1 2850.0 NULL 30
7782 CLARK MANAGER 7839 1981-6-9 2450.0 NULL 10
7788 SCOTT ANALYST 7566 1987-4-19 3000.0 NULL 20
7839 KING PRESIDENT NULL 1981-11-17 5000.0 NULL 10
7844 TURNER SALESMAN 7698 1981-9-8 1500.0 0.0 30
7876 ADAMS CLERK 7788 1987-5-23 1100.0 NULL 20
7900 JAMES CLERK 7698 1981-12-3 950.0 NULL 30
7902 FORD ANALYST 7566 1981-12-3 3000.0 NULL 20
7934 MILLER CLERK 7782 1982-1-23 1300.0 NULL 10
# 不用OVERWRITE關(guān)鍵字
hive> LOAD DATA LOCAL INPATH '/home/hadoop/emp.txt' INTO TABLE emp;
# 再次查看
hive> select * from emp;
OK
7369 SMITH CLERK 7902 1980-12-17 800.0 NULL 20
7499 ALLEN SALESMAN 7698 1981-2-20 1600.0 300.0 30
7521 WARD SALESMAN 7698 1981-2-22 1250.0 500.0 30
7566 JONES MANAGER 7839 1981-4-2 2975.0 NULL 20
7654 MARTIN SALESMAN 7698 1981-9-28 1250.0 1400.0 30
7698 BLAKE MANAGER 7839 1981-5-1 2850.0 NULL 30
7782 CLARK MANAGER 7839 1981-6-9 2450.0 NULL 10
7788 SCOTT ANALYST 7566 1987-4-19 3000.0 NULL 20
7839 KING PRESIDENT NULL 1981-11-17 5000.0 NULL 10
7844 TURNER SALESMAN 7698 1981-9-8 1500.0 0.0 30
7876 ADAMS CLERK 7788 1987-5-23 1100.0 NULL 20
7900 JAMES CLERK 7698 1981-12-3 950.0 NULL 30
7902 FORD ANALYST 7566 1981-12-3 3000.0 NULL 20
7934 MILLER CLERK 7782 1982-1-23 1300.0 NULL 10
7369 SMITH CLERK 7902 1980-12-17 800.0 NULL 20
7499 ALLEN SALESMAN 7698 1981-2-20 1600.0 300.0 30
7521 WARD SALESMAN 7698 1981-2-22 1250.0 500.0 30
7566 JONES MANAGER 7839 1981-4-2 2975.0 NULL 20
7654 MARTIN SALESMAN 7698 1981-9-28 1250.0 1400.0 30
7698 BLAKE MANAGER 7839 1981-5-1 2850.0 NULL 30
7782 CLARK MANAGER 7839 1981-6-9 2450.0 NULL 10
7788 SCOTT ANALYST 7566 1987-4-19 3000.0 NULL 20
7839 KING PRESIDENT NULL 1981-11-17 5000.0 NULL 10
7844 TURNER SALESMAN 7698 1981-9-8 1500.0 0.0 30
7876 ADAMS CLERK 7788 1987-5-23 1100.0 NULL 20
7900 JAMES CLERK 7698 1981-12-3 950.0 NULL 30
7902 FORD ANALYST 7566 1981-12-3 3000.0 NULL 20
7934 MILLER CLERK 7782 1982-1-23 1300.0 NULL 10
Time taken: 0.137 seconds, Fetched: 28 row(s)
# 再次OVERWRITE覆蓋導(dǎo)入
hive> LOAD DATA LOCAL INPATH '/home/hadoop/emp.txt' OVERWRITE INTO TABLE emp;
# 發(fā)現(xiàn)數(shù)據(jù)被覆蓋了
hive> select * from emp;
OK
7369 SMITH CLERK 7902 1980-12-17 800.0 NULL 20
7499 ALLEN SALESMAN 7698 1981-2-20 1600.0 300.0 30
7521 WARD SALESMAN 7698 1981-2-22 1250.0 500.0 30
7566 JONES MANAGER 7839 1981-4-2 2975.0 NULL 20
7654 MARTIN SALESMAN 7698 1981-9-28 1250.0 1400.0 30
7698 BLAKE MANAGER 7839 1981-5-1 2850.0 NULL 30
7782 CLARK MANAGER 7839 1981-6-9 2450.0 NULL 10
7788 SCOTT ANALYST 7566 1987-4-19 3000.0 NULL 20
7839 KING PRESIDENT NULL 1981-11-17 5000.0 NULL 10
7844 TURNER SALESMAN 7698 1981-9-8 1500.0 0.0 30
7876 ADAMS CLERK 7788 1987-5-23 1100.0 NULL 20
7900 JAMES CLERK 7698 1981-12-3 950.0 NULL 30
7902 FORD ANALYST 7566 1981-12-3 3000.0 NULL 20
7934 MILLER CLERK 7782 1982-1-23 1300.0 NULL 10
Time taken: 0.164 seconds, Fetched: 14 row(s)
Standard syntax:
INSERT OVERWRITE TABLE tablename1 [PARTITION (partcol1=val1, partcol2=val2 ...) [IF NOT EXISTS]] select_statement1 FROM from_statement;
INSERT INTO TABLE tablename1 [PARTITION (partcol1=val1, partcol2=val2 ...)] select_statement1 FROM from_statement;
Hive extension (multiple inserts):
FROM from_statement
INSERT OVERWRITE TABLE tablename1 [PARTITION (partcol1=val1, partcol2=val2 ...) [IF NOT EXISTS]] select_statement1
[INSERT OVERWRITE TABLE tablename2 [PARTITION ... [IF NOT EXISTS]] select_statement2]
[INSERT INTO TABLE tablename2 [PARTITION ...] select_statement2] ...;
FROM from_statement
INSERT INTO TABLE tablename1 [PARTITION (partcol1=val1, partcol2=val2 ...)] select_statement1
[INSERT INTO TABLE tablename2 [PARTITION ...] select_statement2]
[INSERT OVERWRITE TABLE tablename2 [PARTITION ... [IF NOT EXISTS]] select_statement2] ...;
Hive extension (dynamic partition inserts):
INSERT OVERWRITE TABLE tablename PARTITION (partcol1[=val1], partcol2[=val2] ...) select_statement FROM from_statement;
INSERT INTO TABLE tablename PARTITION (partcol1[=val1], partcol2[=val2] ...) select_statement FROM from_statement;
官網(wǎng)又給我們列出一大堆語(yǔ)法,看著就很可怕,但是仔細(xì)整理后再來(lái)看看你會(huì)發(fā)現(xiàn)并沒(méi)有什么,下面對(duì)其進(jìn)行分析:
4 使用插入語(yǔ)法會(huì)跑mr作業(yè)。
5 multiple inserts:代表多行插入。
注:這里有兩種插語(yǔ)法,也就是加上OVERWRITE關(guān)鍵字和不加的區(qū)別。
# insert overwrite
hive> insert overwrite table emp2 select * from emp;
Query ID = hadoop_20180624141010_3063d504-ff2f-4003-843f-7dca60f1dd7e
Total jobs = 3
Launching Job 1 out of 3
...
OK
Time taken: 19.554 seconds
hive> select * from emp2;
OK
7369 SMITH CLERK 7902 1980-12-17 800.0 NULL 20
7499 ALLEN SALESMAN 7698 1981-2-20 1600.0 300.0 30
7521 WARD SALESMAN 7698 1981-2-22 1250.0 500.0 30
7566 JONES MANAGER 7839 1981-4-2 2975.0 NULL 20
7654 MARTIN SALESMAN 7698 1981-9-28 1250.0 1400.0 30
7698 BLAKE MANAGER 7839 1981-5-1 2850.0 NULL 30
7782 CLARK MANAGER 7839 1981-6-9 2450.0 NULL 10
7788 SCOTT ANALYST 7566 1987-4-19 3000.0 NULL 20
7839 KING PRESIDENT NULL 1981-11-17 5000.0 NULL 10
7844 TURNER SALESMAN 7698 1981-9-8 1500.0 0.0 30
7876 ADAMS CLERK 7788 1987-5-23 1100.0 NULL 20
7900 JAMES CLERK 7698 1981-12-3 950.0 NULL 30
7902 FORD ANALYST 7566 1981-12-3 3000.0 NULL 20
7934 MILLER CLERK 7782 1982-1-23 1300.0 NULL 10
Time taken: 0.143 seconds, Fetched: 14 row(s)
# insert追加
hive> insert into table emp2 select * from emp;
Query ID = hadoop_20180624141010_3063d504-ff2f-4003-843f-7dca60f1dd7e
Total jobs = 3
...
OK
Time taken: 18.539 seconds
hive> select * from emp2;
OK
7369 SMITH CLERK 7902 1980-12-17 800.0 NULL 20
7499 ALLEN SALESMAN 7698 1981-2-20 1600.0 300.0 30
7521 WARD SALESMAN 7698 1981-2-22 1250.0 500.0 30
7566 JONES MANAGER 7839 1981-4-2 2975.0 NULL 20
7654 MARTIN SALESMAN 7698 1981-9-28 1250.0 1400.0 30
7698 BLAKE MANAGER 7839 1981-5-1 2850.0 NULL 30
7782 CLARK MANAGER 7839 1981-6-9 2450.0 NULL 10
7788 SCOTT ANALYST 7566 1987-4-19 3000.0 NULL 20
7839 KING PRESIDENT NULL 1981-11-17 5000.0 NULL 10
7844 TURNER SALESMAN 7698 1981-9-8 1500.0 0.0 30
7876 ADAMS CLERK 7788 1987-5-23 1100.0 NULL 20
7900 JAMES CLERK 7698 1981-12-3 950.0 NULL 30
7902 FORD ANALYST 7566 1981-12-3 3000.0 NULL 20
7934 MILLER CLERK 7782 1982-1-23 1300.0 NULL 10
7369 SMITH CLERK 7902 1980-12-17 800.0 NULL 20
7499 ALLEN SALESMAN 7698 1981-2-20 1600.0 300.0 30
7521 WARD SALESMAN 7698 1981-2-22 1250.0 500.0 30
7566 JONES MANAGER 7839 1981-4-2 2975.0 NULL 20
7654 MARTIN SALESMAN 7698 1981-9-28 1250.0 1400.0 30
7698 BLAKE MANAGER 7839 1981-5-1 2850.0 NULL 30
7782 CLARK MANAGER 7839 1981-6-9 2450.0 NULL 10
7788 SCOTT ANALYST 7566 1987-4-19 3000.0 NULL 20
7839 KING PRESIDENT NULL 1981-11-17 5000.0 NULL 10
7844 TURNER SALESMAN 7698 1981-9-8 1500.0 0.0 30
7876 ADAMS CLERK 7788 1987-5-23 1100.0 NULL 20
7900 JAMES CLERK 7698 1981-12-3 950.0 NULL 30
7902 FORD ANALYST 7566 1981-12-3 3000.0 NULL 20
7934 MILLER CLERK 7782 1982-1-23 1300.0 NULL 10
Time taken: 0.132 seconds, Fetched: 28 row(s)
INSERT INTO TABLE tablename [PARTITION (partcol1[=val1], partcol2[=val2] ...)] VALUES values_row [, values_row ...]
hive> create table stu(
> id int,
> name string
> )
> ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t';
OK
Time taken: 0.253 seconds
hive> insert into table stu values (1,"zhangsan");
Query ID = hadoop_20180624141010_3063d504-ff2f-4003-843f-7dca60f1dd7e
Total jobs = 3
...
OK
Time taken: 16.589 seconds
hive> select * from stu;
OK
1 zhangsan
Time taken: 0.123 seconds, Fetched: 1 row(s)
查詢結(jié)果可以通過(guò)語(yǔ)句插入文件系統(tǒng)中:
Standard syntax:(標(biāo)準(zhǔn)語(yǔ)法)
INSERT OVERWRITE [LOCAL] DIRECTORY directory1
[ROW FORMAT row_format] [STORED AS file_format] (Note: Only available starting with Hive 0.11.0)
SELECT ... FROM ...
Hive extension (multiple inserts):(導(dǎo)出多條記錄)
FROM from_statement
INSERT OVERWRITE [LOCAL] DIRECTORY directory1 select_statement1
[INSERT OVERWRITE [LOCAL] DIRECTORY directory2 select_statement2] ...
row_format
: DELIMITED [FIELDS TERMINATED BY char [ESCAPED BY char]] [COLLECTION ITEMS TERMINATED BY char]
[MAP KEYS TERMINATED BY char] [LINES TERMINATED BY char]
[NULL DEFINED AS char] (Note: Only available starting with Hive 0.13)
**LOCAL**:加上LOCAL關(guān)鍵字代表導(dǎo)入本地系統(tǒng),不加默認(rèn)導(dǎo)入HDFS;
**STORED AS**:可以指定存儲(chǔ)格式。
```shell
hive> insert overwrite local directory '/home/hadoop/stu' ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' select * from stu;
Query ID = hadoop_20180624153131_97271014-abcf-4e70-b318-7de85d27c97f
Total jobs = 1
...
OK
Time taken: 15.09 seconds
# 結(jié)果
[hadoop@hadoop000 stu]$ pwd
/home/hadoop/stu
[hadoop@hadoop000 stu]$ cat 000000_0
1 zhangsan
# 導(dǎo)出多條記錄
hive> from emp
> INSERT OVERWRITE LOCAL DIRECTORY '/home/hadoop/tmp/hivetmp1'
> ROW FORMAT DELIMITED FIELDS TERMINATED BY "\t"
> select empno, ename
> INSERT OVERWRITE LOCAL DIRECTORY '/home/hadoop/tmp/hivetmp2'
> ROW FORMAT DELIMITED FIELDS TERMINATED BY "\t"
> select ename;
Query ID = hadoop_20180624153131_97271014-abcf-4e70-b318-7de85d27c97f
Total jobs = 1
...
OK
Time taken: 16.261 seconds
# 結(jié)果
[hadoop@hadoop000 tmp]$ cd hivetmp1
[hadoop@hadoop000 hivetmp1]$ ll
total 4
-rw-r--r-- 1 hadoop hadoop 154 Jun 24 15:39 000000_0
[hadoop@hadoop000 hivetmp1]$ cat 000000_0
7369 SMITH
7499 ALLEN
7521 WARD
7566 JONES
7654 MARTIN
7698 BLAKE
7782 CLARK
7788 SCOTT
7839 KING
7844 TURNER
7876 ADAMS
7900 JAMES
7902 FORD
7934 MILLER
[hadoop@hadoop000 hivetmp1]$ cd ..
[hadoop@hadoop000 tmp]$ cd hivetmp2
[hadoop@hadoop000 hivetmp2]$ cat 000000_0
SMITH
ALLEN
WARD
JONES
MARTIN
BLAKE
CLARK
SCOTT
KING
TURNER
ADAMS
JAMES
FORD
MILLER
參考: https://blog.csdn.net/yu0_zhang0/article/details/79007784
分享題目:Hive基礎(chǔ)sql語(yǔ)法(DML)
新聞來(lái)源:http://www.rwnh.cn/article14/jepgde.html
成都網(wǎng)站建設(shè)公司_創(chuàng)新互聯(lián),為您提供服務(wù)器托管、網(wǎng)站策劃、網(wǎng)站設(shè)計(jì)、營(yíng)銷型網(wǎng)站建設(shè)、建站公司、做網(wǎng)站
聲明:本網(wǎng)站發(fā)布的內(nèi)容(圖片、視頻和文字)以用戶投稿、用戶轉(zhuǎn)載內(nèi)容為主,如果涉及侵權(quán)請(qǐng)盡快告知,我們將會(huì)在第一時(shí)間刪除。文章觀點(diǎn)不代表本網(wǎng)站立場(chǎng),如需處理請(qǐng)聯(lián)系客服。電話:028-86922220;郵箱:631063699@qq.com。內(nèi)容未經(jīng)允許不得轉(zhuǎn)載,或轉(zhuǎn)載時(shí)需注明來(lái)源: 創(chuàng)新互聯(lián)