1、準(zhǔn)備
公司主營業(yè)務(wù):做網(wǎng)站、網(wǎng)站制作、移動(dòng)網(wǎng)站開發(fā)等業(yè)務(wù)。幫助企業(yè)客戶真正實(shí)現(xiàn)互聯(lián)網(wǎng)宣傳,提高企業(yè)的競爭能力。成都創(chuàng)新互聯(lián)是一支青春激揚(yáng)、勤奮敬業(yè)、活力青春激揚(yáng)、勤奮敬業(yè)、活力澎湃、和諧高效的團(tuán)隊(duì)。公司秉承以“開放、自由、嚴(yán)謹(jǐn)、自律”為核心的企業(yè)文化,感謝他們對我們的高要求,感謝他們從不同領(lǐng)域給我們帶來的挑戰(zhàn),讓我們激情的團(tuán)隊(duì)有機(jī)會用頭腦與智慧不斷的給客戶帶來驚喜。成都創(chuàng)新互聯(lián)推出武川免費(fèi)做網(wǎng)站回饋大家。
1.1、在vmware上安裝centos7的虛擬機(jī)
1.2、系統(tǒng)配置
配置網(wǎng)絡(luò)
# vi /etc/sysconfig/network-scripts/ifcfg-ens33
BOOTPROTO=static
ONBOOT=yes
IPADDR=192.168.120.131
GATEWAY=192.168.120.2
NETMASK=255.255.255.0
DNS1=8.8.8.8
DNS2=4.4.4.4
1.3、配置主機(jī)名
# hostnamectl set-hostname master1
# hostname master1
1.4、指定時(shí)區(qū)(如果時(shí)區(qū)不是上海)
# ll /etc/localtime
lrwxrwxrwx. 1 root root 35 6月 4 19:25 /etc/localtime -> ../usr/share/zoneinfo/Asia/Shanghai
如果時(shí)區(qū)不對的話需要修改時(shí)區(qū),方法:
# ln -sf /usr/share/zoneinfo/Asia/Shanghai /etc/localtime
1.5、上傳包
hadoop-2.9.1.tar
jdk-8u171-linux-x64.tar
2、開始搭建環(huán)境
2.1、創(chuàng)建用戶和組
[root@master1 ~]# groupadd hadoop
[root@master1 ~]# useradd -g hadoop hadoop
[root@master1 ~]# passwd hadoop
2.2、解壓包
切換用戶
[root@master1 ~]# su hadoop
創(chuàng)建存放包的目錄
[hadoop@master1 root]$ cd
[hadoop@master1 ~]$ mkdir src
[hadoop@master1 ~]$ mv *.tar src
解壓包
[hadoop@master1 ~]$ cd src
[hadoop@master1 src]$ tar -xf jdk-8u171-linux-x64.tar -C ../
[hadoop@master1 src]$ tar xf hadoop-2.9.1.tar -C ../
[hadoop@master1 src]$ cd
[hadoop@master1 ~]$ mv jdk1.8.0_171 jdk
[hadoop@master1 ~]$ mv hadoop-2.9.1 hadoop
2.3、配置環(huán)境變量
[hadoop@master1 ~]$ vi .bashrc
export JAVA_HOME=/home/hadoop/jdk
export JRE_HOME=/$JAVA_HOME/jre
export CLASSPATH=.:$JAVA_HOME/lib
export PATH=$PATH:$JAVA_HOME/bin
export HADOOP_HOME=/home/hadoop/hadoop
export HADOOP_INSTALL=$HADOOP_HOME
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export YARN_HOME=$HADOOP_HOME
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export PATH=$PATH:$HADOOP_HOME/sbin:$HADOOP_HOME/bin
使配置文件生效
[hadoop@master1 ~]$ source .bashrc
驗(yàn)證
[hadoop@master1 ~]$ java -version
java version "1.8.0_171"
Java(TM) SE Runtime Environment (build 1.8.0_171-b11)
Java HotSpot(TM) 64-Bit Server VM (build 25.171-b11, mixed mode)
[hadoop@master1 ~]$ hadoop version
Hadoop 2.9.1
Subversion https://github.com/apache/hadoop.git -r e30710aea4e6e55e69372929106cf119af06fd0e
Compiled by root on 2018-04-16T09:33Z
Compiled with protoc 2.5.0
From source with checksum 7d6d2b655115c6cc336d662cc2b919bd
This command was run using /home/hadoop/hadoop/share/hadoop/common/hadoop-common-2.9.1.jar
2.4、修改hadoop配置文件
[hadoop@master1 ~]$ cd hadoop/etc/hadoop/
[hadoop@master1 hadoop]$ vi hadoop-env.sh
export JAVA_HOME=/home/hadoop/jdk
[hadoop@master1 hadoop]$ vi core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://192.168.120.131:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/data/hadoop/hadoop_tmp_dir</value>
</property>
</configuration>
說明:
fs.defaultFS:這個(gè)屬性用來指定namenode的hdfs協(xié)議的文件系統(tǒng)通信地址,可以指定一個(gè)主機(jī)+端口,也可以指定一個(gè)namenode服務(wù)(這個(gè)服務(wù)內(nèi)部可以有多臺namenode實(shí)現(xiàn)ha的namenode服務(wù))
hadoop.tmp.dir:hadoop集群在工作的時(shí)候存儲的一些臨時(shí)文件的目錄
[hadoop@master1 hadoop]$ vi hdfs-site.xml
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>
說明:
dfs.replication:hdfs的副本數(shù)設(shè)置。也就是上傳一個(gè)文件,其分割的block塊后,每個(gè)block的冗余副本個(gè)數(shù),默認(rèn)配置是3。
下面的參數(shù)以配置就會出現(xiàn)datanode無法啟動(dòng)的問題,所以不做配置,尚未搞明白怎么出現(xiàn)的。
dfs.namenode.name.dir:namenode數(shù)據(jù)的存放目錄。也就是namenode元數(shù)據(jù)存放的目錄,記錄了hdfs系統(tǒng)中文件的元數(shù)據(jù)。
dfs.datanode.data.dir:datanode數(shù)據(jù)的存放目錄。也就是block塊的存放目錄。
下面貼出異常信息
[hadoop@master1 logs]$ pwd
/home/hadoop/hadoop/logs
[hadoop@master1 logs]$ tail -f hadoop-hadoop-datanode-master1.log
2018-06-12 22:30:14,749 WARN org.apache.hadoop.hdfs.server.common.Storage: Failed to add storage directory [DISK]file:/data/hadoop/hdfs/dn/
java.io.IOException: Incompatible clusterIDs in /data/hadoop/hdfs/dn: namenode clusterID = CID-5bbc555b-4622-4781-9a7f-c2e5131e4869; datanode clusterID = CID-29ec402d-95f8-4148-8d18-f7e4b965be4f
at org.apache.hadoop.hdfs.server.datanode.DataStorage.doTransition(DataStorage.java:760)
2018-06-12 22:30:14,752 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed for Block pool <registering> (Datanode Uuid f39576ae-b7af-44aa-841a-48ba03b956f4) service to master1/192.168.120.131:9000. Exiting.
java.io.IOException: All specified directories have failed to load.
at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:557)
2018-06-12 22:30:14,753 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Ending block pool service for: Block pool <registering> (Datanode Uuid f39576ae-b7af-44aa-841a-48ba03b956f4) service to master1/192.168.120.131:9000
2018-06-12 22:30:14,854 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Removed Block pool <registering> (Datanode Uuid f39576ae-b7af-44aa-841a-48ba03b956f4)
2018-06-12 22:30:16,855 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Exiting Datanode
2018-06-12 22:30:16,916 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down DataNode at master1/192.168.120.131
[hadoop@master1 hadoop]$ cp mapred-site.xml.template mapred-site.xml
[hadoop@master1 hadoop]$ vi mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
說明:
mapreduce.framework.name:指定mr框架為yarn方式,Hadoop二代MP也基于Yarn來運(yùn)行。
[hadoop@master1 hadoop]$ vi yarn-site.xml
<configuration>
<!-- Site specific YARN configuration properties -->
<!-- 指定ResourceManager的地址-->
<property>
<name>yarn.resourcemanager.hostname</name>
<value>192.168.120.131</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration>
說明:
yarn.resourcemanager.hostname:yarn總管理器的IPC通訊地址,可以是IP也可以是主機(jī)名。
yarn.nodemanager.aux-service:集群為MapReduce程序提供的shuffle服務(wù)
2.5、創(chuàng)建目錄并賦予權(quán)限
[hadoop@master1 hadoop]$ exit
[root@master1 ~]# mkdir -p /data/hadoop/hadoop_tmp_dir
[root@master1 ~]# mkdir -p /data/hadoop/hdfs/{nn,dn}
[root@master1 ~]# chown -R hadoop:hadoop /data
3、格式化文件系統(tǒng)并啟動(dòng)服務(wù)
3.1、格式化文件系統(tǒng)
[root@master1 ~]# su hadoop
[hadoop@master1 ~]$ cd hadoop/bin
[hadoop@master1 bin]$ ./hdfs namenode -format
注意:
如果是集群環(huán)境,HDFS初始化只能在主節(jié)點(diǎn)上運(yùn)行
3.2、啟動(dòng)HDFS
[hadoop@master1 bin]$ cd sbin
[hadoop@master1 sbin]$ ./start-dfs.sh
注意:
如果是集群環(huán)境,不管在集群中的哪個(gè)節(jié)點(diǎn)都可以運(yùn)行
如果有個(gè)別服務(wù)啟動(dòng)失敗,配置也沒有問題的話,很有可能是創(chuàng)建的目錄權(quán)限問題
3.3、啟動(dòng)YARN
[hadoop@master1 sbin]$ ./start-yarn.sh
注意:
如果是集群環(huán)境,只能在主節(jié)點(diǎn)中運(yùn)行
查看服務(wù)狀態(tài)
[hadoop@master1 sbin]$ jps
6708 NameNode
6966 SecondaryNameNode
6808 DataNode
7116 Jps
5791 ResourceManager
5903 NodeManager
3.4、瀏覽器查看服務(wù)狀態(tài)
使用web查看HSFS運(yùn)行狀態(tài)
在瀏覽器輸入
http://192.168.120.131:50070
使用web查看YARN運(yùn)行狀態(tài)
在瀏覽器輸入
http://192.168.120.131:8088
4、啟動(dòng)ssh無密碼驗(yàn)證
上面啟動(dòng)服務(wù)時(shí)還需要輸入用戶名登錄密碼,如下所示:
[hadoop@master1 sbin]$ ./start-yarn.sh
starting yarn daemons
starting resourcemanager, logging to /home/hadoop/hadoop/logs/yarn-hadoop-resourcemanager-master1.out
hadoop@localhost's password:
如果想要做到無密碼啟動(dòng)服務(wù)的話需要配置ssh
[hadoop@master1 sbin]$ cd ~/.ssh/
[hadoop@master1 .ssh]$ ll
總用量 4
-rw-r--r--. 1 hadoop hadoop 372 6月 12 18:36 known_hosts
[hadoop@master1 .ssh]$ ssh-keygen
Generating public/private rsa key pair.
Enter file in which to save the key (/home/hadoop/.ssh/id_rsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /home/hadoop/.ssh/id_rsa.
Your public key has been saved in /home/hadoop/.ssh/id_rsa.pub.
The key fingerprint is:
SHA256:D14LpPKZbih0K+kVoTl23zGsKK1xOVlNuSugDvrkjJA hadoop@master1
The key's randomart image is:
+---[RSA 2048]----+
| |
| . |
| . + |
| o . * . |
| = = o S . |
| o.=.@ * O . |
|E.=oOoB + o |
|oB+*oo.. |
|ooBo .. |
+----[SHA256]-----+
一路按下enter鍵就行
[hadoop@master1 .ssh]$ ll
總用量 12
-rw-------. 1 hadoop hadoop 1675 6月 12 18:46 id_rsa
-rw-r--r--. 1 hadoop hadoop 396 6月 12 18:46 id_rsa.pub
-rw-r--r--. 1 hadoop hadoop 372 6月 12 18:36 known_hosts
[hadoop@master1 .ssh]$ cat id_rsa.pub >> ~/.ssh/authorized_keys
[hadoop@master1 .ssh]$ ll
總用量 16
-rw-rw-r--. 1 hadoop hadoop 396 6月 12 18:47 authorized_keys
-rw-------. 1 hadoop hadoop 1675 6月 12 18:46 id_rsa
-rw-r--r--. 1 hadoop hadoop 396 6月 12 18:46 id_rsa.pub
-rw-r--r--. 1 hadoop hadoop 372 6月 12 18:36 known_hosts
如果發(fā)現(xiàn)還需要輸入密碼才能登錄,這是因?yàn)槲募?quán)限的問題,改下權(quán)限就可以
[hadoop@master1 .ssh]$ chmod 600 authorized_keys
發(fā)現(xiàn)可以實(shí)現(xiàn)無密碼登錄了
[hadoop@master1 .ssh]$ ssh localhost
Last login: Tue Jun 12 18:48:38 2018 from fe80::e961:7d5b:6a72:a2a9%ens33
[hadoop@master1 ~]$
當(dāng)然無密登錄的實(shí)現(xiàn)還可以用另一種方法實(shí)現(xiàn)
在執(zhí)行完ssh-keygen之后
執(zhí)行下面的命令
ssh-copy-id -i ~/.ssh/id_rsa.pub hadoop@master1
5、文件系統(tǒng)的簡單應(yīng)用及遇到的一些問題
5.1、創(chuàng)建目錄
在文件系統(tǒng)中創(chuàng)建目錄
[hadoop@master1 bin]$ hdfs dfs -mkdir -p /user/hadoop
18/06/12 21:25:31 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
列出創(chuàng)建的目錄
[hadoop@master1 bin]$ hdfs dfs -ls /
18/06/12 21:29:55 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Found 1 items
drwxr-xr-x - hadoop supergroup 0 2018-06-12 21:25 /user
5.2、解決警告問題
有WARN警告,但是并不影響Hadoop正常使用。
兩種方式可以解決這個(gè)報(bào)警問題,方法一是重新編譯源碼,方法二是在日志中取消告警信息,我采用的是第二種方式。
[hadoop@master1 ]$ cd /home/hadoop/hadoop/etc/hadoop/
[hadoop@master1 hadoop]$ vi log4j.properties
添加
#native WARN
log4j.logger.org.apache.hadoop.util.NativeCodeLoader=ERROR
可以看到效果了
[hadoop@master1 hadoop]$ hdfs dfs -ls /
Found 1 items
drwxr-xr-x - hadoop supergroup 0 2018-06-12 21:25 /user
5.3、上傳文件到hdfs文件系統(tǒng)中
[hadoop@master1 bin]$ hdfs dfs -mkdir -p input
[hadoop@master1 hadoop]$ hdfs dfs -put /home/hadoop/hadoop/etc/hadoop input
Hadoop默認(rèn)附帶了豐富的例子:包括wordcoun,terasort,join,grep等,執(zhí)行下面的命令查看:
[hadoop@master1 bin]$ hadoop jar /home/hadoop/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.9.1.jar
An example program must be given as the first argument.
Valid program names are:
aggregatewordcount: An Aggregate based map/reduce program that counts the words in the input files.
aggregatewordhist: An Aggregate based map/reduce program that computes the histogram of the words in the input files.
bbp: A map/reduce program that uses Bailey-Borwein-Plouffe to compute exact digits of Pi.
dbcount: An example job that count the pageview counts from a database.
distbbp: A map/reduce program that uses a BBP-type formula to compute exact bits of Pi.
grep: A map/reduce program that counts the matches of a regex in the input.
join: A job that effects a join over sorted, equally partitioned datasets
multifilewc: A job that counts words from several files.
pentomino: A map/reduce tile laying program to find solutions to pentomino problems.
pi: A map/reduce program that estimates Pi using a quasi-Monte Carlo method.
randomtextwriter: A map/reduce program that writes 10GB of random textual data per node.
randomwriter: A map/reduce program that writes 10GB of random data per node.
secondarysort: An example defining a secondary sort to the reduce.
sort: A map/reduce program that sorts the data written by the random writer.
sudoku: A sudoku solver.
teragen: Generate data for the terasort
terasort: Run the terasort
teravalidate: Checking results of terasort
wordcount: A map/reduce program that counts the words in the input files.
wordmean: A map/reduce program that counts the average length of the words in the input files.
wordmedian: A map/reduce program that counts the median length of the words in the input files.
wordstandarddeviation: A map/reduce program that counts the standard deviation of the length of the words in the input files.
偽分布式運(yùn)行MapReduce作業(yè)的方式跟單機(jī)模式相同,區(qū)別在于偽分布式方式讀取的是HDFS中的文件(可以將單機(jī)步驟中創(chuàng)建的本地input文件夾,輸出結(jié)果output文件夾都刪除來驗(yàn)證這一點(diǎn))。
[hadoop@master1 sbin]$ hadoop jar /home/hadoop/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.9.1.jar grep input output 'dfs[a-z]+'
18/06/12 22:57:05 INFO client.RMProxy: Connecting to ResourceManager at /192.168.120.131:8032
18/06/12 22:57:07 INFO input.FileInputFormat: Total input files to process : 30
省略。。。
18/06/12 22:57:08 INFO mapreduce.Job: Running job: job_1528815135795_0001
18/06/12 22:57:23 INFO mapreduce.Job: Job job_1528815135795_0001 running in uber mode : false
18/06/12 22:57:23 INFO mapreduce.Job: map 0% reduce 0%
18/06/12 22:58:02 INFO mapreduce.Job: map 13% reduce 0%
省略。。。
18/06/12 23:00:17 INFO mapreduce.Job: map 97% reduce 32%
18/06/12 23:00:18 INFO mapreduce.Job: map 100% reduce 32%
18/06/12 23:00:19 INFO mapreduce.Job: map 100% reduce 100%
18/06/12 23:00:20 INFO mapreduce.Job: Job job_1528815135795_0001 completed successfully
18/06/12 23:00:20 INFO mapreduce.Job: Counters: 50
File System Counters
FILE: Number of bytes read=46
FILE: Number of bytes written=6136681
FILE: Number of read operations=0
省略。。。
File Input Format Counters
Bytes Read=138
File Output Format Counters
Bytes Written=24
查看結(jié)果
[hadoop@master1 sbin]$ hdfs dfs -cat output/*
1 dfsmetrics
1 dfsadmin
把結(jié)果取到本地
[hadoop@master1 sbin]$ hdfs dfs -get output /data
[hadoop@master1 sbin]$ ll /data
總用量 0
drwxrwxrwx. 5 hadoop hadoop 52 6月 12 19:20 hadoop
drwxrwxr-x. 2 hadoop hadoop 42 6月 12 23:03 output
[hadoop@master1 sbin]$ cat /data/output/*
1 dfsmetrics
1 dfsadmin
6、開啟歷史服務(wù)器
歷史服務(wù)器服務(wù)用來在web中查看任務(wù)運(yùn)行情況
[hadoop@master1 sbin]$ mr-jobhistory-daemon.sh start historyserver
starting historyserver, logging to /home/hadoop/hadoop/logs/mapred-hadoop-historyserver-master1.out
[hadoop@master1 sbin]$ jps
19985 Jps
15778 ResourceManager
15890 NodeManager
14516 NameNode
14827 SecondaryNameNode
19948 JobHistoryServer
14653 DataNode
在初學(xué)時(shí)盡可能的把配置簡單化,有助于出錯(cuò)后的排查。
參考:
https://www.cnblogs.com/wangxin37/p/6501484.html
https://www.cnblogs.com/xing901022/p/5713585.html
文章標(biāo)題:hadoop2.9.1偽分布式環(huán)境搭建以及文件系統(tǒng)的簡單操作
文章URL:http://www.rwnh.cn/article26/jieojg.html
成都網(wǎng)站建設(shè)公司_創(chuàng)新互聯(lián),為您提供網(wǎng)站策劃、網(wǎng)站制作、網(wǎng)站排名、App設(shè)計(jì)、手機(jī)網(wǎng)站建設(shè)、品牌網(wǎng)站制作
聲明:本網(wǎng)站發(fā)布的內(nèi)容(圖片、視頻和文字)以用戶投稿、用戶轉(zhuǎn)載內(nèi)容為主,如果涉及侵權(quán)請盡快告知,我們將會在第一時(shí)間刪除。文章觀點(diǎn)不代表本網(wǎng)站立場,如需處理請聯(lián)系客服。電話:028-86922220;郵箱:631063699@qq.com。內(nèi)容未經(jīng)允許不得轉(zhuǎn)載,或轉(zhuǎn)載時(shí)需注明來源: 創(chuàng)新互聯(lián)