环境 UCloud云主机,2.6.32-431.11.15.el6.ucloud.x86_64 假设三台主机内网IP分别为master, 10.10.1.11和10.10.1.12,hostname分别为:10-10-1-10,10-10-1-11和10-10-1-12
配置JDK 本次搭建测试用的是jdk8,可以从Oracle官网下载 对应的版本。
配置jdk 假设jdk解压后目录存放在/usr/local/jdk8
,命令行输入sudo vi /etc/profile
,添加一下内容:
1 2 3 JAVA_HOME=/usr/local /jdk8 CLASSPATH=.:$JAVA_HOME /jre/lib PATH=$PATH :$JAVA_HOME /bin
然后输入source /etc/profile
使环境变量生效,输入java -version
有java版本信息输出说明配置成功,三台主机均这么配置。
配置hadoop用户 控制台输入sudo useradd -m -U hadoop
添加hadoop用户,然后输入sudo passwd hadoop
修改hadoop用户的密码,输入su -l hadoop
,输入刚才设置的密码,切换到hadoop用户。
配置ssh 前提是已经切换到hadoop用户。 在每个主机控制台输入ssh-keygen
回车,一直回车直到结束。最后在master主机上使用ssh-copy-id
命令拷贝认证信息到本主机和其他两台主机,这样可以免密码登录。ssh-copy-id hadoop@主机地址
,
配置网络 三台主机均在 /etc/hosts
添加以下映射:
1 2 3 master master 10.10.1.11 slave1 10.10.1.12 slave2
配置Hadoop
下载Hadoop ,这里下载的是2.7.1
版本。控制输入wget http://www.apache.org/dyn/closer.cgi/hadoop/common/hadoop-2.7.1/hadoop-2.7.1.tar.gz
,等待下载完成。
控制台输入tar -xvf hadoop-2.7.1.tar.gz
解压,会生成hadoop-2.7.1
目录。
输入cd hadoop-2.7.1/etc/hadoop
进入配置文件目录。
修改hadoop-env.sh
中的export JAVA_HOME=
,将等号后的内容改成上面配置的jdk绝对路径,在这里就是/usr/local/jdk8
,修改完后应该是export JAVA_HOME=/usr/local/jdk8
,保存退出。
修改core-site.xml
,配置config内容:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 <configuration > <property > <name > hadoop.tmp.dir</name > <value > /home/hadoop/tmp</value > <description > Abase for other temporary directories.</description > </property > <property > <name > fs.defaultFS</name > <value > hdfs://master:9000</value > </property > <property > <name > fs.default.namenode</name > <value > hdfs://master:8082</value > </property > <property > <name > io.file.buffer.size</name > <value > 4096</value > </property > <property > <name > hadoop.native.lib</name > <value > true</value > <description > Should native hadoop libraries, if present, be used.</description > </property > </configuration >
修改hdfs-site.xml
,修改config内容为:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 <configuration > <property > <name > dfs.nameservices</name > <value > cluster</value > </property > <property > <name > dfs.namenode.secondary.http-address</name > <value > master:50090</value > </property > <property > <name > dfs.namenode.name.dir</name > <value > /home/hadoop/dfs/name</value > </property > <property > <name > dfs.datanode.data.dir</name > <value > /home/hadoop/dfs/data</value > </property > <property > <name > dfs.replication</name > <value > 2</value > </property > <property > <name > dfs.webhdfs.enabled</name > <value > true</value > </property > </configuration >
修改yarn-site.xml
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 <configuration > <property > <name > yarn.nodemanager.aux-services</name > <value > mapreduce_shuffle</value > </property > <property > <name > yarn.resourcemanager.address</name > <value > master:8032</value > </property > <property > <name > yarn.resourcemanager.scheduler.address</name > <value > master:8030</value > </property > <property > <name > yarn.resourcemanager.resource-tracker.address</name > <value > master:8031</value > </property > <property > <name > yarn.resourcemanager.admin.address</name > <value > master:8033</value > </property > <property > <name > yarn.resourcemanager.webapp.address</name > <value > master:8088</value > </property > </configuration >
修改mapred-site.xml
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 <configuration > <property > <name > mapreduce.framework.name</name > <value > yarn</value > </property > <property > <name > mapreduce.jobtracker.http.address</name > <value > master:50030</value > </property > <property > <name > mapreduce.jobhistory.address</name > <value > master:10020</value > </property > <property > <name > mapreduce.jobhistory.webapp.address</name > <value > master:19888</value > </property > </configuration >
修改slaves
文件,添加其他两台ip
将hadoop目录覆盖到其余机器对应目录。 下面开始操作hadoop命令,如果遇到hadoop native错误,请查看文末Hadoop Native 配置
部分。 10. 格式化文件系统 注意:这里的格式化文件系统并不是硬盘格式化,只是针对主服务器hdfs-site.xml的dfs.namenode.name.dir和dfs.datanode.data.dir目录做相应的清理工作。切换到Hadoop的home目录,执行bin/hdfs namenode -format
。 11. 启动停止服务 启动sbin/start-dfs.sh
,可以一次性启动master和slaves节点服务。sbin/start-yarn.sh
启动yarn资源管理服务。要停止服务,用对应的sbin/stop-dfs.sh
和sbin/stop-dfs.sh
即可停止服务。 12. 单独启动一个datanode 增加节点或者重启节点,需要单独启动,则可使用以下命令:sbin/hadoop-daemon.sh start datanode
,启动nodeManagersbin/yarn-daemon.sh start nodemanager
,当然也可以操作namenodesbin/hadoop-daemon.sh start namenode
sbin/yarn-daemon.sh start resourcemanager
。注意 :原文中是sbin/yarn-daemons.sh
和sbin/hadoop-daemons.sh
,运行后发现并没有启动成功,去掉s后启动成功。
Hadoop Native 配置 输入 hadoop checknative
检查Hadoop本地库版本和相关依赖信息:
1 2 3 4 5 6 7 8 9 10 11 12 13 16/03/10 12:17:56 DEBUG util.NativeCodeLoader: Trying to load the custom-built native-hadoop library... 16/03/10 12:17:56 DEBUG util.NativeCodeLoader: Failed to load native-hadoop with error: java.lang.UnsatisfiedLinkError: /home/hadoop/hadoop-2.6.3/lib/native/libhadoop.so.1.0.0: /lib64/libc.so.6: version `GLIBC_2.14' not found (required by /home/hadoop/hadoop-2.6.3/lib/native/libhadoop.so.1.0.0) 16/03/10 12:17:56 DEBUG util.NativeCodeLoader: java.library.path=/home/hadoop/hadoop-2.6.3/lib/native 16/03/10 12:17:56 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 16/03/10 12:17:56 DEBUG util.Shell: setsid exited with exit code 0 Native library checking: hadoop: false zlib: false snappy: false lz4: false bzip2: false openssl: false 16/03/10 12:17:56 INFO util.ExitUtil: Exiting with status 1
发现/lib64/libc.so.6: version
GLIBC_2.14’ not found`信息,说明该版本的Hadoop需要glibc_2.14版本。下面就安装所需的版本。
mkdir glib_build && cd glib_build
wget http://ftp.gnu.org/gnu/glibc/glibc-2.14.tar.gz && wget http://ftp.gnu.org/gnu/glibc/glibc-linuxthreads-2.5.tar.bz2
tar zxf glibc-2.14.tar.gz && cd glibc-2.14 && tar jxf ../glibc-linuxthreads-2.5.tar.bz2
cd ../ && export CFLAGS="-g -O2" && ./glibc-2.14/configure --prefix=/usr --disable-profile --enable-add-ons --with-headers=/usr/include --with-binutils=/usr/bin
make
make install
install最后会遇到错误信息:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 CC="gcc -B/usr/bin/" /usr/bin/perl scripts/test-installation.pl /root/ /usr/bin/ld: cannot find -lnss_test1 collect2: ld returned 1 exit status Execution of gcc -B/usr/bin/ failed! The script has found some problems with your installation! Please read the FAQ and the README file and check the following: - Did you change the gcc specs file (necessary after upgrading from Linux libc5)? - Are there any symbolic links of the form libXXX.so to old libraries? Links like libm.so -> libm.so.5 (where libm.so.5 is an old library) are wrong, libm.so should point to the newly installed glibc file - and there should be only one such link (check e.g. /lib and /usr/lib) You should restart this script from your build directory after you've fixed all problems! Btw. the script doesn't work if you're installing GNU libc not as your primary library! make[1]: *** [install] Error 1 make[1]: Leaving directory `/root/glibc-2.14' make: *** [install] Error 2
无需关注,检验是否成功ls -l /lib64/libc.so.6
lrwxrwxrwx 1 root root 12 Mar 10 12:12 /lib64/libc.so.6 -> libc-2.14.so 出现了/lib64/libc.so.6 -> libc-2.14.so
字样说明成功了。
安装opensslyum install openssl-static.x86_64
如何修改主机名称 修改文件/etc/sysconfig/network
然后执行/etc/rc.d/init.d/network restart
重启网络模块
secondaryNameNode 配置
修改masters文件(如果没有则自己创建),添加一个主机名称,用以作为secondaryNameNode。
修改hdfs-site.xml的内容,删除dfs.namenode.secondary.http-address
部分配置,添加新的配置(注意修改为自己的ip):
1 2 3 4 5 6 7 8 9 10 11 12 <property > <name > dfs.http.address</name > <value > master:50070</value > <description > The address and the base port where the dfs namenode web ui will listen on. If the port is 0 then the server will start on a free port. </description > </property > <property > <name > dfs.namenode.secondary.http-address</name > <value > 10.10.1.11</value > </property >
参考资料:
Hadoop-2.5.2集群安装配置详解
基于hadoop2.2的namenode与SecondaryNameNode分开配置在不同的计算机