Hadoop 高可用集群搭建 (Hadoop HA)

创作人 Leo


编辑时间 Wed Jan 1,2020 at 10:13


如果之前安装过hadoop执行如下操作

  1. 删除 masters

  2. 清除 /opt/hadoop-2.6/* tmp dir 中的全部内容

    [root@secondnamenode ~]# rm -rf /usr/local/hadoop-2.6.0/etc/hadoop/masters
    [root@secondnamenode ~]# rm -rf /opt/hadoop-2.6/*
    

HA 机器说明

namenode

namenode (Active)

zookeeper

secondnamenode

namenode (Standby)

zookeeper

datanode1

datanode

zookeeper

journalnode

datanode2

datanode

zookeeper

journalnode

datanode3

datanode

zookeeper

journalnode

准备:

  1. 配置 hadoop-env.sh,调整集群机中每台机器的时间

  2. 配置主节点(namenode)免密码登录到其他节点(主节点启动会自动启动其他节点,如果没有设置免密码登录,则会频繁的要求管理员输密码)

  3. 在各机器安装hadoop

  4. 配置namenode互相能够免密码登录(用作活动/备用namenode热切换)

  5. 配置zookeeper并启动

配置

  1. 修改 hdfs-site.xml

  2. 修改 core-site.xml

  3. 配置 slaves

  4. 拷贝配置文件到 JournalNode,DataNode,NameNode 等集群各个机器中,并逐节点启动 JournalNode “/usr/local/hadoop-2.6.0/sbin/hadoop-daemon.sh start journalnode”

  5. 确定 JournalNode 启动成功,查看日志:tail -n 200 /usr/local/hadoop-2.6.0/logs/hadoop-root-journalnode-datanode2.log,没有java异常输出,则证明成功

  6. 在随便一台namenode机器上执行格式化 hdfs namenode -format(在执行foramt之前一定要启动JournalNode)

  7. 初始化关联到zookeeper的hdfs配置:hdfs zkfc -formatZK

  8. 开启hdfs:/usr/local/hadoop-2.6.0/sbin/start-dfs.sh

  9. 将fsimage同步到Standby,在需要同步的namenode(Standby)执行:hdfs namenode -bootstrapStandby 【 unformatted NameNode by running the command “hdfs namenode -bootstrapStandby” on the unformatted NameNode. 】

  10. 启动Standby的namenode:/usr/local/hadoop-2.6.0/sbin/hadoop-daemon.sh start namenode

  11. 停止hdfs:stop-dfs.sh

  12. 启动hdfs:start-dfs.sh

各个节点的jps指令结果

namenode

[root@namenode hadoop]# jps
2067 DFSZKFailoverController
2131 Jps
1212 QuorumPeerMain
1789 NameNode

secondnamenode

[root@secondnamenode .ssh]# /usr/local/hadoop-2.6.0/sbin/hadoop-daemon.sh start namenode
starting namenode, logging to /usr/local/hadoop-2.6.0/logs/hadoop-root-namenode-secondnamenode.out
[root@secondnamenode .ssh]# jps
1782 Jps
1577 DFSZKFailoverController
1211 QuorumPeerMain
1688 NameNode

datanode1

[root@datanode1 logs]# jps
1669 DataNode
1760 Jps
1596 JournalNode
1189 QuorumPeerMain

datanode2

[root@datanode2 ~]# jps
1545 JournalNode
1709 Jps
1618 DataNode

datanode3

[root@datanode3 hadoop]# jps
1616 DataNode
1543 JournalNode
1707 Jps

测试自动切换

  1. kill掉namenode(Active)

  2. 观察namenode(Standby),会看到其状态变成了Active,如果没有则证明有问题,看日志去排查

  3. 启动刚才被kill的namenode ( /usr/local/hadoop-2.6.0/sbin/hadoop-daemon.sh start namenode ),启动成功后会看到它成为了Standby,这都是zookeeper在管理

关闭hdfs集群

[root@namenode hadoop]# /usr/local/hadoop-2.6.0/sbin/stop-dfs.sh 
17/02/23 17:50:07 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Stopping namenodes on [namenode secondnamenode]
namenode: stopping namenode
secondnamenode: stopping namenode
datanode2: stopping datanode
datanode1: stopping datanode
datanode3: stopping datanode
Stopping journal nodes [datanode1 datanode2 datanode3]
datanode3: stopping journalnode
datanode2: stopping journalnode
datanode1: stopping journalnode
17/02/23 17:50:28 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Stopping ZK Failover Controllers on NN hosts [namenode secondnamenode]
namenode: stopping zkfc
secondnamenode: stopping zkfc

开启hdfs集群

[root@namenode hadoop]# /usr/local/hadoop-2.6.0/sbin/start-dfs.sh 
17/02/23 17:51:15 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Starting namenodes on [namenode secondnamenode]
secondnamenode: starting namenode, logging to /usr/local/hadoop-2.6.0/logs/hadoop-root-namenode-secondnamenode.out
namenode: starting namenode, logging to /usr/local/hadoop-2.6.0/logs/hadoop-root-namenode-namenode.out
datanode2: starting datanode, logging to /usr/local/hadoop-2.6.0/logs/hadoop-root-datanode-datanode2.out
datanode3: starting datanode, logging to /usr/local/hadoop-2.6.0/logs/hadoop-root-datanode-datanode3.out
datanode1: starting datanode, logging to /usr/local/hadoop-2.6.0/logs/hadoop-root-datanode-datanode1.out
Starting journal nodes [datanode1 datanode2 datanode3]
datanode3: starting journalnode, logging to /usr/local/hadoop-2.6.0/logs/hadoop-root-journalnode-datanode3.out
datanode1: starting journalnode, logging to /usr/local/hadoop-2.6.0/logs/hadoop-root-journalnode-datanode1.out
datanode2: starting journalnode, logging to /usr/local/hadoop-2.6.0/logs/hadoop-root-journalnode-datanode2.out
17/02/23 17:51:34 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Starting ZK Failover Controllers on NN hosts [namenode secondnamenode]
secondnamenode: starting zkfc, logging to /usr/local/hadoop-2.6.0/logs/hadoop-root-zkfc-secondnamenode.out
namenode: starting zkfc, logging to /usr/local/hadoop-2.6.0/logs/hadoop-root-zkfc-namenode.out
[root@namenode hadoop]#

注意:开启hdfs之前必须开启zookeeper集群

高可用hdfs集群搭建教程完毕

参考资料:

http://hadoop.apache.org/docs/r2.6.5/hadoop-project-dist/hadoop-hdfs/HDFSHighAvailabilityWithQJM.html

配置文件:

hdfs-site.xml

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at
    http://www.apache.org/licenses/LICENSE-2.0
  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
  <name>dfs.nameservices</name>
  <value>lixin</value>
</property>
<property>
  <name>dfs.ha.namenodes.lixin</name>
  <value>nn1,nn2</value>
</property>
<property>
  <name>dfs.namenode.rpc-address.lixin.nn1</name>
  <value>namenode:8020</value>
</property>
<property>
  <name>dfs.namenode.rpc-address.lixin.nn2</name>
  <value>secondnamenode:8020</value>
</property>
<property>
  <name>dfs.namenode.http-address.lixin.nn1</name>
  <value>namenode:50070</value>
</property>
<property>
  <name>dfs.namenode.http-address.lixin.nn2</name>
  <value>secondnamenode:50070</value>
</property>
<property>
  <name>dfs.namenode.shared.edits.dir</name>
  <value>qjournal://datanode1:8485;datanode2:8485;datanode3:8485/mycluster-edits</value>
</property>
<property>
  <name>dfs.client.failover.proxy.provider.lixin</name>
  <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>
<property>
  <name>dfs.ha.fencing.methods</name>
  <value>sshfence</value>
</property>
<property>
  <name>dfs.ha.fencing.ssh.private-key-files</name>
  <value>/root/.ssh/id_rsa</value>
</property>
<property>
  <name>dfs.journalnode.edits.dir</name>
  <value>/opt/journalnode</value>
</property>
<property>
  <name>dfs.ha.automatic-failover.enabled</name>
  <value>true</value>
</property>
</configuration>

dfs.nameservices namenode集群的逻辑名称

dfs.ha.namenodes.lixin 各namenode的唯一标识

dfs.namenode.rpc-address.lixin.nn1 为各个namenode配置rpc域名及端口号

dfs.namenode.http-address.lixin.nn1 为各个namenode配置http域名及端口号

dfs.namenode.shared.edits.dir 为 JournalNodes 配置 NameNodes 读写操作(edits文件) 的位置

dfs.ha.fencing.methods 指定切换namenode的方法,这里用 sshfence,是tcp方式

dfs.ha.fencing.ssh.private-key-files 配置 sshfence 需要的私钥位置

dfs.journalnode.edits.dir 配置 journalnode 保存 edits 文件的位置

dfs.ha.automatic-failover.enabled 配置是否开启自动切换,这个要开启,在namenode(Active)停掉的时候,namenode(Standby) 会自动接管

core-site.xml

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at
    http://www.apache.org/licenses/LICENSE-2.0
  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->
<!-- Put site-specific property overrides in this file. -->
<configuration>
    <property>
        <name>fs.defaultFS</name>
        <value>hdfs://lixin</value>
    </property>
    <property>
        <name>hadoop.tmp.dir</name>
        <value>/opt/hadoop-2.6</value>
    </property>
    <property>
        <name>ha.zookeeper.quorum</name>
        <value>namenode:2181,secondnamenode:2181,datanode1:2181</value>
    </property>
</configuration>

fs.defaultFS 配置上传/下载路径

hadoop.tmp.dir 配置临时文件夹,默认是/tmp,必须设置,namenode的fsimage会保存在其中

ha.zookeeper.quorum 配置zookeeper集群机器及各个机器端口号

管理地址:

http://namenode:50070


阅读:1892
搜索
  • Linux 高性能网络编程库 Libevent 简介和示例 2332
  • Mac系统编译PHP7【20190929更新】 2208
  • zksync 和 layer2 1899
  • Hadoop 高可用集群搭建 (Hadoop HA) 1891
  • Linux 常用命令 1879
  • 安徽黄山游 1855
  • Windows 安装Swoole 1815
  • 小白鼠问题 1785
  • Hadoop 高可用YARN 配置 1785
  • 使用 Java+Thrift 实现异步事件处理服务 1684
简介
不定期分享软件开发经验,生活经验