基于hadoop2101数仓搭建

###集群规划

*服务器列表

###服务器配置

1、修改主机名

hostnamectlset-hostnamehadoop01#.21.15.2上执行

hostnamectlset-hostnamehadoop02#.21.15.4上执行

hostnamectlset-hostnamehadoop03#.21.15.7上执行

hostnamectlset-hostnamehadoop04#.21.15.8上执行

hostnamectlset-hostnamehadoop05#.21.15.12上执行

2.设置hosts

/etc/hosts文件添加如下信息

.21.15.2hadoop01

.21.15.4hadoop02

.21.15.7hadoop03

.21.15.8hadoop04

.21.15.12hadoop05

3.设置无密码登录

hadoop01上创建sshkey

ssh-keygen#一直回车

ssh-copy-idhadoop01

ssh-copy-idhadoop02

ssh-copy-idhadoop03

ssh-copy-idhadoop04

ssh-copy-idhadoop05

ssh-copy-id需要输入目标服务器root密码

4.磁盘初始化

创建初始化脚本fdisk.sh

#!/bin/bash

set

-e

#创建分区

echo

"n

p

t

8e

w

"

fdisk/dev/vdb

vgcreatecentos-data/dev/vdb1

#3.创建lv

lvcreate-l+%FREE-ndata1centos-data

#4.格式化lv

mkfs.ext4/dev/centos-data/data1

#5.挂载

mkdir-p/data/

uuid=

`blkid/dev/centos-data/data1

grep-P[a-f0-9]{8}(-[a-f0-9]{4}){4}[a-f0-9]{8}-o`

echo

"UUID=

${uuid}

/dataext4defaults02"

/etc/fstab

mount-a

```

在hadoop01上执行

```bash

sshhadoop01fdisk.sh

sshhadoop02fdisk.sh

sshhadoop03fdisk.sh

sshhadoop04fdisk.sh

sshhadoop05fdisk.sh

```

执行完后使用`df-h`查看分区是否挂在到/data/上。

###安装JDK

自行下载对应版本,这里用的是jdk8

1、下载

jdk-8u-linux-x64.rpm

下载后上传到服务器/data目录

cd/data

2、安装

yuminstall-y./jdk-8u-linux-x64.rpm

分别在5个节点上执行。

###安装Zookeeper

1.

下载安装包

apache-zookeeper-3.7.1-bin.tar.gz

下载后上传到hadoop01服务器/data目录

2、安装

cd/data

tarzxfapache-zookeeper-3.7.1-bin.tar.gz

mvapache-zookeeper-3.7.1-binapache-zookeeper

3.

创建配置文件/data/apache-zookeeper/conf/zoo.cfg,内容如下

tickTime

=

initLimit

=10

syncLimit

=5

dataDir

=/data/apache-zookeeper/data

dataLogDir

=/data/apache-zookeeper/logs

clientPort

=

server.1

=.21.15.2::

server.2

=.21.15.4::

server.3

=.21.15.7::

server.4=.21.15.8::

server.5=.21.15.12::

4.

创建数据和日志目录

mkdir-p/data/apache-zookeeper/{data,logs}

echo

1/data/apache-zookeeper/data/myid

5.

拷贝到其他节点

scp-r/data/apache-zookeeperhadoop02:/data/

scp-r/data/apache-zookeeperhadoop03:/data/

scp-r/data/apache-zookeeperhadoop04:/data/

scp-r/data/apache-zookeeperhadoop05:/data/

sshhadoop02mkdir-p/data/apache-zookeeper/{data,logs}

sshhadoop02

echo

2/data/apache-zookeeper/data/myid

sshhadoop03mkdir-p/data/apache-zookeeper/{data,logs}

sshhadoop02

echo

3/data/apache-zookeeper/data/myid

sshhadoop04mkdir-p/data/apache-zookeeper/{data,logs}

sshhadoop02

echo

4/data/apache-zookeeper/data/myid

sshhadoop05mkdir-p/data/apache-zookeeper/{data,logs}

sshhadoop02

echo

5/data/apache-zookeeper/data/myid

6.

启动服务

sshhadoop01/data/apache-zookeeper/bin/zkServer.shstart

sshhadoop02/data/apache-zookeeper/bin/zkServer.shstart

sshhadoop03/data/apache-zookeeper/bin/zkServer.shstart

sshhadoop04/data/apache-zookeeper/bin/zkServer.shstart

sshhadoop05/data/apache-zookeeper/bin/zkServer.shstart

###安装Hadoop

1、下载安装包

hadoop-2.10.1.tar.gz

下载后上传到hadoop01服务器/data目录

2.

安装

cd

/data

tarzxfhadoop-2.10.1.tar.gz

mvhadoop-2.10.1hadoop

3.

配置

修改/etc/profile文件,末尾添加如下内容。

export

JAVA_HOME=/usr/java/jdk1.8.0_-amd64

export

JAVA_BIN=/usr/java/jdk1.8.0_-amd64/bin

export

JRE_HOME=/usr/java/jdk1.8.0_-amd64/jre

export

CLASSPATH=.:

$JAVA_HOME

/lib/tools.jar:

$JAVA_HOME

/lib/dt.jar

export

HADOOP_HOME=/data/hadoop

export

PATH=

$PATH

:

$JAVA_HOME

/bin:

$HADOOP_HOME

/bin

执行source命令,使得配置立即生效。

source

/etc/profile

上面两部需要在5个节点上都执行一遍。

进入

`${HADOOP_HOME}/etc/hadoop`

目录下,修改配置文件。各个配置文件内容如下:

*

hadoop-env.sh

添加在配置*

`exportJAVA_HOME=${JAVA_HOME}`

*之前添加JAVA_HOME配置。

```bash

export

JAVA_HOME=/usr/java/jdk1.8.0_-amd64

```

*

core-site.xml

```xml

?xml

version

=

"1.0"

encoding

=

"UTF-8"

?

?xml-stylesheet

type

=

"text/xsl"

href

=

"configuration.xsl"

?

configuration

property

name

fs.defaultFS

/name

value

hdfs://mycluster

/value

/property

property

name

hadoop.tmp.dir

/name

value

file:/data/hadoop-data/tmp

/value

/property

property

!--ZooKeeper集群的地址--

name

ha.zookeeper.quorum

/name

value

hadoop01:,hadoop02:,hadoop03:,hadoop04:,hadoop05:

/value

/property

property

!--ZKFC连接到ZooKeeper超时时长--

name

ha.zookeeper.session-timeout.ms

/name

value

00

/value

/property

property

name

hadoop.proxyuser.root.hosts

/name

value

*

/value

/property

property

name

hadoop.proxyuser.root.groups

/name

value

*

/value

/property

/configuration

```

*

hdfs-site.xml

```xml

?xml

version

=

"1.0"

encoding

=

"UTF-8"

?

?xml-stylesheet

type

=

"text/xsl"

href

=

"configuration.xsl"

?

configuration

property

name

dfs.namenode.name.dir

/name

value

file:/data/hadoop-data/nn

/value

/property

property

name

dfs.datanode.data.dir

/name

value

file:/data/hadoop-data/dn

/value

/property

property

name

dfs.replication

/name

value

3

/value

/property

property

!--集群服务的逻辑名称--

name

dfs.nameservices

/name

value

mycluster

/value

/property

property

!--NameNodeID列表--

name

dfs.ha.namenodes.mycluster

/name

value

nn1,nn2

/value

/property

property

!--nn1的RPC通信地址--

name

dfs.namenode.rpc-address.mycluster.nn1

/name

value

hadoop01:

/value

/property

property

!--nn2的RPC通信地址--

name

dfs.namenode.rpc-address.mycluster.nn2

/name

value

hadoop02:

/value

/property

property

!--nn1的


转载请注明:http://www.jiaju1314.com/jbjj/jbjj/18238.html

  • 上一篇文章:
  •   
  • 下一篇文章: 没有了