您的当前位置:首页正文

大数据工具Hadoop快速入门3安装

来源:要发发知识网

前提

ubuntu、openssh-server和java安装ok

账号

sudo addgroup hadoop_
sudo adduser --ingroup hadoop_ hduser_
su - hduser_
ssh-keygen -t rsa -P ""
cat $HOME/.ssh/id_rsa.pub >> $HOME/.ssh/authorized_keys
ssh localhost
图片.png 图片.png

下载安装

图片.png
mv hadoop-2.9.2.tar.gz /opt
cd /opt
sudo tar xzf hadoop-2.9.2.tar.gz
mv hadoop-2.9.2 hadoop

参考资料

  • 讨论qq群630011153 144081101
  • 谢谢点赞!
  • [本文相关海量书籍下载](

配置

注意hadoop不会从系统中读取变量,一定要在这些文件中配置:

在~/.bashrc添加:

export HADOOP_HOME=/opt/hadoop
export JAVA_HOME=/usr/lib/jvm/java-8-oracle
export PATH=$PATH:$HADOOP_HOME/bin

HDFS配置

$HADOOP_HOME/etc/hadoop/hadoop-env.sh,修改下面部分。

export JAVA_HOME=/usr/lib/jvm/java-8-oracle

$HADOOP_HOME/etc/hadoop/core-site.xml

在<configuration></configuration>里面添加:

<name>hadoop.tmp.dir</name>
<value>/app/hadoop/tmp</value>
<description>Parent directory for other temporary directories.</description>
</property>
<property>
<name>fs.defaultFS </name>
<value>hdfs://localhost:54310</value>
<description>The name of the default file system. </description>
</property>

创建目录

sudo mkdir -p /app/hadoop/tmp
sudo chown -R hduser_:hadoop_  /app/hadoop/tmp
sudo chmod 750  /app/hadoop/tmp

Map Reduce配置

$ vi /etc/profile.d/hadoop.sh
export HADOOP_HOME=/opt/hadoop
$ sudo chmod +x /etc/profile.d/hadoop.sh
# sudo cp $HADOOP_HOME/etc/hadoop/mapred-site.xml.template $HADOOP_HOME/etc/hadoop/mapred-site.xml
$ vi $HADOOP_HOME/etc/hadoop/mapred-site.xml
# 在<configuration></configuration>部分添加
<property>
<name>mapreduce.jobtracker.address</name>
<value>localhost:54311</value>
<description>MapReduce job tracker runs at this host and port.
</description>
</property>
$ sudo mkdir -p /usr/local/hadoop_store/hdfs/namenode
$ sudo mkdir -p /usr/local/hadoop_store/hdfs/datanode
$ sudo chown -R hduser_:hadoop_ /usr/local/hadoop_store
$ vi $HADOOP_HOME/etc/hadoop/hdfs-site.xml
<configuration>
 <property>
  <name>dfs.replication</name>
  <value>1</value>
  <description>Default block replication.
  The actual number of replications can be specified when the file is created.
  The default is used if replication is not specified in create time.
  </description>
 </property>
 <property>
   <name>dfs.namenode.name.dir</name>
   <value>file:/usr/local/hadoop_store/hdfs/namenode</value>
 </property>
 <property>
   <name>dfs.datanode.data.dir</name>
   <value>file:/usr/local/hadoop_store/hdfs/datanode</value>
 </property>
</configuration>

$ $HADOOP_HOME/bin/hdfs namenode -format # 格式化
$ $HADOOP_HOME/sbin/start-dfs.sh # 启动
$ $HADOOP_HOME/sbin/start-yarn.sh
$ jps # 查看进程
$ $HADOOP_HOME/sbin/stop-dfs.sh # 停止
$ $HADOOP_HOME/sbin/stop-yarn.sh
图片.png