Perquisites:
1. Java must be installed
2. ssh must be installed and ssh must be running to
use the Hadoop scripts that manage remote Hadoop daemons
Download and stage the software:
Installing Java:
Make a directory named java under /u01/
Move the downloaded java software from /mnt to /u01/java
Extract the tar.gz file as below
Installing Hadoop:
Make a directory named hadoop under /u01/
Move the downloaded java software from /mnt to /u01/hadoop
Extract the tar.gz file as below
Edit the
file /u01/hadoop/hadoop-2.7.2/etc/hadoop/hadoop-env.sh to define some
parameters as below
Save and exit the file
Test the hadoop command as below
Configuring Hadoop:
Modifying etc/hadoop/core-site.xml as below:
Save and exit the file
Modifying etc/hadoop/hdfs-site.xml as below:
Save and exit the file
Setup passphraseless ssh
Check that you can ssh to the localhost without a
passphrase:
If you cannot ssh to localhost without a passphrase,
execute the following commands:
Executing MapReduce job locally:
Format the file system:
Start NameNode daemon and DataNode daemon:
The hadoop daemon log output is written to
the $HADOOP_LOG_DIR directory (defaults
to $HADOOP_HOME/logs)
Browse the web interface for the NameNode; by
default it is available at:
- NameNode
- http://localhost:50070/
Create the HDFS directories required to execute
MapReduce jobs:
Copy the input files into the distributed
file system:
Run some of the examples provided:
Examine the output files:
Copy the output files from the distributed
file system to the local file system and examine them:
When you're done, stop the daemons with:
YARN on Single Node
We can run a MapReduce job on YARN in a
pseudo-distributed mode by setting a few parameters and running Resource Manager
daemon and Node Manager daemon in addition.
Configure parameters as below:
Modifying etc/hadoop/mapred-site.xml as below:
Save and exit the file
Modifying etc/hadoop/yarn-site.xml as below:
Save and exit the file
Start Resource Manager daemon and Node Manager daemon:
Browse the web interface for the Resource Manager by
default it is available at:
- Resource Manager
- http://localhost:8088/
No comments:
Post a Comment