Prerequisites:
1. Create number of nodes we want for clustering
2. Set ip and hostname on each node properly
3. Setup ssh connectivity between all the nodes and test it
1. Create number of nodes we want for clustering
2. Set ip and hostname on each node properly
3. Setup ssh connectivity between all the nodes and test it
COMMON CONFIGURATION FOR ALL NODES
UPDATE CORE-SITE.XML FILE (In all nodes):
UPDATE HDFS-SITE.XML
On Master node:
On slave nodes:
UPDATE YARN-SITE.XML (all nodes):
UPDATE MAPRED-SITE.XML (on all nodes):
Add or modify masters and slaves files on Name node only as
below:
FORMAT THE NAME NODE IN MASTER NODE ONLY:
Note: Do not format a running cluster because
this will erase all existing data in the HDFS file system
START THE CLUSTER:
Now
to check whether all daemons are running or not, type the following command:
(In Master node)
Now to check whether all daemons are running or not, type
the following command: (In Slave node)
REVIEW IN CONSOLE
If all the daemons started successfully in all nodes then
you can see the nodes in the listed in the console.
You can type the following command in terminal and verify
that:
Command > hadoop dfsadmin -report
You can type the following url and verify that: