INSTALLATION DOCUMENTS BY RAVI

Saturday, November 11, 2017

Step by step Apache Hadoop cluster configuration

 Prerequisites:
1.  Create number of nodes we want for clustering
2. Set ip and hostname on each node properly
3. Setup ssh connectivity between all the nodes and test it

COMMON CONFIGURATION FOR ALL NODES

UPDATE CORE-SITE.XML FILE (In all nodes):









































UPDATE HDFS-SITE.XML
On Master node:

























On slave nodes:
























UPDATE YARN-SITE.XML (all nodes):
















































UPDATE MAPRED-SITE.XML (on all nodes):














































Add or modify masters and slaves files on Name node only as below:















































FORMAT THE NAME NODE IN MASTER NODE ONLY:
Note: Do not format a running cluster because this will erase all existing data in the HDFS file system





















START THE CLUSTER:



















Now to check whether all daemons are running or not, type the following command: (In Master node)







Now to check whether all daemons are running or not, type the following command: (In Slave node)







REVIEW IN CONSOLE

If all the daemons started successfully in all nodes then you can see the nodes in the listed in the console.
You can type the following command in terminal and verify that:

Command > hadoop dfsadmin -report


























You can type the following url and verify that:










































No comments:

Post a Comment

Loading xml file data to oracle table using python

Sample xml file (test.xml): <?xml version="1.0"?> <data>     <customer name="Ravi" >     ...