Install Hadoop 0.23

The release of Hadoop 0.23 changed some directories and bash script. The /conf/ dir and some of is not easy to find any more

The instruction I learned and successfully installed is almost from the


I download from this address:

Install prerequisite package

$ sudo apt-get install ssh 
$ sudo apt-get install rsync

Setup passphraseless ssh

Now check that you can ssh to the localhost without a passphrase:

$ ssh localhost

If you cannot ssh to localhost without a passphrase, execute the following commands:

$ ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa 
$ cat ~/.ssh/ >> ~/.ssh/authorized_keys

Configure HDFS

The new 0.23 do not using the /conf/ folder again. All the configuration included into etc/hadoop/ folder.


<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

NameNode and DataNode


This file in 0.23 did not appear in the /conf/ since there is no longer that directory anymore. But it also not appear in the new configuration dir etc/hadoop.

From the reference blog I copy that one from below dir:

$ cp ./share/hadoop/common/templates/conf/ etc/hadoop

Write several environment variant

JAVA_HOME to bashrc

$ export JAVA_HOME="$(readlink -f /usr/bin/javac | sed "s:/bin/javac::")"

Change some bogus value

$ export HADOOP_PREFIX="/home/jianfeng/software/hadoop/hadoop-0.23.5"
$ #export HADOOP_CONF_DIR=${HADOOP_CONF_DIR:-"/etc/hadoop"}
$ export HADOOP_CONF_DIR="${HADOOP_PREFIX}/etc/hadoop" 


Start HDFS

Format the namenode

 $ bin/hadoop namenode -format

Run scripts as below

sbin/ start namenode
sbin/ start datanode

Check localhost

Browse the web interface for the NameNode and the JobTracker; by default they are available at:

NameNode - http://localhost:50070/
JobTracker - http://localhost:50030/

At this time, NameNode should success, Not yet the JobTracker

Published: November 29 2012

