Service as a Job: HDFS NameNode

Scheduling an HDFS DataNode is a powerful function. However, an operational HDFS instance also requires a NameNode. Here is an example of how a NameNode can be scheduled, followed by scheduled DataNodes, to create an HDFS instance.

From here, HDFS instances can be dynamically created on shared resources. Workflows can be built to manage, grow and shrink HDFS instances. Multiple HDFS instances can be deployed on a single set of resources.

The control script is based on hdfs_datanode.sh. It discovers the NameNode’s endpoints and chirps them.

hdfs_namenode.sh

#!/bin/sh -x

# condor_chirp in /usr/libexec/condor
export PATH=$PATH:/usr/libexec/condor

HADOOP_TARBALL=$1

# Note: bin/hadoop uses JAVA_HOME to find the runtime and tools.jar,
#       except tools.jar does not seem necessary therefore /usr works
#       (there's no /usr/lib/tools.jar, but there is /usr/bin/java)
export JAVA_HOME=/usr

# When we get SIGTERM, which Condor will send when
# we are kicked, kill off the namenode
function term {
   ./bin/hadoop-daemon.sh stop namenode
}

# Unpack
tar xzfv $HADOOP_TARBALL

# Move into tarball, inefficiently
cd $(tar tzf $HADOOP_TARBALL | head -n1)

# Configure,
#  . fs.default.name,dfs.http.address must be set to port 0 (ephemeral)
#  . dfs.name.dir must be in _CONDOR_SCRATCH_DIR for cleanup
# FIX: Figure out why a hostname is required, instead of 0.0.0.0:0 for
#      fs.default.name
cat > conf/hdfs-site.xml <<EOF
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
  <property>
    <name>fs.default.name</name>
    <value>hdfs://$HOSTNAME:0</value>
  </property>
  <property>
    <name>dfs.name.dir</name>
    <value>$_CONDOR_SCRATCH_DIR/name</value>
  </property>
  <property>
    <name>dfs.http.address</name>
    <value>0.0.0.0:0</value>
  </property>
</configuration>
EOF

# Try to shutdown cleanly
trap term SIGTERM

export HADOOP_CONF_DIR=$PWD/conf
export HADOOP_LOG_DIR=$_CONDOR_SCRATCH_DIR/logs
export HADOOP_PID_DIR=$PWD

./bin/hadoop namenode -format
./bin/hadoop-daemon.sh start namenode

# Wait for pid file
PID_FILE=$(echo hadoop-*-namenode.pid)
while [ ! -s $PID_FILE ]; do sleep 1; done
PID=$(cat $PID_FILE)

# Wait for the log
LOG_FILE=$(echo $HADOOP_LOG_DIR/hadoop-*-namenode-*.log)
while [ ! -s $LOG_FILE ]; do sleep 1; done

# It would be nice if there were a way to get these without grepping logs
while [ ! $(grep "IPC Server listener on" $LOG_FILE) ]; do sleep 1; done
IPC_PORT=$(grep "IPC Server listener on" $LOG_FILE | sed 's/.* on \(.*\):.*/\1/')
while [ ! $(grep "Jetty bound to port" $LOG_FILE) ]; do sleep 1; done
HTTP_PORT=$(grep "Jetty bound to port" $LOG_FILE | sed 's/.* to port \(.*\)$/\1/')

# Record the port number where everyone can see it
condor_chirp set_job_attr NameNodeIPCAddress \"$HOSTNAME:$IPC_PORT\"
condor_chirp set_job_attr NameNodeHTTPAddress \"$HOSTNAME:$HTTP_PORT\"

# While namenode is running, collect and report back stats
while kill -0 $PID; do
   # Collect stats and chirp them back into the job ad
   # Nothing to do.
   sleep 30
done

The job description file is standard at this point.

hdfs_namenode.job

cmd = hdfs_namenode.sh
args = hadoop-1.0.1-bin.tar.gz

transfer_input_files = hadoop-1.0.1-bin.tar.gz

#output = namenode.$(cluster).out
#error = namenode.$(cluster).err

log = namenode.$(cluster).log

kill_sig = SIGTERM

# Want chirp functionality
+WantIOProxy = TRUE

should_transfer_files = yes
when_to_transfer_output = on_exit

requirements = HasJava =?= TRUE

queue

In operation –

Submit the namenode and find its endpoints,

$ condor_submit hdfs_namenode.job
Submitting job(s).
1 job(s) submitted to cluster 208.

$ condor_q -long 208| grep NameNode       
NameNodeHTTPAddress = "eeyore.local:60182"
NameNodeIPCAddress = "eeyore.local:38174"

Open a browser window to http://eeyore.local:60182 to find the cluster summary,

1 files and directories, 0 blocks = 1 total. Heap Size is 44.81 MB / 888.94 MB (5%)
  Configured Capacity                   :        0 KB
  DFS Used                              :        0 KB
  Non DFS Used                          :        0 KB
  DFS Remaining                         :        0 KB
  DFS Used%                             :       100 %
  DFS Remaining%                        :         0 %
 Live Nodes                             :           0
 Dead Nodes                             :           0
 Decommissioning Nodes                  :           0
  Number of Under-Replicated Blocks     :           0

Add a datanode,

$ condor_submit -a NameNodeAddress=hdfs://eeyore.local:38174 hdfs_datanode.job 
Submitting job(s).
1 job(s) submitted to cluster 209.

Refresh the cluster summary,

1 files and directories, 0 blocks = 1 total. Heap Size is 44.81 MB / 888.94 MB (5%)
  Configured Capacity                   :     9.84 GB
  DFS Used                              :       28 KB
  Non DFS Used                          :     9.54 GB
  DFS Remaining                         :   309.79 MB
  DFS Used%                             :         0 %
  DFS Remaining%                        :      3.07 %
 Live Nodes                             :           1
 Dead Nodes                             :           0
 Decommissioning Nodes                  :           0
  Number of Under-Replicated Blocks     :           0

And another,

$ condor_submit -a NameNodeAddress=hdfs://eeyore.local:38174 hdfs_datanode.job
Submitting job(s).
1 job(s) submitted to cluster 210.

Refresh,

1 files and directories, 0 blocks = 1 total. Heap Size is 44.81 MB / 888.94 MB (5%)
  Configured Capacity                   :    19.69 GB
  DFS Used                              :       56 KB
  Non DFS Used                          :    19.26 GB
  DFS Remaining                         :   435.51 MB
  DFS Used%                             :         0 %
  DFS Remaining%                        :      2.16 %
 Live Nodes                             :           2
 Dead Nodes                             :           0
 Decommissioning Nodes                  :           0
  Number of Under-Replicated Blocks     :           0

All the building blocks necessary to run HDFS on scheduled resources.

Advertisements

Tags: , , , , ,

One Response to “Service as a Job: HDFS NameNode”

  1. Service as a Job: Hadoop MapReduce TaskTracker « Spinning Says:

    […] to build on other examples of services run as jobs, such as HDFS NameNode and HDFS DataNode, here is an example for the Hadoop MapReduce framework’s […]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s


%d bloggers like this: