Archive for December, 2011

Amazon S3 – Object Expiration, what about Instance Expiration

December 28, 2011

AWS is providing APIs that take distributed computing concerns into account. One could call them cloud concerns these days. Unfortunately, not all cloud providers are doing the same.

Idempotent instance creation showed up in Sept 2010, providing the ability to simplify interactions with EC2. Idempotent resource allocation is critical for distributed systems.

S3 object expiration appeared in Dec 2011, allowing for service-side managed deallocation of S3 resources.

Next up? It would be great to have an EC2 instance expiration feature. One that could be (0) assigned per instance and (1) adjusted while the instance exists. Bonus if can also be (2) adjusted from within the instance without credentials. Think leases.

Advertisements

New toy: newpgid

December 13, 2011

Useful with cpusoak and memsoak,

newpgid.c

#include <unistd.h>

int
main(int argc, char *argv[])
{
  setpgid(0, 0);

  execvp(argv[1], &(argv[1]));

  return 1;
}

When you want to start a new process in its own process group for easy killing.

If you have coreutils 7.0+, you can take advantage of timeout, which happens to setpgid.

Service as a Job: Memcached

December 5, 2011

Running services such as Tomcat or Qpidd show how to schedule and manage a service’s life-cycle via Condor. It is also possible to gather and centralize statistics about a service as it runs. Here is an example of how with memcached.

As with tomcat and qpidd, there is a control script and a job description.

New in the control script for memcached will be a loop to monitor and chirp back statistic information.

memcached.sh

#!/bin/sh

# condor_chirp in /usr/libexec/condor
export PATH=$PATH:/usr/libexec/condor

PORT_FILE=$TMP/.ports

# When we get SIGTERM, which Condor will send when
# we are kicked, kill off memcached.
function term {
   rm -f $PORT_FILE
   kill %1
}

# Spawn memcached, and make sure we can shut it down cleanly.
trap term SIGTERM
# memcached will write port information to env(MEMCACHED_PORT_FILENAME)
env MEMCACHED_PORT_FILENAME=$PORT_FILE memcached -p -1 "$@" &

# We might have to wait for the port
while [ ! -s $PORT_FILE ]; do sleep 1; done

# The port file's format is:
#  TCP INET: 56697
#  TCP INET6: 47318
#  UDP INET: 34453
#  UDP INET6: 54891
sed -i -e 's/ /_/' -e 's/\(.*\): \(.*\)/\1=\2/' $PORT_FILE
source $PORT_FILE
rm -f $PORT_FILE

# Record the port number where everyone can see it
condor_chirp set_job_attr MemcachedEndpoint \"$HOSTNAME:$TCP_INET\"
condor_chirp set_job_attr TCP_INET $TCP_INET
condor_chirp set_job_attr TCP_INET6 $TCP_INET6
condor_chirp set_job_attr UDP_INET $UDP_INET
condor_chirp set_job_attr UDP_INET6 $UDP_INET6

# While memcached is running, collect and report back stats
while kill -0 %1; do
   # Collect stats and chirp them back into the job ad
   echo stats | nc localhost $TCP_INET | \
    grep -v -e END -e version | tr '\r' '\0' | \
     awk '{print "stat_"$2,$3}' | \
      while read -r stat; do
         condor_chirp set_job_attr $stat
      done
   sleep 30
done

A refresher about chirp. Jobs are stored in condor_schedd processes. They are described using the ClassAd language, extensible name value pairs. chirp is a tool a job can use while it runs to modify its classad stored in the schedd.

The job description, passed to condor_submit, is vanilla except for how arguments are passed to memcached.sh. The dollardollar use, see man condor_submit, allows memcached to use as much memory as is available on the slot where it gets scheduled. Slots may have different amounts of Memory available.

memcached.job

cmd = memcached.sh
args = -m $$(Memory)

log = memcached.log

kill_sig = SIGTERM

# Want chirp functionality
+WantIOProxy = TRUE

should_transfer_files = if_needed
when_to_transfer_output = on_exit

queue

An example, note that the set of memcached servers to use is generated from condor_q,

$ condor_submit -a "queue 4" memcached.job
Submitting job(s)....
4 job(s) submitted to cluster 80.

$ condor_q -format "%s\t" MemcachedEndpoint -format "total_items: %d\t" stat_total_items -format "memory: %d/" stat_bytes -format "%d\n" stat_limit_maxbytes
eeyore.local:50608	total_items: 0	memory: 0/985661440
eeyore.local:47766	total_items: 0	memory: 0/985661440
eeyore.local:39130	total_items: 0	memory: 0/985661440
eeyore.local:57410	total_items: 0	memory: 0/985661440

$ SERVERS=$(condor_q -format "%s," MemcachedEndpoint); for word in $(cat words); do echo $word > $word; memcp --servers=$SERVERS $word; \rm $word; done &
[1] 959

$ condor_q -format "%s\t" MemcachedEndpoint -format "total_items: %d\t" stat_total_items -format "memory: %d/" stat_bytes -format "%d\n" stat_limit_maxbytes
eeyore.local:50608	total_items: 480	memory: 47740/985661440
eeyore.local:47766	total_items: 446	memory: 44284/985661440
eeyore.local:39130	total_items: 504	memory: 50140/985661440
eeyore.local:57410	total_items: 490	memory: 48632/985661440

$ condor_q -format "%s\t" MemcachedEndpoint -format "total_items: %d\t" stat_total_items -format "memory: %d/" stat_bytes -format "%d\n" stat_limit_maxbytes
eeyore.local:50608	total_items: 1926	memory: 191264/985661440
eeyore.local:47766	total_items: 1980	memory: 196624/985661440
eeyore.local:39130	total_items: 2059	memory: 204847/985661440
eeyore.local:57410	total_items: 2053	memory: 203885/985661440

$ condor_q -format "%s\t" MemcachedEndpoint -format "total_items: %d\t" stat_total_items -format "memory: %d/" stat_bytes -format "%d\n" stat_limit_maxbytes
eeyore.local:50608	total_items: 3408	memory: 338522/985661440
eeyore.local:47766	total_items: 3542	memory: 351784/985661440
eeyore.local:39130	total_items: 3666	memory: 364552/985661440
eeyore.local:57410	total_items: 3600	memory: 357546/985661440

[1]  + done       for word in $(cat words); do; echo $word > $word; memcp --servers=$SERVERS ; 

Enjoy.


%d bloggers like this: