Archive for January, 2012

Pool utilization

January 31, 2012

Here is a utilization script for a Condor pool.

$ ./utilization.sh
       Unavailable Available    Total     Used:  Avail   Total
Slots         5968      5451    11419     4179  76.66%  36.59%
Cpus          6314      5903    12217     4631  78.45%  37.90%
Memory    14277325  11776800 26054125  9908190  84.13%  38.02%

And, if you know your workload will not run on slots with less then 1GB of memory, you can filter out slots that are too small,

$ ./utilization.sh 'Memory < 1024'
       Unavailable Available    Total     Used:  Avail   Total
Slots         6292      5127    11419     4177  81.47%  36.57%
Cpus          6638      5579    12217     4629  82.97%  37.88%
Memory    14592711  11461414 26054125  9904193  86.41%  38.01%

Remember, if an attribute is not on all slots you need to use the meta-comparison operators: =?= and =!=, e.g. 'MyCustomAttr =!= True'.

Advertisements

EC2, VNC and Fedora

January 24, 2012

If you have ever wondered about running a desktop session in EC2, here is one way to set it up and some pointers.

First, start an instance, my preferred way is via Condor. I used ami-60bd4609 on an m1.small, providing a basic Fedora 15 server. Make sure the instance’s security group has port 22 (ssh) open.

Second, install a desktop environment, e.g. yum groupinstall 'GNOME Desktop Environment'. This is 467 packages and will take about 18 minutes.

Third, install and setup a VNC server. yum install vnc-server ; vncpasswd ; vncserver :1. This produces a running desktop that can be contacted by a vncviewer.

Finally, connect via an SSH secured VNC session.

VNC_VIA_CMD='/usr/bin/ssh -i KEYPAIR.pem -l ec2-user -f -L "$L":"$H":"$R" "$G" sleep 20' vncviewer localhost:1 -via INSTANCE_ADDRESS

What’s going on here? vncviewer allows for a proxy host when connecting to the vncserver. That is the -via argument. The VNC_VIA_CMD is an environment variable that specifies the command used to connect to the proxy. Here it is modified to provide the keypair needed to access the instance, and the user ec2-user, which is the default user on Fedora AMIs. The INSTANCE_ADDRESS is the Hostname from condor_ec2_q.

Alternatively, ssh-add KEYPAIR.pem followed by vncviewer localhost:1 -via ec2-user@INSTANCE_ADDRESS. However, be careful if you have many keys stored in your ssh-agent. They will all be tried and the remote sshd may reject your connection before the proper keypair is found.

Tips:

  • It takes about 20 minutes from start to vncviewer. Once the instance is setup consider creating your own AMI.
  • Set a password for ec2-user, otherwise the screensaver will lock you out. Use sudo passwd ec2-user.
  • Remember AWS charges for data transmitted out of the instance, as well as the uptime of the instance, see EC2 Pricing. You will want to figure out how much bandwidth your workflow takes on average to figure out total cost. For me, a half hour of browsing Planet Fedora, editing with emacs, and compiling some code, transmitted about 60MB of data. That measurement is the difference in eth0’s “TX bytes” as reported by ifconfig. This is not a perfect estimate because there is may have been data transferred within EC2, which is not charged.
  • For transmit rates, consider running bmw-ng to see what actions use the most bandwidth.
  • Generally, make the screen update as little as possible. Constantly changing graphics on web pages can run 60-120KB/s. Compare that to a text console and emacs producing a TX rate closer to 5-25KB/s.
  • Cover consoles with compilations, or compile in a low verbosity mode.

Manage inventory with Wallaby

January 16, 2012

Wallaby will manage your configuration, as well as an inventory of your machines. It can differentiate between machines that are expected to be present and those that opportunistically appear.

Build the roster with wallaby add-node

$ wallaby add-node node0.local node1.local node2.local
Adding the following node: node0.local
Console Connection Established...
Adding the following node: node1.local
Adding the following node: node2.local
$ for i in $(seq 3 10); do wallaby add-node node$i.local; done
Adding the following node: node3.local
Console Connection Established...
Adding the following node: node4.local
Console Connection Established...
...

List expected nodes (provisioned) –

$ wallaby inventory
Console Connection Established...
P        Node name                 Last checkin
-        ---------                 ------------
+      node0.local Wed Jan 11 07:32:33 -0500 20
+      node1.local Thu Jan 05 12:15:00 -0500 20
+     node10.local Wed Jan 11 07:31:56 -0500 20
+      node2.local Wed Jan 11 07:31:56 -0500 20
+      node3.local Wed Jan 11 07:15:21 -0500 20
+      node4.local Wed Jan 11 07:31:42 -0500 20
+      node5.local Wed Jan 11 07:16:47 -0500 20
+      node6.local                        never
+      node7.local Wed Jan 11 07:32:33 -0500 20
+      node8.local Wed Jan 11 07:32:33 -0500 20
+      node9.local Wed Jan 11 07:30:47 -0500 20
-      robin.local Thu Dec 15 14:11:35 -0500 20
-      woods.local Tue Jan 10 20:33:47 -0500 20

List opportunistic, bonus nodes (unprovisioned) –

$ wallaby inventory -o unprovisioned
Console Connection Established...
P        Node name                 Last checkin
-        ---------                 ------------
-      robin.local Thu Dec 15 14:11:35 -0500 20
-      woods.local Tue Jan 10 20:33:47 -0500 20

Provisioned nodes that have never checked in, maybe setup failed –

$ wallaby inventory -c 'last_checkin == 0 && provisioned'
Console Connection Established...
P        Node name                 Last checkin
-        ---------                 ------------
+      node6.local                        never

Provisioned node that have not checked in for the past 4 hours, maybe machine is down –

$ wallaby inventory -c 'last_checkin > 0 && last_checkin < 4.hours_ago && provisioned'
Console Connection Established...
P        Node name                 Last checkin
-        ---------                 ------------
+      node1.local Thu Jan 05 12:15:00 -0500 20

Unprovisioned nodes that have not checked in for 48 hours, candidates for wallaby remove-node

$ wallaby inventory -c 'last_checkin < 48.hours_ago && !provisioned'
Console Connection Established...
P        Node name                 Last checkin
-        ---------                 ------------
-      robin.local Thu Dec 15 14:11:35 -0500 20 

Enjoy.


%d bloggers like this: