Posts Tagged ‘Firewall’

Getting Started: Multiple node Condor pool with firewalls

June 21, 2011

Creating a Condor pool with no firewalls up is quite a simple task. Before the condor_shared_port daemon, doing the same with firewalls was a bit painful.

Condor uses dynamic ports for everything except the Collector. The Collector endpoint is the bootstrap. This means a Schedd might start up on a random ephemeral port, and each of its shadows might as well. This causes headaches for firewalls as large ranges of ports need to be opened for communication. There are ways to control the ephemeral range used. Unfortunately, doing so just reduced the port range some, did not guarantee Condor was on the ports, and could limit scale.

The condor_shared_port daemon allows Condor to use a single inbound port on a machine.

Again, using Fedora 15. I had no luck with firewalld and firewall-cmd. Instead I fell back to using straight iptables.

The first thing to do is pick a port for Condor to use on your machines. The simplest thing to do is pick 9618, the port typically known as the Collector’s port.

On all machines where Condor is going to run, you want to –

# lokkit --enabled

# service iptables start
Starting iptables (via systemctl):  [  OK  ]

# service iptables status
Table: filter
Chain INPUT (policy ACCEPT)
num  target     prot opt source               destination
1    ACCEPT     all  --  0.0.0.0/0            0.0.0.0/0           state RELATED,ESTABLISHED
2    ACCEPT     icmp --  0.0.0.0/0            0.0.0.0/0
3    ACCEPT     all  --  0.0.0.0/0            0.0.0.0/0
4    REJECT     all  --  0.0.0.0/0            0.0.0.0/0 reject-with icmp-host-prohibited

Chain FORWARD (policy ACCEPT)
num  target     prot opt source               destination
1    REJECT     all  --  0.0.0.0/0            0.0.0.0/0 reject-with icmp-host-prohibited

Chain OUTPUT (policy ACCEPT)
num  target     prot opt source               destination

If you want to ssh to the machine again, be sure to insert rules above the “REJECT ALL — …” –

# iptables -I INPUT 4 -p tcp -m tcp --dport 22 -j ACCEPT

And open a port, both TCP and UDP, for the shared port daemon –

# iptables -I INPUT 5 -p tcp -m tcp --dport condor -j ACCEPT
# iptables -I INPUT 6 -p udp -m udp --dport condor -j ACCEPT

Next you want to configure Condor to use the shared port daemon, with port 9618 –

# cat > /etc/condor/config.d/41shared_port.config
SHARED_PORT_ARGS = -p 9618
DAEMON_LIST = $(DAEMON_LIST), SHARED_PORT
COLLECTOR_HOST = $(CONDOR_HOST)?sock=collector
USE_SHARED_PORT = TRUE
^D

In order, SHARED_PORT_ARGS tells the shared port daemon to listen on port 9618, DAEMON_LIST tells the master to start the shared port daemon, COLLECTOR_HOST specifies that the collector will be on the sock named “collector”, and finally USE_SHARED_PORT tells all daemons to register and use the shared port daemon.

After you put that configuration on all your systems, run service condor restart, and go.

You will have the shared port daemon listening on 9618 (condor), and all communication between machines will around through it.

# lsof -i | grep $(pidof condor_shared_port)
condor_sh 31040  condor    8u  IPv4  74105      0t0  TCP *:condor (LISTEN)
condor_sh 31040  condor    9u  IPv4  74106      0t0  UDP *:condor

That’s right, you have a condor pool with firewalls and a single port opened for communication on each node.

Advertisements

Firewalling execute nodes: Avoid LOWPORT/HIGHPORT, use IN_LOWPORT/IN_HIGHPORT

August 8, 2010

We all know why firewalls are setup. Typical firewall configurations minimize inbound connections and allow unrestricted outbound connections.

Condor primarily uses ephemeral ports for inbound connections. To assist configuration with firewalls, it has long provided LOWPORT and HIGHPORT configuration options to constrain the port range it uses. Going beyond port range management, Condor has grown to include the Condor Connection Broker (CCB), to reverse connections when components are entirely hidden by firewalls, and condor_shared_port, to reduce the inbound port footprint on a machine to one.

Unfortunately, there is a disconnect in the typical firewall configuration and what LOWPORT/HIGHPORT configuration expresses. LOWPORT/HIGHPORT constraints both inbound and outbound port usage.

On an execute node, Condor will run a condor_master, a condor_startd and a few condor_starter processes, one per job. All must be able to accept connections. For a node that can run 4 jobs, the minimum number of inbound ports open in the node’s firewall is 6, one for each of the 6 potential processes. However, those processes will use more than just one port during its lifetime. In fact, the processes may have 3 open connections at some point. Using LOWPORT/HIGHPORT, that means setting a range that is 3 times wider than is necessary. It is possible to reduce that because not all processes will use all 3 connections at once, until they do. Going low is fragile.

Luckily, Condor provides IN_LOWPORT/IN_HIGHPORT and OUT_LOWPORT/OUT_HIGHPORT. For a typical firewall configuration, ignore the OUT_’s and use the IN_’s, e.g. IN_LOWPORT = 10000, IN_HIGHPORT = 10005. You will be much happier.

Port usage running 4 jobs with configuration,

ALL_DEBUG = D_NETWORK
IN_LOWPORT = 10000
IN_HIGHPORT = 10015
OUT_LOWPORT = 20000
OUT_HIGHPORT = 20015

Looks like,

MasterLog:08/08/10 09:25:10 Sock::bindWithin - bound to 10012...
MasterLog:08/08/10 09:25:13 Sock::bindWithin - bound to 10000...
MasterLog:08/08/10 09:25:18 Sock::bindWithin - bound to 20009...
StartLog:08/08/10 09:25:24 Sock::bindWithin - bound to 20015...
StartLog:08/08/10 09:25:24 Sock::bindWithin - bound to 20007...
StartLog:08/08/10 09:25:28 Sock::bindWithin - bound to 20003...
StartLog:08/08/10 09:25:34 Sock::bindWithin - bound to 10007...
StartLog:08/08/10 09:25:34 Sock::bindWithin - bound to 10013...
StartLog:08/08/10 09:25:34 Sock::bindWithin - bound to 10006...
StartLog:08/08/10 09:25:34 Sock::bindWithin - bound to 10015...
StarterLog.slot1:08/08/10 09:25:34 Sock::bindWithin - bound to 20007...
StarterLog.slot1:08/08/10 09:25:34 Sock::bindWithin - bound to 20008...
StarterLog.slot2:08/08/10 09:25:34 Sock::bindWithin - bound to 20003...
StarterLog.slot2:08/08/10 09:25:34 Sock::bindWithin - bound to 20013...
StarterLog.slot3:08/08/10 09:25:34 Sock::bindWithin - bound to 20011...
StarterLog.slot3:08/08/10 09:25:34 Sock::bindWithin - bound to 20012...
StarterLog.slot4:08/08/10 09:25:34 Sock::bindWithin - bound to 20013...
StarterLog.slot4:08/08/10 09:25:34 Sock::bindWithin - bound to 20004...

That’s 6 inbound ports and 12 outbound ports, with a few reused.


%d bloggers like this: