Running Distributed Experiments on the DASΒΆ

In the previous tutorials we have devised experiments that spawn multiple instances on a single computer. In this tutorial, we will show how to run an experiment on the DAS compute cluster. The DAS is a Dutch nation-wide compute infrastructure that consists of multiple clusters, managed by different universities. This tutorial assumes that the reader has access to the DAS head nodes.

The experiment we will run on the DAS is the same experiment that we described in the previous tutorial. In this experiment, we will spawn multiple (synchronized) instances and each instance will write its ID to a file five seconds after the experiment starts. The experiment is started on the DAS head node and before the instances spawn, Gumby automatically reserves a certain number of compute nodes. Each compute node then spawns a certain number of instances, depending on the experiment configuration. When the experiment ends, all data generated by instances is collected by the head node.

The configuration file for this DAS experiment looks as follows:

experiment_name = synchronized_instances_das5
instances_to_run = 16
local_instance_cmd = das_reserve_and_run.sh
post_process_cmd = post_process_write_ids.sh
scenario_file = write_ids.scenario
sync_port = __unique_port__

# The command that is executed prior to starting the experiment. This script prepares the DAS environment.
local_setup_cmd = das_setup.sh

# We use a venv on the DAS since installing packages might lead to conflicts with other experiments.
use_local_venv = TRUE

# The number of DAS compute nodes to use.
node_amount = 2

# The experiment timeout after which the connection with the compute node is closed.
node_timeout = 20

# What command do we want to run?
das_node_command = launch_scenario.py

The new configuration options are annotated with some explanation. It includes a local_setup_cmd configuration option that is executed before the experiment starts. The das_setup.sh script checks the user quote on the DAS and invokes the build_virtualenv.sh script that prepares a virtual environment with various Python packages. To use this virtual environment, the use_local_venv option is set.

Additionally, there are a few configuration options that are specific to DAS related experiments. The node_amount configuration option indicates how many DAS compute nodes are used. The maximum number of compute nodes in each cluster can be found at the DAS sites listed in the Quick Overview. In our experiment, we spawn 16 instances and use 2 compute nodes. Gumby automatically balances instances over compute nodes and in our experiment, each compute node hosts 8 instances. The node_timeout configuration option indicates the timeout of the experiment. To prevent premature termination of an experiment, we recommend to set this value a bit higher than the time of the latest event in the scenario file.

To run this experiment, execute the following command on one of the DAS system head nodes:

$ gumby/run.py gumby/docs/tutorials/synchronized_instances_das.conf