Manage

Functions to manage the generation and execution of experiments.

All functions can be accessed under the module: exputils.manage

Experiments are recommended to be stored in a specific folder structure which allows to save and load experimental data in a structured manner. Please note that it represents a default structure which can be adapted if required. Elements in brackets (<custom name>) can have custom names.

Folder structure:

<experiments> folder: Holds all your campaigns.
- <experimental campaign> folders:
  - <analyze> folder: Scripts such as Jupyter notebooks to analyze the different experiments in this experimental campaign.
  - experiment_configurations.ods file: ODS file that contains the configuration parameters of the different experiments in this campaign.
  - src folder: Holds code templates of the experiments.
    - rep folder: Code templates that are used under the repetition folders of th experiments. These contain the acutal experimental code that should be run.
    - exp folder: Code templates that are used under the experiment folder of the experiment. These contain usually code to compute statistics over all repetitions of an experiment.
  - experiments folder: Contains generated code for experiments and the collected experimental data.
    - experiment_{id} folders:
      - repetition_{id} folders:
        
        data folder: Experimental data for the single repetitions, such as logs.
        
        code files: Generated code and resource files.

Generation & Execution

`generate_experiment_files`

Generates experiments based on a configuration ODS file (LibreOffice Spreadsheet) and template source code.

The configuration ODS has to be in a specific form. See resources/experiment_configurations.ods for an example file.

The template source code is usually located in .\src\exp for code on the experiment level and .\src\rep for code on the repetition level.

Parameters:

Name	Type	Description	Default
`ods_filepath`	`str`	Path to the ODS configuration file that defines the experiments. Default is `'./experiment_configurations.ods'`	`None`
`directory`	`str`	Path to directory where the experiments will be generated. Default is `'./experiments'`.	`None`
`verbose`	`bool`	Should verbose output with more information given. Default is `False`.	`False`
`copy_operator`	`str`	Define the copy operator for source code files. Either 'shutil' (default) for the python copy function or 'cp' for the linux terminal cp operator. The choice of the 'cp' copy operator was introduced as for some OS systems the 'shutil' did not work under python 3.8.	`'shutil'`

Notes:

Sheets in the configuration ODS file define groups of experiments for which an extra subfolder in the output directory will be generated.

Source code in exputils/manage/experimentgenerator.py

def generate_experiment_files(ods_filepath: Optional[str] = None,
                              directory: Optional[str] = None,
                              extra_files: Optional[list] = None,
                              extra_experiment_files: Optional[list] = None,
                              verbose: bool = False,
                              copy_operator: str = 'shutil'):
    """
    Generates experiments based on a configuration ODS file (LibreOffice Spreadsheet) and template
    source code.

    The configuration ODS has to be in a specific form.
    See `resources/experiment_configurations.ods` for an example file.

    The template source code is usually located in `.\src\exp` for code on the experiment level
    and `.\src\\rep` for code on the repetition level.

    [//]: # (TODO: either remove or document the options for extra-files)

    Parameters:
        ods_filepath (str):
            Path to the ODS configuration file that defines the experiments.
            Default is `'./experiment_configurations.ods'`
        directory (str):
            Path to directory where the experiments will be generated.
            Default is `'./experiments'`.
        verbose (bool):
            Should verbose output with more information given. Default is `False`.
        copy_operator (str):
            Define the copy operator for source code files. Either 'shutil' (default) for the python
            copy function or 'cp' for the linux terminal cp operator. The choice of the 'cp' copy
            operator was introduced as for some OS systems the 'shutil' did not work under python 3.8.

    Notes:

    - Sheets in the configuration ODS file define groups of experiments for which an extra
       subfolder in the output directory will be generated.
    """

    if ods_filepath is None:
        ods_filepath = os.path.join('.', exputils.DEFAULT_ODS_CONFIGURATION_FILE)

    if directory is None:
        directory = os.path.join('.',exputils. DEFAULT_EXPERIMENTS_DIRECTORY)
    elif directory == '':
        directory = '.'

    if verbose:
        print('Load config from {!r} ...'.format(ods_filepath))

    config_data = _load_configuration_data_from_ods(ods_filepath)

    # generate experiment files based on the loaded configurations
    if verbose:
        print('Generate experiments ...'.format(ods_filepath))

    _generate_files_from_config(
        config_data, directory,
        extra_files=extra_files, extra_experiment_files=extra_experiment_files, verbose=verbose, copy_operator=copy_operator
    )

`start_experiments`

Searches all the start scripts of experiments and/or repetitions in the experiments folder and executes them either in parallel or sequentially.

It also documents their execution status (todo, running, finished, error) in a status file that allows it to identify if a script should be executed or not when used again on the same target directory.

Parameters:

Name	Type	Description	Default
`directory`	`str`	Directory in which the start scripts are searched. Default is `'./experiments'`.	`None`
`start_scripts`	`str`	Filename of the start script file that are searched under the given target directory. Can include '' to search for scripts, for example 'run_.py'. The default `'run_*'` will look for all files that start with 'run' and try to start them.	`'run_*.py'`
`parallel`	`(bool, int)`	Defines if scripts should be started in parallel and how many are allowed to run in parallel. If `False` then the scripts are started sequentially one after another. If `True` then the scripts are started and executed in parallel all at once. If an integer, then the number defines how many scripts can run in parallel.	`True`
`is_chdir`	`bool`	Before starting a script, should the main process change to its working directory.	`True`
`verbose`	`bool`	Should verbose output with more information given. Default is `False`.	`False`
`post_start_wait_time`	`float`	Time waited before one process is started after another.	`0.0`
`write_status_files_automatically`	`bool`	Should status files that document if scripts were started and executed be written by the manager. These are important to identify if an experiment or repetition did run already.	`True`

Source code in exputils/manage/experimentstarter.py

def start_experiments(directory: Optional[str] = None,
                      start_scripts: Optional[str] = 'run_*.py',
                      start_command: Optional[str] = '{}',
                      parallel: Union[bool, int] = True,
                      is_chdir: bool = True,
                      verbose: bool = False,
                      post_start_wait_time: float = 0.,
                      write_status_files_automatically: bool = True):
    """
    Searches all the start scripts of experiments and/or repetitions in the experiments folder
    and executes them either in parallel or sequentially.

    It also documents their execution status (todo, running, finished, error) in a status file
    that allows it to identify if a script should be executed or not when used again on the
    same target directory.

    Parameters:
        directory (str):
            Directory in which the start scripts are searched.
            Default is `'./experiments'`.
        start_scripts (str):
            Filename of the start script file that are searched under the given target directory.
            Can include '*' to search for scripts, for example 'run_*.py'.
            The default `'run_*'` will look for all files that start with 'run' and try to start them.
        parallel (bool, int):
            Defines if scripts should be started in parallel and how many are allowed to run in parallel.
            If `False` then the scripts are started sequentially one after another.
            If `True` then the scripts are started and executed in parallel all at once.
            If an integer, then the number defines how many scripts can run in parallel.
        is_chdir (bool):
            Before starting a script, should the main process change to its working directory.
        verbose (bool):
            Should verbose output with more information given. Default is `False`.
        post_start_wait_time (float):
            Time waited before one process is started after another.
        write_status_files_automatically (bool):
            Should status files that document if scripts were started and executed be
            written by the manager. These are important to identify if an experiment or repetition
            did run already.
    """

    if directory is None:
        directory = os.path.join('.', exputils.DEFAULT_EXPERIMENTS_DIRECTORY)

    # handle number of parallel processes
    if isinstance(parallel, bool):
        if parallel:
            n_parallel = np.inf
        else:
            n_parallel = 1
    elif isinstance(parallel, int):
        if parallel <= 0:
            raise ValueError('Number of parallel processes must be larger 0!')
        else:
            n_parallel = parallel
    else:
        raise ValueError('Argument \'parallel\' must be either a bool or an integer number!')

    if is_chdir:
        cwd = os.getcwd()

    # get all scripts
    all_scripts = get_scripts(directory=directory, start_scripts=start_scripts)

    ignored_scripts = []
    todo_scripts = []
    # check their initial status and write one for the scripts that will be started
    for script in all_scripts:

        # lock processing of the script, so that no other running experimentstarter is updating its status in parallel
        with _get_script_lock(script):

            status = get_script_status(script)

            if status is None:
                if write_status_files_automatically:
                    _update_script_status(script, 'todo')
                todo_scripts.append(script)

            elif status.lower() != 'finished':
                todo_scripts.append(script)

            else:
                ignored_scripts.append((script, status))

    # start all in parallel if wanted
    if n_parallel == np.inf:
        n_parallel = len(todo_scripts)

    # started process and their corresponding scripts
    started_processes = []
    started_scripts = []
    finished_processes_idxs = []

    next_todo_script_idx = 0
    n_active_processes = 0

    # run as long as there is an active process or we did not finish all processes yet
    while n_active_processes > 0 or next_todo_script_idx < len(todo_scripts):

        # start as many processes as parallel processes are allowed
        for i in range(n_parallel - n_active_processes):

            # stop starting processes when all scripts are started
            if next_todo_script_idx < len(todo_scripts):

                script = todo_scripts[next_todo_script_idx]
                next_todo_script_idx += 1

                # lock processing of the script, so that no other running experimentstarter is starting it in parallel
                with _get_script_lock(script):

                    # check the script status, only start if needed
                    status = get_script_status(script)
                    if _is_to_start_status(status):

                        if write_status_files_automatically:
                            _update_script_status(script, 'running')

                        # start
                        script_directory = os.path.dirname(script)
                        script_path_in_its_working_directory = os.path.join('.', os.path.basename(script))

                        print('{} start {!r} (previous status: {}) ...'.format(datetime.now().strftime("%Y/%m/%d %H:%M:%S"), script, status))

                        process_environ = {
                            **os.environ,
                            "EU_STATUS_FILE": script_path_in_its_working_directory + STATUS_FILE_EXTENSION,
                        }

                        if is_chdir:
                            os.chdir(script_directory)
                            process = subprocess.Popen(start_command.format(script_path_in_its_working_directory).split(), env=process_environ)
                            os.chdir(cwd)
                        else:
                            process = subprocess.Popen(start_command.format(script).split(), cwd=script_directory, env=process_environ)

                        started_processes.append(process)
                        started_scripts.append(script)

                        if post_start_wait_time > 0:
                            time.sleep(post_start_wait_time)

                    else:
                        # do not start
                        ignored_scripts.append((script, status))

        # check the activity of the started processes
        n_active_processes = 0
        for p_idx, process in enumerate(started_processes):

            if p_idx not in finished_processes_idxs:

                if process.poll() is None:
                    n_active_processes += 1
                else:
                    finished_processes_idxs.append(p_idx)
                    if process.returncode == 0:
                        status = 'finished'
                    else:
                        status = 'error'

                    if write_status_files_automatically:
                        _update_script_status(started_scripts[p_idx], status)

                    print('{} finished {!r} (status: {})'.format(datetime.now().strftime("%Y/%m/%d %H:%M:%S"), started_scripts[p_idx], status))

        if n_active_processes > 0:
            time.sleep(0.5) # sleep half a second before checking again

    if verbose:
        if ignored_scripts:
            print('Ignored scripts:')
            for (script_path, status) in ignored_scripts:
                print('\t- {!r} (status: {})'.format(script_path, status))

Helper

A couple of extra functions exist that can be used to determine how to best start experiments. For example by identifying how many scripts need to be executed and asking a cluster manager to provide to required resources such as the number of cores.

`get_scripts`

Searches all start scripts in the experiments directory.

Parameters:

Name	Type	Description	Default
`directory`	`str`	Directory in which the start scripts are searched. Default is `'./experiments'`.	`None`
`start_scripts`	`str`	Filename of the start script file that are searched under the given target directory. Can include '' to search for scripts, for example 'run_.py'. The default `'run_*'` will look for all files that start with 'run' and try to start them.	`'run_*.py'`

Returns:

Name	Type	Description
`scripts`	`list`	List of filepaths to the start scripts.

Source code in exputils/manage/experimentstarter.py

def get_scripts(directory: Optional[str] = None,
                start_scripts: Optional[str] = 'run_*.py') -> list:
    """
    Searches all start scripts in the experiments directory.

    Parameters:
        directory (str):
            Directory in which the start scripts are searched.
            Default is `'./experiments'`.
        start_scripts (str):
            Filename of the start script file that are searched under the given target directory.
            Can include '*' to search for scripts, for example 'run_*.py'.
            The default `'run_*'` will look for all files that start with 'run' and try to start them.

    Returns:
        scripts (list): List of filepaths to the start scripts.
    """

    if directory is None:
        directory = os.path.join('.', exputils.DEFAULT_EXPERIMENTS_DIRECTORY)

    # find all start scripts
    scripts = glob.iglob(os.path.join(directory, '**', start_scripts), recursive=True)
    scripts = list(scripts)
    scripts.sort()

    return scripts

`get_script_status`

Returns the execution status of a certain start script.

Parameters:

Name	Type	Description	Default
`script_file`	`str`	Path to the script file.	required

Returns:

Name	Type	Description
`status`	`(str, None)`	Status as a string. Usually `'todo'`, `'error'`, `'running'`, or `'finished'`. `None` if no status exists.

Source code in exputils/manage/experimentstarter.py

def get_script_status(script_file: str) -> Optional[str]:
    """
    Returns the execution status of a certain start script.

    Parameters:
        script_file (str): Path to the script file.

    Returns:
        status (str, None):
            Status as a string. Usually `'todo'`, `'error'`, `'running'`, or `'finished'`.
            `None` if no status exists.
    """
    status = None

    status_file_path = script_file + STATUS_FILE_EXTENSION

    if os.path.isfile(status_file_path):
        # read status
        with open(status_file_path, 'r') as f:
            lines = f.read().splitlines()
            if len(lines) > 0:
                status = lines[-1]

    return status

`get_number_of_scripts`

Identifies the number of all scripts in the experiments directory regardless of their execution status.

Parameters:

Name	Type	Description	Default
`directory`	`str`	Directory in which the start scripts are searched. Default is `'./experiments'`.	`None`
`start_scripts`	`str`	Filename of the start script file that are searched under the given target directory. Can include '' to search for scripts, for example 'run_.py'. The default `'run_*'` will look for all files that start with 'run' and try to start them.	`'run_*.py'`

Returns:

Name	Type	Description
`n_scripts`	`int`	Number of scripts.

Source code in exputils/manage/misc.py

def get_number_of_scripts(directory: Optional[str] = None,
                          start_scripts: str = 'run_*.py'):
    """
    Identifies the number of all scripts in the experiments directory regardless of their execution
    status.

    Parameters:
        directory (str):
            Directory in which the start scripts are searched.
            Default is `'./experiments'`.
        start_scripts (str):
            Filename of the start script file that are searched under the given target directory.
            Can include '*' to search for scripts, for example 'run_*.py'.
            The default `'run_*'` will look for all files that start with 'run' and try to start them.

    Returns:
        n_scripts (int): Number of scripts.
    """

    scripts = get_scripts(directory=directory, start_scripts=start_scripts)
    return len(scripts)

`get_number_of_scripts_to_execute`

Identifies the number of scripts that have to be executed in the experiments directory. Scripts that have to be executed have either the status 'none', 'todo', 'error', or 'unfinished'.

Parameters:

Name	Type	Description	Default
`directory`	`str`	Directory in which the start scripts are searched. Default is `'./experiments'`.	`None`
`start_scripts`	`str`	Filename of the start script file that are searched under the given target directory. Can include '' to search for scripts, for example 'run_.py'. The default `'run_*'` will look for all files that start with 'run' and try to start them.	`'run_*.py'`

Returns:

Name	Type	Description
`n_scripts`	`int`	Number of scripts that have to be executed.

Source code in exputils/manage/misc.py

def get_number_of_scripts_to_execute(directory: Optional[str] = None,
                                     start_scripts: str = 'run_*.py') -> int:
    """
    Identifies the number of scripts that have to be executed in the experiments directory.
    Scripts that have to be executed have either the status 'none', 'todo', 'error', or 'unfinished'.

    Parameters:
        directory (str):
            Directory in which the start scripts are searched.
            Default is `'./experiments'`.
        start_scripts (str):
            Filename of the start script file that are searched under the given target directory.
            Can include '*' to search for scripts, for example 'run_*.py'.
            The default `'run_*'` will look for all files that start with 'run' and try to start them.

    Returns:
        n_scripts (int): Number of scripts that have to be executed.
    """

    scripts = get_scripts(directory=directory, start_scripts=start_scripts)

    n = 0
    for script in scripts:
        status = get_script_status(script)
        if _is_to_start_status(status):
            n += 1

    return n

`get_experiments_status`

Returns the status of all scripts for all experiments under a specific directory and summary statistics.

This function only detects the status based on existing '*.status' files. Scripts for which a status file does not exist, are ignored.

Parameters:

Name	Type	Description	Default
`directory`	`str`	Path to directory under which experiments are located. Default is `'./experiments'`.	`None`

Returns:

Name	Type	Description
`script_properies`	`List[ScriptProperties]`	List of scripts and their statuses in form of ScriptProperties. The properties have the following attributes: date, time, experiment_id, repetition_idx, script_path, status.
`statistics`	`ScriptStatistics`	Summary statistics for each status (total, todo, running, finished, error).

Source code in exputils/manage/misc.py

def get_experiments_status(directory: Optional[str] = None,
                           status_file_extension: Optional[str] = None) -> Tuple[List[ScriptProperties], ScriptStatistics]:
    """
    Returns the status of all scripts for all experiments under a specific directory and summary
    statistics.

    This function only detects the status based on existing '*.status' files.
    Scripts for which a status file does not exist, are ignored.

    Arguments:
        directory (str):
            Path to directory under which experiments are located.
            Default is `'./experiments'`.

    Returns:
        script_properies (List[ScriptProperties]):
            List of scripts and their statuses in form of ScriptProperties.
            The properties have the following attributes: date, time, experiment_id, repetition_idx,
            script_path, status.

        statistics (ScriptStatistics):
            Summary statistics for each status (total, todo, running, finished, error).
    """

    if status_file_extension is None:
        status_file_extension = STATUS_FILE_EXTENSION

    #################
    # Walk through the directory to find all status files

    status_files = []
    for dirpath, _, filenames in os.walk(directory):
        for filename in filenames:
            if filename.endswith(status_file_extension):
                status_files.append(os.path.join(dirpath, filename))
    status_files = sorted(status_files)

    ##################
    # create script properties and statistics by going over each status file

    script_properties = []

    statistics = ScriptStatistics()

    # identify and count their statuses
    for file in status_files:
        with (open(file, 'r') as f):
            # read properties from the status file
            lines = f.readlines()
            status_message = lines[-2].strip().split(' ') + lines[-1].strip().split(' ')
            date, time, status = status_message[0], status_message[1], " ".join(status_message[2:]).lower()

            # save properties in respective dataclass
            prop = ScriptProperties()
            prop.date = date
            prop.time = time
            prop.status = status
            prop.script_path = file.replace(status_file_extension, '')

            # detect the experiment id
            # remove part in template that defines the number, for example: {:6d}
            experiment_substr = re.sub('{.*}', '', eu.EXPERIMENT_DIRECTORY_TEMPLATE)
            # search for the experiment sub_directory
            re_match = re.search('/(' + experiment_substr + '\d*)', prop.script_path)
            if re_match is not None:
                prop.experiment_id = re_match.group(1).replace(experiment_substr, '')

            # detect repetition id
            # remove part in template that defines the number, for example: {:6d}
            repetition_substr = re.sub('{.*}', '', eu.REPETITION_DIRECTORY_TEMPLATE)
            # search for the repetition sub_directory
            re_match = re.search('/(' + repetition_substr + '\d*)', prop.script_path)
            if re_match is not None:
                prop.repetition_idx = int(re_match.group(1).replace(repetition_substr, ''))

            script_properties.append(prop)

            # compute the statistics
            if status in ['todo', 'running', 'finished', 'error']:
                cur_stat = statistics.__getattribute__(status)
                statistics.__setattr__(status, cur_stat + 1)
            else:
                # all other status messages are considered as some info message while the script
                # is still running
                statistics.running += 1
            statistics.total += 1

    return script_properties, statistics