filehandling

pilot.util.filehandling._define_tabledict_keys(header, fields, separator)[source]

Define the keys for the tabledict dictionary. Note: this function is only used by parse_table_from_file().

Parameters:
  • header – header string.

  • fields – header content string.

  • separator – separator character (char).

Returns:

tabledict (dictionary), keylist (ordered list with dictionary key names).

pilot.util.filehandling.add_to_total_size(path, total_size)[source]

Add the size of file in the given path to the total size of all in/output files.

Parameters:
  • path – path to file (string).

  • total_size – prior total size of all input/output files (long).

Returns:

total size of all input/output files (long).

pilot.util.filehandling.calculate_adler32_checksum(filename)[source]

Calculate the adler32 checksum for the given file. The file is assumed to exist.

Parameters:

filename – file name (string).

Returns:

checksum value (string).

pilot.util.filehandling.calculate_checksum(filename, algorithm='adler32')[source]

Calculate the checksum value for the given file. The default algorithm is adler32. Md5 is also be supported. Valid algorithms are 1) adler32/adler/ad32/ad, 2) md5/md5sum/md.

Parameters:
  • filename – file name (string).

  • algorithm – optional algorithm string.

Raises:

FileHandlingFailure, NotImplementedError – exception raised when file does not exist or for unknown algorithm.

Returns:

checksum value (string).

pilot.util.filehandling.calculate_md5_checksum(filename)[source]

Calculate the md5 checksum for the given file. The file is assumed to exist.

Parameters:

filename – file name (string).

Returns:

checksum value (string).

pilot.util.filehandling.convert(data)[source]

Convert unicode data to utf-8.

Usage examples: 1. Dictionary:

data = {u’Max’: {u’maxRSS’: 3664, u’maxSwap’: 0, u’maxVMEM’: 142260, u’maxPSS’: 1288}, u’Avg’:

{u’avgVMEM’: 94840, u’avgPSS’: 850, u’avgRSS’: 2430, u’avgSwap’: 0}}

convert(data)
{‘Max’: {‘maxRSS’: 3664, ‘maxSwap’: 0, ‘maxVMEM’: 142260, ‘maxPSS’: 1288}, ‘Avg’: {‘avgVMEM’: 94840,

‘avgPSS’: 850, ‘avgRSS’: 2430, ‘avgSwap’: 0}}

  1. String:

data = u’hello’

convert(data)

‘hello’

  1. List:

data = [u’1’,u’2’,’3’]

convert(data)

[‘1’, ‘2’, ‘3’]

Parameters:

data – unicode object to be converted to utf-8

Returns:

converted data to utf-8

pilot.util.filehandling.copy(path1, path2)[source]

Copy path1 to path2.

Parameters:
  • path1 – file path (string).

  • path2 – file path (string).

Raises:

PilotException – FileHandlingFailure, NoSuchFile

Returns:

pilot.util.filehandling.copy_pilot_source(workdir)[source]

Copy the pilot source into the work directory.

Parameters:

workdir – working directory (string).

Returns:

diagnostics (string).

Create a symlink from/to the given paths.

Parameters:
  • from_path – from path (string).

  • to_path – to path (string).

pilot.util.filehandling.dump(path, cmd='cat')[source]

Dump the content of the file in the given path to the log.

Parameters:
  • path – file path (string).

  • cmd – optional command (string).

Returns:

cat (string).

pilot.util.filehandling.establish_logging(debug=True, nopilotlog=False, filename='pilotlog.txt', loglevel=0)[source]

Setup and establish logging.

Option loglevel can be used to decide which (predetermined) logging format to use. Example:

loglevel=0: ‘%(asctime)s | %(levelname)-8s | %(name)-32s | %(funcName)-25s | %(message)s’ loglevel=1: ‘ts=%(asctime)s level=%(levelname)-8s event=%(name)-32s.%(funcName)-25s msg=”%(message)s”’

Parameters:
  • debug – debug mode (Boolean),

  • nopilotlog – True when pilot log is not known (Boolean).

  • filename – name of log file (string).

  • loglevel – selector for logging level (int).

Returns:

pilot.util.filehandling.find_executable(name)[source]

Is the command ‘name’ available locally?

Parameters:

name – command name (string).

Returns:

full path to command if it exists, otherwise empty string.

pilot.util.filehandling.find_last_line(filename)[source]

Find the last line in a (not too large) file.

Parameters:

filename – file name, full path (string).

Returns:

last line (string).

pilot.util.filehandling.find_latest_modified_file(list_of_files)[source]

Find the most recently modified file among the list of given files. In case int conversion of getmtime() fails, int(time.time()) will be returned instead.

Parameters:

list_of_files – list of files with full paths.

Returns:

most recently updated file (string), modification time (int).

pilot.util.filehandling.find_text_files()[source]

Find all non-binary files.

Returns:

list of files.

pilot.util.filehandling.get_checksum_type(checksum)[source]

Return the checksum type (ad32 or md5). The given checksum can be either be a standard ad32 or md5 value, or a dictionary with the format { checksum_type: value } as defined in the FileSpec class. In case the checksum type cannot be identified, the function returns ‘unknown’.

Parameters:

checksum – checksum string or dictionary.

Returns:

checksum type (string).

pilot.util.filehandling.get_checksum_value(checksum)[source]

Return the checksum value. The given checksum might either be a standard ad32 or md5 string, or a dictionary with the format { checksum_type: value } as defined in the FileSpec class. This function extracts the checksum value from this dictionary (or immediately returns the checksum value if the given value is a string).

Parameters:

checksum – checksum object (string or dictionary).

Returns:

checksum. checksum string.

pilot.util.filehandling.get_disk_usage(start_path='.')[source]

Calculate the disk usage of the given directory (including any sub-directories).

Parameters:

start_path – directory (string).

Returns:

disk usage in bytes (int).

pilot.util.filehandling.get_files(pattern='*.log')[source]

Find all files whose names follow the given pattern.

Parameters:

pattern – file name pattern (string).

Returns:

list of files.

pilot.util.filehandling.get_guid()[source]

Generate a GUID using the uuid library. E.g. guid = ‘92008FAF-BE4C-49CF-9C5C-E12BC74ACD19’

Returns:

a random GUID (string)

pilot.util.filehandling.get_local_file_size(filename)[source]

Get the file size of a local file.

Parameters:

filename – file name (string).

Returns:

file size (int).

pilot.util.filehandling.get_nonexistant_path(fname_path)[source]

Get the path to a filename which does not exist by incrementing path.

Parameters:

fname_path – file name path (string).

Returns:

file name path (string).

pilot.util.filehandling.get_pilot_work_dir(workdir)[source]

Return the full path to the main PanDA Pilot work directory. Called once at the beginning of the batch job.

Parameters:

workdir – The full path to where the main work directory should be created

Returns:

The name of main work directory

pilot.util.filehandling.get_table_from_file(filename, header=None, separator='\t', convert_to_float=True)[source]

Extract a table of data from a txt file. E.g. header=”Time VMEM PSS RSS Swap rchar wchar rbytes wbytes” or the first line in the file is Time VMEM PSS RSS Swap rchar wchar rbytes wbytes each of which will become keys in the dictionary, whose corresponding values are stored in lists, with the entries corresponding to the values in the rows of the input file.

The output dictionary will have the format {‘Time’: [ .. data from first row .. ], ‘VMEM’: [.. data from second row], ..}

Parameters:
  • filename – name of input text file, full path (string).

  • header – header string.

  • separator – separator character (char).

  • convert_to_float – boolean, if True, all values will be converted to floats.

Returns:

dictionary.

pilot.util.filehandling.get_valid_path_from_list(paths)[source]

Return the first valid path from the given list.

Parameters:

paths – list of file paths.

Returns:

first valid path from list (string).

pilot.util.filehandling.grep(patterns, file_name)[source]

Search for the patterns in the given list in a file.

Example:

grep([“St9bad_alloc”, “FATAL”], “athena_stdout.txt”) -> [list containing the lines below]

CaloTrkMuIdAlg2.sysExecute() ERROR St9bad_alloc AthAlgSeq.sysExecute() FATAL Standard std::exception is caught

Parameters:
  • patterns – list of regexp patterns.

  • file_name – file name (string).

Returns:

list of matched lines in file.

pilot.util.filehandling.is_json(input_file)[source]

Check if the file is in JSON format. The function reads the first character of the file, and if it is “{” then returns True.

Parameters:

input_file – file name (string)

Returns:

Boolean.

pilot.util.filehandling.locate_file(pattern)[source]

Locate a file defined by the pattern.

Example:

pattern = os.path.join(os.getcwd(), ‘**/core.123’) -> /Users/Paul/Development/python/tt/core.123

Parameters:

pattern – pattern name (string).

Returns:

path (string).

pilot.util.filehandling.mkdirs(workdir, chmod=504)[source]

Create a directory. Perform a chmod if set.

Parameters:
  • workdir – Full path to the directory to be created

  • chmod – chmod code (default 0770) (octal).

Raises:

PilotException – MKDirFailure.

Returns:

pilot.util.filehandling.move(path1, path2)[source]

Move a file from path1 to path2.

Parameters:
  • path1 – source path (string).

  • path2 – destination path2 (string).

pilot.util.filehandling.open_file(filename, mode)[source]

Open and return a file pointer for the given mode. Note: the caller needs to close the file.

Parameters:
  • filename – file name (string).

  • mode – file mode (character).

Raises:

PilotException – FileHandlingFailure.

Returns:

file pointer.

pilot.util.filehandling.read_file(filename, mode='r')[source]

Open, read and close a file. :param filename: file name (string). :param mode: :return: file contents (string).

pilot.util.filehandling.read_json(filename)[source]

Read a dictionary with unicode to utf-8 conversion

Parameters:

filename

Raises:

PilotException – FileHandlingFailure, ConversionFailure

Returns:

json dictionary

pilot.util.filehandling.read_list(filename)[source]

Read a list from a JSON file.

Parameters:

filename – file name (string).

Returns:

list.

pilot.util.filehandling.remove(path)[source]

Remove file. :param path: path to file (string). :return: 0 if successful, -1 if failed (int)

pilot.util.filehandling.remove_core_dumps(workdir, pid=None)[source]

Remove any remaining core dumps so they do not end up in the log tarball

A core dump from the payload process should not be deleted if in debug mode (checked by the called). Also, a found core dump from a non-payload process, should be removed but should result in function returning False.

Parameters:
  • workdir – working directory for payload (string).

  • pid – payload pid (integer).

Returns:

Boolean (True if a payload core dump is found)

pilot.util.filehandling.remove_dir_tree(path)[source]

Remove directory tree. :param path: path to directory (string). :return: 0 if successful, -1 if failed (int)

pilot.util.filehandling.remove_empty_directories(src_dir)[source]

Removal of empty directories in the given src_dir tree. Only empty directories will be removed.

Parameters:

src_dir – directory to be purged of empty directories.

Returns:

pilot.util.filehandling.remove_files(workdir, files)[source]

Remove all given files from workdir.

Parameters:
  • workdir – working directory (string).

  • files – file list.

Returns:

exit code (0 if all went well, -1 otherwise)

pilot.util.filehandling.rmdirs(path)[source]

Remove directory in path.

Parameters:

path – path to directory to be removed (string).

Returns:

Boolean (True if success).

pilot.util.filehandling.scan_file(path, error_messages, warning_message=None)[source]

Scan file for known error messages.

Parameters:
  • path – path to file (string).

  • error_messages – list of error messages.

  • warning_message – optional warning message to printed with any of the error_messages have been found (string).

Returns:

Boolean. (note: True means the error was found)

pilot.util.filehandling.tail(filename, nlines=10)[source]

Return the last n lines of a file. Note: the function uses the posix tail function.

Parameters:
  • filename – name of file to do the tail on (string).

  • nlines – number of lines (int).

Returns:

file tail (str).

pilot.util.filehandling.tar_files(wkdir, excludedfiles, logfile_name, attempt=0)[source]

Tarring of files in given directory.

Parameters:
  • wkdir – work directory (string)

  • excludedfiles – list of files to be excluded from tar operation (list)

  • logfile_name – file name (string)

  • attempt – attempt number (integer)

Returns:

0 if successful, 1 in case of error (int)

pilot.util.filehandling.touch(path)[source]

Touch a file and update mtime in case the file exists. Default to use execute() if case of python problem with appending to non-existant path.

Parameters:

path – full path to file to be touched (string).

Returns:

pilot.util.filehandling.update_extension(path='', extension='')[source]

Update the file name extension to the given extension.

Parameters:
  • path – file path (string).

  • extension – new extension (string).

Returns:

file path with new extension (string).

pilot.util.filehandling.verify_file_list(list_of_files)[source]

Make sure that the files in the given list exist, return the list of files that does exist.

Parameters:

list_of_files – file list.

Returns:

list of existing files.

pilot.util.filehandling.write_file(path, contents, mute=True, mode='w', unique=False)[source]

Write the given contents to a file. If unique=True, then if the file already exists, an index will be added (e.g. ‘out.txt’ -> ‘out-1.txt’) :param path: full path for file (string). :param contents: file contents (object). :param mute: boolean to control stdout info message. :param mode: file mode (e.g. ‘w’, ‘r’, ‘a’, ‘wb’, ‘rb’) (string). :param unique: file must be unique (Boolean). :raises PilotException: FileHandlingFailure. :return: True if successful, otherwise False.

pilot.util.filehandling.write_json(filename, data, sort_keys=True, indent=4, separators=(',', ': '))[source]

Write the dictionary to a JSON file.

param filename:

file name (string).

param data:

object to be written to file (dictionary or list).

param sort_keys:

should entries be sorted? (boolean).

param indent:

indentation level, default 4 (int).

param separators:

field separators (default (‘,’, ‘: ‘) for dictionaries, use e.g. (‘,

‘) for lists) (tuple)
raises PilotException:

FileHandlingFailure.

return:

status (boolean).