errorcodes

class pilot.common.errorcodes.ErrorCodes[source]

Pilot error codes.

Note: Error code numbering is the same as in Pilot 1 since that is expected by the PanDA server and monitor. Note 2: Add error codes as they are needed in other modules. Do not import the full Pilot 1 list at once as there might very well be codes that can be reassigned/removed.

BADALLOC = 1223
BADMEMORYMONITORJSON = 1337
BADQUEUECONFIGURATION = 1341
BADXML = 1247
BLACKHOLE = 1332
CHKSUMNOTSUP = 1242
CHMODTRF = 1143
COMMUNICATIONFAILURE = 1318
CONVERSIONFAILURE = 1302
COREDUMP = 1355
DBRELEASEFAILURE = 1339
EMPTYOUTPUTFILE = 1350
ESFATAL = 1228
ESNOEVENTS = 1238
ESRECOVERABLE = 1224
EXCEEDEDMAXWAITTIME = 1317
EXECUTEDCLONEJOB = 1234
FAILEDBYSERVER = 1236
FILEEXISTS = 1221
FILEHANDLINGFAILURE = 1303
GENERALCPUCALCPROBLEM = 1354
GENERALERROR = 1008
GETADMISMATCH = 1171
GETGLOBUSSYSERR = 1180
GETMD5MISMATCH = 1145
IMAGENOTFOUND = 1360
INTERNALPILOTPROBLEM = 1319
JOBALREADYRUNNING = 1336
JSONRETRIEVALTIMEOUT = 1330
KILLPAYLOAD = 1363
KILLSIGNAL = 1200
LFNTOOLONG = 1190
LOGFILECREATIONFAILURE = 1320
LOOPINGJOB = 1150
MESSAGEHANDLINGFAILURE = 1240
MIDDLEWAREIMPORTFAILURE = 1342
MISSINGCREDENTIALS = 1364
MISSINGINPUTFILE = 1331
MISSINGINSTALLATION = 1211
MISSINGOUTPUTFILE = 1165
MISSINGRELEASEUNPACKED = 1358
MISSINGUSERCODE = 1335
MKDIR = 1199
NFSSQLITE = 1115
NOCTYPES = 1365
NOLOCALSPACE = 1098
NONDETERMINISTICDDM = 1329
NOOUTPUTINJOBREPORT = 1343
NOPAYLOADMETADATA = 1187
NOPROXY = 1163
NORELEASEFOUND = 1244
NOREMOTESPACE = 1333
NOREPLICAS = 1326
NOSOFTWAREDIR = 1186
NOSTORAGE = 1133
NOSTORAGEPROTOCOL = 1313
NOSUCHFILE = 1103
NOSUCHPROCESS = 1353
NOTDEFINED = 1311
NOTIMPLEMENTED = 1300
NOTSAMELENGTH = 1312
NOUSERTARBALL = 1246
NOVOMSPROXY = 1177
OUTPUTFILETOOLARGE = 1124
PANDAKILL = 1144
PANDAQUEUENOTACTIVE = 1359
PAYLOADEXCEEDMAXMEM = 1235
PAYLOADEXECUTIONEXCEPTION = 1310
PAYLOADEXECUTIONFAILURE = 1305
PAYLOADOUTOFMEMORY = 1212
PAYLOADSIGSEGV = 1328
POSTPROCESSFAILURE = 1357
PREPROCESSFAILURE = 1356
PUTADMISMATCH = 1172
PUTGLOBUSSYSERR = 1181
PUTMD5MISMATCH = 1141
QUEUEDATA = 1116
QUEUEDATANOTOK = 1117
REACHEDMAXTIME = 1213
REMOTEFILECOULDNOTBEOPENED = 1361
REPLICANOTFOUND = 1100
RESOURCEUNAVAILABLE = 1344
RUCIOLISTREPLICASFAILED = 1322
RUCIOLOCATIONFAILED = 1321
RUCIOSERVICEUNAVAILABLE = 1316
SERVICENOTAVAILABLE = 1324
SETUPFAILURE = 1110
SETUPFATAL = 1334
SIGBUS = 1206
SIGQUIT = 1202
SIGSEGV = 1203
SIGTERM = 1201
SIGUSR1 = 1207
SIGXCPU = 1204
SINGULARITYBINDPOINTFAILURE = 1308
SINGULARITYFAILEDUSERNAMESPACE = 1345
SINGULARITYGENERALFAILURE = 1306
SINGULARITYIMAGEMOUNTFAILURE = 1309
SINGULARITYNEWUSERNAMESPACE = 1340
SINGULARITYNOLOOPDEVICES = 1307
SINGULARITYNOTINSTALLED = 1325
SINGULARITYRESOURCEUNAVAILABLE = 1348
SIZETOOLARGE = 1168
STAGEINAUTHENTICATIONFAILURE = 1338
STAGEINFAILED = 1099
STAGEINTIMEOUT = 1151
STAGEOUTFAILED = 1137
STAGEOUTTIMEOUT = 1152
STATFILEPROBLEM = 1352
STDOUTTOOBIG = 1106
TRANSFORMNOTFOUND = 1346
TRFDOWNLOADFAILURE = 1149
UNKNOWNCHECKSUMTYPE = 1314
UNKNOWNCOPYTOOL = 1323
UNKNOWNEXCEPTION = 1301
UNKNOWNPAYLOADFAILURE = 1220
UNKNOWNTRFFAILURE = 1315
UNREACHABLENETWORK = 1327
UNRECOGNIZEDTRFARGUMENTS = 1349
UNRECOGNIZEDTRFSTDERR = 1351
UNSUPPORTEDSL5OS = 1347
USERDIRTOOLARGE = 1104
USERKILL = 1205
XRDCPERROR = 1362
ZEROFILESIZE = 1191
__dict__ = mappingproxy({'__module__': 'pilot.common.errorcodes', '__doc__': '\n    Pilot error codes.\n\n    Note: Error code numbering is the same as in Pilot 1 since that is expected by the PanDA server and monitor.\n    Note 2: Add error codes as they are needed in other modules. Do not import the full Pilot 1 list at once as there\n    might very well be codes that can be reassigned/removed.\n    ', 'GENERALERROR': 1008, 'NOLOCALSPACE': 1098, 'STAGEINFAILED': 1099, 'REPLICANOTFOUND': 1100, 'NOSUCHFILE': 1103, 'USERDIRTOOLARGE': 1104, 'STDOUTTOOBIG': 1106, 'SETUPFAILURE': 1110, 'NFSSQLITE': 1115, 'QUEUEDATA': 1116, 'QUEUEDATANOTOK': 1117, 'OUTPUTFILETOOLARGE': 1124, 'NOSTORAGE': 1133, 'STAGEOUTFAILED': 1137, 'PUTMD5MISMATCH': 1141, 'CHMODTRF': 1143, 'PANDAKILL': 1144, 'GETMD5MISMATCH': 1145, 'TRFDOWNLOADFAILURE': 1149, 'LOOPINGJOB': 1150, 'STAGEINTIMEOUT': 1151, 'STAGEOUTTIMEOUT': 1152, 'NOPROXY': 1163, 'MISSINGOUTPUTFILE': 1165, 'SIZETOOLARGE': 1168, 'GETADMISMATCH': 1171, 'PUTADMISMATCH': 1172, 'NOVOMSPROXY': 1177, 'GETGLOBUSSYSERR': 1180, 'PUTGLOBUSSYSERR': 1181, 'NOSOFTWAREDIR': 1186, 'NOPAYLOADMETADATA': 1187, 'LFNTOOLONG': 1190, 'ZEROFILESIZE': 1191, 'MKDIR': 1199, 'KILLSIGNAL': 1200, 'SIGTERM': 1201, 'SIGQUIT': 1202, 'SIGSEGV': 1203, 'SIGXCPU': 1204, 'USERKILL': 1205, 'SIGBUS': 1206, 'SIGUSR1': 1207, 'MISSINGINSTALLATION': 1211, 'PAYLOADOUTOFMEMORY': 1212, 'REACHEDMAXTIME': 1213, 'UNKNOWNPAYLOADFAILURE': 1220, 'FILEEXISTS': 1221, 'BADALLOC': 1223, 'ESRECOVERABLE': 1224, 'ESFATAL': 1228, 'EXECUTEDCLONEJOB': 1234, 'PAYLOADEXCEEDMAXMEM': 1235, 'FAILEDBYSERVER': 1236, 'ESNOEVENTS': 1238, 'MESSAGEHANDLINGFAILURE': 1240, 'CHKSUMNOTSUP': 1242, 'NORELEASEFOUND': 1244, 'NOUSERTARBALL': 1246, 'BADXML': 1247, 'NOTIMPLEMENTED': 1300, 'UNKNOWNEXCEPTION': 1301, 'CONVERSIONFAILURE': 1302, 'FILEHANDLINGFAILURE': 1303, 'PAYLOADEXECUTIONFAILURE': 1305, 'SINGULARITYGENERALFAILURE': 1306, 'SINGULARITYNOLOOPDEVICES': 1307, 'SINGULARITYBINDPOINTFAILURE': 1308, 'SINGULARITYIMAGEMOUNTFAILURE': 1309, 'PAYLOADEXECUTIONEXCEPTION': 1310, 'NOTDEFINED': 1311, 'NOTSAMELENGTH': 1312, 'NOSTORAGEPROTOCOL': 1313, 'UNKNOWNCHECKSUMTYPE': 1314, 'UNKNOWNTRFFAILURE': 1315, 'RUCIOSERVICEUNAVAILABLE': 1316, 'EXCEEDEDMAXWAITTIME': 1317, 'COMMUNICATIONFAILURE': 1318, 'INTERNALPILOTPROBLEM': 1319, 'LOGFILECREATIONFAILURE': 1320, 'RUCIOLOCATIONFAILED': 1321, 'RUCIOLISTREPLICASFAILED': 1322, 'UNKNOWNCOPYTOOL': 1323, 'SERVICENOTAVAILABLE': 1324, 'SINGULARITYNOTINSTALLED': 1325, 'NOREPLICAS': 1326, 'UNREACHABLENETWORK': 1327, 'PAYLOADSIGSEGV': 1328, 'NONDETERMINISTICDDM': 1329, 'JSONRETRIEVALTIMEOUT': 1330, 'MISSINGINPUTFILE': 1331, 'BLACKHOLE': 1332, 'NOREMOTESPACE': 1333, 'SETUPFATAL': 1334, 'MISSINGUSERCODE': 1335, 'JOBALREADYRUNNING': 1336, 'BADMEMORYMONITORJSON': 1337, 'STAGEINAUTHENTICATIONFAILURE': 1338, 'DBRELEASEFAILURE': 1339, 'SINGULARITYNEWUSERNAMESPACE': 1340, 'BADQUEUECONFIGURATION': 1341, 'MIDDLEWAREIMPORTFAILURE': 1342, 'NOOUTPUTINJOBREPORT': 1343, 'RESOURCEUNAVAILABLE': 1344, 'SINGULARITYFAILEDUSERNAMESPACE': 1345, 'TRANSFORMNOTFOUND': 1346, 'UNSUPPORTEDSL5OS': 1347, 'SINGULARITYRESOURCEUNAVAILABLE': 1348, 'UNRECOGNIZEDTRFARGUMENTS': 1349, 'EMPTYOUTPUTFILE': 1350, 'UNRECOGNIZEDTRFSTDERR': 1351, 'STATFILEPROBLEM': 1352, 'NOSUCHPROCESS': 1353, 'GENERALCPUCALCPROBLEM': 1354, 'COREDUMP': 1355, 'PREPROCESSFAILURE': 1356, 'POSTPROCESSFAILURE': 1357, 'MISSINGRELEASEUNPACKED': 1358, 'PANDAQUEUENOTACTIVE': 1359, 'IMAGENOTFOUND': 1360, 'REMOTEFILECOULDNOTBEOPENED': 1361, 'XRDCPERROR': 1362, 'KILLPAYLOAD': 1363, 'MISSINGCREDENTIALS': 1364, 'NOCTYPES': 1365, '_error_messages': {1008: 'General pilot error, consult batch log', 1098: 'Not enough local space', 1099: 'Failed to stage-in file', 1100: 'Replica not found', 1103: 'No such file or directory', 1104: 'User work directory too large', 1106: 'Payload log or stdout file too big', 1110: 'Failed during payload setup', 1115: 'NFS SQLite locking problems', 1116: 'Pilot could not download queuedata', 1117: 'Pilot found non-valid queuedata', 1124: 'Output file too large', 1133: 'Fetching default storage failed: no activity related storage defined', 1137: 'Failed to stage-out file', 1141: 'md5sum mismatch on output file', 1145: 'md5sum mismatch on input file', 1143: 'Failed to chmod transform', 1144: 'This job was killed by panda server', 1165: 'Local output file is missing', 1168: 'Total file size too large', 1149: 'Transform could not be downloaded', 1150: 'Looping job killed by pilot', 1151: 'File transfer timed out during stage-in', 1152: 'File transfer timed out during stage-out', 1163: 'Grid proxy not valid', 1171: 'adler32 mismatch on input file', 1172: 'adler32 mismatch on output file', 1177: 'Voms proxy not valid', 1180: 'Globus system error during stage-in', 1181: 'Globus system error during stage-out', 1186: 'Software directory does not exist', 1187: 'Payload metadata does not exist', 1190: 'LFN too long (exceeding limit of 255 characters)', 1191: 'File size cannot be zero', 1199: 'Failed to create local directory', 1200: 'Job terminated by unknown kill signal', 1201: 'Job killed by signal: SIGTERM', 1202: 'Job killed by signal: SIGQUIT', 1203: 'Job killed by signal: SIGSEGV', 1204: 'Job killed by signal: SIGXCPU', 1207: 'Job killed by signal: SIGUSR1', 1206: 'Job killed by signal: SIGBUS', 1205: 'Job killed by user', 1211: 'Missing installation', 1212: 'Payload ran out of memory', 1213: 'Reached batch system time limit', 1220: 'Job failed due to unknown reason (consult log file)', 1221: 'File already exists', 1223: 'Transform failed due to bad_alloc', 1242: 'Query checksum is not supported', 1244: 'No release candidates found', 1246: 'User tarball could not be downloaded from PanDA server', 1247: 'Badly formed XML', 1224: 'Event service: recoverable error', 1228: 'Event service: fatal error', 1234: 'Clone job is already executed', 1235: 'Payload exceeded maximum allowed memory', 1236: 'Failed by server', 1238: 'Event service: no events', 1240: 'Failed to handle message from payload', 1300: 'The class or function is not implemented', 1301: 'An unknown pilot exception has occurred', 1302: 'Failed to convert object data', 1303: 'Failed during file handling', 1305: 'Failed to execute payload', 1306: 'Singularity: general failure', 1307: 'Singularity: No more available loop devices', 1308: 'Singularity: Not mounting requested bind point', 1309: 'Singularity: Failed to mount image', 1325: 'Singularity: not installed', 1310: 'Exception caught during payload execution', 1311: 'Not defined', 1312: 'Not same length', 1313: 'No protocol defined for storage endpoint', 1314: 'Unknown checksum type', 1315: 'Unknown transform failure', 1316: 'Rucio: Service unavailable', 1317: 'Exceeded maximum waiting time', 1318: 'Failed to communicate with server', 1319: 'An internal Pilot problem has occurred (consult Pilot log)', 1320: 'Failed during creation of log file', 1321: 'Failed to get client location for Rucio', 1322: 'Failed to get replicas from Rucio', 1323: 'Unknown copy tool', 1324: 'Service not available at the moment', 1326: 'No matching replicas were found in list_replicas() output', 1327: 'Unable to stage-in file since network is unreachable', 1328: 'SIGSEGV: Invalid memory reference or a segmentation fault', 1329: 'Failed to construct SURL for non-deterministic ddm (update CRIC)', 1330: 'JSON retrieval timed out', 1331: 'Input file is missing in storage element', 1332: 'Black hole detected in file system (consult Pilot log)', 1333: 'No space left on device', 1334: 'Setup failed with a fatal exception (consult Payload log)', 1335: 'User code not available on PanDA server (resubmit task with --useNewCode)', 1336: 'Job is already running elsewhere', 1337: 'Memory monitor produced bad output', 1338: 'Authentication failure during stage-in', 1339: 'Local DBRelease handling failed (consult Pilot log)', 1340: 'Singularity: Failed invoking the NEWUSER namespace runtime', 1341: 'Bad queue configuration detected', 1342: 'Failed to import middleware (consult Pilot log)', 1343: 'Found no output in job report', 1344: 'Resource temporarily unavailable', 1345: 'Singularity: Failed to create user namespace', 1346: 'Transform not found', 1347: 'Unsupported SL5 OS', 1348: 'Singularity: Resource temporarily unavailable', 1349: 'Unrecognized transform arguments', 1350: 'Empty output file detected', 1351: 'Unrecognized fatal error in transform stderr', 1352: 'Failed to stat proc file for CPU consumption calculation', 1353: 'CPU consumption calculation failed: No such process', 1354: 'General CPU consumption calculation problem (consult Pilot log)', 1355: 'Core dump detected', 1356: 'Pre-process command failed', 1357: 'Post-process command failed', 1358: 'Missing release setup in unpacked container', 1359: 'PanDA queue is not active', 1360: 'Image not found', 1361: 'Remote file could not be opened', 1362: 'Xrdcp was unable to open file', 1363: 'Raythena has decided to kill payload', 1364: 'Unable to locate credentials for S3 transfer', 1365: 'Python module ctypes not available on worker node'}, 'put_error_codes': [1135, 1136, 1137, 1141, 1152, 1181], 'recoverable_error_codes': [0, 1135, 1136, 1137, 1141, 1152, 1181], 'get_kill_signal_error_code': <function ErrorCodes.get_kill_signal_error_code>, 'get_error_message': <function ErrorCodes.get_error_message>, 'add_error_code': <function ErrorCodes.add_error_code>, 'remove_error_code': <function ErrorCodes.remove_error_code>, 'report_errors': <function ErrorCodes.report_errors>, 'resolve_transform_error': <function ErrorCodes.resolve_transform_error>, 'extract_stderr_error': <function ErrorCodes.extract_stderr_error>, 'extract_stderr_warning': <function ErrorCodes.extract_stderr_warning>, 'get_message_for_pattern': <function ErrorCodes.get_message_for_pattern>, 'format_diagnostics': <function ErrorCodes.format_diagnostics>, 'is_recoverable': <classmethod(<function ErrorCodes.is_recoverable>)>, '__dict__': <attribute '__dict__' of 'ErrorCodes' objects>, '__weakref__': <attribute '__weakref__' of 'ErrorCodes' objects>, '__annotations__': {}})
__module__ = 'pilot.common.errorcodes'
__weakref__

list of weak references to the object (if defined)

_error_messages = {1008: 'General pilot error, consult batch log', 1098: 'Not enough local space', 1099: 'Failed to stage-in file', 1100: 'Replica not found', 1103: 'No such file or directory', 1104: 'User work directory too large', 1106: 'Payload log or stdout file too big', 1110: 'Failed during payload setup', 1115: 'NFS SQLite locking problems', 1116: 'Pilot could not download queuedata', 1117: 'Pilot found non-valid queuedata', 1124: 'Output file too large', 1133: 'Fetching default storage failed: no activity related storage defined', 1137: 'Failed to stage-out file', 1141: 'md5sum mismatch on output file', 1143: 'Failed to chmod transform', 1144: 'This job was killed by panda server', 1145: 'md5sum mismatch on input file', 1149: 'Transform could not be downloaded', 1150: 'Looping job killed by pilot', 1151: 'File transfer timed out during stage-in', 1152: 'File transfer timed out during stage-out', 1163: 'Grid proxy not valid', 1165: 'Local output file is missing', 1168: 'Total file size too large', 1171: 'adler32 mismatch on input file', 1172: 'adler32 mismatch on output file', 1177: 'Voms proxy not valid', 1180: 'Globus system error during stage-in', 1181: 'Globus system error during stage-out', 1186: 'Software directory does not exist', 1187: 'Payload metadata does not exist', 1190: 'LFN too long (exceeding limit of 255 characters)', 1191: 'File size cannot be zero', 1199: 'Failed to create local directory', 1200: 'Job terminated by unknown kill signal', 1201: 'Job killed by signal: SIGTERM', 1202: 'Job killed by signal: SIGQUIT', 1203: 'Job killed by signal: SIGSEGV', 1204: 'Job killed by signal: SIGXCPU', 1205: 'Job killed by user', 1206: 'Job killed by signal: SIGBUS', 1207: 'Job killed by signal: SIGUSR1', 1211: 'Missing installation', 1212: 'Payload ran out of memory', 1213: 'Reached batch system time limit', 1220: 'Job failed due to unknown reason (consult log file)', 1221: 'File already exists', 1223: 'Transform failed due to bad_alloc', 1224: 'Event service: recoverable error', 1228: 'Event service: fatal error', 1234: 'Clone job is already executed', 1235: 'Payload exceeded maximum allowed memory', 1236: 'Failed by server', 1238: 'Event service: no events', 1240: 'Failed to handle message from payload', 1242: 'Query checksum is not supported', 1244: 'No release candidates found', 1246: 'User tarball could not be downloaded from PanDA server', 1247: 'Badly formed XML', 1300: 'The class or function is not implemented', 1301: 'An unknown pilot exception has occurred', 1302: 'Failed to convert object data', 1303: 'Failed during file handling', 1305: 'Failed to execute payload', 1306: 'Singularity: general failure', 1307: 'Singularity: No more available loop devices', 1308: 'Singularity: Not mounting requested bind point', 1309: 'Singularity: Failed to mount image', 1310: 'Exception caught during payload execution', 1311: 'Not defined', 1312: 'Not same length', 1313: 'No protocol defined for storage endpoint', 1314: 'Unknown checksum type', 1315: 'Unknown transform failure', 1316: 'Rucio: Service unavailable', 1317: 'Exceeded maximum waiting time', 1318: 'Failed to communicate with server', 1319: 'An internal Pilot problem has occurred (consult Pilot log)', 1320: 'Failed during creation of log file', 1321: 'Failed to get client location for Rucio', 1322: 'Failed to get replicas from Rucio', 1323: 'Unknown copy tool', 1324: 'Service not available at the moment', 1325: 'Singularity: not installed', 1326: 'No matching replicas were found in list_replicas() output', 1327: 'Unable to stage-in file since network is unreachable', 1328: 'SIGSEGV: Invalid memory reference or a segmentation fault', 1329: 'Failed to construct SURL for non-deterministic ddm (update CRIC)', 1330: 'JSON retrieval timed out', 1331: 'Input file is missing in storage element', 1332: 'Black hole detected in file system (consult Pilot log)', 1333: 'No space left on device', 1334: 'Setup failed with a fatal exception (consult Payload log)', 1335: 'User code not available on PanDA server (resubmit task with --useNewCode)', 1336: 'Job is already running elsewhere', 1337: 'Memory monitor produced bad output', 1338: 'Authentication failure during stage-in', 1339: 'Local DBRelease handling failed (consult Pilot log)', 1340: 'Singularity: Failed invoking the NEWUSER namespace runtime', 1341: 'Bad queue configuration detected', 1342: 'Failed to import middleware (consult Pilot log)', 1343: 'Found no output in job report', 1344: 'Resource temporarily unavailable', 1345: 'Singularity: Failed to create user namespace', 1346: 'Transform not found', 1347: 'Unsupported SL5 OS', 1348: 'Singularity: Resource temporarily unavailable', 1349: 'Unrecognized transform arguments', 1350: 'Empty output file detected', 1351: 'Unrecognized fatal error in transform stderr', 1352: 'Failed to stat proc file for CPU consumption calculation', 1353: 'CPU consumption calculation failed: No such process', 1354: 'General CPU consumption calculation problem (consult Pilot log)', 1355: 'Core dump detected', 1356: 'Pre-process command failed', 1357: 'Post-process command failed', 1358: 'Missing release setup in unpacked container', 1359: 'PanDA queue is not active', 1360: 'Image not found', 1361: 'Remote file could not be opened', 1362: 'Xrdcp was unable to open file', 1363: 'Raythena has decided to kill payload', 1364: 'Unable to locate credentials for S3 transfer', 1365: 'Python module ctypes not available on worker node'}
add_error_code(errorcode, pilot_error_codes=[], pilot_error_diags=[], priority=False, msg=None)[source]

Add pilot error code to list of error codes. This function adds the given error code to the list of all errors that have occurred. This is needed since several errors can happen; e.g. a stage-in error can be followed by a stage-out error during the log transfer. The full list of errors is dumped to the log, but only the first error is reported to the server. The function also sets the corresponding error message.

Parameters:
  • errorcode – pilot error code (integer)

  • pilot_error_codes – list of pilot error codes (list of integers)

  • pilot_error_diags – list of pilot error diags (list of strings)

  • priority – if set to True, the new errorcode will be added to the error code list first (highest priority)

  • msg – error message (more detailed) to overwrite standard error message (string).

Returns:

pilot_error_codes, pilot_error_diags

extract_stderr_error(stderr)[source]

Extract the ERROR message from the payload stderr. :param stderr: string. :return: string.

extract_stderr_warning(stderr)[source]

Extract the WARNING message from the payload stderr. :param stderr: string. :return: string.

format_diagnostics(code, diag)[source]

Format the error diagnostics by adding the standard error message and the tail of the longer piloterrordiag. If there is any kind of failure handling the diagnostics string, the standard error description will be returned.

Parameters:
  • code – standard error code (int).

  • diag – dynamic error diagnostics (string).

Returns:

formatted error diagnostics (string).

get_error_message(errorcode)[source]

Return the error message corresponding to the given error code.

Parameters:

errorcode

Returns:

errormessage (string)

get_kill_signal_error_code(signal)[source]

Match a kill signal with a corresponding Pilot error code.

Parameters:

signal – signal name (string).

Returns:

Pilot error code (integer).

get_message_for_pattern(patterns, stderr)[source]
Parameters:
  • patterns – list of patterns.

  • stderr – string.

Returns:

string.

classmethod is_recoverable(code=0)[source]

Determine whether code is a recoverable error code or not.

Parameters:

code – Pilot error code (int).

Returns:

boolean.

put_error_codes = [1135, 1136, 1137, 1141, 1152, 1181]
recoverable_error_codes = [0, 1135, 1136, 1137, 1141, 1152, 1181]
remove_error_code(errorcode, pilot_error_codes=[], pilot_error_diags=[])[source]

Silently remove an error code and its diagnostics from the internal error lists. There is no warning or exception thrown in case the error code is not present in the lists.

Parameters:

errorcode – error code (int).

Returns:

pilot_error_codes, pilot_error_diags

report_errors(pilot_error_codes, pilot_error_diags)[source]

Report all errors that occurred during running. The function should be called towards the end of running a job.

Parameters:
  • pilot_error_codes – list of pilot error codes (list of integers)

  • pilot_error_diags – list of pilot error diags (list of strings)

Returns:

error_report (string)

resolve_transform_error(exit_code, stderr)[source]

Assign a pilot error code to a specific transform error. :param exit_code: transform exit code. :param stderr: transform stderr :return: pilot error code (int)