Riskscape not finishing slurm job

I am running riskscape using sbatch under the simple linux utility for resource management (slurm). When riskscape has finished I am getting “Complete!” reported in the --progress-indicator file but the process continues running. The slurm job then needs to be cancelled manually.

powellj@mahuika01 /nesi/project/niwa03150/road $ /nesi/project/niwa03670/riskscape_1_5/bin/riskscape --version
[WARNING] The 'beta' plugin is enabled. This contains experimental features that may significantly change or be deprecated in future releases.
RiskScape Core Engine v1.5.0
----------------------------
Build time - Thu Jul 06 14:02:24 UTC 2023
Git SHA1   - e908b62c21539872d670206e73a26b42c871efc5

Plugins:
engine       1.5.0  nz.org.riskscape.engine.core.EnginePlugin
cli          1.5.0  nz.org.riskscape.engine.cli.CliPlugin
defaults     1.5.0  nz.org.riskscape.engine.defaults.Plugin
postgis      1.5.0  nz.org.riskscape.postgis.Plugin
beta         1.5.0  nz.org.riskscape.beta.Plugin
jython       1.5.0  nz.org.riskscape.jython.Plugin
wizard       1.5.0  nz.org.riskscape.wizard.WizardPlugin
cpython      1.5.0  nz.org.riskscape.cpython.CPythonPlugin
wizard-cli   1.5.0  nz.org.riskscape.wizard.WizardCliPlugin

System:
Linux 3.10.0-693.2.2.el7.x86_64
Java 17.0.1 Java HotSpot(TM) 64-Bit Server VM 17.0.1+12-LTS-39

Hallo John,

Can you paste the command you’re using to run RiskScape and the output as well if you can?

Cheers,
Nick

Hey Nick

slurm file is

#!/bin/bash -e
#SBATCH --job-name=RisckscapeRoadModel
#SBATCH --time=00:10:00
#SBATCH --mem=100G
#SBATCH --account=NIWA03150
#SBATCH --cpus-per-task=72

module load Riskscape
export RISKSCAPE_OPTS="-Xmx80g"
region="STHL"
/nesi/project/niwa03670/riskscape_1_5/bin/riskscape --pipeline-threads=72 model run RoadSegmentFloodDamage_0_clipped --progress-indicator=progress_roads0.txt

and the output of progress_roads0.txt is

powellj@mahuika01 /nesi/project/niwa03150/road $ cat progress_roads0.txt
Complete!

Hi John,

It’s possible RiskScape might still be busy generating the QGIS project file (project.qgs). The QGIS project file gets generated after the model pipeline has completed. This process could take a while if you have very large output files, or if RiskScape encounters some problem trying to re-read the output files it just created.

You could try turning off the QGIS project file generation by adding something like this to your settings.ini file:

[output]
qgis-project = false

Cheers,
Tim

Thanks Tim, I have added that to my settings now.

One last output file to add

powellj@mahuika01 /nesi/project/niwa03150/road $ cat slurm-43620011.out
[WARNING] The 'beta' plugin is enabled. This contains experimental features that may significantly change or be deprecated in future releases.
file:/scale_wlg_persistent/filesets/project/niwa03150/road/output/RoadSegmentFloodDamage_0_clipped/2024-02-07T03_21_53/max-loss.csv
file:/scale_wlg_persistent/filesets/project/niwa03150/road/output/RoadSegmentFloodDamage_0_clipped/2024-02-07T03_21_53/average-loss.csv
file:/scale_wlg_persistent/filesets/project/niwa03150/road/output/RoadSegmentFloodDamage_0_clipped/2024-02-07T03_21_53/moredetail_aggregate.csv
file:/scale_wlg_persistent/filesets/project/niwa03150/road/output/RoadSegmentFloodDamage_0_clipped/2024-02-07T03_21_53/aggregate.csv
slurmstepd: error: *** JOB 43620011 ON wbn083 CANCELLED AT 2024-02-07T03:28:20 ***

I manually cancelled the job which is the error at the end.

After adding

[output]
qgis-project = false

I have had one model finish correctly and another that has continued to run after finishing

Hmmmm, I guess you could try running the model with the --log-level INFO CLI option and see what the last thing is that the engine tries to do.

Hi Tim,
Here is the output

powellj@mahuika01 /nesi/project/niwa03150/road $ cat slurm-43630198.out
21:38:14.388 [main] INFO  nz.org.riskscape.engine.cli.Main - RiskScape engine started with 81920 Mb max memory
21:38:14.446 [main] WARN  org.jline - Unable to create a system terminal, creating a dumb terminal (enable debug logging for more information)
21:38:14.465 [main] INFO  n.o.r.e.i18n.ResourceBundleControl - Created resource bundle for base+locale cli-help+ from [file:/scale_wlg_persistent/filesets/project/niwa03670/riskscape_1_5/i18n/cli-help.properties]
21:38:14.467 [main] INFO  n.o.r.e.i18n.ResourceBundleControl - Created resource bundle for base+locale cli-help+en from []
21:38:14.467 [main] INFO  n.o.r.e.i18n.ResourceBundleControl - Created resource bundle for base+locale cli-help+en_US from []
21:38:14.474 [main] INFO  n.o.r.engine.cli.CliBootstrap - Looking for settings file - /home/powellj/.config/riskscape/settings.ini
21:38:14.496 [main] INFO  n.o.r.engine.plugin.PluginRepository - Plugins: looking in: /scale_wlg_persistent/filesets/project/niwa03670/riskscape_1_5/plugins
21:38:14.503 [main] INFO  n.o.r.engine.cli.CliBootstrap - Starting plugins ...
21:38:14.503 [main] INFO  n.o.r.engine.plugin.PluginRepository - Attempting to activate plugin BuiltInPluginDescriptor(engine nz.org.riskscape.engine.core.EnginePlugin)
21:38:14.506 [main] INFO  n.o.r.engine.cli.CliBootstrap -   engine...
21:38:14.506 [main] INFO  n.o.r.engine.plugin.PluginRepository - Attempting to activate plugin BuiltInPluginDescriptor(cli nz.org.riskscape.engine.cli.CliPlugin)
21:38:14.506 [main] INFO  n.o.r.engine.cli.CliBootstrap -   cli...
21:38:14.506 [main] INFO  n.o.r.engine.plugin.PluginRepository - Attempting to activate plugin DefaultPluginDescriptor(source=file:/scale_wlg_persistent/filesets/project/niwa03670/riskscape_1_5/plugins-optional/beta/, pluginId=beta)
21:38:14.507 [main] INFO  n.o.r.engine.plugin.PluginRepository - Activating plugin dependency DefaultPluginDescriptor(source=file:/scale_wlg_persistent/filesets/project/niwa03670/riskscape_1_5/plugins/defaults/, pluginId=defaults)
21:38:14.507 [main] INFO  n.o.r.engine.plugin.PluginRepository - Attempting to activate plugin DefaultPluginDescriptor(source=file:/scale_wlg_persistent/filesets/project/niwa03670/riskscape_1_5/plugins/defaults/, pluginId=defaults)
21:38:14.510 [main] INFO  n.o.r.engine.cli.CliBootstrap -   defaults...
21:38:14.523 [main] INFO  n.o.r.e.d.c.CoverageFileBookmarkResolver - Default coverage tile cache changed from 16.0 to 4096.0 Mb
21:38:14.523 [main] INFO  n.o.r.engine.plugin.PluginRepository - Activating plugin dependency DefaultPluginDescriptor(source=file:/scale_wlg_persistent/filesets/project/niwa03670/riskscape_1_5/plugins/postgis/, pluginId=postgis)
21:38:14.523 [main] INFO  n.o.r.engine.plugin.PluginRepository - Attempting to activate plugin DefaultPluginDescriptor(source=file:/scale_wlg_persistent/filesets/project/niwa03670/riskscape_1_5/plugins/postgis/, pluginId=postgis)
21:38:14.523 [main] INFO  n.o.r.engine.plugin.PluginRepository - Activating plugin dependency DefaultPluginDescriptor(source=file:/scale_wlg_persistent/filesets/project/niwa03670/riskscape_1_5/plugins/defaults/, pluginId=defaults)
21:38:14.523 [main] INFO  n.o.r.engine.plugin.PluginRepository - Attempting to activate plugin DefaultPluginDescriptor(source=file:/scale_wlg_persistent/filesets/project/niwa03670/riskscape_1_5/plugins/defaults/, pluginId=defaults)
21:38:14.523 [main] INFO  n.o.r.engine.plugin.PluginRepository - Not activating DefaultPluginDescriptor(source=file:/scale_wlg_persistent/filesets/project/niwa03670/riskscape_1_5/plugins/defaults/, pluginId=defaults) - already activated
21:38:14.528 [main] INFO  n.o.r.engine.cli.CliBootstrap -   postgis...
21:38:14.529 [main] INFO  n.o.r.engine.cli.CliBootstrap -   beta...
21:38:14.534 [main] INFO  n.o.r.engine.plugin.PluginRepository - Attempting to activate plugin DefaultPluginDescriptor(source=file:/scale_wlg_persistent/filesets/project/niwa03670/riskscape_1_5/plugins/postgis/, pluginId=postgis)
21:38:14.536 [main] INFO  n.o.r.engine.plugin.PluginRepository - Not activating DefaultPluginDescriptor(source=file:/scale_wlg_persistent/filesets/project/niwa03670/riskscape_1_5/plugins/postgis/, pluginId=postgis) - already activated
21:38:14.536 [main] INFO  n.o.r.engine.plugin.PluginRepository - Attempting to activate plugin DefaultPluginDescriptor(source=file:/scale_wlg_persistent/filesets/project/niwa03670/riskscape_1_5/plugins/defaults/, pluginId=defaults)
21:38:14.536 [main] INFO  n.o.r.engine.plugin.PluginRepository - Not activating DefaultPluginDescriptor(source=file:/scale_wlg_persistent/filesets/project/niwa03670/riskscape_1_5/plugins/defaults/, pluginId=defaults) - already activated
21:38:14.536 [main] INFO  n.o.r.engine.plugin.PluginRepository - Attempting to activate plugin DefaultPluginDescriptor(source=file:/scale_wlg_persistent/filesets/project/niwa03670/riskscape_1_5/plugins/jython/, pluginId=jython)
21:38:14.552 [main] INFO  n.o.r.engine.cli.CliBootstrap -   jython...
21:38:14.552 [main] INFO  n.o.r.engine.plugin.PluginRepository - Attempting to activate plugin DefaultPluginDescriptor(source=file:/scale_wlg_persistent/filesets/project/niwa03670/riskscape_1_5/plugins/wizard/, pluginId=wizard)
21:38:14.553 [main] INFO  n.o.r.engine.cli.CliBootstrap -   wizard...
21:38:14.553 [main] INFO  n.o.r.engine.plugin.PluginRepository - Attempting to activate plugin DefaultPluginDescriptor(source=file:/scale_wlg_persistent/filesets/project/niwa03670/riskscape_1_5/plugins/cpython/, pluginId=cpython)
21:38:14.554 [main] INFO  n.o.r.engine.cli.CliBootstrap -   cpython...
21:38:14.556 [reaper] INFO  nz.org.riskscape.cpython.Reaper - CPython process reaper thread starting
21:38:14.557 [main] INFO  n.o.riskscape.cpython.CPythonSpawner - Launching command [python3, /scale_wlg_persistent/filesets/project/niwa03670/riskscape_1_5/plugins/cpython/./checkPython.py]
21:38:14.596 [main] INFO  n.o.r.engine.plugin.PluginRepository - Attempting to activate plugin DefaultPluginDescriptor(source=file:/scale_wlg_persistent/filesets/project/niwa03670/riskscape_1_5/plugins/wizard-cli/, pluginId=wizard-cli)
21:38:14.596 [main] INFO  n.o.r.engine.plugin.PluginRepository - Activating plugin dependency DefaultPluginDescriptor(source=file:/scale_wlg_persistent/filesets/project/niwa03670/riskscape_1_5/plugins/wizard/, pluginId=wizard)
21:38:14.596 [main] INFO  n.o.r.engine.plugin.PluginRepository - Attempting to activate plugin DefaultPluginDescriptor(source=file:/scale_wlg_persistent/filesets/project/niwa03670/riskscape_1_5/plugins/wizard/, pluginId=wizard)
21:38:14.596 [main] INFO  n.o.r.engine.plugin.PluginRepository - Not activating DefaultPluginDescriptor(source=file:/scale_wlg_persistent/filesets/project/niwa03670/riskscape_1_5/plugins/wizard/, pluginId=wizard) - already activated
21:38:14.598 [main] INFO  n.o.r.engine.cli.CliBootstrap -   wizard-cli...
21:38:14.625 [main] INFO  n.o.r.engine.plugin.PluginRepository - Plugin features initialized
21:38:14.628 [main] WARN  org.jline - Unable to create a system terminal, creating a dumb terminal (enable debug logging for more information)
21:38:14.716 [main] INFO  n.o.r.e.i18n.ResourceBundleControl - Created resource bundle for base+locale cli-help+ from [file:/scale_wlg_persistent/filesets/project/niwa03670/riskscape_1_5/i18n/cli-help.properties, jar:file:/scale_wlg_persistent/filesets/project/niwa03670/riskscape_1_5/plugins/wizard/plugin.jar!/i18n/cli-help.properties, jar:file:/scale_wlg_persistent/filesets/project/niwa03670/riskscape_1_5/plugins/wizard-cli/plugin.jar!/i18n/cli-help.properties]
21:38:14.717 [main] INFO  n.o.r.e.i18n.ResourceBundleControl - Created resource bundle for base+locale cli-help+en from []
21:38:14.718 [main] INFO  n.o.r.e.i18n.ResourceBundleControl - Created resource bundle for base+locale cli-help+en_US from []
21:38:15.045 [main] INFO  n.o.r.e.i18n.ResourceBundleControl - Created resource bundle for base+locale problems+ from [file:/scale_wlg_persistent/filesets/project/niwa03670/riskscape_1_5/i18n/problems.properties, jar:file:/scale_wlg_persistent/filesets/project/niwa03670/riskscape_1_5/lib/api.jar!/i18n/problems.properties, jar:file:/scale_wlg_persistent/filesets/project/niwa03670/riskscape_1_5/plugins/defaults/plugin.jar!/i18n/problems.properties, jar:file:/scale_wlg_persistent/filesets/project/niwa03670/riskscape_1_5/plugins/postgis/plugin.jar!/i18n/problems.properties, jar:file:/scale_wlg_persistent/filesets/project/niwa03670/riskscape_1_5/plugins-optional/beta/plugin.jar!/i18n/problems.properties, jar:file:/scale_wlg_persistent/filesets/project/niwa03670/riskscape_1_5/plugins/jython/plugin.jar!/i18n/problems.properties, jar:file:/scale_wlg_persistent/filesets/project/niwa03670/riskscape_1_5/plugins/wizard/plugin.jar!/i18n/problems.properties, jar:file:/scale_wlg_persistent/filesets/project/niwa03670/riskscape_1_5/plugins/cpython/plugin.jar!/i18n/problems.properties]
21:38:15.048 [main] INFO  n.o.r.e.i18n.ResourceBundleControl - Created resource bundle for base+locale problems+en from []
21:38:15.049 [main] INFO  n.o.r.e.i18n.ResourceBundleControl - Created resource bundle for base+locale problems+en_US from []
[WARNING] The 'beta' plugin is enabled. This contains experimental features that may significantly change or be deprecated in future releases.
21:38:15.062 [reaper] INFO  nz.org.riskscape.cpython.Reaper - CPython process reaper thread starting
21:38:16.884 [main] INFO  n.o.r.e.data.relation.LockDefeater - Started lock defeat thread Thread[lock-defeater,5,main]
21:38:16.885 [main] INFO  n.o.r.e.data.relation.LockDefeater - Waiting for task create-datastore to complete
21:38:16.903 [main] INFO  n.o.r.e.data.relation.LockDefeater - ... task create-datastore is complete
21:38:16.904 [main] INFO  n.o.r.e.data.relation.LockDefeater - Waiting for task get-feature-source to complete
21:38:16.914 [main] INFO  n.o.r.e.data.relation.LockDefeater - ... task get-feature-source is complete
21:38:16.914 [main] INFO  n.o.r.e.data.relation.LockDefeater - Waiting for task get-crs to complete
21:38:17.134 [main] INFO  n.o.r.e.data.relation.LockDefeater - ... task get-crs is complete
21:38:17.317 [main] INFO  n.o.r.e.data.relation.LockDefeater - Waiting for task create-datastore to complete
21:38:17.318 [main] INFO  n.o.r.e.data.relation.LockDefeater - ... task create-datastore is complete
21:38:17.318 [main] INFO  n.o.r.e.data.relation.LockDefeater - Waiting for task get-feature-source to complete
21:38:17.318 [main] INFO  n.o.r.e.data.relation.LockDefeater - ... task get-feature-source is complete
21:38:17.318 [main] INFO  n.o.r.e.data.relation.LockDefeater - Waiting for task get-crs to complete
21:38:17.323 [main] INFO  n.o.r.e.data.relation.LockDefeater - ... task get-crs is complete
21:38:17.333 [main] INFO  n.o.r.e.data.relation.LockDefeater - Waiting for task create-datastore to complete
21:38:17.333 [main] INFO  n.o.r.e.data.relation.LockDefeater - ... task create-datastore is complete
21:38:17.333 [main] INFO  n.o.r.e.data.relation.LockDefeater - Waiting for task get-feature-source to complete
21:38:17.334 [main] INFO  n.o.r.e.data.relation.LockDefeater - ... task get-feature-source is complete
21:38:17.334 [main] INFO  n.o.r.e.data.relation.LockDefeater - Waiting for task get-crs to complete
21:38:17.337 [main] INFO  n.o.r.e.data.relation.LockDefeater - ... task get-crs is complete
21:38:17.340 [main] INFO  n.o.r.e.data.relation.LockDefeater - Waiting for task create-datastore to complete
21:38:17.341 [main] INFO  n.o.r.e.data.relation.LockDefeater - ... task create-datastore is complete
21:38:17.341 [main] INFO  n.o.r.e.data.relation.LockDefeater - Waiting for task get-feature-source to complete
21:38:17.341 [main] INFO  n.o.r.e.data.relation.LockDefeater - ... task get-feature-source is complete
21:38:17.341 [main] INFO  n.o.r.e.data.relation.LockDefeater - Waiting for task get-crs to complete
21:38:17.344 [main] INFO  n.o.r.e.data.relation.LockDefeater - ... task get-crs is complete
21:38:17.353 [main] INFO  n.o.r.engine.DefaultFunctionResolver - function ['sample_centroid' type=IDENTIFIER (10 : 15-30)] (Coercing(nz.org.riskscape.engine.function.geometry.SampleCoverageAtCentroid$1@1fc1c7e)) returned a struct which could not be normalized.  RiskscapeFunction implementations should normalize any struct return types to avoid possible struct member owner errors
21:38:17.356 [main] INFO  n.o.r.engine.DefaultFunctionResolver - function ['sample_centroid' type=IDENTIFIER (12 : 15-30)] (Coercing(nz.org.riskscape.engine.function.geometry.SampleCoverageAtCentroid$1@75cacb3e)) returned a struct which could not be normalized.  RiskscapeFunction implementations should normalize any struct return types to avoid possible struct member owner errors
21:38:17.531 [scheduler-thread] INFO  n.o.r.e.data.relation.LockDefeater - Waiting for task build-fs-from-query to complete
21:38:17.543 [scheduler-thread] INFO  n.o.r.e.data.relation.LockDefeater - ... task build-fs-from-query is complete
21:38:17.546 [scheduler-thread] INFO  n.o.r.e.data.relation.LockDefeater - Waiting for task build-fs-from-query to complete
21:38:17.547 [scheduler-thread] INFO  n.o.r.e.data.relation.LockDefeater - ... task build-fs-from-query is complete
21:38:17.547 [scheduler-thread] INFO  n.o.r.e.data.relation.LockDefeater - Waiting for task build-fs-from-query to complete
21:38:17.548 [scheduler-thread] INFO  n.o.r.e.data.relation.LockDefeater - ... task build-fs-from-query is complete
21:38:17.549 [scheduler-thread] INFO  n.o.r.e.data.relation.LockDefeater - Waiting for task build-fs-from-query to complete
21:38:17.550 [scheduler-thread] INFO  n.o.r.e.data.relation.LockDefeater - ... task build-fs-from-query is complete
21:38:17.583 [scheduler-thread] INFO  n.o.riskscape.engine.sched.Scheduler - Added 1249 new tasks
21:38:17.590 [scheduler-thread] INFO  n.o.riskscape.engine.sched.Scheduler - Completed step(s) hazard_input:[input]
21:38:17.628 [scheduler-thread] INFO  n.o.r.e.data.relation.LockDefeater - Waiting for task close to complete
21:38:17.628 [scheduler-thread] INFO  n.o.r.e.data.relation.LockDefeater - ... task close is complete
21:38:17.629 [scheduler-thread] INFO  n.o.riskscape.engine.sched.Scheduler - Completed step(s) exposures_input:[input]
21:38:17.773 [scheduler-thread] INFO  n.o.r.e.data.relation.LockDefeater - Waiting for task close to complete
21:38:17.773 [scheduler-thread] INFO  n.o.r.e.data.relation.LockDefeater - ... task close is complete
21:38:17.773 [scheduler-thread] INFO  n.o.riskscape.engine.sched.Scheduler - Completed step(s) resources_input:[input]
21:38:17.802 [scheduler-thread] INFO  n.o.riskscape.engine.sched.Scheduler - Completed step(s) resources:[group]
21:38:17.803 [scheduler-thread] INFO  n.o.riskscape.engine.sched.Scheduler - marking AccumulatorProcessorTask(resources:[group])'s dependency on SinkTask(resources:[group]) as satisfied
21:38:17.804 [execution-worker-1] INFO  n.o.r.e.t.AccumulatorProcessorTask - About to reduce 72 accumulators...
21:38:17.807 [scheduler-thread] INFO  n.o.r.e.data.relation.LockDefeater - Waiting for task close to complete
21:38:17.807 [scheduler-thread] INFO  n.o.r.e.data.relation.LockDefeater - ... task close is complete
21:38:17.807 [scheduler-thread] INFO  n.o.riskscape.engine.sched.Scheduler - Completed step(s) areas_input:[input]
21:38:17.811 [execution-worker-1] INFO  n.o.r.e.t.AccumulatorProcessorTask -    ...done
21:38:17.819 [scheduler-thread] INFO  n.o.riskscape.engine.sched.Scheduler - Completed step(s) resources:[group]
21:38:17.823 [scheduler-thread] INFO  n.o.riskscape.engine.sched.Scheduler - Completed step(s) areas:[group]
21:38:17.823 [scheduler-thread] INFO  n.o.riskscape.engine.sched.Scheduler - marking AccumulatorProcessorTask(areas:[group])'s dependency on SinkTask(areas:[group]) as satisfied
21:38:17.823 [execution-worker-1] INFO  n.o.r.e.t.AccumulatorProcessorTask - About to reduce 72 accumulators...
21:38:17.824 [scheduler-thread] INFO  n.o.riskscape.engine.sched.Scheduler - Completed step(s) exposures_join_resources:[join]
21:38:17.824 [execution-worker-1] INFO  n.o.r.e.t.AccumulatorProcessorTask -    ...done
21:38:17.824 [scheduler-thread] INFO  n.o.riskscape.engine.sched.Scheduler - marking IndexEmitterTask(exposures_join_resources:[join])'s dependency on IndexBuilderTask(exposures_join_resources:[join]) as satisfied
21:38:17.833 [scheduler-thread] INFO  n.o.riskscape.engine.sched.Scheduler - Completed step(s) areas:[group]
21:38:17.846 [scheduler-thread] INFO  n.o.riskscape.engine.sched.Scheduler - Completed step(s) exposures_join_areas:[join]
21:38:17.847 [scheduler-thread] INFO  n.o.riskscape.engine.sched.Scheduler - marking IndexEmitterTask(exposures_join_areas:[join])'s dependency on IndexBuilderTask(exposures_join_areas:[join]) as satisfied
21:38:18.050 [scheduler-thread] INFO  n.o.riskscape.engine.sched.Scheduler - Completed step(s) select:[select], unnest:[unnest], select_1:[select], select_2:[select], exposure_geoprocessed:[select], select_3:[select]
21:38:18.912 [scheduler-thread] INFO  n.o.riskscape.engine.sched.Scheduler - Completed step(s) hazard_and_coverage:[select]
21:38:18.912 [scheduler-thread] INFO  n.o.riskscape.engine.sched.Scheduler - Completed step(s) exposures_join_hazards:[join]
21:38:18.913 [scheduler-thread] INFO  n.o.riskscape.engine.sched.Scheduler - marking IndexEmitterTask(exposures_join_hazards:[join])'s dependency on IndexBuilderTask(exposures_join_hazards:[join]) as satisfied
21:38:33.802 [scheduler-thread] INFO  n.o.r.e.data.relation.LockDefeater - Waiting for task close to complete
21:38:33.803 [scheduler-thread] INFO  n.o.r.e.data.relation.LockDefeater - ... task close is complete
21:38:33.803 [scheduler-thread] INFO  n.o.riskscape.engine.sched.Scheduler - Completed step(s) resources_input2:[input]
21:38:33.818 [scheduler-thread] INFO  n.o.riskscape.engine.sched.Scheduler - Completed step(s) resource2:[group]
21:38:33.818 [scheduler-thread] INFO  n.o.riskscape.engine.sched.Scheduler - marking AccumulatorProcessorTask(resource2:[group])'s dependency on SinkTask(resource2:[group]) as satisfied
21:38:33.819 [execution-worker-1] INFO  n.o.r.e.t.AccumulatorProcessorTask - About to reduce 72 accumulators...
21:38:33.822 [execution-worker-1] INFO  n.o.r.e.t.AccumulatorProcessorTask -    ...done
21:38:33.846 [scheduler-thread] INFO  n.o.riskscape.engine.sched.Scheduler - Completed step(s) resource2:[group]
21:38:33.847 [scheduler-thread] INFO  n.o.riskscape.engine.sched.Scheduler - Completed step(s) exposures_join_resources2:[join]
21:38:33.847 [scheduler-thread] INFO  n.o.riskscape.engine.sched.Scheduler - marking IndexEmitterTask(exposures_join_resources2:[join])'s dependency on IndexBuilderTask(exposures_join_resources2:[join]) as satisfied
21:39:06.259 [execution-worker-56] INFO  n.o.riskscape.cpython.CPythonSpawner - Launching command [python3, /scale_wlg_persistent/filesets/project/niwa03670/riskscape_1_5/plugins/cpython/./rs2CPyAdaptor.py, /scale_wlg_persistent/filesets/project/niwa03150/road/Function/RoadSegmentDamageCFunc.py, Struct[DR_Low=>Floating, DR_High=>Floating, DR=>Floating, SegmentCost=>Floating], Struct[ONRC=>Nullable[Text], Road_ID=>Nullable[Integer], segment_area=>Nullable[Floating]], Nullable[Floating], Struct[LowVel=>Nullable[Integer]], Struct[gridcode=>Nullable[Integer]]]
21:39:07.104 [scheduler-thread] INFO  n.o.riskscape.engine.sched.Scheduler - Completed step(s) exposures_join_resources:[join]
21:39:32.995 [scheduler-thread] INFO  n.o.riskscape.engine.sched.Scheduler - Completed step(s) select_4:[select]
21:39:51.369 [scheduler-thread] INFO  n.o.riskscape.engine.sched.Scheduler - Completed step(s) exposures_join_resources2:[join]
21:40:03.554 [execution-worker-5] INFO  n.o.riskscape.cpython.CPythonSpawner - Launching command [python3, /scale_wlg_persistent/filesets/project/niwa03670/riskscape_1_5/plugins/cpython/./rs2CPyAdaptor.py, /scale_wlg_persistent/filesets/project/niwa03150/road/Function/RoadSegmentDamageCFunc.py, Struct[DR_Low=>Floating, DR_High=>Floating, DR=>Floating, SegmentCost=>Floating], Struct[ONRC=>Nullable[Text], Road_ID=>Nullable[Integer], segment_area=>Nullable[Floating]], Nullable[Floating], Struct[LowVel=>Nullable[Integer]], Struct[gridcode=>Nullable[Integer]]]
21:40:03.554 [execution-worker-8] INFO  n.o.riskscape.cpython.CPythonSpawner - Launching command [python3, /scale_wlg_persistent/filesets/project/niwa03670/riskscape_1_5/plugins/cpython/./rs2CPyAdaptor.py, /scale_wlg_persistent/filesets/project/niwa03150/road/Function/RoadSegmentDamageCFunc.py, Struct[DR_Low=>Floating, DR_High=>Floating, DR=>Floating, SegmentCost=>Floating], Struct[ONRC=>Nullable[Text], Road_ID=>Nullable[Integer], segment_area=>Nullable[Floating]], Nullable[Floating], Struct[LowVel=>Nullable[Integer]], Struct[gridcode=>Nullable[Integer]]]
21:40:06.604 [scheduler-thread] INFO  n.o.riskscape.engine.sched.Scheduler - Completed step(s) select_5:[select]
21:40:06.607 [scheduler-thread] INFO  n.o.riskscape.engine.sched.Scheduler - Completed step(s) exposures_join_areas:[join]
21:40:06.610 [scheduler-thread] INFO  n.o.riskscape.engine.sched.Scheduler - Completed step(s) select_6:[select], exposures:[select]
21:40:06.612 [scheduler-thread] INFO  n.o.riskscape.engine.sched.Scheduler - Completed step(s) exposures_join_hazards:[join]
21:40:06.614 [scheduler-thread] INFO  n.o.riskscape.engine.sched.Scheduler - Completed step(s) select_7:[select], sampled:[select], select_8:[select], analysis:[select]
21:40:06.621 [scheduler-thread] INFO  n.o.riskscape.engine.sched.Scheduler - Completed step(s) select_9:[select], select_9-sink:[select]
21:40:06.622 [scheduler-thread] INFO  n.o.riskscape.engine.sched.Scheduler - marking AccumulatorProcessorTask(group:[group])'s dependency on ChainTask(select_9:[select], select_9-sink:[select]) as satisfied
21:40:06.622 [execution-worker-1] INFO  n.o.r.e.t.AccumulatorProcessorTask - About to reduce 72 accumulators...
21:40:06.622 [scheduler-thread] INFO  n.o.riskscape.engine.sched.Scheduler - Completed step(s) select_10:[select], select_10-sink:[select]
21:40:06.622 [scheduler-thread] INFO  n.o.riskscape.engine.sched.Scheduler - marking AccumulatorProcessorTask(group_1:[group])'s dependency on ChainTask(select_10:[select], select_10-sink:[select]) as satisfied
21:40:06.622 [execution-worker-2] INFO  n.o.r.e.t.AccumulatorProcessorTask - About to reduce 72 accumulators...
21:40:06.626 [execution-worker-2] INFO  n.o.r.e.t.AccumulatorProcessorTask -    ...done
21:40:06.626 [scheduler-thread] INFO  n.o.riskscape.engine.sched.Scheduler - Completed step(s) group_1:[group]
21:40:06.626 [scheduler-thread] INFO  n.o.riskscape.engine.sched.Scheduler - Completed step(s) aggregate_region:[sort]
21:40:06.626 [scheduler-thread] INFO  n.o.riskscape.engine.sched.Scheduler - marking AccumulatorProcessorTask(aggregate_region:[sort])'s dependency on SinkTask(aggregate_region:[sort]) as satisfied
21:40:06.627 [scheduler-thread] INFO  n.o.riskscape.engine.sched.Scheduler - Completed step(s) aggregate_region:[sort]
21:40:06.628 [execution-worker-1] INFO  n.o.r.e.t.AccumulatorProcessorTask -    ...done
21:40:06.628 [scheduler-thread] INFO  n.o.riskscape.engine.sched.Scheduler - Completed step(s) group:[group]
21:40:06.629 [scheduler-thread] INFO  n.o.riskscape.engine.sched.Scheduler - Completed step(s) save_1:[save]
21:40:06.630 [scheduler-thread] INFO  n.o.riskscape.engine.sched.Scheduler - Completed step(s) event_loss_table:[select], event_loss_table-sink:[select]
21:40:06.630 [scheduler-thread] INFO  n.o.riskscape.engine.sched.Scheduler - marking AccumulatorProcessorTask(group_2:[group])'s dependency on ChainTask(event_loss_table:[select], event_loss_table-sink:[select]) as satisfied
21:40:06.630 [scheduler-thread] INFO  n.o.riskscape.engine.sched.Scheduler - marking AccumulatorProcessorTask(max:[group])'s dependency on ChainTask(event_loss_table:[select], event_loss_table-sink:[select]) as satisfied
21:40:06.630 [execution-worker-1] INFO  n.o.r.e.t.AccumulatorProcessorTask - About to reduce 43 accumulators...
21:40:06.630 [execution-worker-2] INFO  n.o.r.e.t.AccumulatorProcessorTask - About to reduce 43 accumulators...
21:40:06.630 [scheduler-thread] INFO  n.o.riskscape.engine.sched.Scheduler - Completed step(s) sort:[sort]
21:40:06.630 [execution-worker-1] INFO  n.o.r.e.t.AccumulatorProcessorTask -    ...done
21:40:06.630 [scheduler-thread] INFO  n.o.riskscape.engine.sched.Scheduler - marking AccumulatorProcessorTask(sort:[sort])'s dependency on SinkTask(sort:[sort]) as satisfied
21:40:06.631 [execution-worker-2] INFO  n.o.r.e.t.AccumulatorProcessorTask -    ...done
21:40:06.631 [scheduler-thread] INFO  n.o.riskscape.engine.sched.Scheduler - Completed step(s) sort:[sort]
21:40:06.631 [scheduler-thread] INFO  n.o.riskscape.engine.sched.Scheduler - Completed step(s) max:[group]
21:40:06.631 [scheduler-thread] INFO  n.o.riskscape.engine.sched.Scheduler - Completed step(s) save_3:[save]
21:40:06.632 [scheduler-thread] INFO  n.o.riskscape.engine.sched.Scheduler - Completed step(s) save:[save]
21:40:06.633 [scheduler-thread] INFO  n.o.riskscape.engine.sched.Scheduler - Completed step(s) group_2:[group]
21:40:06.633 [scheduler-thread] INFO  n.o.riskscape.engine.sched.Scheduler - Completed step(s) max_loss_join:[join]
21:40:06.633 [scheduler-thread] INFO  n.o.riskscape.engine.sched.Scheduler - marking IndexEmitterTask(max_loss_join:[join])'s dependency on IndexBuilderTask(max_loss_join:[join]) as satisfied
21:40:06.634 [scheduler-thread] INFO  n.o.riskscape.engine.sched.Scheduler - Completed step(s) max_loss_join:[join]
21:40:06.635 [scheduler-thread] INFO  n.o.riskscape.engine.sched.Scheduler - Completed step(s) save_2:[save]
file:/scale_wlg_persistent/filesets/project/niwa03150/road/output/RoadSegmentFloodDamage_0_clipped/2024-02-07T21_38_17/max-loss.csv
file:/scale_wlg_persistent/filesets/project/niwa03150/road/output/RoadSegmentFloodDamage_0_clipped/2024-02-07T21_38_17/average-loss.csv
file:/scale_wlg_persistent/filesets/project/niwa03150/road/output/RoadSegmentFloodDamage_0_clipped/2024-02-07T21_38_17/moredetail_aggregate.csv
file:/scale_wlg_persistent/filesets/project/niwa03150/road/output/RoadSegmentFloodDamage_0_clipped/2024-02-07T21_38_17/aggregate.csv

Potentially relevant. I had what looks like an IO error a few days ago and this is maybe the second or third I have had (in the 3 or so years I have been using this machine). I wonder if riskscape is unable to close a file descriptor because of an issue with the IO and is then hanging waiting

[WARNING] The 'beta' plugin is enabled. This contains experimental features that may significantly change or be deprecated in future releases.
03:16:34.577 [reaper] WARN  n.o.riskscape.cpython.CPythonSpawner - Non-zero exit status (137) from Cpython process CPythonProcess(script=file:///scale_wlg_persistent/filesets/project/niwa03150/road/Function/RoadSegmentDamageCFunc.py alive?=false)
03:16:34.660 [reaper] WARN  n.o.riskscape.cpython.CPythonSpawner - Non-zero exit status (137) from Cpython process CPythonProcess(script=file:///scale_wlg_persistent/filesets/project/niwa03150/road/Function/RoadSegmentDamageCFunc.py alive?=false)
Problems found with pipeline model
  - Execution of your data processing pipeline failed. The reasons for this follow:
    - Failed to evaluate `{*, consequence: map(hazard, hv -> road_segment_flood_damage(exposure, hv, resource, resource2))}`
      - Problems found with 'road_segment_flood_damage' function (from source file:/scale_wlg_persistent/filesets/project/niwa03150/road/Function/RoadSegmentDamageCFunc.py)
        - java.io.IOException: Stream closed
/nesi/project/niwa03670/riskscape_1_5/bin/riskscape: line 171: 118049 Killed