Skip to content

Commit

Permalink
Merge pull request #746 from lanl/proc_leak_fix
Browse files Browse the repository at this point in the history
Fix issues with leaking gdb processes and add to unit test
  • Loading branch information
pagrubel authored Dec 5, 2023
2 parents f53a3e6 + 4aea980 commit 9f2f631
Show file tree
Hide file tree
Showing 11 changed files with 199 additions and 4 deletions.
1 change: 1 addition & 0 deletions beeflow/data/cwl/bee_workflows/cat-grep-fail/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
This workflow is designed to fail on purpose, so that we can test that case
17 changes: 17 additions & 0 deletions beeflow/data/cwl/bee_workflows/cat-grep-fail/cat.cwl
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
#!/usr/bin/env cwl-runner

cwlVersion: v1.0
class: CommandLineTool
baseCommand: cat
stdout: cat.txt
stderr: cat.err
inputs:
input_file:
type: File
inputBinding:
position: 1
outputs:
contents:
type: stdout
cat_stderr:
type: stderr
18 changes: 18 additions & 0 deletions beeflow/data/cwl/bee_workflows/cat-grep-fail/grep0.cwl
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
#!/usr/bin/env cwl-runner

cwlVersion: v1.0
class: CommandLineTool
baseCommand: grep
stdout: occur0.txt
inputs:
word:
type: string
inputBinding:
position: 1
text_file:
type: File
inputBinding:
position: 2
outputs:
occur:
type: stdout
18 changes: 18 additions & 0 deletions beeflow/data/cwl/bee_workflows/cat-grep-fail/grep1.cwl
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
#!/usr/bin/env cwl-runner

cwlVersion: v1.0
class: CommandLineTool
baseCommand: grep
stdout: occur1.txt
inputs:
word:
type: string
inputBinding:
position: 1
text_file:
type: File
inputBinding:
position: 2
outputs:
occur:
type: stdout
4 changes: 4 additions & 0 deletions beeflow/data/cwl/bee_workflows/cat-grep-fail/input.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
input_file: lorem.txt
word0: Vivamus
word1: pulvinar
tarball_fname: out.tgz
49 changes: 49 additions & 0 deletions beeflow/data/cwl/bee_workflows/cat-grep-fail/lorem.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Ut ut metus urna.
Morbi tortor libero, suscipit hendrerit lacus et, condimentum porta orci.
Maecenas vulputate lectus lorem, ac eleifend massa consectetur finibus. Integer
vitae sem sit amet quam pharetra consectetur vel vel augue. Vivamus finibus
metus mauris, sed ultricies purus placerat id. Vivamus eget auctor dui. Cras
viverra rutrum neque, eu imperdiet ante sagittis ac. Donec iaculis, lacus sit
amet mollis viverra, ipsum tortor luctus metus, sit amet dignissim orci massa
non lectus. Nullam vitae dui placerat, condimentum nibh eget, dignissim erat.
Phasellus gravida pretium facilisis. Phasellus pharetra mattis risus nec
imperdiet.

Fusce facilisis finibus dolor. Nunc et posuere ante. Praesent blandit
vestibulum egestas. Maecenas auctor nulla maximus tortor condimentum luctus.
Orci varius natoque penatibus et magnis dis parturient montes, nascetur
ridiculus mus. Quisque scelerisque turpis sed aliquet blandit. Praesent eget
lobortis urna. Ut aliquam enim sit amet est ornare, sit amet viverra est
tempus. Nunc blandit sollicitudin enim vel cursus. Fusce sit amet tincidunt
eros. Nulla iaculis, est vitae consectetur imperdiet, ante urna aliquam libero,
id vehicula tortor justo a dolor. Vestibulum id faucibus nisi, nec vehicula
lacus.

Duis feugiat consequat quam eu lobortis. Aliquam erat volutpat. Integer egestas
justo sit amet dui malesuada ullamcorper. Quisque tincidunt lacinia purus, id
facilisis risus convallis id. Fusce ex ligula, consectetur ac dui sed, blandit
suscipit diam. Vivamus sit amet porta dui, nec faucibus arcu. Phasellus et
mauris eu elit molestie pulvinar eget ut orci. Quisque quam est, varius vitae
tristique luctus, tincidunt non lectus. Donec luctus molestie ex. Morbi dui
arcu, rhoncus volutpat felis ut, ultricies vestibulum ligula. Ut metus ex,
mollis eget lacus sed, venenatis aliquam purus. Aenean id imperdiet tortor.

Phasellus mollis vulputate libero. In erat sapien, tempor nec libero ac, tempor
lacinia velit. Curabitur vehicula, arcu eu mollis ultricies, ex ex rutrum
risus, id accumsan eros lectus at velit. In sem erat, sagittis pulvinar porta
et, pulvinar nec ligula. Ut rhoncus lorem vulputate aliquet cursus. Cras
efficitur erat posuere, faucibus mi et, tincidunt quam. Suspendisse eleifend ac
justo ac fermentum. Nulla in lorem nec neque lacinia pharetra. Donec eget elit
id magna mollis interdum. Vivamus pellentesque diam volutpat sollicitudin
mattis. Interdum et malesuada fames ac ante ipsum primis in faucibus. Nulla
aliquam tellus porta sem venenatis imperdiet. Nunc pretium lorem sit amet ipsum
aliquam, sed dapibus turpis consequat.

Aliquam elementum rhoncus placerat. Praesent augue urna, euismod sit amet
dignissim nec, ultricies nec dui. Vestibulum ante ipsum primis in faucibus orci
luctus et ultrices posuere cubilia curae; Nulla lobortis tincidunt leo.
Maecenas finibus quam mauris, a molestie nisi sodales id. Morbi mollis, libero
id tristique ultrices, purus leo faucibus lectus, in consectetur justo eros ut
risus. Integer aliquam fermentum elit, id lacinia turpis tempus et. Donec vitae
sem lobortis eros vulputate blandit nec vel leo. Nullam posuere aliquet dui non
mattis.
24 changes: 24 additions & 0 deletions beeflow/data/cwl/bee_workflows/cat-grep-fail/tar.cwl
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
#!/usr/bin/env cwl-runner

cwlVersion: v1.0
class: CommandLineTool
baseCommand: tar-not-a-real-command-for-failure
inputs:
tarball_fname:
type: string
inputBinding:
position: 1
prefix: -cf
file0:
type: File
inputBinding:
position: 2
file1:
type: File
inputBinding:
position: 3
outputs:
tarball:
type: File
outputBinding:
glob: $(inputs.tarball_fname)
43 changes: 43 additions & 0 deletions beeflow/data/cwl/bee_workflows/cat-grep-fail/workflow.cwl
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
#!/usr/bin/env cwl-runner

cwlVersion: v1.0
class: Workflow
inputs:
input_file: File
word0: string
word1: string
tarball_fname: string

outputs:
tarball:
type: File
outputSource: tar/tarball
cat_stderr:
type: File
outputSource: cat/cat_stderr

steps:
cat:
run: cat.cwl
in:
input_file: input_file
out: [contents, cat_stderr]
grep0:
run: grep0.cwl
in:
word: word0
text_file: cat/contents
out: [occur]
grep1:
run: grep1.cwl
in:
word: word1
text_file: cat/contents
out: [occur]
tar:
run: tar.cwl
in:
file0: grep0/occur
file1: grep1/occur
tarball_fname: tarball_fname
out: [tarball]
10 changes: 10 additions & 0 deletions beeflow/tests/test_wf_manager.py
Original file line number Diff line number Diff line change
Expand Up @@ -200,6 +200,16 @@ def test_cancel_workflow(client, mocker, setup_teardown_workflow, temp_db):
mocker.patch('beeflow.wf_manager.resources.wf_utils.get_db_path', temp_db.db_file)
mocker.patch('beeflow.wf_manager.resources.wf_actions.db_path', temp_db.db_file)

wf_name = 'wf'
wf_status = 'Pending'
bolt_port = 3030
gdb_pid = 12345

temp_db.workflows.init_workflow(WF_ID, wf_name, wf_status, 'dir', bolt_port, gdb_pid)
temp_db.workflows.add_task(123, WF_ID, 'task', "WAITING")
temp_db.workflows.add_task(124, WF_ID, 'task', "RUNNING")
mocker.patch('beeflow.wf_manager.resources.wf_actions.dep_manager.kill_gdb', return_value=None)

request = {'wf_id': WF_ID}
resp = client().delete(f'/bee_wfm/v1/jobs/{WF_ID}', json=request)
assert resp.json['status'] == 'Cancelled'
Expand Down
5 changes: 4 additions & 1 deletion beeflow/wf_manager/resources/wf_actions.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@

from beeflow.common.db import wfm_db
from beeflow.common.db.bdb import connect_db
from beeflow.wf_manager.common import dep_manager

log = bee_logging.setup(__name__)
db_path = wf_utils.get_db_path()
Expand Down Expand Up @@ -62,8 +63,10 @@ def delete(wf_id):
wfi.finalize_workflow()
wf_utils.update_wf_status(wf_id, 'Cancelled')
db.workflows.update_workflow_state(wf_id, 'Cancelled')
db.workflows.delete_workflow(wf_id)
log.info("Workflow cancelled")
log.info("Shutting down gdb")
pid = db.workflows.get_gdb_pid(wf_id)
dep_manager.kill_gdb(pid)
resp = make_response(jsonify(status='Cancelled'), 202)
return resp

Expand Down
14 changes: 11 additions & 3 deletions beeflow/wf_manager/resources/wf_update.py
Original file line number Diff line number Diff line change
Expand Up @@ -99,7 +99,7 @@ def put(self):
db.workflows.add_task(new_task.id, wf_id, new_task.name, "WAITING")
if new_task is None:
log.info('No more restarts')
state = wfi.get_task_state(task)
wf_state = wfi.get_task_state(task)
return make_response(jsonify(status=f'Task {task_id} set to {job_state}'))
# Submit the restart task
tasks = [new_task]
Expand All @@ -113,8 +113,8 @@ def put(self):
else:
wfi.set_task_output(task, output.id, "temp")
tasks = wfi.finalize_task(task)
state = wfi.get_workflow_state()
if tasks and state != 'PAUSED':
wf_state = wfi.get_workflow_state()
if tasks and wf_state != 'PAUSED':
wf_utils.schedule_submit_tasks(wf_id, tasks)

if wfi.workflow_completed():
Expand All @@ -123,6 +123,14 @@ def put(self):
archive_workflow(db, wf_id)
pid = db.workflows.get_gdb_pid(wf_id)
dep_manager.kill_gdb(pid)
if wf_state == 'FAILED':
log.info("Workflow failed")
log.info("Shutting down GDB")
wf_id = wfi.workflow_id
archive_workflow(db, wf_id)
pid = db.workflows.get_gdb_pid(wf_id)
dep_manager.kill_gdb(pid)

resp = make_response(jsonify(status=(f'Task {task_id} belonging to WF {wf_id} set to'
f'{job_state}')), 200)
return resp
Expand Down

0 comments on commit 9f2f631

Please sign in to comment.