start two scripts in parallel and stop one based on the other’s return
I want to start two different python scripts (tensorflow object detection train.py and eval.py) in parallel on different GPUs, and when train.py is completed, kill eval.py.
I have the following code to start two subprocesses in parallel (How to terminate a python subprocess launched with shell=True). But the subprocesses are started on the same device (I can guess why. I just don’t know how to start them on different devices).
start_train = “CUDA_DEVICE_ORDER= PCI_BUS_ID CUDA VISIBLE_DEVICES=0 train.py ...”
start_eval = “CUDA_DEVICE_ORDER= PCI_BUS_ID CUDA VISIBLE_DEVICES=1 eval.py ...”
commands = [start_train, start_eval]
procs = [subprocess.Popen(i, shell=True, stdout=subprocess.PIPE, preexec_fn=os.setsid) for i in commands]
After this point I don’t know how to proceed. Do I need something like below? Should I use p.communicate() instead to avoid deadlocks? Or is it enough if I just call wait() or communicate() for train.py as I need only its completion.
for p in procs:
p.wait() # I assume this command won’t affect the parallel running
Then I need to use the following command somehow. I don’t need a return value from train.py, but a return code from subprocess alone. Popen.returncode documentation wait() and communicate() look like needing a return code setting. I don’t understand how to set this. I prefer something like
if train is done without any error:
os.killpg(os.getpgid(procs[1].pid), signal.SIGTERM)
else:
write the error to the console, or to a file (but how?)
OR?
train_return = proc[0].wait()
if train_return == 0:
os.killpg(os.getpgid(procs[1].pid), signal.SIGTERM)
UPDATE AFTER SOLVING THE PROBLEM:
This is my main:
if __name__ == "__main__":
exp = 1
go = True
while go:
create_dir(os.path.join(MAIN_PATH,'kitti',str(exp),'train'))
create_dir(os.path.join(MAIN_PATH,'kitti',str(exp),'eval'))
copy_tree(os.path.join(MAIN_PATH,"kitti/eval_after_COCO"), os.path.join(MAIN_PATH,"kitti",str(exp),"eval"))
copy_tree(os.path.join(MAIN_PATH,"kitti/train_after_COCO"), os.path.join(MAIN_PATH,"kitti",str(exp),"train"))
err_log = open('./kitti/'+str(exp)+'/error_log' + str(exp) + '.txt', 'w')
train_command = CUDA_COMMAND_PREFIX + "0 python3 " + str(MAIN_PATH) + "legacy/train.py
--logtostderr --train_dir " + str(MAIN_PATH) + "kitti/"
+ str(exp) + "/train/ --pipeline_config_path " + str(MAIN_PATH)
+ "kitti/faster_rcnn_resnet101_coco.config"
eval_command = CUDA_COMMAND_PREFIX + "1 python3 " + str(MAIN_PATH) + "legacy/eval.py
--logtostderr --eval_dir " + str(MAIN_PATH) + "kitti/"
+ str(exp) + "/eval/ --pipeline_config_path " + str(MAIN_PATH)
+ "kitti/faster_rcnn_resnet101_coco.config --checkpoint_dir " +
str(MAIN_PATH) + "kitti/" + str(exp) + "/train/"
os.system("python3 dataset_tools/random_sampler_with_replacement.py --random_set_id " + str(exp))
time.sleep(20)
update_train_set(exp)
train_proc = subprocess.Popen(train_command,
stdout=subprocess.PIPE,
stderr=err_log, # write errors to a file
shell=True)
time.sleep(20)
eval_proc = subprocess.Popen(eval_command,
stdout=subprocess.PIPE,
shell=True)
time.sleep(20)
if train_proc.wait() == 0: # successfull termination
os.killpg(os.getpgid(eval_proc.pid), subprocess.signal.SIGTERM)
clean_train_set(exp)
time.sleep(20)
exp += 1
if exp == 51:
go = False
python-3.x subprocess python-multiprocessing kill-process
add a comment |
I want to start two different python scripts (tensorflow object detection train.py and eval.py) in parallel on different GPUs, and when train.py is completed, kill eval.py.
I have the following code to start two subprocesses in parallel (How to terminate a python subprocess launched with shell=True). But the subprocesses are started on the same device (I can guess why. I just don’t know how to start them on different devices).
start_train = “CUDA_DEVICE_ORDER= PCI_BUS_ID CUDA VISIBLE_DEVICES=0 train.py ...”
start_eval = “CUDA_DEVICE_ORDER= PCI_BUS_ID CUDA VISIBLE_DEVICES=1 eval.py ...”
commands = [start_train, start_eval]
procs = [subprocess.Popen(i, shell=True, stdout=subprocess.PIPE, preexec_fn=os.setsid) for i in commands]
After this point I don’t know how to proceed. Do I need something like below? Should I use p.communicate() instead to avoid deadlocks? Or is it enough if I just call wait() or communicate() for train.py as I need only its completion.
for p in procs:
p.wait() # I assume this command won’t affect the parallel running
Then I need to use the following command somehow. I don’t need a return value from train.py, but a return code from subprocess alone. Popen.returncode documentation wait() and communicate() look like needing a return code setting. I don’t understand how to set this. I prefer something like
if train is done without any error:
os.killpg(os.getpgid(procs[1].pid), signal.SIGTERM)
else:
write the error to the console, or to a file (but how?)
OR?
train_return = proc[0].wait()
if train_return == 0:
os.killpg(os.getpgid(procs[1].pid), signal.SIGTERM)
UPDATE AFTER SOLVING THE PROBLEM:
This is my main:
if __name__ == "__main__":
exp = 1
go = True
while go:
create_dir(os.path.join(MAIN_PATH,'kitti',str(exp),'train'))
create_dir(os.path.join(MAIN_PATH,'kitti',str(exp),'eval'))
copy_tree(os.path.join(MAIN_PATH,"kitti/eval_after_COCO"), os.path.join(MAIN_PATH,"kitti",str(exp),"eval"))
copy_tree(os.path.join(MAIN_PATH,"kitti/train_after_COCO"), os.path.join(MAIN_PATH,"kitti",str(exp),"train"))
err_log = open('./kitti/'+str(exp)+'/error_log' + str(exp) + '.txt', 'w')
train_command = CUDA_COMMAND_PREFIX + "0 python3 " + str(MAIN_PATH) + "legacy/train.py
--logtostderr --train_dir " + str(MAIN_PATH) + "kitti/"
+ str(exp) + "/train/ --pipeline_config_path " + str(MAIN_PATH)
+ "kitti/faster_rcnn_resnet101_coco.config"
eval_command = CUDA_COMMAND_PREFIX + "1 python3 " + str(MAIN_PATH) + "legacy/eval.py
--logtostderr --eval_dir " + str(MAIN_PATH) + "kitti/"
+ str(exp) + "/eval/ --pipeline_config_path " + str(MAIN_PATH)
+ "kitti/faster_rcnn_resnet101_coco.config --checkpoint_dir " +
str(MAIN_PATH) + "kitti/" + str(exp) + "/train/"
os.system("python3 dataset_tools/random_sampler_with_replacement.py --random_set_id " + str(exp))
time.sleep(20)
update_train_set(exp)
train_proc = subprocess.Popen(train_command,
stdout=subprocess.PIPE,
stderr=err_log, # write errors to a file
shell=True)
time.sleep(20)
eval_proc = subprocess.Popen(eval_command,
stdout=subprocess.PIPE,
shell=True)
time.sleep(20)
if train_proc.wait() == 0: # successfull termination
os.killpg(os.getpgid(eval_proc.pid), subprocess.signal.SIGTERM)
clean_train_set(exp)
time.sleep(20)
exp += 1
if exp == 51:
go = False
python-3.x subprocess python-multiprocessing kill-process
add a comment |
I want to start two different python scripts (tensorflow object detection train.py and eval.py) in parallel on different GPUs, and when train.py is completed, kill eval.py.
I have the following code to start two subprocesses in parallel (How to terminate a python subprocess launched with shell=True). But the subprocesses are started on the same device (I can guess why. I just don’t know how to start them on different devices).
start_train = “CUDA_DEVICE_ORDER= PCI_BUS_ID CUDA VISIBLE_DEVICES=0 train.py ...”
start_eval = “CUDA_DEVICE_ORDER= PCI_BUS_ID CUDA VISIBLE_DEVICES=1 eval.py ...”
commands = [start_train, start_eval]
procs = [subprocess.Popen(i, shell=True, stdout=subprocess.PIPE, preexec_fn=os.setsid) for i in commands]
After this point I don’t know how to proceed. Do I need something like below? Should I use p.communicate() instead to avoid deadlocks? Or is it enough if I just call wait() or communicate() for train.py as I need only its completion.
for p in procs:
p.wait() # I assume this command won’t affect the parallel running
Then I need to use the following command somehow. I don’t need a return value from train.py, but a return code from subprocess alone. Popen.returncode documentation wait() and communicate() look like needing a return code setting. I don’t understand how to set this. I prefer something like
if train is done without any error:
os.killpg(os.getpgid(procs[1].pid), signal.SIGTERM)
else:
write the error to the console, or to a file (but how?)
OR?
train_return = proc[0].wait()
if train_return == 0:
os.killpg(os.getpgid(procs[1].pid), signal.SIGTERM)
UPDATE AFTER SOLVING THE PROBLEM:
This is my main:
if __name__ == "__main__":
exp = 1
go = True
while go:
create_dir(os.path.join(MAIN_PATH,'kitti',str(exp),'train'))
create_dir(os.path.join(MAIN_PATH,'kitti',str(exp),'eval'))
copy_tree(os.path.join(MAIN_PATH,"kitti/eval_after_COCO"), os.path.join(MAIN_PATH,"kitti",str(exp),"eval"))
copy_tree(os.path.join(MAIN_PATH,"kitti/train_after_COCO"), os.path.join(MAIN_PATH,"kitti",str(exp),"train"))
err_log = open('./kitti/'+str(exp)+'/error_log' + str(exp) + '.txt', 'w')
train_command = CUDA_COMMAND_PREFIX + "0 python3 " + str(MAIN_PATH) + "legacy/train.py
--logtostderr --train_dir " + str(MAIN_PATH) + "kitti/"
+ str(exp) + "/train/ --pipeline_config_path " + str(MAIN_PATH)
+ "kitti/faster_rcnn_resnet101_coco.config"
eval_command = CUDA_COMMAND_PREFIX + "1 python3 " + str(MAIN_PATH) + "legacy/eval.py
--logtostderr --eval_dir " + str(MAIN_PATH) + "kitti/"
+ str(exp) + "/eval/ --pipeline_config_path " + str(MAIN_PATH)
+ "kitti/faster_rcnn_resnet101_coco.config --checkpoint_dir " +
str(MAIN_PATH) + "kitti/" + str(exp) + "/train/"
os.system("python3 dataset_tools/random_sampler_with_replacement.py --random_set_id " + str(exp))
time.sleep(20)
update_train_set(exp)
train_proc = subprocess.Popen(train_command,
stdout=subprocess.PIPE,
stderr=err_log, # write errors to a file
shell=True)
time.sleep(20)
eval_proc = subprocess.Popen(eval_command,
stdout=subprocess.PIPE,
shell=True)
time.sleep(20)
if train_proc.wait() == 0: # successfull termination
os.killpg(os.getpgid(eval_proc.pid), subprocess.signal.SIGTERM)
clean_train_set(exp)
time.sleep(20)
exp += 1
if exp == 51:
go = False
python-3.x subprocess python-multiprocessing kill-process
I want to start two different python scripts (tensorflow object detection train.py and eval.py) in parallel on different GPUs, and when train.py is completed, kill eval.py.
I have the following code to start two subprocesses in parallel (How to terminate a python subprocess launched with shell=True). But the subprocesses are started on the same device (I can guess why. I just don’t know how to start them on different devices).
start_train = “CUDA_DEVICE_ORDER= PCI_BUS_ID CUDA VISIBLE_DEVICES=0 train.py ...”
start_eval = “CUDA_DEVICE_ORDER= PCI_BUS_ID CUDA VISIBLE_DEVICES=1 eval.py ...”
commands = [start_train, start_eval]
procs = [subprocess.Popen(i, shell=True, stdout=subprocess.PIPE, preexec_fn=os.setsid) for i in commands]
After this point I don’t know how to proceed. Do I need something like below? Should I use p.communicate() instead to avoid deadlocks? Or is it enough if I just call wait() or communicate() for train.py as I need only its completion.
for p in procs:
p.wait() # I assume this command won’t affect the parallel running
Then I need to use the following command somehow. I don’t need a return value from train.py, but a return code from subprocess alone. Popen.returncode documentation wait() and communicate() look like needing a return code setting. I don’t understand how to set this. I prefer something like
if train is done without any error:
os.killpg(os.getpgid(procs[1].pid), signal.SIGTERM)
else:
write the error to the console, or to a file (but how?)
OR?
train_return = proc[0].wait()
if train_return == 0:
os.killpg(os.getpgid(procs[1].pid), signal.SIGTERM)
UPDATE AFTER SOLVING THE PROBLEM:
This is my main:
if __name__ == "__main__":
exp = 1
go = True
while go:
create_dir(os.path.join(MAIN_PATH,'kitti',str(exp),'train'))
create_dir(os.path.join(MAIN_PATH,'kitti',str(exp),'eval'))
copy_tree(os.path.join(MAIN_PATH,"kitti/eval_after_COCO"), os.path.join(MAIN_PATH,"kitti",str(exp),"eval"))
copy_tree(os.path.join(MAIN_PATH,"kitti/train_after_COCO"), os.path.join(MAIN_PATH,"kitti",str(exp),"train"))
err_log = open('./kitti/'+str(exp)+'/error_log' + str(exp) + '.txt', 'w')
train_command = CUDA_COMMAND_PREFIX + "0 python3 " + str(MAIN_PATH) + "legacy/train.py
--logtostderr --train_dir " + str(MAIN_PATH) + "kitti/"
+ str(exp) + "/train/ --pipeline_config_path " + str(MAIN_PATH)
+ "kitti/faster_rcnn_resnet101_coco.config"
eval_command = CUDA_COMMAND_PREFIX + "1 python3 " + str(MAIN_PATH) + "legacy/eval.py
--logtostderr --eval_dir " + str(MAIN_PATH) + "kitti/"
+ str(exp) + "/eval/ --pipeline_config_path " + str(MAIN_PATH)
+ "kitti/faster_rcnn_resnet101_coco.config --checkpoint_dir " +
str(MAIN_PATH) + "kitti/" + str(exp) + "/train/"
os.system("python3 dataset_tools/random_sampler_with_replacement.py --random_set_id " + str(exp))
time.sleep(20)
update_train_set(exp)
train_proc = subprocess.Popen(train_command,
stdout=subprocess.PIPE,
stderr=err_log, # write errors to a file
shell=True)
time.sleep(20)
eval_proc = subprocess.Popen(eval_command,
stdout=subprocess.PIPE,
shell=True)
time.sleep(20)
if train_proc.wait() == 0: # successfull termination
os.killpg(os.getpgid(eval_proc.pid), subprocess.signal.SIGTERM)
clean_train_set(exp)
time.sleep(20)
exp += 1
if exp == 51:
go = False
python-3.x subprocess python-multiprocessing kill-process
python-3.x subprocess python-multiprocessing kill-process
edited Nov 25 '18 at 13:25
kneazle
asked Nov 19 '18 at 22:58
kneazlekneazle
397
397
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
By default, TensorFlow assigns operations to the "/gpu:0" (or "/cpu:0") even if you have multiple GPUs. The only way to solve it is to assign each operation manually to the second GPU in one of your scripts using context manager
with tf.device("/gpu:1"):
# your ops here
UPDATE
If I understand you correctly, what you need is the following:
import subprocess
import os
err_log = open('error_log.txt', 'w')
train_proc = subprocess.Popen(start_train,
stdout=subprocess.PIPE,
stderr=err_log, # write errors to a file
shell=True)
eval_proc = subprocess.Popen(start_eval,
stdout=subprocess.PIPE,
shell=True)
if train_proc.wait() == 0: # successfull termination
os.killpg(os.getpgid(eval_proc.pid), subprocess.signal.SIGTERM)
# else, errors will be written to the 'err_log.txt' file
I can start the scripts on preferred gpu's with that command. However my question was different..
– kneazle
Nov 21 '18 at 16:10
Check the update
– Vlad
Nov 21 '18 at 17:46
Thanks! It is working. I have one question though, I want to use this setting in a loop (say 100 times). In other words, after train is done and eval is killed, I want to start another pair of subprocesses just like those. However, after eval is killed, nothing starts again and “Terminated” is printed on my console. How could I prevent this to be happening, and start other processes after eval is killed?
– kneazle
Nov 24 '18 at 21:52
I need to see the error you get to help you.
– Vlad
Nov 25 '18 at 11:28
1
instead ofos.killpg(os.getpgid(eval_proc.pid), subprocess.signal.SIGTERM)useeval_proc.kill().
– Vlad
Nov 26 '18 at 11:47
|
show 5 more comments
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53383847%2fstart-two-scripts-in-parallel-and-stop-one-based-on-the-other-s-return%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
By default, TensorFlow assigns operations to the "/gpu:0" (or "/cpu:0") even if you have multiple GPUs. The only way to solve it is to assign each operation manually to the second GPU in one of your scripts using context manager
with tf.device("/gpu:1"):
# your ops here
UPDATE
If I understand you correctly, what you need is the following:
import subprocess
import os
err_log = open('error_log.txt', 'w')
train_proc = subprocess.Popen(start_train,
stdout=subprocess.PIPE,
stderr=err_log, # write errors to a file
shell=True)
eval_proc = subprocess.Popen(start_eval,
stdout=subprocess.PIPE,
shell=True)
if train_proc.wait() == 0: # successfull termination
os.killpg(os.getpgid(eval_proc.pid), subprocess.signal.SIGTERM)
# else, errors will be written to the 'err_log.txt' file
I can start the scripts on preferred gpu's with that command. However my question was different..
– kneazle
Nov 21 '18 at 16:10
Check the update
– Vlad
Nov 21 '18 at 17:46
Thanks! It is working. I have one question though, I want to use this setting in a loop (say 100 times). In other words, after train is done and eval is killed, I want to start another pair of subprocesses just like those. However, after eval is killed, nothing starts again and “Terminated” is printed on my console. How could I prevent this to be happening, and start other processes after eval is killed?
– kneazle
Nov 24 '18 at 21:52
I need to see the error you get to help you.
– Vlad
Nov 25 '18 at 11:28
1
instead ofos.killpg(os.getpgid(eval_proc.pid), subprocess.signal.SIGTERM)useeval_proc.kill().
– Vlad
Nov 26 '18 at 11:47
|
show 5 more comments
By default, TensorFlow assigns operations to the "/gpu:0" (or "/cpu:0") even if you have multiple GPUs. The only way to solve it is to assign each operation manually to the second GPU in one of your scripts using context manager
with tf.device("/gpu:1"):
# your ops here
UPDATE
If I understand you correctly, what you need is the following:
import subprocess
import os
err_log = open('error_log.txt', 'w')
train_proc = subprocess.Popen(start_train,
stdout=subprocess.PIPE,
stderr=err_log, # write errors to a file
shell=True)
eval_proc = subprocess.Popen(start_eval,
stdout=subprocess.PIPE,
shell=True)
if train_proc.wait() == 0: # successfull termination
os.killpg(os.getpgid(eval_proc.pid), subprocess.signal.SIGTERM)
# else, errors will be written to the 'err_log.txt' file
I can start the scripts on preferred gpu's with that command. However my question was different..
– kneazle
Nov 21 '18 at 16:10
Check the update
– Vlad
Nov 21 '18 at 17:46
Thanks! It is working. I have one question though, I want to use this setting in a loop (say 100 times). In other words, after train is done and eval is killed, I want to start another pair of subprocesses just like those. However, after eval is killed, nothing starts again and “Terminated” is printed on my console. How could I prevent this to be happening, and start other processes after eval is killed?
– kneazle
Nov 24 '18 at 21:52
I need to see the error you get to help you.
– Vlad
Nov 25 '18 at 11:28
1
instead ofos.killpg(os.getpgid(eval_proc.pid), subprocess.signal.SIGTERM)useeval_proc.kill().
– Vlad
Nov 26 '18 at 11:47
|
show 5 more comments
By default, TensorFlow assigns operations to the "/gpu:0" (or "/cpu:0") even if you have multiple GPUs. The only way to solve it is to assign each operation manually to the second GPU in one of your scripts using context manager
with tf.device("/gpu:1"):
# your ops here
UPDATE
If I understand you correctly, what you need is the following:
import subprocess
import os
err_log = open('error_log.txt', 'w')
train_proc = subprocess.Popen(start_train,
stdout=subprocess.PIPE,
stderr=err_log, # write errors to a file
shell=True)
eval_proc = subprocess.Popen(start_eval,
stdout=subprocess.PIPE,
shell=True)
if train_proc.wait() == 0: # successfull termination
os.killpg(os.getpgid(eval_proc.pid), subprocess.signal.SIGTERM)
# else, errors will be written to the 'err_log.txt' file
By default, TensorFlow assigns operations to the "/gpu:0" (or "/cpu:0") even if you have multiple GPUs. The only way to solve it is to assign each operation manually to the second GPU in one of your scripts using context manager
with tf.device("/gpu:1"):
# your ops here
UPDATE
If I understand you correctly, what you need is the following:
import subprocess
import os
err_log = open('error_log.txt', 'w')
train_proc = subprocess.Popen(start_train,
stdout=subprocess.PIPE,
stderr=err_log, # write errors to a file
shell=True)
eval_proc = subprocess.Popen(start_eval,
stdout=subprocess.PIPE,
shell=True)
if train_proc.wait() == 0: # successfull termination
os.killpg(os.getpgid(eval_proc.pid), subprocess.signal.SIGTERM)
# else, errors will be written to the 'err_log.txt' file
edited Nov 21 '18 at 17:52
answered Nov 20 '18 at 1:12
VladVlad
310411
310411
I can start the scripts on preferred gpu's with that command. However my question was different..
– kneazle
Nov 21 '18 at 16:10
Check the update
– Vlad
Nov 21 '18 at 17:46
Thanks! It is working. I have one question though, I want to use this setting in a loop (say 100 times). In other words, after train is done and eval is killed, I want to start another pair of subprocesses just like those. However, after eval is killed, nothing starts again and “Terminated” is printed on my console. How could I prevent this to be happening, and start other processes after eval is killed?
– kneazle
Nov 24 '18 at 21:52
I need to see the error you get to help you.
– Vlad
Nov 25 '18 at 11:28
1
instead ofos.killpg(os.getpgid(eval_proc.pid), subprocess.signal.SIGTERM)useeval_proc.kill().
– Vlad
Nov 26 '18 at 11:47
|
show 5 more comments
I can start the scripts on preferred gpu's with that command. However my question was different..
– kneazle
Nov 21 '18 at 16:10
Check the update
– Vlad
Nov 21 '18 at 17:46
Thanks! It is working. I have one question though, I want to use this setting in a loop (say 100 times). In other words, after train is done and eval is killed, I want to start another pair of subprocesses just like those. However, after eval is killed, nothing starts again and “Terminated” is printed on my console. How could I prevent this to be happening, and start other processes after eval is killed?
– kneazle
Nov 24 '18 at 21:52
I need to see the error you get to help you.
– Vlad
Nov 25 '18 at 11:28
1
instead ofos.killpg(os.getpgid(eval_proc.pid), subprocess.signal.SIGTERM)useeval_proc.kill().
– Vlad
Nov 26 '18 at 11:47
I can start the scripts on preferred gpu's with that command. However my question was different..
– kneazle
Nov 21 '18 at 16:10
I can start the scripts on preferred gpu's with that command. However my question was different..
– kneazle
Nov 21 '18 at 16:10
Check the update
– Vlad
Nov 21 '18 at 17:46
Check the update
– Vlad
Nov 21 '18 at 17:46
Thanks! It is working. I have one question though, I want to use this setting in a loop (say 100 times). In other words, after train is done and eval is killed, I want to start another pair of subprocesses just like those. However, after eval is killed, nothing starts again and “Terminated” is printed on my console. How could I prevent this to be happening, and start other processes after eval is killed?
– kneazle
Nov 24 '18 at 21:52
Thanks! It is working. I have one question though, I want to use this setting in a loop (say 100 times). In other words, after train is done and eval is killed, I want to start another pair of subprocesses just like those. However, after eval is killed, nothing starts again and “Terminated” is printed on my console. How could I prevent this to be happening, and start other processes after eval is killed?
– kneazle
Nov 24 '18 at 21:52
I need to see the error you get to help you.
– Vlad
Nov 25 '18 at 11:28
I need to see the error you get to help you.
– Vlad
Nov 25 '18 at 11:28
1
1
instead of
os.killpg(os.getpgid(eval_proc.pid), subprocess.signal.SIGTERM) use eval_proc.kill().– Vlad
Nov 26 '18 at 11:47
instead of
os.killpg(os.getpgid(eval_proc.pid), subprocess.signal.SIGTERM) use eval_proc.kill().– Vlad
Nov 26 '18 at 11:47
|
show 5 more comments
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53383847%2fstart-two-scripts-in-parallel-and-stop-one-based-on-the-other-s-return%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown