start two scripts in parallel and stop one based on the other’s return

I want to start two different python scripts (tensorflow object detection train.py and eval.py) in parallel on different GPUs, and when train.py is completed, kill eval.py.

I have the following code to start two subprocesses in parallel (How to terminate a python subprocess launched with shell=True). But the subprocesses are started on the same device (I can guess why. I just don’t know how to start them on different devices).

start_train = “CUDA_DEVICE_ORDER= PCI_BUS_ID CUDA VISIBLE_DEVICES=0 train.py ...”



start_eval = “CUDA_DEVICE_ORDER= PCI_BUS_ID CUDA VISIBLE_DEVICES=1 eval.py ...”



commands = [start_train, start_eval]



procs = [subprocess.Popen(i, shell=True, stdout=subprocess.PIPE, preexec_fn=os.setsid) for i in commands]

After this point I don’t know how to proceed. Do I need something like below? Should I use p.communicate() instead to avoid deadlocks? Or is it enough if I just call wait() or communicate() for train.py as I need only its completion.

for p in procs:

    p.wait() # I assume this command won’t affect the parallel running

Then I need to use the following command somehow. I don’t need a return value from train.py, but a return code from subprocess alone. Popen.returncode documentation wait() and communicate() look like needing a return code setting. I don’t understand how to set this. I prefer something like

if train is done without any error:

    os.killpg(os.getpgid(procs[1].pid), signal.SIGTERM) 

else:

    write the error to the console, or to a file (but how?)

OR?

train_return = proc[0].wait() 

if train_return == 0:

    os.killpg(os.getpgid(procs[1].pid), signal.SIGTERM)

UPDATE AFTER SOLVING THE PROBLEM:

This is my main:

if __name__ == "__main__":

    exp = 1

    go = True

    while go:





        create_dir(os.path.join(MAIN_PATH,'kitti',str(exp),'train'))

        create_dir(os.path.join(MAIN_PATH,'kitti',str(exp),'eval'))





        copy_tree(os.path.join(MAIN_PATH,"kitti/eval_after_COCO"), os.path.join(MAIN_PATH,"kitti",str(exp),"eval"))

        copy_tree(os.path.join(MAIN_PATH,"kitti/train_after_COCO"), os.path.join(MAIN_PATH,"kitti",str(exp),"train"))



        err_log = open('./kitti/'+str(exp)+'/error_log' + str(exp) + '.txt', 'w')



        train_command = CUDA_COMMAND_PREFIX + "0 python3 " + str(MAIN_PATH) + "legacy/train.py 

                                            --logtostderr --train_dir " + str(MAIN_PATH) + "kitti/" 

                                            + str(exp) + "/train/ --pipeline_config_path " + str(MAIN_PATH) 

                                            + "kitti/faster_rcnn_resnet101_coco.config"

        eval_command = CUDA_COMMAND_PREFIX + "1 python3 " + str(MAIN_PATH) + "legacy/eval.py 

                                            --logtostderr --eval_dir " + str(MAIN_PATH) + "kitti/" 

                                            + str(exp) + "/eval/ --pipeline_config_path " + str(MAIN_PATH) 

                                            + "kitti/faster_rcnn_resnet101_coco.config --checkpoint_dir " + 

                                            str(MAIN_PATH) + "kitti/" + str(exp) + "/train/"



        os.system("python3 dataset_tools/random_sampler_with_replacement.py --random_set_id " + str(exp))

        time.sleep(20)

        update_train_set(exp)







        train_proc = subprocess.Popen(train_command,

                                  stdout=subprocess.PIPE,

                                  stderr=err_log, # write errors to a file

                                  shell=True)

        time.sleep(20)      

        eval_proc = subprocess.Popen(eval_command,

                                 stdout=subprocess.PIPE,

                                 shell=True)

        time.sleep(20)



        if train_proc.wait() == 0: # successfull termination

            os.killpg(os.getpgid(eval_proc.pid), subprocess.signal.SIGTERM)



        clean_train_set(exp)

        time.sleep(20)

        exp += 1

        if exp == 51:

            go = False

edited Nov 25 '18 at 13:25

asked Nov 19 '18 at 22:58

kneazle

397

add a comment |

I want to start two different python scripts (tensorflow object detection train.py and eval.py) in parallel on different GPUs, and when train.py is completed, kill eval.py.

start_train = “CUDA_DEVICE_ORDER= PCI_BUS_ID CUDA VISIBLE_DEVICES=0 train.py ...”



start_eval = “CUDA_DEVICE_ORDER= PCI_BUS_ID CUDA VISIBLE_DEVICES=1 eval.py ...”



commands = [start_train, start_eval]



procs = [subprocess.Popen(i, shell=True, stdout=subprocess.PIPE, preexec_fn=os.setsid) for i in commands]

for p in procs:

    p.wait() # I assume this command won’t affect the parallel running

if train is done without any error:

    os.killpg(os.getpgid(procs[1].pid), signal.SIGTERM) 

else:

    write the error to the console, or to a file (but how?)

OR?

train_return = proc[0].wait() 

if train_return == 0:

    os.killpg(os.getpgid(procs[1].pid), signal.SIGTERM)

UPDATE AFTER SOLVING THE PROBLEM:

This is my main:

if __name__ == "__main__":

    exp = 1

    go = True

    while go:





        create_dir(os.path.join(MAIN_PATH,'kitti',str(exp),'train'))

        create_dir(os.path.join(MAIN_PATH,'kitti',str(exp),'eval'))





        copy_tree(os.path.join(MAIN_PATH,"kitti/eval_after_COCO"), os.path.join(MAIN_PATH,"kitti",str(exp),"eval"))

        copy_tree(os.path.join(MAIN_PATH,"kitti/train_after_COCO"), os.path.join(MAIN_PATH,"kitti",str(exp),"train"))



        err_log = open('./kitti/'+str(exp)+'/error_log' + str(exp) + '.txt', 'w')



        train_command = CUDA_COMMAND_PREFIX + "0 python3 " + str(MAIN_PATH) + "legacy/train.py 

                                            --logtostderr --train_dir " + str(MAIN_PATH) + "kitti/" 

                                            + str(exp) + "/train/ --pipeline_config_path " + str(MAIN_PATH) 

                                            + "kitti/faster_rcnn_resnet101_coco.config"

        eval_command = CUDA_COMMAND_PREFIX + "1 python3 " + str(MAIN_PATH) + "legacy/eval.py 

                                            --logtostderr --eval_dir " + str(MAIN_PATH) + "kitti/" 

                                            + str(exp) + "/eval/ --pipeline_config_path " + str(MAIN_PATH) 

                                            + "kitti/faster_rcnn_resnet101_coco.config --checkpoint_dir " + 

                                            str(MAIN_PATH) + "kitti/" + str(exp) + "/train/"



        os.system("python3 dataset_tools/random_sampler_with_replacement.py --random_set_id " + str(exp))

        time.sleep(20)

        update_train_set(exp)







        train_proc = subprocess.Popen(train_command,

                                  stdout=subprocess.PIPE,

                                  stderr=err_log, # write errors to a file

                                  shell=True)

        time.sleep(20)      

        eval_proc = subprocess.Popen(eval_command,

                                 stdout=subprocess.PIPE,

                                 shell=True)

        time.sleep(20)



        if train_proc.wait() == 0: # successfull termination

            os.killpg(os.getpgid(eval_proc.pid), subprocess.signal.SIGTERM)



        clean_train_set(exp)

        time.sleep(20)

        exp += 1

        if exp == 51:

            go = False

edited Nov 25 '18 at 13:25

asked Nov 19 '18 at 22:58

kneazle

397

add a comment |

I want to start two different python scripts (tensorflow object detection train.py and eval.py) in parallel on different GPUs, and when train.py is completed, kill eval.py.

start_train = “CUDA_DEVICE_ORDER= PCI_BUS_ID CUDA VISIBLE_DEVICES=0 train.py ...”



start_eval = “CUDA_DEVICE_ORDER= PCI_BUS_ID CUDA VISIBLE_DEVICES=1 eval.py ...”



commands = [start_train, start_eval]



procs = [subprocess.Popen(i, shell=True, stdout=subprocess.PIPE, preexec_fn=os.setsid) for i in commands]

for p in procs:

    p.wait() # I assume this command won’t affect the parallel running

if train is done without any error:

    os.killpg(os.getpgid(procs[1].pid), signal.SIGTERM) 

else:

    write the error to the console, or to a file (but how?)

OR?

train_return = proc[0].wait() 

if train_return == 0:

    os.killpg(os.getpgid(procs[1].pid), signal.SIGTERM)

UPDATE AFTER SOLVING THE PROBLEM:

This is my main:

if __name__ == "__main__":

    exp = 1

    go = True

    while go:





        create_dir(os.path.join(MAIN_PATH,'kitti',str(exp),'train'))

        create_dir(os.path.join(MAIN_PATH,'kitti',str(exp),'eval'))





        copy_tree(os.path.join(MAIN_PATH,"kitti/eval_after_COCO"), os.path.join(MAIN_PATH,"kitti",str(exp),"eval"))

        copy_tree(os.path.join(MAIN_PATH,"kitti/train_after_COCO"), os.path.join(MAIN_PATH,"kitti",str(exp),"train"))



        err_log = open('./kitti/'+str(exp)+'/error_log' + str(exp) + '.txt', 'w')



        train_command = CUDA_COMMAND_PREFIX + "0 python3 " + str(MAIN_PATH) + "legacy/train.py 

                                            --logtostderr --train_dir " + str(MAIN_PATH) + "kitti/" 

                                            + str(exp) + "/train/ --pipeline_config_path " + str(MAIN_PATH) 

                                            + "kitti/faster_rcnn_resnet101_coco.config"

        eval_command = CUDA_COMMAND_PREFIX + "1 python3 " + str(MAIN_PATH) + "legacy/eval.py 

                                            --logtostderr --eval_dir " + str(MAIN_PATH) + "kitti/" 

                                            + str(exp) + "/eval/ --pipeline_config_path " + str(MAIN_PATH) 

                                            + "kitti/faster_rcnn_resnet101_coco.config --checkpoint_dir " + 

                                            str(MAIN_PATH) + "kitti/" + str(exp) + "/train/"



        os.system("python3 dataset_tools/random_sampler_with_replacement.py --random_set_id " + str(exp))

        time.sleep(20)

        update_train_set(exp)







        train_proc = subprocess.Popen(train_command,

                                  stdout=subprocess.PIPE,

                                  stderr=err_log, # write errors to a file

                                  shell=True)

        time.sleep(20)      

        eval_proc = subprocess.Popen(eval_command,

                                 stdout=subprocess.PIPE,

                                 shell=True)

        time.sleep(20)



        if train_proc.wait() == 0: # successfull termination

            os.killpg(os.getpgid(eval_proc.pid), subprocess.signal.SIGTERM)



        clean_train_set(exp)

        time.sleep(20)

        exp += 1

        if exp == 51:

            go = False

edited Nov 25 '18 at 13:25

asked Nov 19 '18 at 22:58

kneazle

397

I want to start two different python scripts (tensorflow object detection train.py and eval.py) in parallel on different GPUs, and when train.py is completed, kill eval.py.

start_train = “CUDA_DEVICE_ORDER= PCI_BUS_ID CUDA VISIBLE_DEVICES=0 train.py ...”



start_eval = “CUDA_DEVICE_ORDER= PCI_BUS_ID CUDA VISIBLE_DEVICES=1 eval.py ...”



commands = [start_train, start_eval]



procs = [subprocess.Popen(i, shell=True, stdout=subprocess.PIPE, preexec_fn=os.setsid) for i in commands]

for p in procs:

    p.wait() # I assume this command won’t affect the parallel running

if train is done without any error:

    os.killpg(os.getpgid(procs[1].pid), signal.SIGTERM) 

else:

    write the error to the console, or to a file (but how?)

OR?

train_return = proc[0].wait() 

if train_return == 0:

    os.killpg(os.getpgid(procs[1].pid), signal.SIGTERM)

UPDATE AFTER SOLVING THE PROBLEM:

This is my main:

if __name__ == "__main__":

    exp = 1

    go = True

    while go:





        create_dir(os.path.join(MAIN_PATH,'kitti',str(exp),'train'))

        create_dir(os.path.join(MAIN_PATH,'kitti',str(exp),'eval'))





        copy_tree(os.path.join(MAIN_PATH,"kitti/eval_after_COCO"), os.path.join(MAIN_PATH,"kitti",str(exp),"eval"))

        copy_tree(os.path.join(MAIN_PATH,"kitti/train_after_COCO"), os.path.join(MAIN_PATH,"kitti",str(exp),"train"))



        err_log = open('./kitti/'+str(exp)+'/error_log' + str(exp) + '.txt', 'w')



        train_command = CUDA_COMMAND_PREFIX + "0 python3 " + str(MAIN_PATH) + "legacy/train.py 

                                            --logtostderr --train_dir " + str(MAIN_PATH) + "kitti/" 

                                            + str(exp) + "/train/ --pipeline_config_path " + str(MAIN_PATH) 

                                            + "kitti/faster_rcnn_resnet101_coco.config"

        eval_command = CUDA_COMMAND_PREFIX + "1 python3 " + str(MAIN_PATH) + "legacy/eval.py 

                                            --logtostderr --eval_dir " + str(MAIN_PATH) + "kitti/" 

                                            + str(exp) + "/eval/ --pipeline_config_path " + str(MAIN_PATH) 

                                            + "kitti/faster_rcnn_resnet101_coco.config --checkpoint_dir " + 

                                            str(MAIN_PATH) + "kitti/" + str(exp) + "/train/"



        os.system("python3 dataset_tools/random_sampler_with_replacement.py --random_set_id " + str(exp))

        time.sleep(20)

        update_train_set(exp)







        train_proc = subprocess.Popen(train_command,

                                  stdout=subprocess.PIPE,

                                  stderr=err_log, # write errors to a file

                                  shell=True)

        time.sleep(20)      

        eval_proc = subprocess.Popen(eval_command,

                                 stdout=subprocess.PIPE,

                                 shell=True)

        time.sleep(20)



        if train_proc.wait() == 0: # successfull termination

            os.killpg(os.getpgid(eval_proc.pid), subprocess.signal.SIGTERM)



        clean_train_set(exp)

        time.sleep(20)

        exp += 1

        if exp == 51:

            go = False

python-3.x subprocess python-multiprocessing kill-process

edited Nov 25 '18 at 13:25

asked Nov 19 '18 at 22:58

kneazle

397

edited Nov 25 '18 at 13:25

asked Nov 19 '18 at 22:58

kneazle

397

edited Nov 25 '18 at 13:25

asked Nov 19 '18 at 22:58

kneazle

397

asked Nov 19 '18 at 22:58

kneazle

397

asked Nov 19 '18 at 22:58

kneazle

397

add a comment |

1 Answer
1

active

oldest

votes

By default, TensorFlow assigns operations to the "/gpu:0" (or "/cpu:0") even if you have multiple GPUs. The only way to solve it is to assign each operation manually to the second GPU in one of your scripts using context manager

with tf.device("/gpu:1"):

    # your ops here

UPDATE

If I understand you correctly, what you need is the following:

import subprocess

import os

err_log = open('error_log.txt', 'w')

train_proc = subprocess.Popen(start_train,

                              stdout=subprocess.PIPE,

                              stderr=err_log, # write errors to a file

                              shell=True)

eval_proc = subprocess.Popen(start_eval,

                             stdout=subprocess.PIPE,

                             shell=True)



if train_proc.wait() == 0: # successfull termination

    os.killpg(os.getpgid(eval_proc.pid), subprocess.signal.SIGTERM)

# else, errors will be written to the 'err_log.txt' file

edited Nov 21 '18 at 17:52

answered Nov 20 '18 at 1:12

Vlad

310411

I can start the scripts on preferred gpu's with that command. However my question was different..

– kneazle
Nov 21 '18 at 16:10

Check the update

– Vlad
Nov 21 '18 at 17:46

Thanks! It is working. I have one question though, I want to use this setting in a loop (say 100 times). In other words, after train is done and eval is killed, I want to start another pair of subprocesses just like those. However, after eval is killed, nothing starts again and “Terminated” is printed on my console. How could I prevent this to be happening, and start other processes after eval is killed?

– kneazle
Nov 24 '18 at 21:52

I need to see the error you get to help you.

– Vlad
Nov 25 '18 at 11:28

1

instead of os.killpg(os.getpgid(eval_proc.pid), subprocess.signal.SIGTERM) use eval_proc.kill().

– Vlad
Nov 26 '18 at 11:47

|
show 5 more comments

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53383847%2fstart-two-scripts-in-parallel-and-stop-one-based-on-the-other-s-return%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

with tf.device("/gpu:1"):

    # your ops here

UPDATE

If I understand you correctly, what you need is the following:

import subprocess

import os

err_log = open('error_log.txt', 'w')

train_proc = subprocess.Popen(start_train,

                              stdout=subprocess.PIPE,

                              stderr=err_log, # write errors to a file

                              shell=True)

eval_proc = subprocess.Popen(start_eval,

                             stdout=subprocess.PIPE,

                             shell=True)



if train_proc.wait() == 0: # successfull termination

    os.killpg(os.getpgid(eval_proc.pid), subprocess.signal.SIGTERM)

# else, errors will be written to the 'err_log.txt' file

edited Nov 21 '18 at 17:52

answered Nov 20 '18 at 1:12

Vlad

310411

I can start the scripts on preferred gpu's with that command. However my question was different..

– kneazle
Nov 21 '18 at 16:10

Check the update

– Vlad
Nov 21 '18 at 17:46

Thanks! It is working. I have one question though, I want to use this setting in a loop (say 100 times). In other words, after train is done and eval is killed, I want to start another pair of subprocesses just like those. However, after eval is killed, nothing starts again and “Terminated” is printed on my console. How could I prevent this to be happening, and start other processes after eval is killed?

– kneazle
Nov 24 '18 at 21:52

I need to see the error you get to help you.

– Vlad
Nov 25 '18 at 11:28

1

instead of os.killpg(os.getpgid(eval_proc.pid), subprocess.signal.SIGTERM) use eval_proc.kill().

– Vlad
Nov 26 '18 at 11:47

|
show 5 more comments

with tf.device("/gpu:1"):

    # your ops here

UPDATE

If I understand you correctly, what you need is the following:

import subprocess

import os

err_log = open('error_log.txt', 'w')

train_proc = subprocess.Popen(start_train,

                              stdout=subprocess.PIPE,

                              stderr=err_log, # write errors to a file

                              shell=True)

eval_proc = subprocess.Popen(start_eval,

                             stdout=subprocess.PIPE,

                             shell=True)



if train_proc.wait() == 0: # successfull termination

    os.killpg(os.getpgid(eval_proc.pid), subprocess.signal.SIGTERM)

# else, errors will be written to the 'err_log.txt' file

edited Nov 21 '18 at 17:52

answered Nov 20 '18 at 1:12

Vlad

310411

I can start the scripts on preferred gpu's with that command. However my question was different..

– kneazle
Nov 21 '18 at 16:10

Check the update

– Vlad
Nov 21 '18 at 17:46

Thanks! It is working. I have one question though, I want to use this setting in a loop (say 100 times). In other words, after train is done and eval is killed, I want to start another pair of subprocesses just like those. However, after eval is killed, nothing starts again and “Terminated” is printed on my console. How could I prevent this to be happening, and start other processes after eval is killed?

– kneazle
Nov 24 '18 at 21:52

I need to see the error you get to help you.

– Vlad
Nov 25 '18 at 11:28

1

instead of os.killpg(os.getpgid(eval_proc.pid), subprocess.signal.SIGTERM) use eval_proc.kill().

– Vlad
Nov 26 '18 at 11:47

|
show 5 more comments

with tf.device("/gpu:1"):

    # your ops here

UPDATE

If I understand you correctly, what you need is the following:

import subprocess

import os

err_log = open('error_log.txt', 'w')

train_proc = subprocess.Popen(start_train,

                              stdout=subprocess.PIPE,

                              stderr=err_log, # write errors to a file

                              shell=True)

eval_proc = subprocess.Popen(start_eval,

                             stdout=subprocess.PIPE,

                             shell=True)



if train_proc.wait() == 0: # successfull termination

    os.killpg(os.getpgid(eval_proc.pid), subprocess.signal.SIGTERM)

# else, errors will be written to the 'err_log.txt' file

edited Nov 21 '18 at 17:52

answered Nov 20 '18 at 1:12

Vlad

310411

with tf.device("/gpu:1"):

    # your ops here

UPDATE

If I understand you correctly, what you need is the following:

import subprocess

import os

err_log = open('error_log.txt', 'w')

train_proc = subprocess.Popen(start_train,

                              stdout=subprocess.PIPE,

                              stderr=err_log, # write errors to a file

                              shell=True)

eval_proc = subprocess.Popen(start_eval,

                             stdout=subprocess.PIPE,

                             shell=True)



if train_proc.wait() == 0: # successfull termination

    os.killpg(os.getpgid(eval_proc.pid), subprocess.signal.SIGTERM)

# else, errors will be written to the 'err_log.txt' file

edited Nov 21 '18 at 17:52

answered Nov 20 '18 at 1:12

Vlad

310411

edited Nov 21 '18 at 17:52

answered Nov 20 '18 at 1:12

Vlad

310411

answered Nov 20 '18 at 1:12

Vlad

310411

answered Nov 20 '18 at 1:12

Vlad

310411

I can start the scripts on preferred gpu's with that command. However my question was different..

– kneazle
Nov 21 '18 at 16:10

Check the update

– Vlad
Nov 21 '18 at 17:46

Thanks! It is working. I have one question though, I want to use this setting in a loop (say 100 times). In other words, after train is done and eval is killed, I want to start another pair of subprocesses just like those. However, after eval is killed, nothing starts again and “Terminated” is printed on my console. How could I prevent this to be happening, and start other processes after eval is killed?

– kneazle
Nov 24 '18 at 21:52

I need to see the error you get to help you.

– Vlad
Nov 25 '18 at 11:28

1

instead of os.killpg(os.getpgid(eval_proc.pid), subprocess.signal.SIGTERM) use eval_proc.kill().

– Vlad
Nov 26 '18 at 11:47

|
show 5 more comments

I can start the scripts on preferred gpu's with that command. However my question was different..

– kneazle
Nov 21 '18 at 16:10

Check the update

– Vlad
Nov 21 '18 at 17:46

Thanks! It is working. I have one question though, I want to use this setting in a loop (say 100 times). In other words, after train is done and eval is killed, I want to start another pair of subprocesses just like those. However, after eval is killed, nothing starts again and “Terminated” is printed on my console. How could I prevent this to be happening, and start other processes after eval is killed?

– kneazle
Nov 24 '18 at 21:52

I need to see the error you get to help you.

– Vlad
Nov 25 '18 at 11:28

1

instead of os.killpg(os.getpgid(eval_proc.pid), subprocess.signal.SIGTERM) use eval_proc.kill().

– Vlad
Nov 26 '18 at 11:47

I can start the scripts on preferred gpu's with that command. However my question was different..

– kneazle
Nov 21 '18 at 16:10

Check the update

– Vlad
Nov 21 '18 at 17:46

Thanks! It is working. I have one question though, I want to use this setting in a loop (say 100 times). In other words, after train is done and eval is killed, I want to start another pair of subprocesses just like those. However, after eval is killed, nothing starts again and “Terminated” is printed on my console. How could I prevent this to be happening, and start other processes after eval is killed?

– kneazle
Nov 24 '18 at 21:52

I need to see the error you get to help you.

– Vlad
Nov 25 '18 at 11:28

instead of os.killpg(os.getpgid(eval_proc.pid), subprocess.signal.SIGTERM) use eval_proc.kill().

– Vlad
Nov 26 '18 at 11:47

|
show 5 more comments

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Agfdhyk