Semantic Segmentation to Bounding Boxes












1















Suppose you are performing semantic segmentation. For simplicity, let's assume this is 1D segmentation rather than 2D (i.e. we only care about finding objects with width).



So the desired output of our model might be something like:



[
[0, 0, 0, 0, 1, 1, 1], # label channel 1
[1, 1, 1, 0, 0, 1, 1], # label channel 2
[0, 0, 0, 1, 1, 1, 0], # label channel 3
#...
]


However, our trained imperfect model might be more like



[
[0.1, 0.1, 0.1, 0.4, 0.91, 0.81, 0.84], # label channel 1
[0.81, 0.79, 0.85, 0.1, 0.2, 0.61, 0.91], # label channel 2
[0.3, 0.1, 0.24, 0.87, 0.62, 1, 0 ], # label channel 3
#...
]


What would be a performant way, using python, for getting the boundaries of the labels (or bounding box)



e.g. (zero-indexed)



[
[[4, 6]], # "objects" of label 1
[[0, 2], [5, 6]] # "objects" of label 2
[[3, 5]], # "objects" of label 3
]


if it helps, perhaps transforming it to a binary mask would be of more use?



def binarize(arr, cutoff=0.5):
return (arr > cutoff).astype(int)


with a binary mask we just need to find the consecutive integers of the indices of nonzero values:



def consecutive(data, stepsize=1):
return np.split(data, np.where(np.diff(data) != stepsize)[0]+1)



find "runs" of labels:



def binary_boundaries(labels, cutoff=0.5):  
return [consecutive(channel.nonzero()[0]) for channel in binarize(labels, cutoff)]


name objects according to channel name:



def binary_objects(labels, cutoff=0.5, channel_names=None):
if channel_names == None:
channel_names = ['channel {}'.format(i) for i in range(labels.shape[0])]

return dict(zip(channel_names, binary_boundaries(labels, cutoff)))









share|improve this question

























  • Your question is not clear but if you are asking how to compute the bounding box from segmentation image, follow my answer.

    – Nouman Riaz Khan
    Nov 22 '18 at 16:42











  • @NoumanRiazKhan I appreciate your time and input. What part of my question is not clear? This is not quite bounding boxes as it is in 1D... so 1. the labels' dimension as well as the output would have to be reformatted and 2. I was hoping for something a bit more sophisticated than a binary mask, (e.g. if a channel was [0, 1, 0.6, 1, 1, 1] then adjust 0.6 according to its neighbors...

    – SumNeuron
    Nov 22 '18 at 16:50











  • DId you try the solution I proposed? The unclear part is, do you want to have index of all non-zero values for every label or do you want 4 bounding box points? Do you need to find optimal cut off point?

    – Nouman Riaz Khan
    Dec 3 '18 at 12:14













  • @NoumanRiazKhan I just tried your solution, however that entire module is centered around 3-channel images and so not ideal in this case.

    – SumNeuron
    Dec 3 '18 at 15:23
















1















Suppose you are performing semantic segmentation. For simplicity, let's assume this is 1D segmentation rather than 2D (i.e. we only care about finding objects with width).



So the desired output of our model might be something like:



[
[0, 0, 0, 0, 1, 1, 1], # label channel 1
[1, 1, 1, 0, 0, 1, 1], # label channel 2
[0, 0, 0, 1, 1, 1, 0], # label channel 3
#...
]


However, our trained imperfect model might be more like



[
[0.1, 0.1, 0.1, 0.4, 0.91, 0.81, 0.84], # label channel 1
[0.81, 0.79, 0.85, 0.1, 0.2, 0.61, 0.91], # label channel 2
[0.3, 0.1, 0.24, 0.87, 0.62, 1, 0 ], # label channel 3
#...
]


What would be a performant way, using python, for getting the boundaries of the labels (or bounding box)



e.g. (zero-indexed)



[
[[4, 6]], # "objects" of label 1
[[0, 2], [5, 6]] # "objects" of label 2
[[3, 5]], # "objects" of label 3
]


if it helps, perhaps transforming it to a binary mask would be of more use?



def binarize(arr, cutoff=0.5):
return (arr > cutoff).astype(int)


with a binary mask we just need to find the consecutive integers of the indices of nonzero values:



def consecutive(data, stepsize=1):
return np.split(data, np.where(np.diff(data) != stepsize)[0]+1)



find "runs" of labels:



def binary_boundaries(labels, cutoff=0.5):  
return [consecutive(channel.nonzero()[0]) for channel in binarize(labels, cutoff)]


name objects according to channel name:



def binary_objects(labels, cutoff=0.5, channel_names=None):
if channel_names == None:
channel_names = ['channel {}'.format(i) for i in range(labels.shape[0])]

return dict(zip(channel_names, binary_boundaries(labels, cutoff)))









share|improve this question

























  • Your question is not clear but if you are asking how to compute the bounding box from segmentation image, follow my answer.

    – Nouman Riaz Khan
    Nov 22 '18 at 16:42











  • @NoumanRiazKhan I appreciate your time and input. What part of my question is not clear? This is not quite bounding boxes as it is in 1D... so 1. the labels' dimension as well as the output would have to be reformatted and 2. I was hoping for something a bit more sophisticated than a binary mask, (e.g. if a channel was [0, 1, 0.6, 1, 1, 1] then adjust 0.6 according to its neighbors...

    – SumNeuron
    Nov 22 '18 at 16:50











  • DId you try the solution I proposed? The unclear part is, do you want to have index of all non-zero values for every label or do you want 4 bounding box points? Do you need to find optimal cut off point?

    – Nouman Riaz Khan
    Dec 3 '18 at 12:14













  • @NoumanRiazKhan I just tried your solution, however that entire module is centered around 3-channel images and so not ideal in this case.

    – SumNeuron
    Dec 3 '18 at 15:23














1












1








1








Suppose you are performing semantic segmentation. For simplicity, let's assume this is 1D segmentation rather than 2D (i.e. we only care about finding objects with width).



So the desired output of our model might be something like:



[
[0, 0, 0, 0, 1, 1, 1], # label channel 1
[1, 1, 1, 0, 0, 1, 1], # label channel 2
[0, 0, 0, 1, 1, 1, 0], # label channel 3
#...
]


However, our trained imperfect model might be more like



[
[0.1, 0.1, 0.1, 0.4, 0.91, 0.81, 0.84], # label channel 1
[0.81, 0.79, 0.85, 0.1, 0.2, 0.61, 0.91], # label channel 2
[0.3, 0.1, 0.24, 0.87, 0.62, 1, 0 ], # label channel 3
#...
]


What would be a performant way, using python, for getting the boundaries of the labels (or bounding box)



e.g. (zero-indexed)



[
[[4, 6]], # "objects" of label 1
[[0, 2], [5, 6]] # "objects" of label 2
[[3, 5]], # "objects" of label 3
]


if it helps, perhaps transforming it to a binary mask would be of more use?



def binarize(arr, cutoff=0.5):
return (arr > cutoff).astype(int)


with a binary mask we just need to find the consecutive integers of the indices of nonzero values:



def consecutive(data, stepsize=1):
return np.split(data, np.where(np.diff(data) != stepsize)[0]+1)



find "runs" of labels:



def binary_boundaries(labels, cutoff=0.5):  
return [consecutive(channel.nonzero()[0]) for channel in binarize(labels, cutoff)]


name objects according to channel name:



def binary_objects(labels, cutoff=0.5, channel_names=None):
if channel_names == None:
channel_names = ['channel {}'.format(i) for i in range(labels.shape[0])]

return dict(zip(channel_names, binary_boundaries(labels, cutoff)))









share|improve this question
















Suppose you are performing semantic segmentation. For simplicity, let's assume this is 1D segmentation rather than 2D (i.e. we only care about finding objects with width).



So the desired output of our model might be something like:



[
[0, 0, 0, 0, 1, 1, 1], # label channel 1
[1, 1, 1, 0, 0, 1, 1], # label channel 2
[0, 0, 0, 1, 1, 1, 0], # label channel 3
#...
]


However, our trained imperfect model might be more like



[
[0.1, 0.1, 0.1, 0.4, 0.91, 0.81, 0.84], # label channel 1
[0.81, 0.79, 0.85, 0.1, 0.2, 0.61, 0.91], # label channel 2
[0.3, 0.1, 0.24, 0.87, 0.62, 1, 0 ], # label channel 3
#...
]


What would be a performant way, using python, for getting the boundaries of the labels (or bounding box)



e.g. (zero-indexed)



[
[[4, 6]], # "objects" of label 1
[[0, 2], [5, 6]] # "objects" of label 2
[[3, 5]], # "objects" of label 3
]


if it helps, perhaps transforming it to a binary mask would be of more use?



def binarize(arr, cutoff=0.5):
return (arr > cutoff).astype(int)


with a binary mask we just need to find the consecutive integers of the indices of nonzero values:



def consecutive(data, stepsize=1):
return np.split(data, np.where(np.diff(data) != stepsize)[0]+1)



find "runs" of labels:



def binary_boundaries(labels, cutoff=0.5):  
return [consecutive(channel.nonzero()[0]) for channel in binarize(labels, cutoff)]


name objects according to channel name:



def binary_objects(labels, cutoff=0.5, channel_names=None):
if channel_names == None:
channel_names = ['channel {}'.format(i) for i in range(labels.shape[0])]

return dict(zip(channel_names, binary_boundaries(labels, cutoff)))






python numpy machine-learning computer-vision






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 18 '18 at 13:58







SumNeuron

















asked Nov 18 '18 at 11:42









SumNeuronSumNeuron

1,155824




1,155824













  • Your question is not clear but if you are asking how to compute the bounding box from segmentation image, follow my answer.

    – Nouman Riaz Khan
    Nov 22 '18 at 16:42











  • @NoumanRiazKhan I appreciate your time and input. What part of my question is not clear? This is not quite bounding boxes as it is in 1D... so 1. the labels' dimension as well as the output would have to be reformatted and 2. I was hoping for something a bit more sophisticated than a binary mask, (e.g. if a channel was [0, 1, 0.6, 1, 1, 1] then adjust 0.6 according to its neighbors...

    – SumNeuron
    Nov 22 '18 at 16:50











  • DId you try the solution I proposed? The unclear part is, do you want to have index of all non-zero values for every label or do you want 4 bounding box points? Do you need to find optimal cut off point?

    – Nouman Riaz Khan
    Dec 3 '18 at 12:14













  • @NoumanRiazKhan I just tried your solution, however that entire module is centered around 3-channel images and so not ideal in this case.

    – SumNeuron
    Dec 3 '18 at 15:23



















  • Your question is not clear but if you are asking how to compute the bounding box from segmentation image, follow my answer.

    – Nouman Riaz Khan
    Nov 22 '18 at 16:42











  • @NoumanRiazKhan I appreciate your time and input. What part of my question is not clear? This is not quite bounding boxes as it is in 1D... so 1. the labels' dimension as well as the output would have to be reformatted and 2. I was hoping for something a bit more sophisticated than a binary mask, (e.g. if a channel was [0, 1, 0.6, 1, 1, 1] then adjust 0.6 according to its neighbors...

    – SumNeuron
    Nov 22 '18 at 16:50











  • DId you try the solution I proposed? The unclear part is, do you want to have index of all non-zero values for every label or do you want 4 bounding box points? Do you need to find optimal cut off point?

    – Nouman Riaz Khan
    Dec 3 '18 at 12:14













  • @NoumanRiazKhan I just tried your solution, however that entire module is centered around 3-channel images and so not ideal in this case.

    – SumNeuron
    Dec 3 '18 at 15:23

















Your question is not clear but if you are asking how to compute the bounding box from segmentation image, follow my answer.

– Nouman Riaz Khan
Nov 22 '18 at 16:42





Your question is not clear but if you are asking how to compute the bounding box from segmentation image, follow my answer.

– Nouman Riaz Khan
Nov 22 '18 at 16:42













@NoumanRiazKhan I appreciate your time and input. What part of my question is not clear? This is not quite bounding boxes as it is in 1D... so 1. the labels' dimension as well as the output would have to be reformatted and 2. I was hoping for something a bit more sophisticated than a binary mask, (e.g. if a channel was [0, 1, 0.6, 1, 1, 1] then adjust 0.6 according to its neighbors...

– SumNeuron
Nov 22 '18 at 16:50





@NoumanRiazKhan I appreciate your time and input. What part of my question is not clear? This is not quite bounding boxes as it is in 1D... so 1. the labels' dimension as well as the output would have to be reformatted and 2. I was hoping for something a bit more sophisticated than a binary mask, (e.g. if a channel was [0, 1, 0.6, 1, 1, 1] then adjust 0.6 according to its neighbors...

– SumNeuron
Nov 22 '18 at 16:50













DId you try the solution I proposed? The unclear part is, do you want to have index of all non-zero values for every label or do you want 4 bounding box points? Do you need to find optimal cut off point?

– Nouman Riaz Khan
Dec 3 '18 at 12:14







DId you try the solution I proposed? The unclear part is, do you want to have index of all non-zero values for every label or do you want 4 bounding box points? Do you need to find optimal cut off point?

– Nouman Riaz Khan
Dec 3 '18 at 12:14















@NoumanRiazKhan I just tried your solution, however that entire module is centered around 3-channel images and so not ideal in this case.

– SumNeuron
Dec 3 '18 at 15:23





@NoumanRiazKhan I just tried your solution, however that entire module is centered around 3-channel images and so not ideal in this case.

– SumNeuron
Dec 3 '18 at 15:23












1 Answer
1






active

oldest

votes


















0














Your trained model returned the float image and not the int image you were looking for (and it's not 'imperfect' if decimals were bothering you) and Yes! you do need to threshold it to get binary image.



Once you do have the binary image, lets do some work with skimage.



label_mask = measure.label(mask)
props = measure.regionprops(label_mask)


mask is your binary image and here you do have props the properties of all the regions which are detected objects actually.



Among these properties, there exists bounding box!






share|improve this answer























    Your Answer






    StackExchange.ifUsing("editor", function () {
    StackExchange.using("externalEditor", function () {
    StackExchange.using("snippets", function () {
    StackExchange.snippets.init();
    });
    });
    }, "code-snippets");

    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "1"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });














    draft saved

    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53360471%2fsemantic-segmentation-to-bounding-boxes%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    1 Answer
    1






    active

    oldest

    votes








    1 Answer
    1






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    0














    Your trained model returned the float image and not the int image you were looking for (and it's not 'imperfect' if decimals were bothering you) and Yes! you do need to threshold it to get binary image.



    Once you do have the binary image, lets do some work with skimage.



    label_mask = measure.label(mask)
    props = measure.regionprops(label_mask)


    mask is your binary image and here you do have props the properties of all the regions which are detected objects actually.



    Among these properties, there exists bounding box!






    share|improve this answer




























      0














      Your trained model returned the float image and not the int image you were looking for (and it's not 'imperfect' if decimals were bothering you) and Yes! you do need to threshold it to get binary image.



      Once you do have the binary image, lets do some work with skimage.



      label_mask = measure.label(mask)
      props = measure.regionprops(label_mask)


      mask is your binary image and here you do have props the properties of all the regions which are detected objects actually.



      Among these properties, there exists bounding box!






      share|improve this answer


























        0












        0








        0







        Your trained model returned the float image and not the int image you were looking for (and it's not 'imperfect' if decimals were bothering you) and Yes! you do need to threshold it to get binary image.



        Once you do have the binary image, lets do some work with skimage.



        label_mask = measure.label(mask)
        props = measure.regionprops(label_mask)


        mask is your binary image and here you do have props the properties of all the regions which are detected objects actually.



        Among these properties, there exists bounding box!






        share|improve this answer













        Your trained model returned the float image and not the int image you were looking for (and it's not 'imperfect' if decimals were bothering you) and Yes! you do need to threshold it to get binary image.



        Once you do have the binary image, lets do some work with skimage.



        label_mask = measure.label(mask)
        props = measure.regionprops(label_mask)


        mask is your binary image and here you do have props the properties of all the regions which are detected objects actually.



        Among these properties, there exists bounding box!







        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered Nov 22 '18 at 16:48









        Nouman Riaz KhanNouman Riaz Khan

        1596




        1596






























            draft saved

            draft discarded




















































            Thanks for contributing an answer to Stack Overflow!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53360471%2fsemantic-segmentation-to-bounding-boxes%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            Guess what letter conforming each word

            Port of Spain

            Run scheduled task as local user group (not BUILTIN)