Semantic Segmentation to Bounding Boxes

Suppose you are performing semantic segmentation. For simplicity, let's assume this is 1D segmentation rather than 2D (i.e. we only care about finding objects with width).

So the desired output of our model might be something like:

[

    [0, 0, 0, 0, 1, 1, 1], # label channel 1 

    [1, 1, 1, 0, 0, 1, 1], # label channel 2 

    [0, 0, 0, 1, 1, 1, 0], # label channel 3

    #...

]

However, our trained imperfect model might be more like

[

    [0.1,  0.1,  0.1,  0.4,  0.91, 0.81, 0.84], # label channel 1 

    [0.81, 0.79, 0.85, 0.1,  0.2,  0.61, 0.91], # label channel 2 

    [0.3,  0.1,  0.24, 0.87, 0.62, 1,    0   ], # label channel 3

    #...

]

What would be a performant way, using python, for getting the boundaries of the labels (or bounding box)

e.g. (zero-indexed)

[

    [[4, 6]],        # "objects" of label 1

    [[0, 2], [5, 6]] # "objects" of label 2

    [[3, 5]],        # "objects" of label 3

]

if it helps, perhaps transforming it to a binary mask would be of more use?

def binarize(arr, cutoff=0.5):

  return (arr > cutoff).astype(int)

with a binary mask we just need to find the consecutive integers of the indices of nonzero values:

def consecutive(data, stepsize=1):
return np.split(data, np.where(np.diff(data) != stepsize)[0]+1)

find "runs" of labels:

def binary_boundaries(labels, cutoff=0.5):  

  return [consecutive(channel.nonzero()[0]) for channel in binarize(labels, cutoff)]

name objects according to channel name:

def binary_objects(labels, cutoff=0.5, channel_names=None):

  if channel_names == None: 

    channel_names = ['channel {}'.format(i) for i in range(labels.shape[0])]



  return dict(zip(channel_names, binary_boundaries(labels, cutoff)))

edited Nov 18 '18 at 13:58

asked Nov 18 '18 at 11:42

SumNeuron

1,155824

Your question is not clear but if you are asking how to compute the bounding box from segmentation image, follow my answer.

– Nouman Riaz Khan
Nov 22 '18 at 16:42

@NoumanRiazKhan I appreciate your time and input. What part of my question is not clear? This is not quite bounding boxes as it is in 1D... so 1. the labels' dimension as well as the output would have to be reformatted and 2. I was hoping for something a bit more sophisticated than a binary mask, (e.g. if a channel was [0, 1, 0.6, 1, 1, 1] then adjust 0.6 according to its neighbors...

– SumNeuron
Nov 22 '18 at 16:50

DId you try the solution I proposed? The unclear part is, do you want to have index of all non-zero values for every label or do you want 4 bounding box points? Do you need to find optimal cut off point?

– Nouman Riaz Khan
Dec 3 '18 at 12:14

@NoumanRiazKhan I just tried your solution, however that entire module is centered around 3-channel images and so not ideal in this case.

– SumNeuron
Dec 3 '18 at 15:23

add a comment |

Suppose you are performing semantic segmentation. For simplicity, let's assume this is 1D segmentation rather than 2D (i.e. we only care about finding objects with width).

So the desired output of our model might be something like:

[

    [0, 0, 0, 0, 1, 1, 1], # label channel 1 

    [1, 1, 1, 0, 0, 1, 1], # label channel 2 

    [0, 0, 0, 1, 1, 1, 0], # label channel 3

    #...

]

However, our trained imperfect model might be more like

[

    [0.1,  0.1,  0.1,  0.4,  0.91, 0.81, 0.84], # label channel 1 

    [0.81, 0.79, 0.85, 0.1,  0.2,  0.61, 0.91], # label channel 2 

    [0.3,  0.1,  0.24, 0.87, 0.62, 1,    0   ], # label channel 3

    #...

]

What would be a performant way, using python, for getting the boundaries of the labels (or bounding box)

e.g. (zero-indexed)

[

    [[4, 6]],        # "objects" of label 1

    [[0, 2], [5, 6]] # "objects" of label 2

    [[3, 5]],        # "objects" of label 3

]

if it helps, perhaps transforming it to a binary mask would be of more use?

def binarize(arr, cutoff=0.5):

  return (arr > cutoff).astype(int)

with a binary mask we just need to find the consecutive integers of the indices of nonzero values:

def consecutive(data, stepsize=1):
return np.split(data, np.where(np.diff(data) != stepsize)[0]+1)

find "runs" of labels:

def binary_boundaries(labels, cutoff=0.5):  

  return [consecutive(channel.nonzero()[0]) for channel in binarize(labels, cutoff)]

name objects according to channel name:

def binary_objects(labels, cutoff=0.5, channel_names=None):

  if channel_names == None: 

    channel_names = ['channel {}'.format(i) for i in range(labels.shape[0])]



  return dict(zip(channel_names, binary_boundaries(labels, cutoff)))

edited Nov 18 '18 at 13:58

asked Nov 18 '18 at 11:42

SumNeuron

1,155824

Your question is not clear but if you are asking how to compute the bounding box from segmentation image, follow my answer.

– Nouman Riaz Khan
Nov 22 '18 at 16:42

@NoumanRiazKhan I appreciate your time and input. What part of my question is not clear? This is not quite bounding boxes as it is in 1D... so 1. the labels' dimension as well as the output would have to be reformatted and 2. I was hoping for something a bit more sophisticated than a binary mask, (e.g. if a channel was [0, 1, 0.6, 1, 1, 1] then adjust 0.6 according to its neighbors...

– SumNeuron
Nov 22 '18 at 16:50

DId you try the solution I proposed? The unclear part is, do you want to have index of all non-zero values for every label or do you want 4 bounding box points? Do you need to find optimal cut off point?

– Nouman Riaz Khan
Dec 3 '18 at 12:14

@NoumanRiazKhan I just tried your solution, however that entire module is centered around 3-channel images and so not ideal in this case.

– SumNeuron
Dec 3 '18 at 15:23

add a comment |

Suppose you are performing semantic segmentation. For simplicity, let's assume this is 1D segmentation rather than 2D (i.e. we only care about finding objects with width).

So the desired output of our model might be something like:

[

    [0, 0, 0, 0, 1, 1, 1], # label channel 1 

    [1, 1, 1, 0, 0, 1, 1], # label channel 2 

    [0, 0, 0, 1, 1, 1, 0], # label channel 3

    #...

]

However, our trained imperfect model might be more like

[

    [0.1,  0.1,  0.1,  0.4,  0.91, 0.81, 0.84], # label channel 1 

    [0.81, 0.79, 0.85, 0.1,  0.2,  0.61, 0.91], # label channel 2 

    [0.3,  0.1,  0.24, 0.87, 0.62, 1,    0   ], # label channel 3

    #...

]

What would be a performant way, using python, for getting the boundaries of the labels (or bounding box)

e.g. (zero-indexed)

[

    [[4, 6]],        # "objects" of label 1

    [[0, 2], [5, 6]] # "objects" of label 2

    [[3, 5]],        # "objects" of label 3

]

if it helps, perhaps transforming it to a binary mask would be of more use?

def binarize(arr, cutoff=0.5):

  return (arr > cutoff).astype(int)

with a binary mask we just need to find the consecutive integers of the indices of nonzero values:

def consecutive(data, stepsize=1):
return np.split(data, np.where(np.diff(data) != stepsize)[0]+1)

find "runs" of labels:

def binary_boundaries(labels, cutoff=0.5):  

  return [consecutive(channel.nonzero()[0]) for channel in binarize(labels, cutoff)]

name objects according to channel name:

def binary_objects(labels, cutoff=0.5, channel_names=None):

  if channel_names == None: 

    channel_names = ['channel {}'.format(i) for i in range(labels.shape[0])]



  return dict(zip(channel_names, binary_boundaries(labels, cutoff)))

edited Nov 18 '18 at 13:58

asked Nov 18 '18 at 11:42

SumNeuron

1,155824

Suppose you are performing semantic segmentation. For simplicity, let's assume this is 1D segmentation rather than 2D (i.e. we only care about finding objects with width).

So the desired output of our model might be something like:

[

    [0, 0, 0, 0, 1, 1, 1], # label channel 1 

    [1, 1, 1, 0, 0, 1, 1], # label channel 2 

    [0, 0, 0, 1, 1, 1, 0], # label channel 3

    #...

]

However, our trained imperfect model might be more like

[

    [0.1,  0.1,  0.1,  0.4,  0.91, 0.81, 0.84], # label channel 1 

    [0.81, 0.79, 0.85, 0.1,  0.2,  0.61, 0.91], # label channel 2 

    [0.3,  0.1,  0.24, 0.87, 0.62, 1,    0   ], # label channel 3

    #...

]

What would be a performant way, using python, for getting the boundaries of the labels (or bounding box)

e.g. (zero-indexed)

[

    [[4, 6]],        # "objects" of label 1

    [[0, 2], [5, 6]] # "objects" of label 2

    [[3, 5]],        # "objects" of label 3

]

if it helps, perhaps transforming it to a binary mask would be of more use?

def binarize(arr, cutoff=0.5):

  return (arr > cutoff).astype(int)

with a binary mask we just need to find the consecutive integers of the indices of nonzero values:

def consecutive(data, stepsize=1):
return np.split(data, np.where(np.diff(data) != stepsize)[0]+1)

find "runs" of labels:

def binary_boundaries(labels, cutoff=0.5):  

  return [consecutive(channel.nonzero()[0]) for channel in binarize(labels, cutoff)]

name objects according to channel name:

def binary_objects(labels, cutoff=0.5, channel_names=None):

  if channel_names == None: 

    channel_names = ['channel {}'.format(i) for i in range(labels.shape[0])]



  return dict(zip(channel_names, binary_boundaries(labels, cutoff)))

python numpy machine-learning computer-vision

edited Nov 18 '18 at 13:58

asked Nov 18 '18 at 11:42

SumNeuron

1,155824

edited Nov 18 '18 at 13:58

asked Nov 18 '18 at 11:42

SumNeuron

1,155824

edited Nov 18 '18 at 13:58

asked Nov 18 '18 at 11:42

SumNeuron

1,155824

asked Nov 18 '18 at 11:42

SumNeuron

1,155824

asked Nov 18 '18 at 11:42

SumNeuron

1,155824

Your question is not clear but if you are asking how to compute the bounding box from segmentation image, follow my answer.

– Nouman Riaz Khan
Nov 22 '18 at 16:42

@NoumanRiazKhan I appreciate your time and input. What part of my question is not clear? This is not quite bounding boxes as it is in 1D... so 1. the labels' dimension as well as the output would have to be reformatted and 2. I was hoping for something a bit more sophisticated than a binary mask, (e.g. if a channel was [0, 1, 0.6, 1, 1, 1] then adjust 0.6 according to its neighbors...

– SumNeuron
Nov 22 '18 at 16:50

DId you try the solution I proposed? The unclear part is, do you want to have index of all non-zero values for every label or do you want 4 bounding box points? Do you need to find optimal cut off point?

– Nouman Riaz Khan
Dec 3 '18 at 12:14

@NoumanRiazKhan I just tried your solution, however that entire module is centered around 3-channel images and so not ideal in this case.

– SumNeuron
Dec 3 '18 at 15:23

add a comment |

Your question is not clear but if you are asking how to compute the bounding box from segmentation image, follow my answer.

– Nouman Riaz Khan
Nov 22 '18 at 16:42

@NoumanRiazKhan I appreciate your time and input. What part of my question is not clear? This is not quite bounding boxes as it is in 1D... so 1. the labels' dimension as well as the output would have to be reformatted and 2. I was hoping for something a bit more sophisticated than a binary mask, (e.g. if a channel was [0, 1, 0.6, 1, 1, 1] then adjust 0.6 according to its neighbors...

– SumNeuron
Nov 22 '18 at 16:50

DId you try the solution I proposed? The unclear part is, do you want to have index of all non-zero values for every label or do you want 4 bounding box points? Do you need to find optimal cut off point?

– Nouman Riaz Khan
Dec 3 '18 at 12:14

@NoumanRiazKhan I just tried your solution, however that entire module is centered around 3-channel images and so not ideal in this case.

– SumNeuron
Dec 3 '18 at 15:23

Your question is not clear but if you are asking how to compute the bounding box from segmentation image, follow my answer.

– Nouman Riaz Khan
Nov 22 '18 at 16:42

@NoumanRiazKhan I appreciate your time and input. What part of my question is not clear? This is not quite bounding boxes as it is in 1D... so 1. the labels' dimension as well as the output would have to be reformatted and 2. I was hoping for something a bit more sophisticated than a binary mask, (e.g. if a channel was [0, 1, 0.6, 1, 1, 1] then adjust 0.6 according to its neighbors...

– SumNeuron
Nov 22 '18 at 16:50

DId you try the solution I proposed? The unclear part is, do you want to have index of all non-zero values for every label or do you want 4 bounding box points? Do you need to find optimal cut off point?

– Nouman Riaz Khan
Dec 3 '18 at 12:14

@NoumanRiazKhan I just tried your solution, however that entire module is centered around 3-channel images and so not ideal in this case.

– SumNeuron
Dec 3 '18 at 15:23

add a comment |

1 Answer
1

active

oldest

votes

Your trained model returned the float image and not the int image you were looking for (and it's not 'imperfect' if decimals were bothering you) and Yes! you do need to threshold it to get binary image.

Once you do have the binary image, lets do some work with skimage.

label_mask = measure.label(mask)

props = measure.regionprops(label_mask)

mask is your binary image and here you do have props the properties of all the regions which are detected objects actually.

Among these properties, there exists bounding box!

answered Nov 22 '18 at 16:48

Nouman Riaz Khan

1596

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53360471%2fsemantic-segmentation-to-bounding-boxes%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

Once you do have the binary image, lets do some work with skimage.

label_mask = measure.label(mask)

props = measure.regionprops(label_mask)

mask is your binary image and here you do have props the properties of all the regions which are detected objects actually.

Among these properties, there exists bounding box!

answered Nov 22 '18 at 16:48

Nouman Riaz Khan

1596

add a comment |

Once you do have the binary image, lets do some work with skimage.

label_mask = measure.label(mask)

props = measure.regionprops(label_mask)

mask is your binary image and here you do have props the properties of all the regions which are detected objects actually.

Among these properties, there exists bounding box!

answered Nov 22 '18 at 16:48

Nouman Riaz Khan

1596

add a comment |

Once you do have the binary image, lets do some work with skimage.

label_mask = measure.label(mask)

props = measure.regionprops(label_mask)

mask is your binary image and here you do have props the properties of all the regions which are detected objects actually.

Among these properties, there exists bounding box!

answered Nov 22 '18 at 16:48

Nouman Riaz Khan

1596

Once you do have the binary image, lets do some work with skimage.

label_mask = measure.label(mask)

props = measure.regionprops(label_mask)

mask is your binary image and here you do have props the properties of all the regions which are detected objects actually.

Among these properties, there exists bounding box!

answered Nov 22 '18 at 16:48

Nouman Riaz Khan

1596

answered Nov 22 '18 at 16:48

Nouman Riaz Khan

1596

answered Nov 22 '18 at 16:48

Nouman Riaz Khan

1596

answered Nov 22 '18 at 16:48

Nouman Riaz Khan

1596

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Agfdhyk