igraph: adding vertices = X creating clusters of size = 1
I am currently working through some graph theory problems and have a question I can't seem to find an answer to. When creating a graph using:
x <- graph_from_data_frame(el, directed = F, vertices = x)
The addition of the vertices = x creates components of size = 1.
I want to look at cluster size i.e. extracting the components and looking at a table of size using:
comp <- components(x)
table(comp$csize)
Given the nature of edgelists, I would expect no clusters to have size <= 2, seeing as the edgelist is the relationship between two nodes.If I run the exact same code without the vertices = x, my table will start with clusters of size = 2.
Why does the addition of vertices = x do this?
Thanks
EDIT:
My edgelist has the variables:
ID ID.2 soure
x1 x2 healthcare
x1 x3 child benefit
The vertices data frame contains general information for the nodes(IDs)
ID date_of_birth nationality
x1 02/09/1999 French
x2 12/12/1997 French
x3 22/01/2002 French
r igraph graph-theory sna
add a comment |
I am currently working through some graph theory problems and have a question I can't seem to find an answer to. When creating a graph using:
x <- graph_from_data_frame(el, directed = F, vertices = x)
The addition of the vertices = x creates components of size = 1.
I want to look at cluster size i.e. extracting the components and looking at a table of size using:
comp <- components(x)
table(comp$csize)
Given the nature of edgelists, I would expect no clusters to have size <= 2, seeing as the edgelist is the relationship between two nodes.If I run the exact same code without the vertices = x, my table will start with clusters of size = 2.
Why does the addition of vertices = x do this?
Thanks
EDIT:
My edgelist has the variables:
ID ID.2 soure
x1 x2 healthcare
x1 x3 child benefit
The vertices data frame contains general information for the nodes(IDs)
ID date_of_birth nationality
x1 02/09/1999 French
x2 12/12/1997 French
x3 22/01/2002 French
r igraph graph-theory sna
1
Thevertices
argument is there to include vertex metadata. Without knowing what is inx
its hard to say. If you post some of your data withdput()
or make a minimal reproducible example it would be easier to diagnose.
– gfgm
Nov 21 '18 at 12:48
Hi, thanks for the quick response. I have edited the thread and added a small reproducible example.
– williamg15
Nov 21 '18 at 12:57
add a comment |
I am currently working through some graph theory problems and have a question I can't seem to find an answer to. When creating a graph using:
x <- graph_from_data_frame(el, directed = F, vertices = x)
The addition of the vertices = x creates components of size = 1.
I want to look at cluster size i.e. extracting the components and looking at a table of size using:
comp <- components(x)
table(comp$csize)
Given the nature of edgelists, I would expect no clusters to have size <= 2, seeing as the edgelist is the relationship between two nodes.If I run the exact same code without the vertices = x, my table will start with clusters of size = 2.
Why does the addition of vertices = x do this?
Thanks
EDIT:
My edgelist has the variables:
ID ID.2 soure
x1 x2 healthcare
x1 x3 child benefit
The vertices data frame contains general information for the nodes(IDs)
ID date_of_birth nationality
x1 02/09/1999 French
x2 12/12/1997 French
x3 22/01/2002 French
r igraph graph-theory sna
I am currently working through some graph theory problems and have a question I can't seem to find an answer to. When creating a graph using:
x <- graph_from_data_frame(el, directed = F, vertices = x)
The addition of the vertices = x creates components of size = 1.
I want to look at cluster size i.e. extracting the components and looking at a table of size using:
comp <- components(x)
table(comp$csize)
Given the nature of edgelists, I would expect no clusters to have size <= 2, seeing as the edgelist is the relationship between two nodes.If I run the exact same code without the vertices = x, my table will start with clusters of size = 2.
Why does the addition of vertices = x do this?
Thanks
EDIT:
My edgelist has the variables:
ID ID.2 soure
x1 x2 healthcare
x1 x3 child benefit
The vertices data frame contains general information for the nodes(IDs)
ID date_of_birth nationality
x1 02/09/1999 French
x2 12/12/1997 French
x3 22/01/2002 French
r igraph graph-theory sna
r igraph graph-theory sna
edited Nov 23 '18 at 9:17
Szabolcs
16.1k361143
16.1k361143
asked Nov 21 '18 at 12:44
williamg15williamg15
276
276
1
Thevertices
argument is there to include vertex metadata. Without knowing what is inx
its hard to say. If you post some of your data withdput()
or make a minimal reproducible example it would be easier to diagnose.
– gfgm
Nov 21 '18 at 12:48
Hi, thanks for the quick response. I have edited the thread and added a small reproducible example.
– williamg15
Nov 21 '18 at 12:57
add a comment |
1
Thevertices
argument is there to include vertex metadata. Without knowing what is inx
its hard to say. If you post some of your data withdput()
or make a minimal reproducible example it would be easier to diagnose.
– gfgm
Nov 21 '18 at 12:48
Hi, thanks for the quick response. I have edited the thread and added a small reproducible example.
– williamg15
Nov 21 '18 at 12:57
1
1
The
vertices
argument is there to include vertex metadata. Without knowing what is in x
its hard to say. If you post some of your data with dput()
or make a minimal reproducible example it would be easier to diagnose.– gfgm
Nov 21 '18 at 12:48
The
vertices
argument is there to include vertex metadata. Without knowing what is in x
its hard to say. If you post some of your data with dput()
or make a minimal reproducible example it would be easier to diagnose.– gfgm
Nov 21 '18 at 12:48
Hi, thanks for the quick response. I have edited the thread and added a small reproducible example.
– williamg15
Nov 21 '18 at 12:57
Hi, thanks for the quick response. I have edited the thread and added a small reproducible example.
– williamg15
Nov 21 '18 at 12:57
add a comment |
1 Answer
1
active
oldest
votes
I suspect that what is happening is that you have IDs appearing in your data.frame of node metadata x
that do not appear in the edge list. Igraph will add these nodes as isolated vertices. Some sample code below to illustrate the problem:
library(igraph)
# generate some fake data
set.seed(42)
e1 <- data.frame(ID = sample(1:10, 5), ID.2 = sample(1:10, 5))
head(e1)
#> ID ID.2
#> 1 10 6
#> 2 9 7
#> 3 3 2
#> 4 6 5
#> 5 4 9
# make the desired graph object
x <- graph_from_data_frame(e1, directed = F)
# make some attribute data that only matches the nodes that have edges
v_atts1 <- data.frame(ID = names(V(x)), foo = rnorm(length(names(V(x)))))
v_atts1
#> ID foo
#> 1 10 -0.10612452
#> 2 9 1.51152200
#> 3 3 -0.09465904
#> 4 6 2.01842371
#> 5 4 -0.06271410
#> 6 7 1.30486965
#> 7 2 2.28664539
#> 8 5 -1.38886070
g1 <- graph_from_data_frame(e1, directed = FALSE, vertices = v_atts1)
# we can see only groups of size 2 and greater
comp1 <- components(g1)
table(comp1$csize)
#>
#> 2 3
#> 1 2
# now make attribute data that includes nodes that dont appear in e1
v_atts2 <- data.frame(ID = 1:10, foo=rnorm(10))
g2 <- graph_from_data_frame(e1, directed = FALSE, vertices = v_atts2)
# now we see that there are isolated nodes
comp2 <- components(g2)
table(comp2$csize)
#>
#> 1 2 3
#> 2 1 2
# and inspecting the number of vertices we see that
# this is because the graph has incorporated vertices
# that appear in the metadata but not the edge list
length(V(g1))
#> [1] 8
length(V(g2))
#> [1] 10
If you wanted to avoid this you could try graph_from_data_frame(e1, directed=FALSE, vertices=x[x$ID %in% c(e1$ID, e1$ID.2),])
which should subset your metadata to only the vertices that are connected. Note that you may want to check that your IDs are not being encoded as factors with levels that are not appearing in the data.
Sorry for only getting back to you now! The %in% procedure solves the problem. Many thanks
– williamg15
Dec 3 '18 at 10:51
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53412315%2figraph-adding-vertices-x-creating-clusters-of-size-1%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
I suspect that what is happening is that you have IDs appearing in your data.frame of node metadata x
that do not appear in the edge list. Igraph will add these nodes as isolated vertices. Some sample code below to illustrate the problem:
library(igraph)
# generate some fake data
set.seed(42)
e1 <- data.frame(ID = sample(1:10, 5), ID.2 = sample(1:10, 5))
head(e1)
#> ID ID.2
#> 1 10 6
#> 2 9 7
#> 3 3 2
#> 4 6 5
#> 5 4 9
# make the desired graph object
x <- graph_from_data_frame(e1, directed = F)
# make some attribute data that only matches the nodes that have edges
v_atts1 <- data.frame(ID = names(V(x)), foo = rnorm(length(names(V(x)))))
v_atts1
#> ID foo
#> 1 10 -0.10612452
#> 2 9 1.51152200
#> 3 3 -0.09465904
#> 4 6 2.01842371
#> 5 4 -0.06271410
#> 6 7 1.30486965
#> 7 2 2.28664539
#> 8 5 -1.38886070
g1 <- graph_from_data_frame(e1, directed = FALSE, vertices = v_atts1)
# we can see only groups of size 2 and greater
comp1 <- components(g1)
table(comp1$csize)
#>
#> 2 3
#> 1 2
# now make attribute data that includes nodes that dont appear in e1
v_atts2 <- data.frame(ID = 1:10, foo=rnorm(10))
g2 <- graph_from_data_frame(e1, directed = FALSE, vertices = v_atts2)
# now we see that there are isolated nodes
comp2 <- components(g2)
table(comp2$csize)
#>
#> 1 2 3
#> 2 1 2
# and inspecting the number of vertices we see that
# this is because the graph has incorporated vertices
# that appear in the metadata but not the edge list
length(V(g1))
#> [1] 8
length(V(g2))
#> [1] 10
If you wanted to avoid this you could try graph_from_data_frame(e1, directed=FALSE, vertices=x[x$ID %in% c(e1$ID, e1$ID.2),])
which should subset your metadata to only the vertices that are connected. Note that you may want to check that your IDs are not being encoded as factors with levels that are not appearing in the data.
Sorry for only getting back to you now! The %in% procedure solves the problem. Many thanks
– williamg15
Dec 3 '18 at 10:51
add a comment |
I suspect that what is happening is that you have IDs appearing in your data.frame of node metadata x
that do not appear in the edge list. Igraph will add these nodes as isolated vertices. Some sample code below to illustrate the problem:
library(igraph)
# generate some fake data
set.seed(42)
e1 <- data.frame(ID = sample(1:10, 5), ID.2 = sample(1:10, 5))
head(e1)
#> ID ID.2
#> 1 10 6
#> 2 9 7
#> 3 3 2
#> 4 6 5
#> 5 4 9
# make the desired graph object
x <- graph_from_data_frame(e1, directed = F)
# make some attribute data that only matches the nodes that have edges
v_atts1 <- data.frame(ID = names(V(x)), foo = rnorm(length(names(V(x)))))
v_atts1
#> ID foo
#> 1 10 -0.10612452
#> 2 9 1.51152200
#> 3 3 -0.09465904
#> 4 6 2.01842371
#> 5 4 -0.06271410
#> 6 7 1.30486965
#> 7 2 2.28664539
#> 8 5 -1.38886070
g1 <- graph_from_data_frame(e1, directed = FALSE, vertices = v_atts1)
# we can see only groups of size 2 and greater
comp1 <- components(g1)
table(comp1$csize)
#>
#> 2 3
#> 1 2
# now make attribute data that includes nodes that dont appear in e1
v_atts2 <- data.frame(ID = 1:10, foo=rnorm(10))
g2 <- graph_from_data_frame(e1, directed = FALSE, vertices = v_atts2)
# now we see that there are isolated nodes
comp2 <- components(g2)
table(comp2$csize)
#>
#> 1 2 3
#> 2 1 2
# and inspecting the number of vertices we see that
# this is because the graph has incorporated vertices
# that appear in the metadata but not the edge list
length(V(g1))
#> [1] 8
length(V(g2))
#> [1] 10
If you wanted to avoid this you could try graph_from_data_frame(e1, directed=FALSE, vertices=x[x$ID %in% c(e1$ID, e1$ID.2),])
which should subset your metadata to only the vertices that are connected. Note that you may want to check that your IDs are not being encoded as factors with levels that are not appearing in the data.
Sorry for only getting back to you now! The %in% procedure solves the problem. Many thanks
– williamg15
Dec 3 '18 at 10:51
add a comment |
I suspect that what is happening is that you have IDs appearing in your data.frame of node metadata x
that do not appear in the edge list. Igraph will add these nodes as isolated vertices. Some sample code below to illustrate the problem:
library(igraph)
# generate some fake data
set.seed(42)
e1 <- data.frame(ID = sample(1:10, 5), ID.2 = sample(1:10, 5))
head(e1)
#> ID ID.2
#> 1 10 6
#> 2 9 7
#> 3 3 2
#> 4 6 5
#> 5 4 9
# make the desired graph object
x <- graph_from_data_frame(e1, directed = F)
# make some attribute data that only matches the nodes that have edges
v_atts1 <- data.frame(ID = names(V(x)), foo = rnorm(length(names(V(x)))))
v_atts1
#> ID foo
#> 1 10 -0.10612452
#> 2 9 1.51152200
#> 3 3 -0.09465904
#> 4 6 2.01842371
#> 5 4 -0.06271410
#> 6 7 1.30486965
#> 7 2 2.28664539
#> 8 5 -1.38886070
g1 <- graph_from_data_frame(e1, directed = FALSE, vertices = v_atts1)
# we can see only groups of size 2 and greater
comp1 <- components(g1)
table(comp1$csize)
#>
#> 2 3
#> 1 2
# now make attribute data that includes nodes that dont appear in e1
v_atts2 <- data.frame(ID = 1:10, foo=rnorm(10))
g2 <- graph_from_data_frame(e1, directed = FALSE, vertices = v_atts2)
# now we see that there are isolated nodes
comp2 <- components(g2)
table(comp2$csize)
#>
#> 1 2 3
#> 2 1 2
# and inspecting the number of vertices we see that
# this is because the graph has incorporated vertices
# that appear in the metadata but not the edge list
length(V(g1))
#> [1] 8
length(V(g2))
#> [1] 10
If you wanted to avoid this you could try graph_from_data_frame(e1, directed=FALSE, vertices=x[x$ID %in% c(e1$ID, e1$ID.2),])
which should subset your metadata to only the vertices that are connected. Note that you may want to check that your IDs are not being encoded as factors with levels that are not appearing in the data.
I suspect that what is happening is that you have IDs appearing in your data.frame of node metadata x
that do not appear in the edge list. Igraph will add these nodes as isolated vertices. Some sample code below to illustrate the problem:
library(igraph)
# generate some fake data
set.seed(42)
e1 <- data.frame(ID = sample(1:10, 5), ID.2 = sample(1:10, 5))
head(e1)
#> ID ID.2
#> 1 10 6
#> 2 9 7
#> 3 3 2
#> 4 6 5
#> 5 4 9
# make the desired graph object
x <- graph_from_data_frame(e1, directed = F)
# make some attribute data that only matches the nodes that have edges
v_atts1 <- data.frame(ID = names(V(x)), foo = rnorm(length(names(V(x)))))
v_atts1
#> ID foo
#> 1 10 -0.10612452
#> 2 9 1.51152200
#> 3 3 -0.09465904
#> 4 6 2.01842371
#> 5 4 -0.06271410
#> 6 7 1.30486965
#> 7 2 2.28664539
#> 8 5 -1.38886070
g1 <- graph_from_data_frame(e1, directed = FALSE, vertices = v_atts1)
# we can see only groups of size 2 and greater
comp1 <- components(g1)
table(comp1$csize)
#>
#> 2 3
#> 1 2
# now make attribute data that includes nodes that dont appear in e1
v_atts2 <- data.frame(ID = 1:10, foo=rnorm(10))
g2 <- graph_from_data_frame(e1, directed = FALSE, vertices = v_atts2)
# now we see that there are isolated nodes
comp2 <- components(g2)
table(comp2$csize)
#>
#> 1 2 3
#> 2 1 2
# and inspecting the number of vertices we see that
# this is because the graph has incorporated vertices
# that appear in the metadata but not the edge list
length(V(g1))
#> [1] 8
length(V(g2))
#> [1] 10
If you wanted to avoid this you could try graph_from_data_frame(e1, directed=FALSE, vertices=x[x$ID %in% c(e1$ID, e1$ID.2),])
which should subset your metadata to only the vertices that are connected. Note that you may want to check that your IDs are not being encoded as factors with levels that are not appearing in the data.
answered Nov 21 '18 at 13:12
gfgmgfgm
2,703727
2,703727
Sorry for only getting back to you now! The %in% procedure solves the problem. Many thanks
– williamg15
Dec 3 '18 at 10:51
add a comment |
Sorry for only getting back to you now! The %in% procedure solves the problem. Many thanks
– williamg15
Dec 3 '18 at 10:51
Sorry for only getting back to you now! The %in% procedure solves the problem. Many thanks
– williamg15
Dec 3 '18 at 10:51
Sorry for only getting back to you now! The %in% procedure solves the problem. Many thanks
– williamg15
Dec 3 '18 at 10:51
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53412315%2figraph-adding-vertices-x-creating-clusters-of-size-1%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
1
The
vertices
argument is there to include vertex metadata. Without knowing what is inx
its hard to say. If you post some of your data withdput()
or make a minimal reproducible example it would be easier to diagnose.– gfgm
Nov 21 '18 at 12:48
Hi, thanks for the quick response. I have edited the thread and added a small reproducible example.
– williamg15
Nov 21 '18 at 12:57