Frequencies of all subsequences of size 3 in a given 0-1 sequence?
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ height:90px;width:728px;box-sizing:border-box;
}
Given data
s<-c(1,0,0,0,1,0,0,0,0,0,1,1,1,0,0)
I can count 1s and 0s with table or ftable
ftable(s,row.vars =1:1)
and the totals of 11s,01s,10s,00s occurred in s with
table(s[-length(s)],s[-1]).
What would be the clever way to count occurrences of 111s, 011s, ..., 100s, 000s? Ideally, I want a table of counts x like
0 1
11 x x
01 x x
10 x x
00 x x
Is there a general way to compute the total occurrences for all possible sub-sequences of length k=1,2,3,4, ... occurred in data?
r count sequence
add a comment |
Given data
s<-c(1,0,0,0,1,0,0,0,0,0,1,1,1,0,0)
I can count 1s and 0s with table or ftable
ftable(s,row.vars =1:1)
and the totals of 11s,01s,10s,00s occurred in s with
table(s[-length(s)],s[-1]).
What would be the clever way to count occurrences of 111s, 011s, ..., 100s, 000s? Ideally, I want a table of counts x like
0 1
11 x x
01 x x
10 x x
00 x x
Is there a general way to compute the total occurrences for all possible sub-sequences of length k=1,2,3,4, ... occurred in data?
r count sequence
add a comment |
Given data
s<-c(1,0,0,0,1,0,0,0,0,0,1,1,1,0,0)
I can count 1s and 0s with table or ftable
ftable(s,row.vars =1:1)
and the totals of 11s,01s,10s,00s occurred in s with
table(s[-length(s)],s[-1]).
What would be the clever way to count occurrences of 111s, 011s, ..., 100s, 000s? Ideally, I want a table of counts x like
0 1
11 x x
01 x x
10 x x
00 x x
Is there a general way to compute the total occurrences for all possible sub-sequences of length k=1,2,3,4, ... occurred in data?
r count sequence
Given data
s<-c(1,0,0,0,1,0,0,0,0,0,1,1,1,0,0)
I can count 1s and 0s with table or ftable
ftable(s,row.vars =1:1)
and the totals of 11s,01s,10s,00s occurred in s with
table(s[-length(s)],s[-1]).
What would be the clever way to count occurrences of 111s, 011s, ..., 100s, 000s? Ideally, I want a table of counts x like
0 1
11 x x
01 x x
10 x x
00 x x
Is there a general way to compute the total occurrences for all possible sub-sequences of length k=1,2,3,4, ... occurred in data?
r count sequence
r count sequence
edited Nov 22 '18 at 5:38
Cœur
19.3k10116155
19.3k10116155
asked Feb 17 '10 at 7:22
andrekosandrekos
1,74712023
1,74712023
add a comment |
add a comment |
2 Answers
2
active
oldest
votes
Well, it seems like you would first need to generate n-tuples from your vector. The following function should accomplish that:
makeTuples <- function( x, n ){
# Very inefficient way to loop... but what the heck
tuples <- list()
for( i in 1:n ){
tuples[[i]] <- x[i:(length(x)-n+i)]
}
return(tuples)
}
Then you could feed the results of makeTuples() to table() using do.call():
do.call( table, makeTuples(s,3) )
, , = 0
0 1
0 4 1
1 3 1
, , = 1
0 1
0 2 1
1 0 1
This works because the makeTuples() function returns the tuples as a list of lists. The output isn't quite as nice as you wanted, but you could write a function to reformat, say:
, , = 0
0 1
0 4 1
1 3 1
To:
0 1
00 4 1
01 3 1
It would require looping over the outer n-2 dimensions of the n-dimensional array returned by table, creating row names and concatenating things together.
Update
So, I was just sitting in a Stochastic processes class when I figured out a more or less straight-forward way to produce the output you want without trying to unwind the output of table(). First you will need a function that generates all possible permutations of n selections from your population. The generation of permutations can be done with expand.grid(), but it needs a little sugar-coating:
permute <- function( population, n ){
permutations <- do.call( expand.grid, rep( list(population), n ) )
permutations <- apply( permutations, 1, paste, collapse = '' )
return( permutations )
}
The basic idea is to iterate over the list of permutations and count the number of tuples that match the given permutation. Since you want the results split out into a table, we should select a permutation of n-1 elements from the population and let the last position form the columns of the table. Here's a function that takes a permutation of size n-1, a list of tuples, and the population the tuples were drawn from and produces a named vector of match counts:
countFrequency <- function(permutation,tuples,population){
permutations <- paste( permutation, population, sep = '' )
# Inner lapply applies the equality operator `==` to each
# permutation and returns a list of TRUE/FALSE vectors.
# Outer lapply sums the number of TRUE values in each vector.
frequencies <- lapply(lapply(permutations,`==`,tuples),sum)
names( frequencies ) <- as.character( population )
return( unlist(frequencies) )
}
Finally, all three functions can be combined into a bigger function that takes a vector, splits it into n-tuples and returns a frequency table. The final aggregation operation is done using ldply() from Hadley Wickham's plyr package as it does a nice job of preserving information such as which permutation corresponds to which row of output matches:
permutationFrequency <- function( vector, n, population = unique( vector ) ){
# Split the vector into tuples.
tuples <- makeTuples( vector, n )
# Coerce and compact the tuples to a vector of strings.
tuples <- do.call(cbind,tuples)
tuples <- apply( tuples, 1, paste, collapse = '' )
# Generate permutations of n-1 elements from the population.
# Turn into a named list for ldply() to work it's magic.
permutations <- permute( population, n-1 )
names( permutations ) <- permutations
frequencies <- ldply( permutations, countFrequency,
tuples = tuples, population = population )
return( frequencies )
}
And there you go:
require( plyr )
permutationFrequency( s, 2 )
.id 1 0
1 1 2 3
2 0 2 7
permutationFrequency( s, 3 )
.id 1 0
1 11 1 1
2 01 1 1
3 10 0 3
4 00 2 4
permutationFrequency( s, 4 )
.id 1 0
1 111 0 1
2 011 1 0
3 101 0 0
4 001 1 1
5 110 0 1
6 010 0 1
7 100 0 2
8 000 2 2
permutationFrequency( sample( -1:1, 10, replace = T ), 2 )
.id 1 -1 0
1 1 1 2 0
2 -1 0 1 2
3 0 1 0 2
Apologies to my stochastic processes teacher, but functional programming problems in R were just more interesting than the Gambler's Ruin today...
Thanks very much for this, but the .id column appears to be missing in my output. Or am I missing something? The rest is exactly what I needed.
– andrekos
Feb 18 '10 at 1:30
Hmm, I noticed the.idcolumn didn't show up if I gave an unnamed list or vector toldply(). Did you includenames(permutations) <- permutations?
– Sharpie
Feb 18 '10 at 1:40
Yes, to start with, I copypasted your code.
– andrekos
Feb 18 '10 at 8:26
Interesting. Could be a version thing-- I'm using R 2.10.1 and plyr 0.1.9
– Sharpie
Feb 18 '10 at 9:31
SessionInfo() informed I used plyr 0.1.3, and update.packages() did not help. But upgrading from R 2.9.2 did help :)
– andrekos
Feb 19 '10 at 0:11
|
show 1 more comment
One approach is to create a data frame of the subsequences and then use the table function:
s<-c(1,0,0,0,1,0,0,0,0,0,1,1,1,0,0)
n<-length(s)
k<-3
subseqs<-t(sapply(1:(n-k+1),function(i){s[i:(i+k-1)]}))
colnames(subseqs)<-paste('Y',1:k,sep="")
subseqs<-data.frame(subseqs)
table(subseqs)
This produces
, , Y3 = 0
Y2
Y1 0 1
0 4 1
1 3 1
, , Y3 = 1
Y2
Y1 0 1
0 2 1
1 0 1
Use ftable instead of table or on the output of table for a display similar to the one in your question:
ftable(subseqs)
Y3 0 1
Y1 Y2
0 0 4 2
1 1 1
1 0 3 0
1 1 1
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f2278951%2ffrequencies-of-all-subsequences-of-size-3-in-a-given-0-1-sequence%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
Well, it seems like you would first need to generate n-tuples from your vector. The following function should accomplish that:
makeTuples <- function( x, n ){
# Very inefficient way to loop... but what the heck
tuples <- list()
for( i in 1:n ){
tuples[[i]] <- x[i:(length(x)-n+i)]
}
return(tuples)
}
Then you could feed the results of makeTuples() to table() using do.call():
do.call( table, makeTuples(s,3) )
, , = 0
0 1
0 4 1
1 3 1
, , = 1
0 1
0 2 1
1 0 1
This works because the makeTuples() function returns the tuples as a list of lists. The output isn't quite as nice as you wanted, but you could write a function to reformat, say:
, , = 0
0 1
0 4 1
1 3 1
To:
0 1
00 4 1
01 3 1
It would require looping over the outer n-2 dimensions of the n-dimensional array returned by table, creating row names and concatenating things together.
Update
So, I was just sitting in a Stochastic processes class when I figured out a more or less straight-forward way to produce the output you want without trying to unwind the output of table(). First you will need a function that generates all possible permutations of n selections from your population. The generation of permutations can be done with expand.grid(), but it needs a little sugar-coating:
permute <- function( population, n ){
permutations <- do.call( expand.grid, rep( list(population), n ) )
permutations <- apply( permutations, 1, paste, collapse = '' )
return( permutations )
}
The basic idea is to iterate over the list of permutations and count the number of tuples that match the given permutation. Since you want the results split out into a table, we should select a permutation of n-1 elements from the population and let the last position form the columns of the table. Here's a function that takes a permutation of size n-1, a list of tuples, and the population the tuples were drawn from and produces a named vector of match counts:
countFrequency <- function(permutation,tuples,population){
permutations <- paste( permutation, population, sep = '' )
# Inner lapply applies the equality operator `==` to each
# permutation and returns a list of TRUE/FALSE vectors.
# Outer lapply sums the number of TRUE values in each vector.
frequencies <- lapply(lapply(permutations,`==`,tuples),sum)
names( frequencies ) <- as.character( population )
return( unlist(frequencies) )
}
Finally, all three functions can be combined into a bigger function that takes a vector, splits it into n-tuples and returns a frequency table. The final aggregation operation is done using ldply() from Hadley Wickham's plyr package as it does a nice job of preserving information such as which permutation corresponds to which row of output matches:
permutationFrequency <- function( vector, n, population = unique( vector ) ){
# Split the vector into tuples.
tuples <- makeTuples( vector, n )
# Coerce and compact the tuples to a vector of strings.
tuples <- do.call(cbind,tuples)
tuples <- apply( tuples, 1, paste, collapse = '' )
# Generate permutations of n-1 elements from the population.
# Turn into a named list for ldply() to work it's magic.
permutations <- permute( population, n-1 )
names( permutations ) <- permutations
frequencies <- ldply( permutations, countFrequency,
tuples = tuples, population = population )
return( frequencies )
}
And there you go:
require( plyr )
permutationFrequency( s, 2 )
.id 1 0
1 1 2 3
2 0 2 7
permutationFrequency( s, 3 )
.id 1 0
1 11 1 1
2 01 1 1
3 10 0 3
4 00 2 4
permutationFrequency( s, 4 )
.id 1 0
1 111 0 1
2 011 1 0
3 101 0 0
4 001 1 1
5 110 0 1
6 010 0 1
7 100 0 2
8 000 2 2
permutationFrequency( sample( -1:1, 10, replace = T ), 2 )
.id 1 -1 0
1 1 1 2 0
2 -1 0 1 2
3 0 1 0 2
Apologies to my stochastic processes teacher, but functional programming problems in R were just more interesting than the Gambler's Ruin today...
Thanks very much for this, but the .id column appears to be missing in my output. Or am I missing something? The rest is exactly what I needed.
– andrekos
Feb 18 '10 at 1:30
Hmm, I noticed the.idcolumn didn't show up if I gave an unnamed list or vector toldply(). Did you includenames(permutations) <- permutations?
– Sharpie
Feb 18 '10 at 1:40
Yes, to start with, I copypasted your code.
– andrekos
Feb 18 '10 at 8:26
Interesting. Could be a version thing-- I'm using R 2.10.1 and plyr 0.1.9
– Sharpie
Feb 18 '10 at 9:31
SessionInfo() informed I used plyr 0.1.3, and update.packages() did not help. But upgrading from R 2.9.2 did help :)
– andrekos
Feb 19 '10 at 0:11
|
show 1 more comment
Well, it seems like you would first need to generate n-tuples from your vector. The following function should accomplish that:
makeTuples <- function( x, n ){
# Very inefficient way to loop... but what the heck
tuples <- list()
for( i in 1:n ){
tuples[[i]] <- x[i:(length(x)-n+i)]
}
return(tuples)
}
Then you could feed the results of makeTuples() to table() using do.call():
do.call( table, makeTuples(s,3) )
, , = 0
0 1
0 4 1
1 3 1
, , = 1
0 1
0 2 1
1 0 1
This works because the makeTuples() function returns the tuples as a list of lists. The output isn't quite as nice as you wanted, but you could write a function to reformat, say:
, , = 0
0 1
0 4 1
1 3 1
To:
0 1
00 4 1
01 3 1
It would require looping over the outer n-2 dimensions of the n-dimensional array returned by table, creating row names and concatenating things together.
Update
So, I was just sitting in a Stochastic processes class when I figured out a more or less straight-forward way to produce the output you want without trying to unwind the output of table(). First you will need a function that generates all possible permutations of n selections from your population. The generation of permutations can be done with expand.grid(), but it needs a little sugar-coating:
permute <- function( population, n ){
permutations <- do.call( expand.grid, rep( list(population), n ) )
permutations <- apply( permutations, 1, paste, collapse = '' )
return( permutations )
}
The basic idea is to iterate over the list of permutations and count the number of tuples that match the given permutation. Since you want the results split out into a table, we should select a permutation of n-1 elements from the population and let the last position form the columns of the table. Here's a function that takes a permutation of size n-1, a list of tuples, and the population the tuples were drawn from and produces a named vector of match counts:
countFrequency <- function(permutation,tuples,population){
permutations <- paste( permutation, population, sep = '' )
# Inner lapply applies the equality operator `==` to each
# permutation and returns a list of TRUE/FALSE vectors.
# Outer lapply sums the number of TRUE values in each vector.
frequencies <- lapply(lapply(permutations,`==`,tuples),sum)
names( frequencies ) <- as.character( population )
return( unlist(frequencies) )
}
Finally, all three functions can be combined into a bigger function that takes a vector, splits it into n-tuples and returns a frequency table. The final aggregation operation is done using ldply() from Hadley Wickham's plyr package as it does a nice job of preserving information such as which permutation corresponds to which row of output matches:
permutationFrequency <- function( vector, n, population = unique( vector ) ){
# Split the vector into tuples.
tuples <- makeTuples( vector, n )
# Coerce and compact the tuples to a vector of strings.
tuples <- do.call(cbind,tuples)
tuples <- apply( tuples, 1, paste, collapse = '' )
# Generate permutations of n-1 elements from the population.
# Turn into a named list for ldply() to work it's magic.
permutations <- permute( population, n-1 )
names( permutations ) <- permutations
frequencies <- ldply( permutations, countFrequency,
tuples = tuples, population = population )
return( frequencies )
}
And there you go:
require( plyr )
permutationFrequency( s, 2 )
.id 1 0
1 1 2 3
2 0 2 7
permutationFrequency( s, 3 )
.id 1 0
1 11 1 1
2 01 1 1
3 10 0 3
4 00 2 4
permutationFrequency( s, 4 )
.id 1 0
1 111 0 1
2 011 1 0
3 101 0 0
4 001 1 1
5 110 0 1
6 010 0 1
7 100 0 2
8 000 2 2
permutationFrequency( sample( -1:1, 10, replace = T ), 2 )
.id 1 -1 0
1 1 1 2 0
2 -1 0 1 2
3 0 1 0 2
Apologies to my stochastic processes teacher, but functional programming problems in R were just more interesting than the Gambler's Ruin today...
Thanks very much for this, but the .id column appears to be missing in my output. Or am I missing something? The rest is exactly what I needed.
– andrekos
Feb 18 '10 at 1:30
Hmm, I noticed the.idcolumn didn't show up if I gave an unnamed list or vector toldply(). Did you includenames(permutations) <- permutations?
– Sharpie
Feb 18 '10 at 1:40
Yes, to start with, I copypasted your code.
– andrekos
Feb 18 '10 at 8:26
Interesting. Could be a version thing-- I'm using R 2.10.1 and plyr 0.1.9
– Sharpie
Feb 18 '10 at 9:31
SessionInfo() informed I used plyr 0.1.3, and update.packages() did not help. But upgrading from R 2.9.2 did help :)
– andrekos
Feb 19 '10 at 0:11
|
show 1 more comment
Well, it seems like you would first need to generate n-tuples from your vector. The following function should accomplish that:
makeTuples <- function( x, n ){
# Very inefficient way to loop... but what the heck
tuples <- list()
for( i in 1:n ){
tuples[[i]] <- x[i:(length(x)-n+i)]
}
return(tuples)
}
Then you could feed the results of makeTuples() to table() using do.call():
do.call( table, makeTuples(s,3) )
, , = 0
0 1
0 4 1
1 3 1
, , = 1
0 1
0 2 1
1 0 1
This works because the makeTuples() function returns the tuples as a list of lists. The output isn't quite as nice as you wanted, but you could write a function to reformat, say:
, , = 0
0 1
0 4 1
1 3 1
To:
0 1
00 4 1
01 3 1
It would require looping over the outer n-2 dimensions of the n-dimensional array returned by table, creating row names and concatenating things together.
Update
So, I was just sitting in a Stochastic processes class when I figured out a more or less straight-forward way to produce the output you want without trying to unwind the output of table(). First you will need a function that generates all possible permutations of n selections from your population. The generation of permutations can be done with expand.grid(), but it needs a little sugar-coating:
permute <- function( population, n ){
permutations <- do.call( expand.grid, rep( list(population), n ) )
permutations <- apply( permutations, 1, paste, collapse = '' )
return( permutations )
}
The basic idea is to iterate over the list of permutations and count the number of tuples that match the given permutation. Since you want the results split out into a table, we should select a permutation of n-1 elements from the population and let the last position form the columns of the table. Here's a function that takes a permutation of size n-1, a list of tuples, and the population the tuples were drawn from and produces a named vector of match counts:
countFrequency <- function(permutation,tuples,population){
permutations <- paste( permutation, population, sep = '' )
# Inner lapply applies the equality operator `==` to each
# permutation and returns a list of TRUE/FALSE vectors.
# Outer lapply sums the number of TRUE values in each vector.
frequencies <- lapply(lapply(permutations,`==`,tuples),sum)
names( frequencies ) <- as.character( population )
return( unlist(frequencies) )
}
Finally, all three functions can be combined into a bigger function that takes a vector, splits it into n-tuples and returns a frequency table. The final aggregation operation is done using ldply() from Hadley Wickham's plyr package as it does a nice job of preserving information such as which permutation corresponds to which row of output matches:
permutationFrequency <- function( vector, n, population = unique( vector ) ){
# Split the vector into tuples.
tuples <- makeTuples( vector, n )
# Coerce and compact the tuples to a vector of strings.
tuples <- do.call(cbind,tuples)
tuples <- apply( tuples, 1, paste, collapse = '' )
# Generate permutations of n-1 elements from the population.
# Turn into a named list for ldply() to work it's magic.
permutations <- permute( population, n-1 )
names( permutations ) <- permutations
frequencies <- ldply( permutations, countFrequency,
tuples = tuples, population = population )
return( frequencies )
}
And there you go:
require( plyr )
permutationFrequency( s, 2 )
.id 1 0
1 1 2 3
2 0 2 7
permutationFrequency( s, 3 )
.id 1 0
1 11 1 1
2 01 1 1
3 10 0 3
4 00 2 4
permutationFrequency( s, 4 )
.id 1 0
1 111 0 1
2 011 1 0
3 101 0 0
4 001 1 1
5 110 0 1
6 010 0 1
7 100 0 2
8 000 2 2
permutationFrequency( sample( -1:1, 10, replace = T ), 2 )
.id 1 -1 0
1 1 1 2 0
2 -1 0 1 2
3 0 1 0 2
Apologies to my stochastic processes teacher, but functional programming problems in R were just more interesting than the Gambler's Ruin today...
Well, it seems like you would first need to generate n-tuples from your vector. The following function should accomplish that:
makeTuples <- function( x, n ){
# Very inefficient way to loop... but what the heck
tuples <- list()
for( i in 1:n ){
tuples[[i]] <- x[i:(length(x)-n+i)]
}
return(tuples)
}
Then you could feed the results of makeTuples() to table() using do.call():
do.call( table, makeTuples(s,3) )
, , = 0
0 1
0 4 1
1 3 1
, , = 1
0 1
0 2 1
1 0 1
This works because the makeTuples() function returns the tuples as a list of lists. The output isn't quite as nice as you wanted, but you could write a function to reformat, say:
, , = 0
0 1
0 4 1
1 3 1
To:
0 1
00 4 1
01 3 1
It would require looping over the outer n-2 dimensions of the n-dimensional array returned by table, creating row names and concatenating things together.
Update
So, I was just sitting in a Stochastic processes class when I figured out a more or less straight-forward way to produce the output you want without trying to unwind the output of table(). First you will need a function that generates all possible permutations of n selections from your population. The generation of permutations can be done with expand.grid(), but it needs a little sugar-coating:
permute <- function( population, n ){
permutations <- do.call( expand.grid, rep( list(population), n ) )
permutations <- apply( permutations, 1, paste, collapse = '' )
return( permutations )
}
The basic idea is to iterate over the list of permutations and count the number of tuples that match the given permutation. Since you want the results split out into a table, we should select a permutation of n-1 elements from the population and let the last position form the columns of the table. Here's a function that takes a permutation of size n-1, a list of tuples, and the population the tuples were drawn from and produces a named vector of match counts:
countFrequency <- function(permutation,tuples,population){
permutations <- paste( permutation, population, sep = '' )
# Inner lapply applies the equality operator `==` to each
# permutation and returns a list of TRUE/FALSE vectors.
# Outer lapply sums the number of TRUE values in each vector.
frequencies <- lapply(lapply(permutations,`==`,tuples),sum)
names( frequencies ) <- as.character( population )
return( unlist(frequencies) )
}
Finally, all three functions can be combined into a bigger function that takes a vector, splits it into n-tuples and returns a frequency table. The final aggregation operation is done using ldply() from Hadley Wickham's plyr package as it does a nice job of preserving information such as which permutation corresponds to which row of output matches:
permutationFrequency <- function( vector, n, population = unique( vector ) ){
# Split the vector into tuples.
tuples <- makeTuples( vector, n )
# Coerce and compact the tuples to a vector of strings.
tuples <- do.call(cbind,tuples)
tuples <- apply( tuples, 1, paste, collapse = '' )
# Generate permutations of n-1 elements from the population.
# Turn into a named list for ldply() to work it's magic.
permutations <- permute( population, n-1 )
names( permutations ) <- permutations
frequencies <- ldply( permutations, countFrequency,
tuples = tuples, population = population )
return( frequencies )
}
And there you go:
require( plyr )
permutationFrequency( s, 2 )
.id 1 0
1 1 2 3
2 0 2 7
permutationFrequency( s, 3 )
.id 1 0
1 11 1 1
2 01 1 1
3 10 0 3
4 00 2 4
permutationFrequency( s, 4 )
.id 1 0
1 111 0 1
2 011 1 0
3 101 0 0
4 001 1 1
5 110 0 1
6 010 0 1
7 100 0 2
8 000 2 2
permutationFrequency( sample( -1:1, 10, replace = T ), 2 )
.id 1 -1 0
1 1 1 2 0
2 -1 0 1 2
3 0 1 0 2
Apologies to my stochastic processes teacher, but functional programming problems in R were just more interesting than the Gambler's Ruin today...
edited Feb 20 '10 at 19:42
answered Feb 17 '10 at 20:38
SharpieSharpie
11.4k43945
11.4k43945
Thanks very much for this, but the .id column appears to be missing in my output. Or am I missing something? The rest is exactly what I needed.
– andrekos
Feb 18 '10 at 1:30
Hmm, I noticed the.idcolumn didn't show up if I gave an unnamed list or vector toldply(). Did you includenames(permutations) <- permutations?
– Sharpie
Feb 18 '10 at 1:40
Yes, to start with, I copypasted your code.
– andrekos
Feb 18 '10 at 8:26
Interesting. Could be a version thing-- I'm using R 2.10.1 and plyr 0.1.9
– Sharpie
Feb 18 '10 at 9:31
SessionInfo() informed I used plyr 0.1.3, and update.packages() did not help. But upgrading from R 2.9.2 did help :)
– andrekos
Feb 19 '10 at 0:11
|
show 1 more comment
Thanks very much for this, but the .id column appears to be missing in my output. Or am I missing something? The rest is exactly what I needed.
– andrekos
Feb 18 '10 at 1:30
Hmm, I noticed the.idcolumn didn't show up if I gave an unnamed list or vector toldply(). Did you includenames(permutations) <- permutations?
– Sharpie
Feb 18 '10 at 1:40
Yes, to start with, I copypasted your code.
– andrekos
Feb 18 '10 at 8:26
Interesting. Could be a version thing-- I'm using R 2.10.1 and plyr 0.1.9
– Sharpie
Feb 18 '10 at 9:31
SessionInfo() informed I used plyr 0.1.3, and update.packages() did not help. But upgrading from R 2.9.2 did help :)
– andrekos
Feb 19 '10 at 0:11
Thanks very much for this, but the .id column appears to be missing in my output. Or am I missing something? The rest is exactly what I needed.
– andrekos
Feb 18 '10 at 1:30
Thanks very much for this, but the .id column appears to be missing in my output. Or am I missing something? The rest is exactly what I needed.
– andrekos
Feb 18 '10 at 1:30
Hmm, I noticed the
.id column didn't show up if I gave an unnamed list or vector to ldply(). Did you include names(permutations) <- permutations?– Sharpie
Feb 18 '10 at 1:40
Hmm, I noticed the
.id column didn't show up if I gave an unnamed list or vector to ldply(). Did you include names(permutations) <- permutations?– Sharpie
Feb 18 '10 at 1:40
Yes, to start with, I copypasted your code.
– andrekos
Feb 18 '10 at 8:26
Yes, to start with, I copypasted your code.
– andrekos
Feb 18 '10 at 8:26
Interesting. Could be a version thing-- I'm using R 2.10.1 and plyr 0.1.9
– Sharpie
Feb 18 '10 at 9:31
Interesting. Could be a version thing-- I'm using R 2.10.1 and plyr 0.1.9
– Sharpie
Feb 18 '10 at 9:31
SessionInfo() informed I used plyr 0.1.3, and update.packages() did not help. But upgrading from R 2.9.2 did help :)
– andrekos
Feb 19 '10 at 0:11
SessionInfo() informed I used plyr 0.1.3, and update.packages() did not help. But upgrading from R 2.9.2 did help :)
– andrekos
Feb 19 '10 at 0:11
|
show 1 more comment
One approach is to create a data frame of the subsequences and then use the table function:
s<-c(1,0,0,0,1,0,0,0,0,0,1,1,1,0,0)
n<-length(s)
k<-3
subseqs<-t(sapply(1:(n-k+1),function(i){s[i:(i+k-1)]}))
colnames(subseqs)<-paste('Y',1:k,sep="")
subseqs<-data.frame(subseqs)
table(subseqs)
This produces
, , Y3 = 0
Y2
Y1 0 1
0 4 1
1 3 1
, , Y3 = 1
Y2
Y1 0 1
0 2 1
1 0 1
Use ftable instead of table or on the output of table for a display similar to the one in your question:
ftable(subseqs)
Y3 0 1
Y1 Y2
0 0 4 2
1 1 1
1 0 3 0
1 1 1
add a comment |
One approach is to create a data frame of the subsequences and then use the table function:
s<-c(1,0,0,0,1,0,0,0,0,0,1,1,1,0,0)
n<-length(s)
k<-3
subseqs<-t(sapply(1:(n-k+1),function(i){s[i:(i+k-1)]}))
colnames(subseqs)<-paste('Y',1:k,sep="")
subseqs<-data.frame(subseqs)
table(subseqs)
This produces
, , Y3 = 0
Y2
Y1 0 1
0 4 1
1 3 1
, , Y3 = 1
Y2
Y1 0 1
0 2 1
1 0 1
Use ftable instead of table or on the output of table for a display similar to the one in your question:
ftable(subseqs)
Y3 0 1
Y1 Y2
0 0 4 2
1 1 1
1 0 3 0
1 1 1
add a comment |
One approach is to create a data frame of the subsequences and then use the table function:
s<-c(1,0,0,0,1,0,0,0,0,0,1,1,1,0,0)
n<-length(s)
k<-3
subseqs<-t(sapply(1:(n-k+1),function(i){s[i:(i+k-1)]}))
colnames(subseqs)<-paste('Y',1:k,sep="")
subseqs<-data.frame(subseqs)
table(subseqs)
This produces
, , Y3 = 0
Y2
Y1 0 1
0 4 1
1 3 1
, , Y3 = 1
Y2
Y1 0 1
0 2 1
1 0 1
Use ftable instead of table or on the output of table for a display similar to the one in your question:
ftable(subseqs)
Y3 0 1
Y1 Y2
0 0 4 2
1 1 1
1 0 3 0
1 1 1
One approach is to create a data frame of the subsequences and then use the table function:
s<-c(1,0,0,0,1,0,0,0,0,0,1,1,1,0,0)
n<-length(s)
k<-3
subseqs<-t(sapply(1:(n-k+1),function(i){s[i:(i+k-1)]}))
colnames(subseqs)<-paste('Y',1:k,sep="")
subseqs<-data.frame(subseqs)
table(subseqs)
This produces
, , Y3 = 0
Y2
Y1 0 1
0 4 1
1 3 1
, , Y3 = 1
Y2
Y1 0 1
0 2 1
1 0 1
Use ftable instead of table or on the output of table for a display similar to the one in your question:
ftable(subseqs)
Y3 0 1
Y1 Y2
0 0 4 2
1 1 1
1 0 3 0
1 1 1
answered Feb 18 '10 at 9:13
Jyotirmoy BhattacharyaJyotirmoy Bhattacharya
6,19732434
6,19732434
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f2278951%2ffrequencies-of-all-subsequences-of-size-3-in-a-given-0-1-sequence%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown