티스토리 뷰
Reading Aboy et al.(2006), I needed a function for computing lempel-ziv complexity.
I searched throughout the internet, but I couldn't find one.
So, I had to program one myself.
Here are the source, an example and the result.
# FUNCTION lempel.ziv(____.VEC, ____.VEC)
# s is a sequence vector
# alphabet is a vector of alphabet letters
# function counts unique sub-sequence and normalized it
# ref) Aboy et al.(2006),
# Interpretation of the Lempel-Ziv Complexity Measure
# in the Context of Biomedical Signal Analysis
# , IEEE Trans Biomed Eng. 2006 Nov;53(11):2282-8.
lempel.ziv=function(s, alphabet) {
n=sum(!is.na(s))
s=s[!is.na(s)]
if (sum(s %in% alphabet)!= n) { stop("Alphabet error!") }
voc=s[1]; cmpl=1
r=1; i=1;
while (r+i<=n) {
Q="";
repeat {
Q=paste(Q,s[r+i], sep="")
if (Q %in% voc) {
cmpl[r+i]=cmpl[r+i-1]; i=i+1; }
if(!(Q %in% voc) | !(r+i<=n)) { break }
} # repeat
if (r+i > n) break;
voc=c(voc, Q); cmpl[r+i]=cmpl[r+i-1]+1;
r=r+i; i=1;
}
cmpl=cmpl/(1:n/log(1:n,length(alphabet)))
return(cmpl)}
# FUNCTION lempel.ziv2(____.CHR, ____.CHR)
# Wrapper for lempel.ziv
# str is a vector of strings
# str.alphabet is a vector of alphabets of length 1 or length str
lempel.ziv2=function(str, str.alphabet) {
s2=strsplit(str,"")
alphabet=strsplit(str.alphabet,"")
if (length(alphabet) ==1) { inc.alphabet = 1 }
else {
if (length(alphabet) != length(s2))
{ stop("Number of Strings and alphabets aren't the same.") }
else { inc.alphabet=0 }}
index.alphabet = 1
lzs=c()
for (s in s2) {
lzs=c(lzs, lempel.ziv(s, alphabet[[1]])[length(s)])
index.alphabet=index.alphabet+inc.alphabet
}
lzs
}
# examples for lempel.ziv and lempel.ziv2
par(mfcol=c(1,1))
a=list(); lz=list();
a[[1]]<-floor(runif(1000,min=0,max=4))
a[[2]]<-rep(0,1000)
a[[3]]<-floor(rnorm(1000,2,0.5))
a[[3]][a[[3]]<0 | a[[3]]>3]=1
a[[4]]<-rep(c(0,1,2,3),250)
a[[5]]<-rep(c(0,1),500)
a[[6]]<-floor(runif(1000,min=0,max=2))
temp=a[[1]]; temp[rep(c(T,F),500)]=1; a[[7]]=temp
# Logical error when logical vector is longer than the vector
# log=c(T,F,T,T,F,F,T,T,T); v=1:3; v[log]
# NA from nowhere!
leg = c("unif","rep0","norm/2/0.5","rep0123","rep01",
"unif(0,1)","unif+T")
for (i in 1:length(a)) {
lz[[i]]<-lempel.ziv(a[[i]],c(0,1,2,3))
}
plot(lz[[1]], type="l", col=1)
for (i in 2:length(lz)) {
points(lz[[i]], type="l",col=i)
}
legend(x="topright",legend=leg, col=1:length(leg), lty=rep("solid",length(leg)))
Reference
Aboy et al.(2006), Interpretation of the Lempel-Ziv Complexity Measure
in the Context of Biomedical Signal Analysis
, IEEE Trans Biomed Eng. 2006 Nov;53(11):2282-8.
'차기작 : R을 배우자' 카테고리의 다른 글
package deepnet을 활용하여 XOR 학습하기 (0) | 2014.11.08 |
---|---|
a CRF model for denoising (0) | 2014.10.04 |
R studio, Git, BitBucket (0) | 2014.02.25 |
frequency polygons (0) | 2014.02.22 |
Python (0) | 2014.02.15 |