티스토리 뷰
최근에 R에서 sum(a*b)를 해야 하는 상황에 직면했다. 문제는 a와 b가 매우 크거나 작은 수라는 것!
Recently, I came across the situation in which I had to sum(a*b) in R. The problem was that a and b are vectors of very large or small numbers.
다음의 R implementation은 log a b = log c b/log c a라는 것을 안다면 쉽게 이해할 수 있을 것이다.
The implementation of log-sum-exp below must be easily understandable if you know log a b = log c b/log c a
# log-sum-exp trick
# calculating sum(a*b) using log-sum-exp trick
# Case 1
a = runif(1000000, 1e+220, 1e+276)
b = runif(1000000, 1e+40, 1e+44)
# Case 2
#a = runif(1000000, 1e-200, 1e-180)
#b = runif(1000000, 1e-200, 1e-180)
sum(a*b)
log.a = log(a)
log.b = log(b)
sum.log.ab = log.a + log.b
max.log = max(sum.log.ab)
result.log = log(sum(exp(sum.log.ab - max.log))) + max.log
# log_10 X = log_10 e * log_e x
result.log10 = log10(exp(1))*result.log
paste(10^(result.log10 - floor(result.log10)), "*10^",floor(result.log10), sep="")
# compare with the result above
sum((a/10000000000)*(b/10000000000))
#sum((a*100000000000000000000)*(b*100000000000000000000)) for case 2
#log.sum.exp:
#input a, b on log-scale
# output sum(a*b) on log-scale
log.sum.exp=function(a.log, b.log) {
stopifnot(is.vector(a.log) & is.vector(b.log))
stopifnot(is.numeric(a.log) & is.numeric(b.log))
sum.log.ab = a.log + b.log
max.log = max(sum.log.ab)
log(sum(exp(sum.log.ab - max.log))) + max.log
}
#log10.sum.exp : same with log.sum.exp except for that the base is 10 instead of e
log10.sum.exp=function(a.log, b.log) {
stopifnot(is.vector(a.log) & is.vector(b.log))
stopifnot(is.numeric(a.log) & is.numeric(b.log))
sum.log.ab = a.log + b.log
max.log = max(sum.log.ab)
log10(sum(10^(sum.log.ab - max.log))) + max.log
}
log.sum.exp(log(a), log(b))
log10.sum.exp(log10(a), log10(b))
'차기작 : R을 배우자' 카테고리의 다른 글
Dealing with multidimensional array (0) | 2015.11.10 |
---|---|
8 dimensions (0) | 2015.11.02 |
log-sum-exp trick (0) | 2015.10.15 |
Against underflow (0) | 2015.10.10 |
fitting data to gpcm(ltm or mirt?) (0) | 2015.09.26 |