r - Grouping very small numbers (e.g. 1e-28) and 0.0 in data.table v1.8.10 vs v1.9.2 -
i noticed frequency tables created data.table in r seem not distinguish between small numbers , zero? can change behavior or bug?
reproducible example:
>library(data.table) dt <- data.table(c(0.0000000000000000000000000001,2,9999,0)) test1 <- as.data.frame(unique(dt[,v1])) test2 <- dt[, .n, = v1]
as can see, frequency table (test2) not recognize differences between 0.0000000000000000000000000001 , 0 , put both observations in same class.
data.table version: 1.8.10
r: 3.02
it worth reading r faq 7.31 , thinking accuracy of floating point represenations.
i can't reproduce in current cran version (1.9.2). using
r version 3.0.3 (2014-03-06) platform: x86_64-w64-mingw32/x64 (64-bit)
my guess change in behaivour related news item.
o numeric data still joined , grouped within tolerance before instead of tolerance being sqrt(.machine$double.eps) == 1.490116e-08 (the same base::all.equal's default) significand rounded last 2 bytes, apx 11 s.f. more appropriate large (1.23e20) , small (1.23e-20) numerics , faster via simple bit twiddle. few functions provided 'tolerance' argument wasn't being passed through has been removed. aim add global option (e.g. 2, 1 or 0 byte rounding) in future release.
update matt
yes deliberate change in v1.9.2 , data.table
distinguishes 0.0000000000000000000000000001
0
(as user3340145 rightly thought should) due improved rounding method highlighted above news.
i've added for
loop test rick's answer test suite.
btw, #5369 implemented in v1.9.3 (although neither of these needed question) :
o bit64::integer64 works in grouping , joins, #5369. james sams highlighting upcs.
o new function setnumericrounding() may used reduce 1 byte or 0 byte rounding when joining or grouping columns of type 'numeric', #5369. see example in ?setnumericrounding , news item v1.9.2. getnumericrounding() returns current setting.
notice rounding (as v1.9.2) accuracy of significand; i.e. number of significant figures. 0.0000000000000000000000000001 == 1.0e-28
accurate 1 s.f., new rounding method doesn't group 0.0
.
in short, answer question : upgrade v1.8.10 v1.9.2 or greater.
Comments
Post a Comment