r - Grouping very small numbers (e.g. 1e-28) and 0.0 in data.table v1.8.10 vs v1.9.2 -

September 15, 2013

i noticed frequency tables created data.table in r seem not distinguish between small numbers , zero? can change behavior or bug?

reproducible example:

>library(data.table)    dt <- data.table(c(0.0000000000000000000000000001,2,9999,0))     test1 <- as.data.frame(unique(dt[,v1]))    test2 <-  dt[, .n, = v1]

as can see, frequency table (test2) not recognize differences between 0.0000000000000000000000000001 , 0 , put both observations in same class.

data.table version: 1.8.10
r: 3.02

it worth reading r faq 7.31 , thinking accuracy of floating point represenations.

i can't reproduce in current cran version (1.9.2). using

r version 3.0.3 (2014-03-06) platform: x86_64-w64-mingw32/x64 (64-bit)

my guess change in behaivour related news item.

o numeric data still joined , grouped within tolerance before instead of tolerance being sqrt(.machine$double.eps) == 1.490116e-08 (the same base::all.equal's default) significand rounded last 2 bytes, apx 11 s.f. more appropriate large (1.23e20) , small (1.23e-20) numerics , faster via simple bit twiddle. few functions provided 'tolerance' argument wasn't being passed through has been removed. aim add global option (e.g. 2, 1 or 0 byte rounding) in future release.

update matt

yes deliberate change in v1.9.2 , data.table distinguishes 0.0000000000000000000000000001 0 (as user3340145 rightly thought should) due improved rounding method highlighted above news.

i've added for loop test rick's answer test suite.

btw, #5369 implemented in v1.9.3 (although neither of these needed question) :

o bit64::integer64 works in grouping , joins, #5369. james sams highlighting upcs.

o new function setnumericrounding() may used reduce 1 byte or 0 byte rounding when joining or grouping columns of type 'numeric', #5369. see example in ?setnumericrounding , news item v1.9.2. getnumericrounding() returns current setting.

notice rounding (as v1.9.2) accuracy of significand; i.e. number of significant figures. 0.0000000000000000000000000001 == 1.0e-28 accurate 1 s.f., new rounding method doesn't group 0.0.

in short, answer question : upgrade v1.8.10 v1.9.2 or greater.

Search This Blog

And

r - Grouping very small numbers (e.g. 1e-28) and 0.0 in data.table v1.8.10 vs v1.9.2 -

Comments

Post a Comment

Popular posts from this blog

Android layout hidden on keyboard show -

google app engine - 403 Forbidden POST - Flask WTForms -

how to run a query SQL in node.js mysql -