hmisc - How to assign "cut" range midpoints in R? -
i using cut divide data bins, gives resulting bin (x1,x2]. can tell me how might make new column expresses these bins midpoint of bin? example, following dataframe:
structure(list(x = c(1l, 4l, 6l, 7l, 8l, 9l, 12l, 18l, 19l), y = 1:9), .names = c("x", "y"), class = "data.frame", row.names = c(na, -9l)) i can use
test$xrange <- cut(test$x, breaks=seq(0, 20, 5)) to give
x y xrange 1 1 1 (0,5] 2 4 2 (0,5] 3 6 3 (5,10] 4 7 4 (5,10] 5 8 5 (5,10] 6 9 6 (5,10] 7 12 7 (10,15] 8 18 8 (15,20] 9 19 9 (15,20] but result need should instead like:
x y xrange xmidpoint 1 1 1 (0,5] 2.5 2 4 2 (0,5] 2.5 3 6 3 (5,10] 7.5 4 7 4 (5,10] 7.5 5 8 5 (5,10] 7.5 6 9 6 (5,10] 7.5 7 12 7 (10,15] 12.5 8 18 8 (15,20] 17.5 9 19 9 (15,20] 17.5 i've done searching, , came upon similar question @ divide range of values in bins of equal length: cut vs cut2, gives solution as
cut2 <- function(x, breaks) { r <- range(x) b <- seq(r[1], r[2], length=2*breaks+1) brk <- b[0:breaks*2+1] mid <- b[1:breaks*2] brk[1] <- brk[1]-0.01 k <- cut(x, breaks=brk, labels=false) mid[k] } but when try on case, using
test$xmidpoint <- cut2(test$x, 5) it not return correct midpoint. perhaps entering breaks incorrectly in cut2? can tell me i'm doing incorrectly?
unless miss something, looks valid:
brks = seq(0, 20, 5) ints = findinterval(test$x, brks, all.inside = t) #mapply(function(x, y) (x + y) / 2, brks[ints], brks[ints + 1]) #which ridiculous #[1] 2.5 2.5 7.5 7.5 7.5 7.5 12.5 17.5 17.5 (brks[ints] + brks[ints + 1]) / 2 #as sgibb noted #[1] 2.5 2.5 7.5 7.5 7.5 7.5 12.5 17.5 17.5 (head(brks, -1) + diff(brks) / 2)[ints] #or using thelatemail's idea comments #[1] 2.5 2.5 7.5 7.5 7.5 7.5 12.5 17.5 17.5
Comments
Post a Comment