java - how to write hadoop map reduce programs in scala -
i writing map reduce application scala. till map function works fine. while writing reducer facing problem.
override def reduce(key: text, values: java.lang.iterable[text], context: reducercontext) { }
the reducercontext
defined such refers context inner class, fine here.
the issue iterable
(java) component.i not able iterate through it. understand first have convert scala iterable
, iterate on it, did still didnt get result.
i have tried both scala.collection.javaconverters._ , javaconversions._ here few scanarios didnt work out
val jit: java.util.iterator[text]= values.iterator() val abc = javaconversions.asscalaiterator(jit) /// val abc=jit.asscala println("size "+ abc.size)// displays proper size for(temp <- abc){ ///it dosent come inside loop }
similarly have tried converting iterator list/array in vain. once convert list/arrray(tolist/tiarray) size of resulting list/array becomes 0. no matter m not able iterate thorough
i appreciate on this.
thanks
you can import javaconversions
convert iterable
automatically.
import scala.collection.javaconversions._
if still have problem, can paste codes?
the tricky thing of values
receive in reduce
can traversed once. abc.size
traverse values
. after that, values
invalid.
so correct code should be
// don't use values for(value <- values) { // string v = value.tostring // don't save value, reused. content of value changed reference same. } // don't use values
just mentioned in comment, type of value
text
. when traverse values
, content of value
changed, reference same. don't try save value
in collection
, or collection
of items same.
Comments
Post a Comment