hadoop - Mapreduce multiple map and reducer -


i had csv files data follows

lat,lng 18.1234,77.3443 18.345,77.335 18.356,77.345 

so contains latitude , longitude , each csv file upto 1mb,i needed calculate distance latitude , longitude first record , second record of csv.

i.e 18.1234, 77.3443 , 18.345, 77.335. 

but mapper read 1 line @ time thinking add delimeter('|') between lines,so above csv file records become 1 line , input mapper

key->filename values-> csv records 1 line (all records seprated delimetr) text.  filename  18.1234,77.3443|18.345,77.335|18.356,77.345.... 

in reducer split delimeter , calculate distance between subsequent records[first , second coordinates].

so if have 30 csv files want 30 mappers , 30 reducers process csv files. need store data in mysql. such lat,lng,distance

if each csv file smaller default block size, id of current mapper , emit key.

i believe can id conf.get("mapred.tip.id") mapper's configuration.


Comments

Popular posts from this blog

Android layout hidden on keyboard show -

google app engine - 403 Forbidden POST - Flask WTForms -

c - Why would PK11_GenerateRandom() return an error -8023? -