perl - Compare lines in a file -

January 15, 2011

i have large dataset looks this:

identifier,feature 1, feature 2, feature 3, ... 29239999, 2,5,3,... 29239999, 2,4,3,... 29239999, 2,6,7,... 17221882, 2,6,7,... 17221882, 1,1,7,...

i write script groups these lines identifier (so first 3 , last 2 grouped) in order compare them. so, example, 3 29239999 , take 1 of 2 feature 3 3 , last feature 3 7. in particular, take 1 has largest feature 2 (it third line 29239999).

my specific question: of 2 options: (1) hashes , (2) making each identifier object , comparing them, best?

if working "large" data set , data grouped id in example, suggest process these go instead of building huge hash.

use strict; use warnings;  # skip header row <data>;  @group; $lastid = '';  while (<data>) {     ($id, $data) = split /,\s*/, $_, 2;      if ($id ne $lastid) {         processdata($lastid, @group);         @group = ();     }      push @group, $data;     $lastid = $id; }  processdata($lastid, @group);  sub processdata {     $id = shift;      return if ! @_;      print "$id " . scalar(@_) . "\n";      # rest of code here }  __data__ identifier,feature 1, feature 2, feature 3, ... 29239999, 2,5,3,... 29239999, 2,4,3,... 29239999, 2,6,7,... 17221882, 2,6,7,... 17221882, 1,1,7,...

outputs

29239999 3 17221882 2

Search This Blog

And

perl - Compare lines in a file -

Comments

Post a Comment

Popular posts from this blog

Android layout hidden on keyboard show -

google app engine - 403 Forbidden POST - Flask WTForms -

how to run a query SQL in node.js mysql -