if statement - How to find text in data file and calculate average using perl -
i replace grep | awk | perl command pure perl solution make quicker , simpler run.
i want match each line in input.txt data.txt file , calculate average of values matched id names , numbers.
the input.txt contains 1 column of id numbers:
fbgn0260798 fbgn0040007 fbgn0046692
i match each id number it's corresponding id names , associated value. here's example of data.txt column 1 id number, columns 2 , 3 id name1 , id name2 , column 3 contains values want calculate average.
fbgn0260798 cg17665 cg17665 21.4497 fbgn0040007 gprk1 cg40129 22.4236 fbgn0046692 rpl38 cg18001 1182.88
so far used grep , awk produce output file containing corresponding values matched id numbers , values , used output file calculate counts , averages using following commands:
# first part using grep | awk exec < input.txt while read line grep -w $line data.txt | cut -f1,2,3,4 | awk '{print $1,$2,$3,$4} ' >> output.txt done # second part perl open $input, '<', "output_1.txt" or die; ## output file first part , has same layout data.txt file $total = 0; $count = 0; while (<$input>) { ($name, $id1, $id2, $value) = split; $total += $value; $count += 1; } print "the total $total\n"; print "the count $count\n"; print "the average ", $total / $count, "\n";
both parts work ok make simplify running 1 script. i've been trying find quicker way of running whole lot in perl after several hours of reading, totally stuck on how it. i've been playing around hashes, arrays, if , elsif statements without 0 success. if has suggestions etc, great.
thanks, harriet
if understand you, have data file contains name of each line , value line. other 2 ids not important.
you use new file called input file contain matching names found in data file. these values want average.
the fastest way create hash keyed names , values value name in data file. because hash, can locate corresponding value. faster grep`ing same array on , on again.
this first part read in data.txt
file , store name , value in hash keyed name.
use strict; use warnings; use autodie; # way, don't have check if can't open file use feature qw(say); use constant { input_name => "input.txt", data_file => "data.txt", }; # # read in data.txt , values , keys # open $data_fh, "<", data_file; %ids; while ( $line = <$data_fh> ) { chomp $line; ($name, $id1, $id2, $value) = split /\s+/, $line; $ids{$name} = $value; } close $data_fh;
now, have hash, it's easy read through input.txt
file , locate matching name in data.txt
file:
open $input_fh, "<", input_file; $count = 0; $total = 0; while ( $name = <$input_fh> ) { chomp $name; if ( not defined $ids{$name} ) { die qq(cannot find matching id "$name" in data file\n); } $total += $ids{$name}; $count += 1; } close $input_fh; "average = " $total / $count;
you read through each file once. assuming have single instance of each name in each file.
Comments
Post a Comment