unix - Remove all lines from file with duplicate value in field, including the first occurrence -
i remove lines in data file contain value in column 2 repeated in column 2 in other lines.
i've sorted value in column 2, can't figure out how use uniq values in 1 field values not of same length.
alternately, can remove lines duplicate using awk one-liner
awk -f"[,]" '!_[$2]++'
but retains line first incidence of repeated value in col 2.
as example, if data is
a,b,c c,b,a d,e,f h,i,j j,b,h
i remove lines (including first) b occurs in second column. this:
d,e,f h,i,j
thanks advice!!
if order not important following should work:
awk -f, ' !seen[$2]++ { line[$2] = $0 } end { for(val in seen) if(seen[val]==1) print line[val] }' file
output
h,i,j d,e,f
Comments
Post a Comment