shell - Reg Ex to replace single quote up to a constant -
i have csv file need load postgres.
the values like
date, col1, col2, col3, col4 20140101,value1, value2, value3, "http//,bar" 20130102,value1, value2, value3, "http//,bar" 20130103,value1, value2", value3, "http//,xxx"
in of data (as shown on line 3) have bad data value2
has double quote @ end of it.
unfortunately have no control on input data.
using postgres copy command "as csv" error unterminated quote.
i cannot remove quotes in file because of last column 4 has commas embedded in value , comma delimiter.
basically i'm looking type of sed script can delete occurrences of double quote until first occurence of "http//..... last column have "http//" in i'm using constant.
in example above lines 1 , 2 correct. line 3 should change
20130103,value1, value2", value3, "http//,xxx"
to
20130103,value1, value2, value3, "http//,xxx"
it easier using perl since supports lookahead:
perl -pe 's/"(?=.*?"http)//g' file.csv date, col1, col2, col3, col4 20140101,value1, value2, value3, "http//,bar" 20130102,value1, value2, value3, "http//,bar" 20130103,value1, value2, value3, "http//,xxx"
or using awk:
awk -f'"http' 'index($1, "\"") { gsub(/"/, "", $1); $1=$1 fs } 1' file date, col1, col2, col3, col4"http 20140101,value1, value2, value3, "http //,bar" 20130102,value1, value2, value3, "http //,bar" 20130103,value1, value2, value3, "http //,xxx"
Comments
Post a Comment