c# - CSV Text file parser with TextFieldParser - MalformedLineException -
i working on csv parser using c# textfieldparser class.
my csv data deliminated ,
, string enclosed "
character.
however, data row cell can have "
appears making parser throw exception.
this c# code far:
using system; using system.collections.generic; using system.linq; using system.text; using system.io; using microsoft.visualbasic.fileio; namespace csv_parser { class program { static void main(string[] args) { // init string csv_file = "test.csv"; // proceed if file found if (file.exists(csv_file)) { // test parse_csv(csv_file); } // finished console.writeline("press exit ..."); console.readkey(); } static void parse_csv(string filename) { using (textfieldparser parser = new textfieldparser(filename)) { parser.textfieldtype = fieldtype.delimited; parser.setdelimiters(","); parser.trimwhitespace = true; while (!parser.endofdata) { string[] fieldrow = parser.readfields(); foreach (string fieldrowcell in fieldrow) { // todo } } } } } }
this content of test.csv
file:
" dummy test"s data", b , c d,e,f gh,ij
what best way deal "
in row cell data?
update
based on tim schmelter's
answer, have modified code following:
static void parse_csv(string filename) { using (textfieldparser parser = new textfieldparser(filename)) { parser.textfieldtype = fieldtype.delimited; parser.setdelimiters(","); parser.hasfieldsenclosedinquotes = false; parser.trimwhitespace = true; while (parser.peekchars(1) != null) { var cleanfieldrowcells = parser.readfields().select( f => f.trim(new[] { ' ', '"' })); console.writeline(string.join(" | ", cleanfieldrowcells)); } } }
which appears produce following (correctly):
is best way deal string enclosed quotes, having quotes?
could omit quoting-character setting hasfieldsenclosedinquotes
false
?
using (var parser = new textfieldparser(@"path")) { parser.hasfieldsenclosedinquotes = false; parser.delimiters = new[]{","}; while(parser.peekchars(1) != null) { string[] fields = parser.readfields(); } }
you can remove quotes manually:
var cleanfields = fields.select(f => f.trim(new[]{ ' ', '"' }));
Comments
Post a Comment