Csvhelper: Improperly escaped quotes in CSV not being detected, remaining record fields mismapped

Created on 25 Jun 2019  路  1Comment  路  Source: JoshClose/CsvHelper

When parsing this misquoted record, BadDataFound is not being invoked.

The field with the issue is:
"RUSSELL "HILLS," "

* data *

Owner,Store_Name,Store_Num,Contact_Name,Address,Address2,City,State_Cd,Postal_Cd,Country,Store_Phone_Number
"28941","SOMESTORE & HILLS INC                   ","28941","RUSSELL "HILLS,"     ","123 Main St                             ","                                        ","Springfield                                 ","NY","12345-1111","US","(781) 555-1111"

* Result *

Key                Value
---                -----
Owner              28941
Store_Name         SOMESTORE & HILLS INC
Store_Num          28941
Contact_Name       RUSSELL HILLS
Address
Address2           123 Main St
City
State_Cd           Springfield
Postal_Cd          NY
Country            12345-1111
Store_Phone_Number US

* Expected Result *
This row should be flagged as Bad and I should be able to handle it using BadDataFound

bug

Most helpful comment

I believe this is the code that controls the observed behavior. After the parser reads the second quote, it continues to read the rest of the field like a normal (unquoted) field. I agree with @jasonchester that the desired behavior would be to see this as bad data if it is not hitting either a delimiter or an escape quote after reaching the second quote.
https://github.com/JoshClose/CsvHelper/blob/2914f6856febbf7488c53ed65948a00a39f95a22/src/CsvHelper/CsvParser.cs#L695-L700
rfc4180 seems to indicate that you can't have a partially quoted field.

  1. Each field may or may not be enclosed in double quotes (however
    some programs, such as Microsoft Excel, do not use double quotes
    at all). If fields are not enclosed with double quotes, then
    double quotes may not appear inside the fields.

>All comments

I believe this is the code that controls the observed behavior. After the parser reads the second quote, it continues to read the rest of the field like a normal (unquoted) field. I agree with @jasonchester that the desired behavior would be to see this as bad data if it is not hitting either a delimiter or an escape quote after reaching the second quote.
https://github.com/JoshClose/CsvHelper/blob/2914f6856febbf7488c53ed65948a00a39f95a22/src/CsvHelper/CsvParser.cs#L695-L700
rfc4180 seems to indicate that you can't have a partially quoted field.

  1. Each field may or may not be enclosed in double quotes (however
    some programs, such as Microsoft Excel, do not use double quotes
    at all). If fields are not enclosed with double quotes, then
    double quotes may not appear inside the fields.
Was this page helpful?
0 / 5 - 0 ratings

Related issues

muzzamo picture muzzamo  路  5Comments

SuperSkippy picture SuperSkippy  路  5Comments

NeilMeredith picture NeilMeredith  路  4Comments

Dushyant262 picture Dushyant262  路  4Comments

RifS picture RifS  路  5Comments