Csvhelper: Empty Unquoted String Fields Not Passed Through as NULL

Created on 9 Apr 2014  Â·  18Comments  Â·  Source: JoshClose/CsvHelper

I am having issues trying to read in CSV documents where a string field can be null.

For example a record looks like this...
ABC,,"",DEF,"GHI"

Five string fields should be read as...
0: "ABC"
1: null
2: ""
3: "DEF"
4: "GHI"

But CsvReader reads field 1 as "" but then I no longer can tell if it was originally supposed to be a null or an empty string.

Other value fields seem to handle nulls fine, basically "" is null. But for strings "" is not null. How do I preserve truly null strings?

I need to have string fields with zero length which are not quoted to pass through as null, not an empty string.

feature

Most helpful comment

You can do per field:

Map(m => m.Name).TypeConverterOption.NullValues("null");

Or for a whole type:

csv.Configuration.TypeConverterOptionsCache.GetOptions<string>().NullValues.Add("null");

All 18 comments

There is no way to do this. With a CSV file, there is no difference; it's just an empty field.

If you have control over the export, you could write a special value that means null, and when reading, look for that value and make it null.

Shouldn't a field with no content represent a null?
I don't see why field 1 and 2 in my example above is not the perfect example of how a null string is differentiated from an empty string. In a similar manner as indicating embedded quotes and line feeds.
Basically for each field parsed, I need to know if the raw field was quoted. Then if the field value is an empty string and it wasn't quoted then it is null. If it was quoted, then it is an empty string.
Seems logical to me at least.
Using a magic string like "NULL" or the like just seems like a bad idea.

No, a field with no content is just an empty string. Having quotes around a field doesn't mean anything. It just says that there may be things in the field like a quote, newline, or delimiter. Quotes are optional.

Changing this behavior would be changing how the CSV format works. It's a text file and should be treated as such. If a field has special meaning, you will need to determine how to convert the content.

I understand what you're saying, but it doesn't make sense to do based on the CSV format.

This may be a config option that I could incorporate in the future. Something like TreatEmptyFieldsAsNull. When this is turned on, it would do what you're asking.

Thank you for speedy feedback Josh
I am tempted to fork and implement it myself. I started having a nose around the code. But just at this moment I'm strapped for time with project deadlines.
Maybe in a week or too I'll have a crack at it. It shouldn't too onerous.

If you want a little more verbose way of doing this inside of the existing framework, you can always create a new String ITypeConverter that inherits from the default CsvHelper.TypeConversion.StringConverter, and overrides the ConvertFromString method as such:

public override object ConvertFromString(CsvHelper.TypeConversion.TypeConverterOptions options, string text)
{
    if (string.IsNullOrEmpty(text))
        return null;
    else
        return base.ConvertFromString(options, text);
}

Then explicitly specify that you want to use this overriden type converter for every field in which a null value is acceptable.

This feature would be great to have. I've used other csv readers and have had this behavior previously. I've started using your library recently to replace my old readers so would preferably like to have this null feature available.

Do you have any plans to implement this?

This feature would also help me out also.

I second a "treat empties as nulls" option. This makes records much more manageable with null coalescence, propagation, etc.

I know it's not the quite fix asked for, but you can write a generic
string => null handling type converter, or just use a call to ConvertUsing
in your class maps. They're both fairly easy to use for such situations,
I've found.

On Sat, Nov 14, 2015 at 1:05 PM, Daniel Rasmussen [email protected]
wrote:

I second a "treat empties as nulls" option. This makes records much more
manageable with null coalescence, propagation, etc.

—
Reply to this email directly or view it on GitHub
https://github.com/JoshClose/CsvHelper/issues/252#issuecomment-156729411
.

Version 3.0 will interpret null and NULL as null's and set the property with a null. You are able to change what values are used for null in the configuration.

What will Version 3.0 do if you actually want to store the string "null"?

It will output string.Empty. If you want to have it output something else, you could create a custom type converter that you use for that field. If it's null, conver to "null", otherwise get the default type converter for that type and return the result of that instead.

Having a config for a custom null value when writing would be a nice feature. It would probably just use the first in the list of null values actually. If you want to create a new issue for writing custom values for null, I'll add it to the 3.0 release.

Hi @JoshClose thanks for responding and apologies for the delay on my end.

The context I'm coming from is using CsvHelper to pack up some tables in a database into files which we later unpack into another datastore. Recently, we got burned by the fact that null string values weren't round-tripping: they unpacked as the empty string. In hindsight this should have been obvious given the nature of CSV, but it tripped us up.

Changing null be be stored as null rather than empty doesn't seem unreasonable, but it still has the problem that not all values will round-trip, since I'm assuming that the string "null" will be encoded the same as the value null. Being able to choose a custom null placeholder doesn't solve this problem: there can always be collisions.

The way we solved our specific problem was to introduce a wrapper type NullableString with a custom type converter such that all non-null values were wrapped with quotes to distinguish them from null values which are encoded as un-quoted null.

As far a general library behavior, my personal preference would be:

  • Keep the behavior that nullable DateTime, int, etc. store null as empty. This seems most true to the language-agnostic CSV way of storing missing data (empty cell).
  • Throw an exception when trying to write a null string by default
  • Allow null strings to be encoded via the caller explicitly choosing an encoding string (e. g. null). Obviously people would still have the option to use more robust methods like the one we did as well.

I'd be happy to open a separate issue if you were interested in pursuing this direction, but I understand if you want to stick with the library's existing approach.

Version 3.0 will interpret null and NULL as null's and set the property with a null. You are able to change what values are used for null in the configuration.

Sorry to open such an old issue. But, my NULL string values are coming in as "null" rather than an actual null. Is there a setting I need to have it ? I'm using v 7.1.1?

You can do per field:

Map(m => m.Name).TypeConverterOption.NullValues("null");

Or for a whole type:

csv.Configuration.TypeConverterOptionsCache.GetOptions<string>().NullValues.Add("null");

No, a field with no content is just an empty string. Having quotes around a field doesn't mean anything. It just says that there may be things in the field like a quote, newline, or delimiter. Quotes are optional.

Changing this behavior would be changing how the CSV format works. It's a text file and should be treated as such. If a field has special meaning, you will need to determine how to convert the content.

I understand what you're saying, but it doesn't make sense to do based on the CSV format.

This may be a config option that I could incorporate in the future. Something like TreatEmptyFieldsAsNull. When this is turned on, it would do what you're asking.

Hey Josh
Did you add this feature??
This is really needed, more people are running into this problem.

@jainrishabh67 just did this with his post above.
csv.Configuration.TypeConverterOptionsCache.GetOptions<string>().NullValues.Add(string.Empty); (or just 2 double quotes)
This will turn empty strings into null values in your object.

You can set whatever strings you want to be considered null as @chadthompsonallscripts demonstrated.

Was this page helpful?
0 / 5 - 0 ratings