Csvhelper: considerations about my parser, which I compared to this one

Created on 3 Jan 2020  路  1Comment  路  Source: JoshClose/CsvHelper

Hello, before I knew about this project, I created my own (simple) csv helper (source code here) and after knew about this project, tried to do some usage comparisons and found some points which my looks simpler, faster (i dont know if this last is true, maybe i didnt configure your csv library appropriately). here is one example of usage of my helper. hope it can give you some insight.

here are some points that I found notable about my helper when comparatin:

  1. my helper configuration is simpler, you create a new instance of GenericRecordParser<T> and pass to constructor a IEnumerable<string>, where each string is the name of a property of T. first property on list, will be filled with first column of csv line, second property on list with second column, and so on. if need ignore some column, simple pass null
  2. the previous point enforces POCO classes
  3. if I understood well, your helper becomes verbose when we have nested objects to configure. in my, we simple refers a nested column with a "X.Y" string in the list that we pass to constructor, where X is a property (nested object) of T and Y is a property of X. that is, no need of different configuration for nested objects.
  4. in the exemple that follows bellow I used a circular reference to test my helper. when trying to use it with yours helper it throws an exception
  5. when I did a simple benchamark reading a file with tons of records (where all lines was the same: the one in the example bellow), my helper was 10x faster
  6. my helper parses strings, not streams. we choose this approach because all csv files can be converted into a sequence of csv records strings easily, but the opposite is not true. A csv string might not come from a csv file. in fact, the file that i needed to parse was a parcial csv. there was lines that are csv, others that not, so could not parse the whole file, need to parse just the lines that I found appropriate. in another case, each line represents an array of csv records. so for each line I need to split them in many, flatten them all, and finally parse each string.

exemple of class which csv line represents

```c#
internal class Person
{
public Color Eye { get; set; }
public Color Color { get; set; }
public bool IsAlive { get; set; }
public char Gender { get; set; }
public Document RG;
public int Age { get; set; }
public Decimal Money { get; set; }
public string Name { get; set; }
public DateTime BirthDay { get; set; }
public DateTime? DeathDay { get; set; }
public Person Father { get; set; }
}

    public class Document
    {
        public Document(int _) { }

        public string Name;
        public double Id { get; set; }
    }

    internal enum Color
    {
        Black,
        White,
        Yellow,
        LightBlue,
    }
example of translatin csv line to `Person` object

```c#
    public class GenericRecordParserTests
    {
        [Fact]
        public void Parse()
        {
            CultureInfo.CurrentCulture = CultureInfo.InvariantCulture;

            // Arrange

            Color Color = Color.Yellow;
            bool IsAlive = true;
            char Gender = 'M';
            double Id = 12.34;
            int Age = 25;
            Decimal Money = 123.345M;
            string Name = "Bob";
            DateTime BirthDay = DateTime.Today.AddYears(-Age);
            DateTime? DeathDay = null;
            string FatherName = nameof(FatherName);
            string GrandpaName = nameof(GrandpaName);
            int GrandpaId = 734;
            Color Eye = Color.LightBlue;

            var mapped = new[]
            {
                nameof(Eye),
                nameof(Color),
                nameof(IsAlive),
                nameof(Gender),
                "RG.Id",
                "RG.Name",
                nameof(Age),
                nameof(Money),
                nameof(Name),
                nameof(BirthDay),
                nameof(DeathDay),
                "Father.Name",
                "Father.Father.Name",
                "Father.Father.RG.Id",
            };

            var EyeColor = "   LIGHT    BLUE   ";
            var parser = new GenericRecordParser<Person>(mapped);

            var csvLine = $@"{EyeColor};{Color};{IsAlive};{Gender};{Id};
         {Name};{Age};{Money};{Name};
         {BirthDay};{DeathDay};{FatherName};{GrandpaName};{GrandpaId}";

            // Act

            Person person = parser.Parse(csvLine);

            // Assert

            person.Eye.Should().Be(Eye);
            person.Color.Should().Be(Color);
            person.IsAlive.Should().Be(IsAlive);
            person.Gender.Should().Be(Gender);
            person.RG.Id.Should().Be(Id);
            person.RG.Name.Should().Be(Name);
            person.Age.Should().Be(Age);
            person.Money.Should().Be(Money);
            person.Name.Should().Be(Name);
            person.BirthDay.Should().Be(BirthDay);
            person.DeathDay.Should().Be(DeathDay);

            person.Father.Name.Should().Be(FatherName);
            person.Father.Father.Name.Should().Be(GrandpaName);
            person.Father.Father.RG.Id.Should().Be(GrandpaId);
            person.Father.Father.Father.Should().BeNull();
        }
feature

Most helpful comment

if I understood well, your helper becomes verbose when we have nested objects to configure. in my, we simple refers a nested column with a "X.Y" string in the list that we pass to constructor, where X is a property (nested object) of T and Y is a property of X. that is, no need of different configuration for nested objects.

You can do nested mapping as far down the tree as you like. I guess I don't have an example on the documentation site.

Map(m => m.A.B.C.D.E.F);

Circular references seem to work fine the way you're doing them.

void Main()
{
    var s = new StringBuilder();
    s.AppendLine("Name,Grandpa's Name");
    s.AppendLine("1,one");
    using (var reader = new StringReader(s.ToString()))
    using (var csv = new CsvReader(reader))
    {
        csv.Configuration.RegisterClassMap<FooMap>();
        csv.GetRecords<Person>().ToList().Dump();
    }
}

public class Person
{
    public string Name { get; set; }
    public Person Father { get; set; }
}

public class FooMap : ClassMap<Person>
{
    public FooMap()
    {
        Map(m => m.Name);
        Map(m => m.Father.Name);
    }
}

my helper parses strings, not streams. we choose this approach because all csv files can be converted into a sequence of csv records strings easily, but the opposite is not true. A csv string might not come from a csv file. in fact, the file that i needed to parse was a parcial csv. there was lines that are csv, others that not, so could not parse the whole file, need to parse just the lines that I found appropriate. in another case, each line represents an array of csv records. so for each line I need to split them in many, flatten them all, and finally parse each string.

How do you get the lines that you're passing in?

>All comments

if I understood well, your helper becomes verbose when we have nested objects to configure. in my, we simple refers a nested column with a "X.Y" string in the list that we pass to constructor, where X is a property (nested object) of T and Y is a property of X. that is, no need of different configuration for nested objects.

You can do nested mapping as far down the tree as you like. I guess I don't have an example on the documentation site.

Map(m => m.A.B.C.D.E.F);

Circular references seem to work fine the way you're doing them.

void Main()
{
    var s = new StringBuilder();
    s.AppendLine("Name,Grandpa's Name");
    s.AppendLine("1,one");
    using (var reader = new StringReader(s.ToString()))
    using (var csv = new CsvReader(reader))
    {
        csv.Configuration.RegisterClassMap<FooMap>();
        csv.GetRecords<Person>().ToList().Dump();
    }
}

public class Person
{
    public string Name { get; set; }
    public Person Father { get; set; }
}

public class FooMap : ClassMap<Person>
{
    public FooMap()
    {
        Map(m => m.Name);
        Map(m => m.Father.Name);
    }
}

my helper parses strings, not streams. we choose this approach because all csv files can be converted into a sequence of csv records strings easily, but the opposite is not true. A csv string might not come from a csv file. in fact, the file that i needed to parse was a parcial csv. there was lines that are csv, others that not, so could not parse the whole file, need to parse just the lines that I found appropriate. in another case, each line represents an array of csv records. so for each line I need to split them in many, flatten them all, and finally parse each string.

How do you get the lines that you're passing in?

Was this page helpful?
0 / 5 - 0 ratings

Related issues

DmitryEfimenko picture DmitryEfimenko  路  3Comments

muzzamo picture muzzamo  路  5Comments

SuperSkippy picture SuperSkippy  路  5Comments

CallMeBruce picture CallMeBruce  路  4Comments

GraceYuJuSong picture GraceYuJuSong  路  4Comments