Openrefine: Fail to separate .txt file with ;

Created on 5 Mar 2020  路  7Comments  路  Source: OpenRefine/OpenRefine

Hi,
i hope i am in the right place for my question. A friend of me told me, I should ask here no matter how "stupid" my question is...

I got a.txt file with like 8 to 10 lines. After a random amount of lines there will be a ";" and after that sign i want to get all the text from one ";" to another ";" to the next column and so on. Is it somehow possible? I tried it via "Parse data as CSV/TSV/separator-based-flies". I marked Columns are separated by "custom ;". I received a new column, but it wasn麓t the result i needed. This is my example-file.

example_Open_refine.txt

In a perfect world i would like to receive 4 columns and every column is ending by the ";".
Is it somehow possible?

question

All 7 comments

Hi, and welcome!

i hope i am in the right place for my question.

It would be better to use the mailing list or StackOverflow for questions.

Is it somehow possible?

I would import the file as text lines, and then clean it up from there:

  • remove lines with ; and blank lines;
  • split the lines into two columns;
  • key-value columnize.

I would not use the CSV/TSV importer for that.

Thx, for your advices!

I would import the file as text lines, and then clean it up from there:

* remove lines with `;` and blank lines;

If i remove the ";" than i got blank lines, which is not good for the dataset, because sometimes i already got blank lines in my dataset which should be not separated from each other.

* split the lines into two columns;

* key-value columnize.

I would not use the CSV/TSV importer for that.

Is it maybe somehow possible to import 100.txt files and get 1 .txt file per column in one project? I could easily produce hundred separated .txt file, if it麓s easier...

Creating a project with hundreds of column and few rows sounds like a bad idea in OpenRefine. I would rather transpose and use few columns, many rows.

Creating a project with hundreds of column and few rows sounds like a bad idea in OpenRefine. I would rather transpose and use few columns, many rows.

You are totally right, to do it that way. But i still got the problem, that i want to get 100 .txt files into OpenRefine. I could do it by excel as well, but i didn麓t found out how to do it there...so i thought i will ask here.

You are right to transponse it, but i still need to seperate the .txt files from each other. And i am still not able to do it :)

And i am still not able to do it :)

Could you ask your question on the mailing list or StackOverflow? Thanks!

@HSHMartin if you ask this on the mailing list I've got a possible approach I will post in reponse and I'm sure other will have as well - there are many more OpenRefine users on the mailing list than monitor these github issues.

Thanks

@HSHMartin if you ask this on the mailing list I've got a possible approach I will post in reponse and I'm sure other will have as well - there are many more OpenRefine users on the mailing list than monitor these github issues.

Thanks

Thanks! I will try post it there :)

Was this page helpful?
0 / 5 - 0 ratings