Twint: [REQUEST] New lines in tweet text stripped

Created on 6 Oct 2019  Â·  3Comments  Â·  Source: twintproject/twint

Issue Template

Initial Check

  • ✅ Python version is 3.6;
  • ✅ Updated Twint with pip3 install --user --upgrade -e git+https://github.com/twintproject/twint.git@origin/master#egg=twint;
  • ✅ I have searched the issues and there are no duplicates of this issue/question/request.

Command Ran


import twint

config = twint.Config()
config.Search = '"First language" AND "Most used" AND "Most loved"'
config.Store_json = True
config.Output = "data/languages.json"
config.Hide_output = True

twint.run.Search(config)

Description of Issue

It looks like twint strips new lines from tweet text. JSON and CSV are both capable of containing newlines. New lines can sometimes be significant when you are analyzing tweets: for example like when parsing these tweets.

I was curious what they are being stripped out.

Environment Details

OS X (Mojave 10.14.6)

question

Most helpful comment

I don't know why \n are stripped out, I did not cover that part. Anyway I think it's better to not strip them out. The output might not be clean, and 'raw' saving (not to CSV or JSON) might not be really handy and cool.

So in the cases where c.Store_csv and c.Store_json are not specified, \ns are stripped out

Pushing updates right now

All 3 comments

I just confirmed that I'm having the same issue. Latest Python/Twint under Windows Server 2012.

I don't know why \n are stripped out, I did not cover that part. Anyway I think it's better to not strip them out. The output might not be clean, and 'raw' saving (not to CSV or JSON) might not be really handy and cool.

So in the cases where c.Store_csv and c.Store_json are not specified, \ns are stripped out

Pushing updates right now

Thanks so much @pielco11!

Was this page helpful?
0 / 5 - 0 ratings