Twint: Cannot clean data stored in pandas

Created on 23 May 2019  路  2Comments  路  Source: twintproject/twint

Issue Template

Please use this template!

Initial Check

If the issue is a request please specify that it is a request in the title (Example: [REQUEST] more features). If this is a question regarding 'twint' please specify that it's a question in the title (Example: [QUESTION] What is x?). Please only submit issues related to 'twint'. Thanks.

Make sure you've checked the following:

  • [x] Python version is 3.6;
  • [x] Updated Twint with pip3 install --upgrade -e git+https://github.com/twintproject/twint.git@origin/master#egg=twint;
  • [x] I have searched the issues and there are no duplicates of this issue/question/request.

Command Ran

Please provide the _exact_ command ran including the username/search/code so I may reproduce the issue.

import nest_asyncio 
import twint

nest_asyncio.apply()

c = twint.Config()
c.Pandas = True
c.Store_pandas = True
c.Pandas_clean=True
c.Username = 'noneprivacy'
c.Limit = 20

twint.run.Search(c)
df = twint.storage.panda.Tweets_df
twint.storage.panda.clean()

Description of Issue

Please use as much detail as possible.

When running the presented code twice, I obtain a dataframe containing outputs stored from last searches (In the present case, I thus have a dataframe composed of 40 Tweets while I only wanted the data from the last scrape). The automatic cleaning option (c.Pandas_clean=True) doesn't seem to work and running twint.storage.panda.clean() doesn't work neither :(

Environment Details

Using Windows, Linux? What OS version? Running this in Anaconda? Jupyter Notebook? Terminal?

Running it on Spyder 3.2.8 with Python 3.6. Using macOS 10.14.5

bug

All 2 comments

I guess that while updating the code base, I missed something

For now, as workaround, I suggest you to "reset" the dataframe. Simply do twint.storage.panda.Tweets_df = None instead of twint.storage.panda.clean()

At start, every dataframe is defined as None, so that's not too much wrong

And thanks for reporting this!

Your workaround works perfectly, thank you very much for the help !

Was this page helpful?
0 / 5 - 0 ratings