Twint: [QUESTION[Pandas compatibility - error

Created on 13 Aug 2019  路  5Comments  路  Source: twintproject/twint

  • [Yes] Python version is 3.6;
  • [Yes] Updated Twint with pip3 install --user --upgrade -e git+https://github.com/twintproject/twint.git@origin/master#egg=twint;
  • [Yes] I have searched the issues and there are no duplicates of this issue/question/request.

Command Ran

Hey guys, love this library. Wanted to test out the pandas compatibility but everytime i run c.Pandas=True and run a twint.run.Search(c) I get an error. I think it might be related to date formatting?

c.Pandas = True works when I just grab the followers or users.

import twint
from datetime import datetime, timedelta
import nest_asyncio
import pandas as pd
nest_asyncio.apply()

c = twint.Config()
c.Limit=10
c.Username='ProtonMail'
c.Store_object=True
c.Pandas=True
twint.run.Search(c)

Description of Issue

Using c.Pandas=True yields
File "c:\usersxx\appdata\local\programs\python\python37-32\lib\site-packages\twint\storage\panda.py", line 67, in update
day = weekdays[strftime("%A", localtime(Tweet.datetime))]

OSError: [Errno 22] Invalid argument

Environment Details

I am using windows and running this with spyder - without Anaconda. Also run this with visual studio code.

Workaround

Most helpful comment

Well the workaround is just to save to csv/json and then use pandas to read it though having pandas integration built in was a cool feature!

All 5 comments

Hmmm strange, maybe it's related to an old bug. May you update and retry?

This is what I get
immagine

I have the same problem and I know i have the latest version of twint because I just installed it. I also recreated the problem in jupyter, running a local .py, and running in a python shell entering the exact same things you typed. I get the same error as the first comment in every one. When i run it in the python CL each line works including c.Pandas = True and I just get the error when I try to do the .run.Search() line.

Full error output:

OSError Traceback (most recent call last)
in
19 c.Pandas = True
20
---> 21 twint.run.Search(c)
22
23 Tweets_df = twint.storage.panda.Tweets_df

~\Anaconda3\lib\site-packages\twint\run.py in Search(config, callback)
290 config.Profile = False
291 config.Profile_full = False
--> 292 run(config, callback)
293 if config.Pandas_au:
294 storage.panda._autoget("tweet")

~\Anaconda3\lib\site-packages\twint\run.py in run(config, callback)
211 def run(config, callback=None):
212 logme.debug(__name__+':run')
--> 213 get_event_loop().run_until_complete(Twint(config).main(callback))
214
215 def Favorites(config):

~\Anaconda3\lib\asyncio\base_events.py in run_until_complete(self, future)
571 raise RuntimeError('Event loop stopped before Future completed.')
572
--> 573 return future.result()
574
575 def stop(self):

~\Anaconda3\lib\asyncio\futures.py in result(self)
176 self.__log_traceback = False
177 if self._exception is not None:
--> 178 raise self._exception
179 return self._result
180

~\Anaconda3\lib\asyncio\tasks.py in __step(failed resolving arguments)
223 result = coro.send(None)
224 else:
--> 225 result = coro.throw(exc)
226 except StopIteration as exc:
227 if self._must_cancel:

~\Anaconda3\lib\site-packages\twint\run.py in main(self, callback)
152 task.add_done_callback(callback)
153
--> 154 await task
155
156 async def run(self):

~\Anaconda3\lib\asyncio\futures.py in __await__(self)
258 if not self.done():
259 self._asyncio_future_blocking = True
--> 260 yield self # This tells Task to wait for completion.
261 if not self.done():
262 raise RuntimeError("await wasn't used with future")

~\Anaconda3\lib\asyncio\tasks.py in __wakeup(self, future)
290 def __wakeup(self, future):
291 try:
--> 292 future.result()
293 except Exception as exc:
294 # This may also be a cancellation.

~\Anaconda3\lib\asyncio\futures.py in result(self)
176 self.__log_traceback = False
177 if self._exception is not None:
--> 178 raise self._exception
179 return self._result
180

~\Anaconda3\lib\asyncio\tasks.py in __step(failed resolving arguments)
221 # We use the send method directly, because coroutines
222 # don't have __iter__ and __next__ methods.
--> 223 result = coro.send(None)
224 else:
225 result = coro.throw(exc)

~\Anaconda3\lib\site-packages\twint\run.py in run(self)
196 elif self.config.TwitterSearch:
197 logme.debug(__name__+':Twint:main:twitter-search')
--> 198 await self.tweets()
199 else:
200 logme.debug(__name__+':Twint:main:no-more-tweets')

~\Anaconda3\lib\site-packages\twint\run.py in tweets(self)
143 for tweet in self.feed:
144 self.count += 1
--> 145 await output.Tweets(tweet, self.config, self.conn)
146
147 async def main(self, callback=None):

~\Anaconda3\lib\site-packages\twint\output.py in Tweets(tweets, config, conn, url)
140 elif config.TwitterSearch:
141 logme.debug(__name__+':Tweets:TwitterSearch')
--> 142 await checkData(tweets, config, conn)
143 else:
144 logme.debug(__name__+':Tweets:else')

~\Anaconda3\lib\site-packages\twint\output.py in checkData(tweet, config, conn)
114 if config.Pandas:
115 logme.debug(__name__+':checkData:Pandas')
--> 116 panda.update(tweet, config)
117
118 if config.Store_object:

~\Anaconda3\lib\site-packages\twint\storage\panda.py in update(object, config)
65 if _type == "tweet":
66 Tweet = object
---> 67 day = weekdays[strftime("%A", localtime(Tweet.datetime))]
68 dt = f"{object.datestamp} {object.timestamp}"
69 _data = {

OSError: [Errno 22] Invalid argument

I guess there's a compatibility issue of Twint with your OS

Well the workaround is just to save to csv/json and then use pandas to read it though having pandas integration built in was a cool feature!

I also am having this problem and error running on Windows OS. Is it a specific Windows OS issue, would it work on Linux?

Was this page helpful?
0 / 5 - 0 ratings