If the issue is a request please specify that it is a request in the title (Example: [REQUEST] more features). If this is a question regarding 'twint' please specify that it's a question in the title (Example: [QUESTION] What is x?). Please only submit issues related to 'twint'. Thanks.
Make sure you've checked the following: Checked
pip3 install --user --upgrade -e git+https://github.com/twintproject/twint.git@origin/master#egg=twint;Please provide the _exact_ command ran including the username/search/code so I may reproduce the issue.
import twint
c=twint.Config()
c.Limit = 20
c.Search = 'dog'
c.Until = "2019-01-24 11:10:00"
twint.run.Search(c)
I can provide a more detailed view of my original program if necessary.
Please use as much detail as possible.
The above command returns no tweets. I had created an algorithm to scrape over ten-minute intervals, and it randomly started retrieving zero tweets around January 24th, 2019 at 10:30 AM when I tried to reconnect to the script.
The error it provides is:
Expecting value: line 1 column 1 (char 0) [x] run.Feed
[!] if get this error but you know for sure that more tweets exist, please open an issue and we will investigate it!
I tried testing various searches to see what exactly was wrong, and this search, which clearly should return tweets was providing none found. I fooled around with a few other searches, and they occasionally will work. It seems like there may be a server time out of some sort because after waiting for roughly 5 minutes since the last call, the search will go through.
Using Windows, Linux? What OS version? Running this in Anaconda? Jupyter Notebook? Terminal?
I am using OS X 10.15, I haven't updated anything since it stopped working, am using IDLE & Terminal to test.
Quick update: The program is not running on other computers as well, where it was previously.
I have the same problem. Beginning this morning I get zero tweets for whatever search I try. I tried on multiple operating systems and it doesn't change anything.... I also checked all of the requirements.
The problem has extended for me. I have tried running on AWS, my school services, VPNs, hotspot, and various networks. It started off as just a problem no my home wifi, but I cannot scrape on any network/machine combination now.
That's way so strange, may you guys try with --debug or c.Debug = True and upload somewhere (not here, Pastebin is OK) the content of the file named twint-last-request.log please? So that I can see what Twitter returns
Here is a link.
Strangely, you are having
<table class="content">
<tr>
<td>
<div class="title">Sorry, that page doesn't exist</div>
<div class="subtitle"><a href="/">Back to home</a></div>
</td>
</tr>
</table>
</div>
</div>
And that's why you are not getting tweets; I tried your query on my end and it works perfectly. So it's not Twint. May you try with a VPN with the endpoint in EU (for example)?
I ran into a similar issue trying to scrape a profile (using the profile scrape) using the python API. I got the same "Sorry, that page doesn't exist" for twint.run.Profile(c), but twint.run.Search(c) seems to work fine. Tried North American and EU endpoints, as suggested above.
Sample url from debug:
https://twitter.com/i/profiles/show/jeffykao/timeline/tweets?include_available_features=1&lang=en&include_entities=1&include_new_items_bar=true
Thanks for any light you can shed on the issue!
So I guess Twitter changed something on its side and so that feature (.Profile) is currently broken
Very strange -- I tried again today, and it appears to be working consistently. I'll report back if I see any other hints about what could be happening.
Cool then!
Note that the above url (https://twitter.com/i/profiles/show/jeffykao/timeline/tweets?include_available_features=1&lang=en&include_entities=1&include_new_items_bar=true) is no longer working. :-(
I'm having the same issue. The first time it happened, I stopped for several hours, came back, and it worked again. Most recently, I haven't been able to retrieve tweets for the past 48hrs. I tried using a VPN, but that didn't fix the issue. Here is what the last-request.log is giving me:
{"min_position":"thGAVUV0VFVBYBFgESNQAVACUAVQAVAAA=","has_more_items":false,"items_html":"\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n \n","new_latent_count":0,"focused_refresh_interval":30000}
@xxristoskk it means that there are no more tweets, at least that's what Twitter's saying
@pielco11 Ahh, thank you for replying. I'm still trying to find a work around for this since there are definitely more tweets--searched for recent trending hashtags as a test. Cleared cookies & cache, used a VPN. The only other thing I can think of is trying it on a completely different computer. If you have any recommendations, please share!
Hey all,
I'm commenting with a (relative) solution and cause. It seems that Twitter throws a more strict device + IP-ban after a certain amount of queries. This resets after time, but I was making roughly 2160 queries through Twint that would each return ~150 search results. I don't really have more information on this, but it prevents Twint from seeing the Tweets, even though they DO exist.
It is correct that it is not Twint and it must use some form of a combination between a device & IP address because a device change and/or network change do not exclusively work.
The workaround for me was to use AWS. Create an instance in a different time zone each time you get 'banned'. Note that this should only be affecting people with high volumes of queries. If you are getting this issue with a small number of queries, I am not sure why that might be happening.
Also, one final note is that this 'ban' does not seem to be a wall. Instead, queries become more and more sporadic in successfully returning the search results. a 'try/except' or something of the sorts will clearly fix this so that any 'null' values are thrown out.
@j2kao @xxristoskk @pielco11 @annika-stechemesser
@mshayes18 Thank you! I had a feeling it was something like that.
Thanks @mshayes18 -- really helpful! What's the best indicator for checking whether your device + ip is "banned"?
@j2kao to prove this I try to search for tweets in the specific date-time range with another device and IP, with my smartphone for example, and if I see tweets but Twitter does not return them... that's it
i/profiles/show appears to no longer be a valid way to obtain timeline tweets. This is no longer used, it appears, in the normal twitter browser client. Has this endpoint been deprecated entirely?
is it issue with the exact string like it searches for exact keywords!!
Most helpful comment
Hey all,
I'm commenting with a (relative) solution and cause. It seems that Twitter throws a more strict device + IP-ban after a certain amount of queries. This resets after time, but I was making roughly 2160 queries through Twint that would each return ~150 search results. I don't really have more information on this, but it prevents Twint from seeing the Tweets, even though they DO exist.
It is correct that it is not Twint and it must use some form of a combination between a device & IP address because a device change and/or network change do not exclusively work.
The workaround for me was to use AWS. Create an instance in a different time zone each time you get 'banned'. Note that this should only be affecting people with high volumes of queries. If you are getting this issue with a small number of queries, I am not sure why that might be happening.
Also, one final note is that this 'ban' does not seem to be a wall. Instead, queries become more and more sporadic in successfully returning the search results. a 'try/except' or something of the sorts will clearly fix this so that any 'null' values are thrown out.
@j2kao @xxristoskk @pielco11 @annika-stechemesser