It seems Twint doesn't get the list of all followers for accounts with large number of followers and stoped abruptly at some random number. For example tried twint -u nasa --followers and each time the script stopped at some random number with few thousand screen_names.
pip3 install --upgrade -e git+https://github.com/twintproject/twint.git@origin/master#egg=twint;It could be possible that Twitter stops returning new entities because you (as everyone else) in that case requested too many queries.
Thanks @pielco11 , I was wondering what is the way around it. e.g., If it raises an error and it can continue pulling the information from that point. or controlling the requests within some limit so that doesn't happen. thanks
I'll look deeper (can't determine a dead-line), for now I can say that the issue does not seem to have an unique pattern. I
tried a couple of queries and got a long list, ~100k users. Plus when one stopped I started a new one, and this lasted a long. So I guess that's not Twitter that's blocking you, for what I tested I think that using a VPN will not get you around the issue.
A solution could be to re-try the query when it fails, anyway the code should be changed after a deeper look at what-is-going-on
Thank you @pielco11 !
I'll look deeper (can't determine a dead-line), for now I can say that the issue does not seem to have an unique pattern. I
tried a couple of queries and got a long list, ~100k users. Plus when one stopped I started a new one, and this lasted a long. So I guess that's not Twitter that's blocking you, for what I tested I think that using a VPN will not get you around the issue.A solution could be to re-try the query when it fails, anyway the code should be changed after a deeper look at what-is-going-on
Hi, first of all, thanks for doing such amazing tool and public the code, I am sure I will learn a lot from your work. I tried getting a long list, ~130k and it stopped in random number of followers in each query.
On the other hand, I am making a script to get all tweet links of a user, because I think your tool does not do it. Without log in and without using the API, but in some way, after doing a lot of querys (with my script), twitter blocks your user's tweets search. Using an VPN, the problem was solved. This is just to give you some information.
Finally, If you have a PayPal account, I would like to buy you a coffe for posting the source code, because as I said, I would like to learn who do you did the tool, which would be impossible without the source code.
~130k followers are a lot so Twitter might be blocking requests at a random time
For the second point, that's why Twitter blocks an IP if it makes too many requests, that's why using a VPN solves the problem.
What we could try is handling that "followers count" issue and ask the user to change the IP and then retry the query, and see if this solves the issue.
Unfortunately I do not have enough time to solve every issue, so the patch will be delayed. Every kind of help in the development is widely accepted
@mmosleh Here is what's going on


In the first case there is a show more, Twint extracts that link and does a new request. Then that button vanishes so Twint is not able to make a new request.
If I get the last cursor-id and make a new request changing the IP and stuff, nothing changes
I think that we found the origin of the issue and sadly we can't do anything, at least for now
@pielco11 I made a quick dirty patch into the previous version of Twint (the one with a single file). Just few retrial on the last curser-id when receive the error massage. I managed to download all 32M NASA followers this way. (I'm not familiar the code base on the new version though)
@mmosleh oh, nice... may you provide me the commit id? git rev-parse HEAD
@mmosleh oh, nice... may you provide me the commit id?
git rev-parse HEAD
So, was the update uploaded? Is it possible to download a large list of followers? as @mmosleh managed to do
Adding timeout seems to solve the issue.
Without timeouts I'm able to get upto 40 followers/following, adding time.sleep(3) to line 161 in twint/get.py allows me to get upto 440 followers/following
In the current iteration of get.py - has this issue been resolved? I'm not seeing the time.sleep(3) line within the script
thanks once again!
@KristopherMakuch I did not apply that "patch" since I'm not sure that's a patch. More testing is needed, everyone is welcome to find a workaround
how do I include a control file to know on which page it stopped?
example:
twint -u username --followers -o username_followers.txt username_page.txt -t 3 -r 15
username_page.txt = file with last id page followers.
-t = 3 (Time elipse for new page followers)
-r = 15 (time random ofr new page followers)
the time to go to the next page to find followers should be the sum of t + r. R will always be random and can be 2, 3 or 15. So time will vary.
If processing is interrupted, it may after a while try to run again and will continue from the followers page according to the id of the page in file.
My original command:
twint -u username --followers -o username_followers.txt
my error today:
CRITICAL:root:twint.feed:Follow:IndexError
file with 1036 followers, but this profile have 3.800 followers.
Thanks again for all work.
Congratz
Sorry my poor english....
my first language is portugues.
Hi @Matiusco
to add some timeouts you have just to add a line as descried above
You could resume the scrape with something like twint -u username --followers -o user_followers.txt --resume username_followers_resume.txt
When Twint will stop (most probably because Twitter does not return more data) you will have just to re-run the command to resume from where it stopped
hi @pielco11
Thanks for informing.
I will try to perform this operation.
edited report
yes, work perfect now.
resume.txt is [] if finished. ;)
I can not get all followers.
Sometimes reaching up to 15,000 others ends in 9,000, but the resume file is empty [].
my command in terminal:
twint -u zehdeabreu --followers -o user_followers.txt --resume zehdeabreu_followers_resume.txt -t 15 -l 50
-t 15 and -l 50 not working...
How could I do inside a python file to control the time of each request to give a much longer time between requests?
=====
max followers at moment is:
wc -l user_followers.txt
tks for all help
-t is not implemented, yet (at least); -l is for the lang, --limit is for the limit. If you want to control the time for each request, you have to play with get.py
Your query should be something like twint -u zehdeabreu --followers -o user_followers.txt --resume zehdeabreu_followers_resume.txt --limit 60
I also tried the resume option and it works fine
tks @pielco11 , I'll try
I had similar issue. The problem seems to be sometimes the "more" button
doesn't appear on Twitter end and Twint assumes that it has reached the end
of the list of followers. However, if it tries and sends another request,
the button might be available to the scraper ... .. So once quick fix could
be just get the number of followers first then keep retrying till the
number of followers matches.
On Thu, Jun 20, 2019 at 12:51 PM Matiusco notifications@github.com wrote:
tks @pielco11 https://github.com/pielco11 , I'll try
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/twintproject/twint/issues/340?email_source=notifications&email_token=AFIVAC5GAVFM5FJCHCHIFY3P3NOJ3A5CNFSM4GS5C642YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODYFBXIQ#issuecomment-503978914,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AFIVAC6RLTX7EKFIHGCYJQTP3NOJ3ANCNFSM4GS5C64Q
.
ok @mmosleh , but I do not know how I could change this part in code get.py
I still can not get all the followers.
How can I get the IDs of the followers instead of the username, please? Thank you
@nxhuy-github please write comments about the topic of the issue. Anyway you can do that using .Lookup as showed in the wiki
Hi, what is the current status for the code that retrieves all the followers of one person? I am still having the problem that only a subset of followers is downloaded. I am using the command
twint -u SpeakerPelosi --followers but unable to get all 3 millions followers (my result is only about 30k users). I saw that line 161 has a timeout. Would increasing this timeout help ?
@datduong Twitter works effectively to not allow Twint to get all the followers, I highly suggest you to use the API
Hi. I faced same issue now.
I've trial and error so many times, And perhaps I found some workaround of this issue.
twint -u nasa --following --resume nasa_following_resume.txt --limit 60 is basically works well.CRITICAL:root:twint.feed:Follow:IndexError--wait-random 120, for example.CRITICAL:root:twint.feed:Follow:IndexError, twint should wait random seconds and try again the command.twint -u nasa --following --wait-random 120.--resume filename is should automatically determine or store only in memory.--limit 60 is should determine appropriate default value.
Most helpful comment
Hi. I faced same issue now.
I've trial and error so many times, And perhaps I found some workaround of this issue.
My found is this:
twint -u nasa --following --resume nasa_following_resume.txt --limit 60is basically works well.CRITICAL:root:twint.feed:Follow:IndexErrorMy Proposal is this:
--wait-random 120, for example.CRITICAL:root:twint.feed:Follow:IndexError, twint should wait random seconds and try again the command.twint -u nasa --following --wait-random 120.--resume filenameis should automatically determine or store only in memory.--limit 60is should determine appropriate default value.