Twint: Get user profile without scraping for all tweets and other code snippets

Created on 16 Jan 2019  路  5Comments  路  Source: twintproject/twint

Issue Template

Please use this template!

Initial Check

I have run through scraping a twitter user's profile. It works but it scrapes through all the tweets and retweets as well. Is there a way if I could stop it from scraping a profile beyond the user profile information such as bio, username etc.

Make sure you've checked the following:

  • [] Python version is 3.6;
  • [] Updated Twint with pip3 install --upgrade -e git+https://github.com/twintproject/twint.git@origin/master#egg=twint;
  • [] I have searched the issues and there are no duplicates of this issue/question/request.

Command Ran

c = twint.Config()
c.Username = "dusadv"
c.Custom["user"] = ["bio"]
c.User_full = False
c.Output = "users.csv"
twint.run.Profile(c)

Description of Issue

The commands work as described in the wiki but I want a little more control over scraping the profile of a user. Currently, while scraping a profile it scrapes over all the tweets and retweets as well. I want to limit the process to only fetching user profile info such as bio, location, etc.

Environment Details

Using Windows 10, running from terminal.

enhancement question

Most helpful comment

Thanks a lot, pielco11 for building this library as well providing the response to the above question.

All 5 comments

import twint

c = twint.Config()
c.Username = "username"

twint.run.Lookup(c)

Thanks for this example (and asking the question). Some examples of using twint as a library would be super handy. Or do those already exist somewhere?

I'm editing (quite slowly, honestly, due to lack of time) the wiki, updating fields and stuff. Feel free to post here the snippets that you would like to see, so that I'll make a list as complete as possible

Thanks a lot, pielco11 for building this library as well providing the response to the above question.

Hi @pielco11 thank you so much for this module, it's incredibly useful!

My request for documentation is Store_object examples. I have been able to find .Lookup places the user objects in a list at twint.output.user_object but I know it would be helpful to have that documented. Also documenting the twint.output.follow_object and twint.output.clean_follow_list() would be helpful. I was able to use my IDE to find these, but I am sure others could find it helpful. Processing in memory is so useful for remote servers with little/unmonitored storage but sufficient memory and computing power, and I don't have to worry about cleaning up a ton of folders/files. :)

Thank you again!

Was this page helpful?
0 / 5 - 0 ratings