Instapy: Skip Private/Business/No Profile Pic, Min/Max posts

Created on 17 Sep 2018  路  12Comments  路  Source: timgrossmann/InstaPy

https://www.bountysource.com/issues/63470533-skip-private-business-no-profile-pic-min-max-posts

Expected Behavior

There are basically two parts to this request.

  1. Extending the relationship bounds with the min/max amount of posts.
  2. Adding the set_skip_users feature which will allow InstaPy-users to skip:

    • Private Profiles

    • Profiles without profile pictures

    • Business profiles



      • Since there is a fixed amount of business categories, also skipping users of specific categories/only using users with a specific category



There already are older feature requests for this stuff.

980 (Skip Business - $25 Bounty)

1006 (Skip private - $33 Bounty)

I will add another $42 to this Issue. So the whole feature will have a bounty of $100.
Whoever implements this, please think about sharing it with @marcomokastyle since he already did a lot of the work.

Possible Solution (optional)

There already is an old, outdated PR which had some of the stuff implemented, the author unfortunately is not able to complete it.

2242

In the comment section there also is information about how ti find whether it's a user profile or not.

InstaPy configuration

Min/Max media
set_relationship_bounds(enabled=True,
                                    min_media=5,
                                    max_media=100
                    max_following=5555)
Profile skipping
# skip all private profiles, profiles without profile pictures and all business profiles
set_skip_users(skip_private=True, 
                          skip_no_profile_pic=True, 
                          skip_business=True)

# only skip all business profiles of the given categories
set_skip_users(skip_business=True,
                          skip_business_categories=['Creators & Celebrities', '...'])


# skip all but the given categories
set_skip_users(skip_business=True,
                          dont_skip_business_categories=['Home Services'], '...')

C&C very welcome, this is a community project so we want to have everyone agree on what we want for InstaPy.
Thanks guys for making this tool awesome.

Any comments @sionking @CharlesCCC @uluQulu ?

Best
Tim

bounty feature request

All 12 comments

I've actually just implemented something similar to suit my own needs - happy to contribute if you like.
My solution was a bit of a "quick & dirty" approach, but I think it could easily be adapted to be more flexible.

While rooting about in the code it struck me that there is a LOT of repetition that really should be refactored into more flexible functions...

For example, wherever we attempt to get some information by using graphqland then falling back to find_element_by_xpath - this should be encapsulated in a function.

Anyway, my solution was to add a function into utils.py as follows:

def is_suitable_user(browser, username):
    link = 'https://www.instagram.com/'+username+'/'
    web_address_navigator(browser, link)
    try:
        private = browser.execute_script("return window._sharedData.entry_data.ProfilePage[0].graphql.user.is_private")
        bio = browser.execute_script("return window._sharedData.entry_data.ProfilePage[0].graphql.user.biography")
        postCount = browser.execute_script("return window._sharedData.entry_data.ProfilePage[0].graphql.user.edge_owner_to_timeline_media.count")
    except WebDriverException:
        try:
            browser.execute_script("location.reload()")
            private = browser.execute_script("return window._sharedData.entry_data.ProfilePage[0].graphql.user.is_private")
            bio = browser.execute_script("return window._sharedData.entry_data.ProfilePage[0].graphql.user.biography")
            postCount = browser.execute_script("return window._sharedData.entry_data.ProfilePage[0].graphql.user.edge_owner_to_timeline_media.count")
        except WebDriverException:
            private = None
            bio = ""
            postCount = 0

    bio = bio.lower()

    return private, postCount >= 10, any((word in bio for word in ["bio", "word", "to", "find"]))

I then use this function in validate_username as follows:

    private, enoughposts, biomatch = is_suitable_user(browser, username)

    if private:
        return False, \
                "---> {} is private ~skipping user\n".format(username)
    if (not enoughposts):
        return False, \
                "---> {} not enough posts ~skipping user\n".format(username)
    if (not biomatch):
        return False, \
                "---> {} no match in bio ~skipping user\n".format(username)

@timgrossmann

Extending the relationship bounds with the min/max amount of posts.

We have discussed it with @marcomokastyle before and I think it does not fit in the relationship bounds setting. He moved it to somewhere else.
maxpic-moka

I think [ALL] of the features you have requested are implemented by @marcomokastyle already, it's just business tools that are new cos earlier it was not shared in the graphql entry.
Now that can be implemented easily.

You really invest big in this project 馃槈 those are gonna be the most expensive lines 馃

I think #2242 is not outdated, it's just @marcomokastyle could not push a final rebase that made it stall for weeks.

skip and dont_skip _business categories_ parameters look great 馃

I think let's merge @marcomokastyle PR and then many people will have additions on it.
Adding new business profile filtrations, etc.

If @marcomokastyle is not available we can make a new PR out of the code in there but I hope @marcomokastyle can be there to do it cos he is the author.
@marcomokastyle got some time? 馃


Cheers 馃榿

@uluQulu I remember that discussion about the relationship_bounds, still it's the most "understandable" point of code where this min/max is repeated twice which makes it the best place to put it for now.
Even if the method relationship_bounds is not really the right name for it.
I'm pretty sure people will get confused if this logic of min/max will be added to another location.

I agree, all the blessing belongs to聽@marcomokastyle !
Still, I haven't heard from him since the PR...

Hopefully he'll come back and finish it. 馃檹

I have all of this implemented already.

buisness skip by percentage.
private users by percentage.
minimum num of posts.

I hope to find time and pr this this week.

0 = skip all
50 = skip 1010
100 = don't skip

BTW is that acceptable:
I pass most of my parameters using a dictionary:
validate_username( ... self.parameters)
while:
self.parameters['interact_business'] = 0-100

This way I don't get into so many parameters pass to function:
e.g
self.max_followers,
self.max_following,
self.min_followers,
self.min_following,
....

OK ? I am starting to work on it.

@sionking In JS I use the object passing with destructuring all the time so I don't think it's a problem.
Also haven't seen something like this in the style guide so it shouldn't be a problem.

What's still open about this issue?

Can you explain better point 1(min/max posts) of this request?

@GabrieleCalarota Actually a lot of it.
Most of it has already been implemented in some way in #2242

But that one lacks the described API and all of the business functionality.

To help you with your point, the min max posts simply say:

  • If user has < min posts, skip
  • If user has > max posts, skip
  • If user is in bounds, then interact

@timgrossmann I implemented in a bunch of lines of codes through apinsta.herokuapp.com.
Should I create a new PR?

3048

hi guys, instagram has changed the profile picture name from "11906329_960233084022564_1448528159_a.jpg " to "44884218_345707102882519_2446069589734326272_n.jpg" cause of this reason i guess instapy is not skipping profile with no picture

Was this page helpful?
0 / 5 - 0 ratings

Related issues

CodeMaster1 picture CodeMaster1  路  3Comments

n0sw34r picture n0sw34r  路  3Comments

drcyber975 picture drcyber975  路  3Comments

ghost picture ghost  路  3Comments

rahulkapoor90 picture rahulkapoor90  路  3Comments