Instapy: How Instagram is banning bots using AI, solution proposal

Created on 17 Aug 2019 · 20Comments · Source: timgrossmann/InstaPy

Motivation

Lately and strangely, because non-spam bots increase use and users, Instagram has been cracking down on bots. Opened the issue because I've heard from a friend who is close to Instagram that at Instagram they're going to download/buy and run all bots to train their AI. This of course includes InstaPy. Giving the current situation of bans I believe him and that is better to change a few things in InstaPy since the risk is not worth not writing a few lines.

Expected Behavior

Run InstaPy without worry of getting banned, using a combination of customization + randomness + human-like behavior.

Current Behavior

Risk of getting banned.

Possible Solution

More customization, currently each user has almost the same footprints as others, for example: adding the break_after_n_actions parameter to all functions: comment, follow, like... to have unique footprints for each user. Each user can then randomize as he wants his custom break_after_n_actions value.

sleep_delay: Time interval (in seconds) to sleep after n actions. Default time is 600.

break_after_n_actions: Sleep after n follows/unfollows/whatever. Default n is 11

More randomness, for example using .time_util's sleep() rather than time.sleep() in all files or at least time.sleep(random.uniform()) or alike.
For example: https://github.com/timgrossmann/InstaPy/pull/4814
InstaPy has default base sleeps that leave a footprint, whether they're somewhat random or not, I propose to include an easy way for users to modify from quickstart.py all base delays/ sleeps in InstaPy, for example:

session = InstaPy(username, password)
session.modify_base_sleeps(multiplier=None) # default is 1
session.modify_base_sleeps(multiplier=0.75) # custom example

Maybe session.set_action_delays() can do it with a new parameter: set_action_delays(modify_base_sleeps=2). Or maybe set_sleep_percentage(percentage).

More human-like behavior:
- In 90% or any custom percentage of posts with several pics or _"carousels"_ the bot sees, likes or comments, it should stop and see all or several pics, like people do.
- Scroll pace should be human-like, less abrupt, smoother and slightly slower.
- Currently while scrolling the bot will stop and click on pics to simulate human activity, this is great but should be customizable,
- Click times should be human-like, for example I found time.sleep(random.uniform(0.9, 1.47)) or its sibling time_util.sleep(1.185) to be the time I spent in clicking a thing, moving the cursor and clicking again.
- The bot always opens first instragram.com, I think this is a huge footprint. The bot should randomly open first between: own profile, some following's profile, tags, locations and also instagram.com, if cookies haven't expired and going through instagram.com at the beginning isn't necessary. The probabilities should be customizable.

instead of at first doing only:

browser.get(instragram.com)

if login isn't required, at first choose between:

browser.get(instragram.com)
browser.get(instragram.com/own_profile/) # user own profile
browser.get(instragram.com/following_profile/) # random following account
browser.get(instragram.com/explore/tags/following_tag/) # random following tag
browser.get(instragram.com/explore/location/user_city/)

- What do you think? Any suggestions?. Thanks for reading

#

I can do most things but I don't want to change all the code to have the pull request ignored.

wontfix

Source

jm-willy

👍10

Most helpful comment

My 50 cents here.

Browse the user feed and like a few pictures every now and then
Use the scroll and like when exploring , instead of going directly to the media url, this can be random.
View slides if it's that kind of media. The amount can be random too.
Using the back button, instead of just going to a link
Set referers to the links, if you're liking multiple pictures of a profile.
Maybe having a initial config, where you can set up how user normally start using the app, if he go to profile and check the new followers, or if he just open the heart menu and check out new followers.
Having random sleep times for every user bot will help a lot.
Having some way to easily mix user interactions, normally you don't search something and then proceed to follow 20 users and then wait for 30 minutes and then follow 20 more. That's not how a human interact with the Instagram.
Normally you can use Schedule to setup custom functions, but normally in instagram you don't open the app and do the same thing over and over for a whole month, people open the app and like 10 media in the feed and that's it, then watch some histories, or search for something and like 30 or 40 pictures really fast and then stop and return in an hour or whatever the amount. People also just open the app and scroll, do 0 actions, then close the App.

Adding some human behavior between the actions will drastically help bypass the bot detection.

Josexv1 on 19 Aug 2019

👍6

All 20 comments

that is a Good idea, but are you sure just time-out randomize will help us? I think other elements now effect to banned, we first need to find all / Most of them.
also i think its possible when every user lunch Instapy have their own randomize timeouts

Mehran on 18 Aug 2019

@Mehran isn't only time-out randomize, I said several additional things aside of randomization because randomness can help but customization + randomness as you said is the way to go due to its more unique footprints to each user. And yes each user should be able to have randomized sleeps, this is already possible with some sleeps(like sleep_delay parameter of some functions). Except that _"base"_ sleeps are all over the code in instapy and cannot be easily customized, that's why I proposed the next function:

session = InstaPy(username, password)
session.modify_base_sleeps(multiplier=0.75) # custom example
# custom random example:
session.modify_base_sleeps(multiplier=random.uniform(0.9, 2.5)

jm-willy on 18 Aug 2019

About that, I think about if something for some requests, it could be more interesting to access the page without selenium but just by the lib request
For example, set_relationship_bounds check the user's page for liking a picture.
Use request lib could save some requests with the cookie/ token

I think that for some requests, it is possible to avoid to use the cookie/ token

maxxfly on 18 Aug 2019

Maybe the bot could generate a random key when it creates the workspace, and based on that key set all the time / delays / other stuff, giving every workspace a unique behavior.

iperich on 19 Aug 2019

I think what should be added are redundant actions like:

opening multiple tabs with photos and just leave them open,
writing a comment but not actually posting it. (I believe that's also being monitored on some sites).
Going back to a previous photo that's already liked or had been opened before. Or open it on a new page and close all tabs later @ end session.
clicking on the heart menu to show the amount of likes when there's new interaction or just randomly as if they're expecting someone.
use that heart menu to go to users in that list and like their last photo/video or just scroll through the users photos.

VNRARA on 19 Aug 2019

👍6

Some other randomness that could be added:

opening the search tab and entering the username there to find them instead of directly going to the link.
When checking the profile of the user you want to follow, opening their followers/following tabs, opening photos on the profile

FancyRobot on 19 Aug 2019

👍3

Yes I agree, I think mirroring exactly how a person interacts would be helpful.
Just in case it tracks mouse movement could it hover the mouse over objects randomly as well.

masto182 on 19 Aug 2019

My 50 cents here.

Browse the user feed and like a few pictures every now and then
Use the scroll and like when exploring , instead of going directly to the media url, this can be random.
View slides if it's that kind of media. The amount can be random too.
Using the back button, instead of just going to a link
Set referers to the links, if you're liking multiple pictures of a profile.
Maybe having a initial config, where you can set up how user normally start using the app, if he go to profile and check the new followers, or if he just open the heart menu and check out new followers.
Having random sleep times for every user bot will help a lot.
Having some way to easily mix user interactions, normally you don't search something and then proceed to follow 20 users and then wait for 30 minutes and then follow 20 more. That's not how a human interact with the Instagram.
Normally you can use Schedule to setup custom functions, but normally in instagram you don't open the app and do the same thing over and over for a whole month, people open the app and like 10 media in the feed and that's it, then watch some histories, or search for something and like 30 or 40 pictures really fast and then stop and return in an hour or whatever the amount. People also just open the app and scroll, do 0 actions, then close the App.

Adding some human behavior between the actions will drastically help bypass the bot detection.

Josexv1 on 19 Aug 2019

👍6

Would it be possible to use Android emulation and automated app testing tools to simulate human-like behavior?

masto182 on 20 Aug 2019

👎3

@masto182 it is fully another projet. Another technology, etc...

maxxfly on 20 Aug 2019

that is a Good idea, but are you sure just time-out randomize will help us? I think other elements now effect to banned, we first need to find all / Most of them.
also i think its possible when every user lunch Instapy have their own randomize timeouts

yes it could just use something like

import random from datetime import datetime random.seed(datetime.now())

ghost on 21 Aug 2019

These are amazing ideas!! What if we also added something like randomly backspacing when writing comments and rewriting them as well?

Flaunkerton2395 on 21 Aug 2019

👍1

these are all great ideas, but the more important questions would be how Instagram detects bots exactly, does anyone got data for that?

macd2 on 22 Aug 2019

these are all great ideas, but the more important questions would be how Instagram detects bots exactly, does anyone got data for that?

to me the main reason is that they can see you are not using a phone to use their app. instapy should changet the agent. even thoght they did add something like that for us to use. but the problem is that too many stuff is being add and you dont even know where to look to use some of these features. i was tryng to change the agent myself and i couldnt do it since im using the docker version not the regular instapy

ghost on 22 Aug 2019

I'm working with another library (private api) and came here because of the similar problems. First of all I have impression it's not only ML/bigdata analysis but they also started to black list accounts. While you stay below the radar you can use anything and do almost anything. I also have impression it mostly related to mass liking - nothing else is putting account in this black list so quick when automation becomes detected instantly and you have to wait 24h after few actions.
I have few accounts and those who have been liking regularly for months now get feedback_reqiured spam:true actionBlocked (without specifics) after 1-20 likes. Those who have been using automation sporadically and didn't go on a liking spree still can mass comment and and mass like (from the same IP, with the same api).
The good idea is to collect all possible signs of non-human behavior and avoid them. My top list is following (just randomizing ALL delays/numbers even in the wide range of 0.8 doesn't help):

always liking just the last post(s) of a user
only liking without ever commenting
similar comments (don't play with smiles - they're ignored, you need thousands of really different comments of different length and vocabulary to stay out of suspicion - checked and worked)
only liking without ever getting some other feeds in between (liked by, comments, followers or recommended posts/users)
not scrolling carousels (mentioned here)
Doesn't it look like going through some feeds (like hashtag or geo) is safe while some are toxic (followers, followings)?

almarax on 22 Aug 2019

👍1

Another thing, maybe the worst, InstaPy uses the graphQL, probably the most obvious hint an account is using a bot, see: https://github.com/timgrossmann/instagram-profilecrawl/blob/refacotor_to_graphql/util/extractor.py and https://github.com/timgrossmann/InstaPy/issues/1073, tell me that doesn't make it obvious for Instagram to spot us.

I'll summarize the ideas from comments I think are both better and simpler, complex ones can be done later after we have the easy done, other ideas can directly be done in quickstart.py so I don't include those:

from @VNRARA: opening multiple tabs with photos and just leave them open
from @Josexv1: Using the back button, instead of just going to a link
from @FancyRobot : opening the search tab and entering the username there to find them instead of directly going to the link.

Edit, forgot this one:

from @dandy27: use random.seed(datetime.now()), but instead I would do:

seed(repr(datetime.now())+"username"+str(hash("password"))+str(randint(0,100_000)))+repr(os.times())+repr(sys.path+repr(sys.platform)) I know is fucked up and that's what we need.

jm-willy on 23 Aug 2019

I have somewhat gotten around this by turning off headless browser and opening tabs myself to random pages like Google and Yahoo.

This could be an interesting add on. Previously my setup only lasted about 30 min but with randomly going to other pages, I am at 2 hours and counting. The Instapy tab also works by itself and doesn't need to actually be the active tab on my Firefox session. I'm free to randomly go around Yahoo and the Instapy tab continues to work.

Is it possible to add on another tab to the Instapy session that just goes around GitHub or a different website at random? That may be a simpler solution rather than gutting the entire program and rewriting it.

Flaunkerton2395 on 23 Aug 2019

😕1

I have somewhat gotten around this by turning off headless browser and opening tabs myself to random pages like Google and Yahoo.

This could be an interesting add on. Previously my setup only lasted about 30 min but with randomly going to other pages, I am at 2 hours and counting. The Instapy tab also works by itself and doesn't need to actually be the active tab on my Firefox session. I'm free to randomly go around Yahoo and the Instapy tab continues to work.

Is it possible to add on another tab to the Instapy session that just goes around GitHub or a different website at random? That may be a simpler solution rather than gutting the entire program and rewriting it.

Does this still work for you? It worked once for me - 5 hours, then temporarily blocked again, even if I repeat the random page visits

thematsanity on 29 Aug 2019

Add action delays for keystrokes (including backspace) during login to simulate how fast a user would type their username and password. Also, include and randomize the number of incorrect login attempts.

simonel on 2 Sep 2019

👍1

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.