Twint: [QUESTION] Is it possible to collect multiple users at the same time?

Created on 26 Jul 2018  Â·  4Comments  Â·  Source: twintproject/twint

Is it possible to collect from multiple users in parallel? I've tried

import twint
from multiprocessing import Process


def process(user):
    c = twint.Config()
    c.Username = user
    c.Since = '2018-07-01'
    c.Store_json = True
    c.Output = '{}.json'.format(user)
    twint.run.Search(c)


users = ['otojya', 'otsuichich']

for user in users:
    p = Process(target=process, args=(user,))
    p.start()
    p.join()

But it doesn't seem to run in parallel. It finishes collecting otojya and then it collects from otsuichich.

New Feature question

Most helpful comment

I think I can help here,

If you use threading instead of Process you can achieve it

I did something similar but to run three search in parallel, the same query but searching in three different languages

https://github.com/twintproject/twint/issues/171
it works, you can run several different searches in parallel. You only have to adapt the code to search by different users.

All 4 comments

Hi @sshum00, the pythonic way for doing multi scrapes concurrently requires some code edits.
This feature is not ready, yet. I suggest you to run separate scripts.

If you want to run a lot of multiple scraping sessions, I suggest you to create a script in bash/sh/zsh/pwsh/etc.

Thank you for your patience

I think I can help here,

If you use threading instead of Process you can achieve it

I did something similar but to run three search in parallel, the same query but searching in three different languages

https://github.com/twintproject/twint/issues/171
it works, you can run several different searches in parallel. You only have to adapt the code to search by different users.

Ah thank you, I'll look into using your code, currently I'm using subprocess.Popen to handle it atm but it's a little janky in that regards since it's actually using the Twint.py file opposed to the module.

I am using this code with even more threads and it runs in parallel or of
them.

For me it works but may be is because I am no using the latest version on
twitter...

El vie., 27 jul. 2018 22:59, Shelby S notifications@github.com escribió:

@Nestor75 https://github.com/Nestor75 I tried out your code and it
doesn't seem to run in parallel. It finishes for one entry before scraping
another.

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/twintproject/twint/issues/185#issuecomment-408537263,
or mute the thread
https://github.com/notifications/unsubscribe-auth/ALesvDHJXDZASODsdxvNBGzgwvlgpdrPks5uK38igaJpZM4ViVCD
.

Was this page helpful?
0 / 5 - 0 ratings