Goaccess: Support multiple parsing threads

Created on 22 Feb 2016  路  19Comments  路  Source: allinurl/goaccess

The abillity to parse logs with multiple threads would be a very great improvement.

enhancement

Most helpful comment

Multi thread is very useful feature.
I ran a goaccess on the k8s cluster, it is very long time to start when pod was scheduled to another node.

All 19 comments

Would this be multiple threads when parsing one log? If it's for multiple logs, can you run multiple instances of goaccess?

The idea is that multiple threads are parsing a single log file.
A server with 32Cores needs now almost an hour to parse a 2gb log file. That would be much faster if all of the 32 cores are parsing the logfile.

Are you using the on-disk storage?

Folks, is there any news here?

@freefd I haven't had the chance to look at this particular request. As soon as some progress is made, I'll post back.

Dear Gerardo,
We have a huge number of log files what keeps about 60.400.000 requests. To speed up file processing we would like parse them with parallel goaccess processes. Currently parsing ~200 files took about 3 hours on our VM's hardware.
So, it is possible use taskset to run few concurrent goaccess processes which will use on-disk storage?

@Blaar good question, you should be able to use taskset with goaccess. Give it a shot and let me know how it goes for you. The times that I've used taskset, I noticed that it would use only one core. I'd bump this on the to-do list, I still want to get the filters done first though.

@allinurl,
Ok, we'll check and get back to you. Thanks.

I tried to do:
taskset 0x00000001 goaccess
and
taskset 0x00000002 goaccess
Got:
taskset -cp 28057
pid 28057's current affinity list: 0
taskset -cp 28059
pid 28059's current affinity list: 1

But at the same time worked only one

If it's for multiple logs, can you run multiple instances of goaccess?
With multi result, how to merge it as one html result ??
and, like @d3f3kt say, my log file was 100gb , but just used one core of 32Cores in the server.

@toontong You could run multiple instances but you won't be able to merge the results. Definitely need to look into this request, #117 will make use of multiple threads, so this request could be part of that one as well. Stay tuned!

Why was this closed? Trying to use GoAccess on many logs also and found it incredible that such a powerful piece of software is single threaded?

@shaun-ba This is still opened (see label at the top of the page), issue 799 was closed.

Multiple parsing threads or multiple parsing projects...
I have daily access logs, from midnight to midnight, and goaccess works fine. Sometimes however I create hourly pages as well, using a loop like

for H in $(seq -w 0 23); do grep "20/Feb/2018:$H:" 2018-02-20_access.log | goaccess - >$H.html & done

24 goaccess processes start immediately, but they are executed sequentially (the files like /tmp/-1mdb_hostnames.tcb and the bunch of other similar files may lock the execution?). I will not use other analyzer than goaccess, so I may get acclimatized to this. But it would be better to process parallel.

@gitqlt Are you using the same database files for the 24 processes? If they are all different, you could execute them all in parallel, and use different folder paths, i.e., --db-path <dir>.

Yep, I didn't pay attention to those databases files... Then I created 24 subfolders, specified them in
--db-path, and all the processes worked like a charm. Thank you for your help.
However, couldn't that be the default behaviour? Or couldn't exist an option, something like
--par-procs <num> (with a default value of ... 24 ... for the lazy guy)?

Any news about this one?

@balazsbaranyi still on the works. I need to address a few other issues on the to-do list before getting to this.

Multi thread is very useful feature.
I ran a goaccess on the k8s cluster, it is very long time to start when pod was scheduled to another node.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

tbarbette picture tbarbette  路  3Comments

mhipo1364 picture mhipo1364  路  3Comments

domainoverflow picture domainoverflow  路  3Comments

g33kphr33k picture g33kphr33k  路  3Comments

LoanDEV picture LoanDEV  路  3Comments