Go-ipfs: http gateway freezes: no response code will be returned

Created on 25 Mar 2020  路  10Comments  路  Source: ipfs/go-ipfs

Version information:

go-ipfs version: 0.4.23-6ce9a355f
Repo version: 7
System version: amd64/linux
Golang version: go1.14

Description:

I tried to list a directory via the local web-gateway and couldn't get an answer via the browser. Doing the same via the ipfs command is on the other hand very fast:

$ time ipfs ls QmdQogukw7icepHraCEgrqUiYRehqXbGiXkRXYX1i8rUZq > /dev/null
real    0m0,839s
user    0m0,191s
sys     0m0,312s

using an (already cached IPNS is also very fast):

$ time ipfs ls /ipns/pkg.pacman.store/arch/x86_64/default/community > /dev/null
real    0m0,806s
user    0m0,197s
sys     0m0,299s

While this wget requests connects but never gets a code back (within 2 minutes).

$ time wget http://127.0.0.1:8080/ipfs/QmdQogukw7icepHraCEgrqUiYRehqXbGiXkRXYX1i8rUZq

After a restart of the node this issue disappeared. Any idea how to debug this in the future?

kinbug neeauthor-input

All 10 comments

Follow the guide in https://github.com/ipfs/go-ipfs/blob/master/docs/debug-guide.md. The goroutine dump will be especially helpful in figuring out where the request is stuck.

@Stebalien

I had a similar occurrence, where the Web-API basically stalled for 2-3 minutes, for no apparent reason.

I wrote a goroutine dump for you while it happened (the same computer/setup but a different IPFS-version, this time build from the git repo: 729e0533f7e91c67c4c9c589bd7ab937145345b6)

ipfs.stacks.log
ipfs_config.txt

Could you take the rest of the dumps? One thing I'm noticing is that zap (the logging framework) appears to be blocking. Do you have debug logging enabled?

Yes, debug logging was on.

When it occurs the next time, I'll do all dumps :)

Well, my testing installation (for 0.5.0-dev-84724a360) now got completely stuck. I started this time with a completely empty repo/block storage and just tried to cluster follow the ipfs-websites cluster.

This is what the cluster-follower has logged on the console until I terminated it:

output_hanging_ipfs.txt

Nothing else was running - just the cluster follower and the WebUI.

The log of ipfs hasn't printed any errors (haven't changed the log level to debug this time) - and I cannot get any of the dumps from the API... the curls are running for 10 Minutes now with 0 bytes fetched.

Any idea how to debug this? :)

A overview of the processes:

$ pstree -acnp | grep ipfs | grep -v grep
  |   |   |   `-ipfs,324159 diag sys
  |   |   |       |-{ipfs},324160
  |   |   |       |-{ipfs},324161
  |   |   |       |-{ipfs},324162
  |   |   |       |-{ipfs},324163
  |   |   |       |-{ipfs},324164
  |   |   |       |-{ipfs},324165
  |   |   |       |-{ipfs},324166
  |   |   |       `-{ipfs},324167
  |-ipfs,243084 daemon --enable-pubsub-experiment --enable-namesys-pubsub --enable-mplex-experiment --enable-gc
  |   |-{ipfs},243085
  |   |-{ipfs},243086
  |   |-{ipfs},243087
  |   |-{ipfs},243088
  |   |-{ipfs},243089
  |   |-{ipfs},243090
  |   |-{ipfs},243091
  |   |-{ipfs},243092
  |   |-{ipfs},243207
  |   |-{ipfs},243297
  |   |-{ipfs},244563
  |   `-{ipfs},244564
$ ps -eLfa | grep -E "324159|324160|324161|324162|324163|324164|324166|324165|324167|243084|243085|243086|243087|243088|243089|243090|243091|243092|243207|243297|244563|244564|^UID" | grep -v "grep"
UID          PID    PPID     LWP  C NLWP STIME TTY          TIME CMD
ruben     243084       1  243084  0   13 06:28 ?        00:00:00 /usr/bin/ipfs daemon --enable-pubsub-experiment --enable-namesys-pubsub --enable-mplex-experiment --enable-gc
ruben     243084       1  243085  3   13 06:28 ?        00:24:02 /usr/bin/ipfs daemon --enable-pubsub-experiment --enable-namesys-pubsub --enable-mplex-experiment --enable-gc
ruben     243084       1  243086  0   13 06:28 ?        00:00:02 /usr/bin/ipfs daemon --enable-pubsub-experiment --enable-namesys-pubsub --enable-mplex-experiment --enable-gc
ruben     243084       1  243087  0   13 06:28 ?        00:00:00 /usr/bin/ipfs daemon --enable-pubsub-experiment --enable-namesys-pubsub --enable-mplex-experiment --enable-gc
ruben     243084       1  243088  0   13 06:28 ?        00:00:02 /usr/bin/ipfs daemon --enable-pubsub-experiment --enable-namesys-pubsub --enable-mplex-experiment --enable-gc
ruben     243084       1  243089  0   13 06:28 ?        00:00:00 /usr/bin/ipfs daemon --enable-pubsub-experiment --enable-namesys-pubsub --enable-mplex-experiment --enable-gc
ruben     243084       1  243090  0   13 06:28 ?        00:00:02 /usr/bin/ipfs daemon --enable-pubsub-experiment --enable-namesys-pubsub --enable-mplex-experiment --enable-gc
ruben     243084       1  243091  0   13 06:28 ?        00:00:03 /usr/bin/ipfs daemon --enable-pubsub-experiment --enable-namesys-pubsub --enable-mplex-experiment --enable-gc
ruben     243084       1  243092  0   13 06:28 ?        00:00:00 /usr/bin/ipfs daemon --enable-pubsub-experiment --enable-namesys-pubsub --enable-mplex-experiment --enable-gc
ruben     243084       1  243207  0   13 06:29 ?        00:00:02 /usr/bin/ipfs daemon --enable-pubsub-experiment --enable-namesys-pubsub --enable-mplex-experiment --enable-gc
ruben     243084       1  243297 98   13 06:30 ?        10:26:05 /usr/bin/ipfs daemon --enable-pubsub-experiment --enable-namesys-pubsub --enable-mplex-experiment --enable-gc
ruben     243084       1  244563  0   13 06:35 ?        00:00:00 /usr/bin/ipfs daemon --enable-pubsub-experiment --enable-namesys-pubsub --enable-mplex-experiment --enable-gc
ruben     243084       1  244564  1   13 06:35 ?        00:12:12 /usr/bin/ipfs daemon --enable-pubsub-experiment --enable-namesys-pubsub --enable-mplex-experiment --enable-gc
ruben     324159  324140  324159  0    9 15:50 pts/13   00:00:00 ipfs diag sys
ruben     324159  324140  324160  0    9 15:50 pts/13   00:00:00 ipfs diag sys
ruben     324159  324140  324161  0    9 15:50 pts/13   00:00:00 ipfs diag sys
ruben     324159  324140  324162  0    9 15:50 pts/13   00:00:00 ipfs diag sys
ruben     324159  324140  324163  0    9 15:50 pts/13   00:00:00 ipfs diag sys
ruben     324159  324140  324164  0    9 15:50 pts/13   00:00:00 ipfs diag sys
ruben     324159  324140  324165  0    9 15:50 pts/13   00:00:00 ipfs diag sys
ruben     324159  324140  324166  0    9 15:50 pts/13   00:00:00 ipfs diag sys
ruben     324159  324140  324167  0    9 15:50 pts/13   00:00:00 ipfs diag sys

So just one thread is busy looping, how can we get a stack trace of it? :)

I wrote with ttocslliwDiscor4 on IRC, he assumed it's a bug in go-yamux and go-libp2p 0.7.2 should have fixed it (my build is g84724a360).

I used SIGABRT to kill it and that's the stacktrace:

ipfs_stacktrace.txt

So, there _was_ a bug in go-libp2p but it's unrelated (potentially causes streams to fail).

The bug here looks like https://github.com/golang/go/issues/38070 which will be fixed in go 1.14.2.

If you want to work around the issue _now_, you can use https://github.com/libp2p/go-yamux/tree/fix/re-disable-write-coalescing. It doesn't fix the bug in go, but this branch reduces the likelihood that the bug will be triggered.

If you want to work around the issue _now_, you can use https://github.com/libp2p/go-yamux/tree/fix/re-disable-write-coalescing. It doesn't fix the bug in go, but this branch reduces the likelihood that the bug will be triggered.

No it's not needed, I just want to test this build until it gets released to make sure I can point out some bugs early. :)

Was this page helpful?
0 / 5 - 0 ratings

Related issues

daviddias picture daviddias  路  3Comments

Kubuxu picture Kubuxu  路  3Comments

jonchoi picture jonchoi  路  3Comments

whyrusleeping picture whyrusleeping  路  4Comments

slrslr picture slrslr  路  3Comments