V2ray-core: 访问某些网址导致程序阻塞其他请求,大概1分15秒后恢复

Created on 11 Sep 2020  ·  12Comments  ·  Source: v2ray/v2ray-core

除非特殊情况,请完整填写所有问题。不按模板发的 issue 将直接被关闭。
如果你遇到的问题不是 V2Ray 的 bug,比如你不清楚要如何配置,请使用Discussion进行讨论。

1) 你正在使用哪个版本的 V2Ray?(如果服务器和客户端使用了不同版本,请注明)
V2Ray 4.27.5 (V2Fly, a community-driven edition of V2Ray.) Custom (go1.14.7 windows/amd64)

2) 你的使用场景是什么?比如使用 Chrome 通过 Socks/VMess 代理观看 YouTube 视频。
Edge (Chrome内核)浏览器使用插件SwitchyOmega,通过本地v2ray的socks代理(port 1080)访问https://p3terx.com/go/aHR0cHM6Ly9oZWxwLmdpdGh1Yi5jb20vZW4vZ2l0aHViL2F1dGhlbnRpY2F0aW5nLXRvLWdpdGh1Yi9jcmVhdGluZy1hLXBlcnNvbmFsLWFjY2Vzcy10b2tlbi1mb3ItdGhlLWNvbW1hbmQtbGluZQ==

3) 你看到的不正常的现象是什么?(请描述具体现象,比如访问超时,TLS 证书错误等)
请求被阻塞大概一分半钟,期间其他请求也被阻塞。请求该网址会有两次跳转。第一次跳转到主机help.github.com;第二次跳转到主机docs.github.com。随后程序开始阻塞请求,时间大概在21:06:17(见客户端访问日志第4行)。大概在21:07:30后程序可以正常处理请求(见客户端错误日志第33行)。

4) 你期待看到的正确表现是怎样的?
程序在任何情况下都不应该阻塞其他请求。

5) 请附上你的配置(提交 Issue 前请隐藏服务器端IP地址)。

服务器端配置:

    // 在这里附上服务器端配置文件
服务端配置和本问题关系不大,所以没有附上。

客户端配置:

    // 在这里附上客户端配置
{
    "log": {
        "loglevel": "debug",
        "access": "access.log",
        "error": "error.log"
    },
    "inbounds": [
        {
            "port": 1080,
            "protocol": "socks",
            "tag": "inTag1"
        },
        {
            "port": 1081,
            "protocol": "socks",
            "tag": "inTag2"
        },
        {
            "port": 8080,
            "protocol": "http",
            "tag": "inTag3"
        }
    ],
    "outbounds": [
        {
            "protocol": "http",
            "tag": "outTag1",
            "settings": {
                "servers": [
                    {
                        "address": "http-tag1.test.com",
                        "port": 8080
                    }
                ]
            }
        },
        {
            "protocol": "freedom",
            "tag": "outDirect"
        },
        {
            "protocol": "http",
            "tag": "outTag2",
            "settings": {
                "servers": [
                    {
                        "address": "10.10.1.1",
                        "port": 8080
                    }
                ]
            }
        },
        {
            "protocol": "vmess",
            "tag": "outTag3",
            "settings": {
                "vnext": [
                    {
                        "address": "vmess-tag3.test.com",
                        "port": 6000,
                        "users": [
                            {
                                "id": "xxxxxxxx-50f9-c2a2-xxxx-xxxxdccexxxx",
                                "alterId": 4
                            }
                        ]
                    }
                ]
            },
            "proxySettings": {
                "tag": "outTag1"
            }
        }
    ],
    "routing": {
        "rules": [
            {
                "type": "field",
                "inboundTag": "inTag1",
                "outboundTag": "outTag3"
            },
            {
                "type": "field",
                "ip": ["10.0.0.0/8"],
                "outboundTag": "outDirect"
            },
            {
                "type": "field",
                "ip": ["192.168.0.0/16"],
                "outboundTag": "outTag3"
            },
            {
                "type": "field",
                "domain": [
                    "geosite:google",
                    "geosite:github",
                    "geosite:stackexchange",
                    "geosite:sourceforge",
                    "geosite:wikimedia",
                    "geosite:medium",
                    "geosite:reddit",
                    "domain:wireshark.org",
                    "domain:3gpp.org",
                    "domain:openwrt.org"
                ],
                "outboundTag": "outTag2"
            }
        ]
    }
}

6) 请附上出错时软件输出的错误日志。在 Linux 中,日志通常在 /var/log/v2ray/error.log 文件中。

服务器端错误日志:

    // 在这里附上服务器端日志
服务端日志和本问题无关,所以没有附上。

客户端错误日志:

    // 在这里附上客户端日志
2020/09/11 21:06:12 [Debug] v2ray.com/core/app/log: Logger started
2020/09/11 21:06:12 [Debug] v2ray.com/core/app/proxyman/inbound: creating stream worker on 0.0.0.0:1080
2020/09/11 21:06:12 [Debug] v2ray.com/core/app/proxyman/inbound: creating stream worker on 0.0.0.0:1081
2020/09/11 21:06:12 [Debug] v2ray.com/core/app/proxyman/inbound: creating stream worker on 0.0.0.0:8080
2020/09/11 21:06:12 [Info] v2ray.com/core/transport/internet/tcp: listening TCP on 0.0.0.0:8080
2020/09/11 21:06:12 [Info] v2ray.com/core/transport/internet/tcp: listening TCP on 0.0.0.0:1080
2020/09/11 21:06:12 [Info] v2ray.com/core/transport/internet/tcp: listening TCP on 0.0.0.0:1081
2020/09/11 21:06:12 [Warning] v2ray.com/core: V2Ray 4.27.5 started
2020/09/11 21:06:14 [Info] [1989321405] v2ray.com/core/proxy/socks: TCP Connect request to tcp:p3terx.com:443
2020/09/11 21:06:14 [Info] [1989321405] v2ray.com/core/app/dispatcher: default route for tcp:p3terx.com:443
2020/09/11 21:06:14 [Info] [1989321405] v2ray.com/core/transport/internet/tcp: dialing TCP to tcp:http-tag1.test.com:8080
2020/09/11 21:06:15 [Info] [4210779375] v2ray.com/core/proxy/socks: TCP Connect request to tcp:cdn.zrahh.com:443
2020/09/11 21:06:15 [Info] [4210779375] v2ray.com/core/app/dispatcher: default route for tcp:cdn.zrahh.com:443
2020/09/11 21:06:15 [Info] [4210779375] v2ray.com/core/transport/internet/tcp: dialing TCP to tcp:http-tag1.test.com:8080
2020/09/11 21:06:17 [Info] [4000677990] v2ray.com/core/proxy/socks: TCP Connect request to tcp:help.github.com:443
2020/09/11 21:06:17 [Info] [4000677990] v2ray.com/core/app/dispatcher: taking detour [outTag2] for [tcp:help.github.com:443]
2020/09/11 21:06:17 [Info] [926572094] v2ray.com/core/proxy/socks: TCP Connect request to tcp:docs.github.com:443
2020/09/11 21:06:17 [Info] [926572094] v2ray.com/core/app/dispatcher: taking detour [outTag2] for [tcp:docs.github.com:443]
2020/09/11 21:06:19 [Info] [1318124717] v2ray.com/core/proxy/http: request to Method [GET] Host [192.168.0.1] with URL [http://192.168.0.1/]
2020/09/11 21:06:19 [Info] [1318124717] v2ray.com/core/app/dispatcher: taking detour [outTag3] for [tcp:192.168.0.1:80]
2020/09/11 21:06:19 [Debug] [1318124717] v2ray.com/core/app/proxyman/outbound: proxying to outTag1 for dest tcp:vmess-tag3.test.com:6000
2020/09/11 21:06:19 [Info] [1318124717] v2ray.com/core/proxy/vmess/outbound: tunneling request to tcp:192.168.0.1:80 via tcp:vmess-tag3.test.com:6000
2020/09/11 21:06:46 [Info] [4210779375] v2ray.com/core/app/proxyman/inbound: connection ends > v2ray.com/core/proxy/socks: connection ends > context canceled
2020/09/11 21:06:47 [Info] [1052261188] v2ray.com/core/proxy/socks: TCP Connect request to tcp:docs.github.com:443
2020/09/11 21:06:47 [Info] [1052261188] v2ray.com/core/app/dispatcher: taking detour [outTag2] for [tcp:docs.github.com:443]
2020/09/11 21:06:48 [Info] [4000677990] v2ray.com/core/app/proxyman/inbound: connection ends > v2ray.com/core/proxy/socks: connection ends > context canceled
2020/09/11 21:06:48 [Info] [926572094] v2ray.com/core/app/proxyman/inbound: connection ends > v2ray.com/core/proxy/socks: connection ends > context canceled
2020/09/11 21:07:17 [Info] [1395419795] v2ray.com/core/proxy/socks: TCP Connect request to tcp:edge.microsoft.com:80
2020/09/11 21:07:17 [Info] [1395419795] v2ray.com/core/app/dispatcher: default route for tcp:edge.microsoft.com:80
2020/09/11 21:07:18 [Info] [1052261188] v2ray.com/core/app/proxyman/inbound: connection ends > v2ray.com/core/proxy/socks: connection ends > context canceled
2020/09/11 21:07:22 [Info] [3499243683] v2ray.com/core/proxy/socks: TCP Connect request to tcp:docs.github.com:443
2020/09/11 21:07:22 [Info] [3499243683] v2ray.com/core/app/dispatcher: taking detour [outTag2] for [tcp:docs.github.com:443]
2020/09/11 21:07:30 [Info] [4210779375] v2ray.com/core/transport/internet/tcp: dialing TCP to tcp:http-tag1.test.com:8080
2020/09/11 21:07:30 [Info] [4000677990] v2ray.com/core/transport/internet/tcp: dialing TCP to tcp:10.10.1.1:8080
2020/09/11 21:07:30 [Info] [4000677990] v2ray.com/core/transport/internet/tcp: dialing TCP to tcp:10.10.1.1:8080
2020/09/11 21:07:30 [Info] [926572094] v2ray.com/core/transport/internet/tcp: dialing TCP to tcp:10.10.1.1:8080
2020/09/11 21:07:30 [Info] [1318124717] v2ray.com/core/transport/internet/tcp: dialing TCP to tcp:http-tag1.test.com:8080
2020/09/11 21:07:30 [Info] [1052261188] v2ray.com/core/transport/internet/tcp: dialing TCP to tcp:10.10.1.1:8080
2020/09/11 21:07:30 [Info] [1395419795] v2ray.com/core/transport/internet/tcp: dialing TCP to tcp:http-tag1.test.com:8080
2020/09/11 21:07:30 [Info] [1318124717] v2ray.com/core/app/proxyman/outbound: failed to process outbound traffic > v2ray.com/core/proxy/vmess/outbound: connection ends > context canceled
2020/09/11 21:07:30 [Info] [1318124717] v2ray.com/core/app/proxyman/outbound: failed to process outbound traffic > v2ray.com/core/proxy/http: connection ends > context canceled
2020/09/11 21:07:30 [Info] [3499243683] v2ray.com/core/transport/internet/tcp: dialing TCP to tcp:10.10.1.1:8080
2020/09/11 21:07:31 [Info] [926572094] v2ray.com/core/transport/internet/tcp: dialing TCP to tcp:10.10.1.1:8080
2020/09/11 21:07:31 [Info] [4210779375] v2ray.com/core/transport/internet/tcp: dialing TCP to tcp:http-tag1.test.com:8080
2020/09/11 21:07:31 [Info] [4000677990] v2ray.com/core/transport/internet/tcp: dialing TCP to tcp:10.10.1.1:8080
2020/09/11 21:07:31 [Info] [1052261188] v2ray.com/core/transport/internet/tcp: dialing TCP to tcp:10.10.1.1:8080
2020/09/11 21:07:31 [Info] [926572094] v2ray.com/core/transport/internet/tcp: dialing TCP to tcp:10.10.1.1:8080
2020/09/11 21:07:31 [Info] [1052261188] v2ray.com/core/transport/internet/tcp: dialing TCP to tcp:10.10.1.1:8080
2020/09/11 21:07:31 [Info] [4210779375] v2ray.com/core/transport/internet/tcp: dialing TCP to tcp:http-tag1.test.com:8080
2020/09/11 21:07:31 [Info] [4000677990] v2ray.com/core/transport/internet/tcp: dialing TCP to tcp:10.10.1.1:8080
2020/09/11 21:07:31 [Info] [1052261188] v2ray.com/core/transport/internet/tcp: dialing TCP to tcp:10.10.1.1:8080
2020/09/11 21:07:31 [Info] [926572094] v2ray.com/core/transport/internet/tcp: dialing TCP to tcp:10.10.1.1:8080
2020/09/11 21:07:31 [Info] [4000677990] v2ray.com/core/transport/internet/tcp: dialing TCP to tcp:10.10.1.1:8080
2020/09/11 21:07:31 [Info] [4210779375] v2ray.com/core/transport/internet/tcp: dialing TCP to tcp:http-tag1.test.com:8080
2020/09/11 21:07:31 [Info] [1052261188] v2ray.com/core/transport/internet/tcp: dialing TCP to tcp:10.10.1.1:8080
2020/09/11 21:07:31 [Info] [926572094] v2ray.com/core/transport/internet/tcp: dialing TCP to tcp:10.10.1.1:8080
2020/09/11 21:07:32 [Info] [4210779375] v2ray.com/core/app/proxyman/outbound: failed to process outbound traffic > v2ray.com/core/proxy/http: failed to find an available destination > v2ray.com/core/common/retry: [v2ray.com/core/proxy/http: Proxy responded with non 200 code: 503 Service Unavailable dial tcp: operation was canceled] > v2ray.com/core/common/retry: all retry attempts failed
2020/09/11 21:07:32 [Info] [4000677990] v2ray.com/core/app/proxyman/outbound: failed to process outbound traffic > v2ray.com/core/proxy/http: failed to find an available destination > v2ray.com/core/common/retry: [dial tcp 10.10.1.1:8080: operation was canceled] > v2ray.com/core/common/retry: all retry attempts failed
2020/09/11 21:07:32 [Info] [1052261188] v2ray.com/core/app/proxyman/outbound: failed to process outbound traffic > v2ray.com/core/proxy/http: failed to find an available destination > v2ray.com/core/common/retry: [dial tcp 10.10.1.1:8080: operation was canceled] > v2ray.com/core/common/retry: all retry attempts failed
2020/09/11 21:07:32 [Info] [926572094] v2ray.com/core/app/proxyman/outbound: failed to process outbound traffic > v2ray.com/core/proxy/http: failed to find an available destination > v2ray.com/core/common/retry: [dial tcp 10.10.1.1:8080: operation was canceled] > v2ray.com/core/common/retry: all retry attempts failed
2020/09/11 21:07:32 [Debug] v2ray.com/core/app/log: Logger closing

7) 请附上访问日志。在 Linux 中,日志通常在 /var/log/v2ray/access.log 文件中。

    // 在这里附上服务器端日志
服务端日志和本问题无关,所以没有附上。
// 客户端访问日志
2020/09/11 21:06:14 tcp:127.0.0.1:2967 accepted tcp:p3terx.com:443 [outTag1] 
2020/09/11 21:06:15 tcp:127.0.0.1:2969 accepted tcp:cdn.zrahh.com:443 [outTag1] 
2020/09/11 21:06:17 tcp:127.0.0.1:2973 accepted tcp:help.github.com:443 [outTag2] 
2020/09/11 21:06:17 tcp:127.0.0.1:2974 accepted tcp:docs.github.com:443 [outTag2] 
2020/09/11 21:06:19 127.0.0.1:2972 accepted http://192.168.0.1/ [outTag3] 
2020/09/11 21:06:47 tcp:127.0.0.1:2988 accepted tcp:docs.github.com:443 [outTag2] 
2020/09/11 21:07:17 tcp:127.0.0.1:3006 accepted tcp:edge.microsoft.com:80 [outTag1] 
2020/09/11 21:07:22 tcp:127.0.0.1:3009 accepted tcp:docs.github.com:443 [outTag2] 

8) 其它相关的配置文件(如 Nginx)和相关日志。

9) 如果 V2Ray 无法启动,请附上 --test 输出。

通常的命令为 /usr/bin/v2ray/v2ray --test --config /etc/v2ray/config.json。请按实际情况修改。

10) 如果 V2Ray 服务运行不正常,请附上 journal 日志。

通常的命令为 journalctl -u v2ray

请预览一下你填的内容再提交。

Most helpful comment


根据以上推测,便可以复盘日志中阻塞现象的过程了:

用户向socks inbound发出第一个有问题的请求(session id 4210779375)。请求被分发到http outbound (outTag1),产生阻塞。

2020/09/11 21:06:15 [Info] [4210779375] v2ray.com/core/proxy/socks: TCP Connect request to tcp:cdn.zrahh.com:443
2020/09/11 21:06:15 [Info] [4210779375] v2ray.com/core/app/dispatcher: default route for tcp:cdn.zrahh.com:443
2020/09/11 21:06:15 [Info] [4210779375] v2ray.com/core/transport/internet/tcp: dialing TCP to tcp:http-tag1.test.com:8080

过了大约30秒,socks inbound没有收到响应,认定为超时,关闭了连接。

2020/09/11 21:06:46 [Info] [4210779375] v2ray.com/core/app/proxyman/inbound: connection ends > v2ray.com/core/proxy/socks: connection ends > context canceled
> socks inbound超时相关的代码如下,它会按Policy配置的`connIdle`项中设定的时间设置超时: > https://github.com/v2ray/v2ray-core/blob/7a5079496423d5dda61726299eea402e48c97127/proxy/socks/server.go#L147-L149

尽管inbound主动关闭了连接,但outbound仍在堵塞,直到1分钟后503错误返回,然后继续尝试dialing。

2020/09/11 21:07:30 [Info] [4210779375] v2ray.com/core/transport/internet/tcp: dialing TCP to tcp:http-tag1.test.com:8080
2020/09/11 21:07:31 [Info] [4210779375] v2ray.com/core/transport/internet/tcp: dialing TCP to tcp:http-tag1.test.com:8080
2020/09/11 21:07:31 [Info] [4210779375] v2ray.com/core/transport/internet/tcp: dialing TCP to tcp:http-tag1.test.com:8080
2020/09/11 21:07:31 [Info] [4210779375] v2ray.com/core/transport/internet/tcp: dialing TCP to tcp:http-tag1.test.com:8080

此后的dialing operation,由于连接的context已经被socks inbound取消了,因此全部失败。

2020/09/11 21:07:32 [Info] [4210779375] v2ray.com/core/app/proxyman/outbound: failed to process outbound traffic > v2ray.com/core/proxy/http: failed to find an available destination > v2ray.com/core/common/retry: [v2ray.com/core/proxy/http: Proxy responded with non 200 code: 503 Service Unavailable dial tcp: operation was canceled] > v2ray.com/core/common/retry: all retry attempts failed
> `operation was canceled`与 `context cancelled`是同一个错误,只是在dial的过程中被重命名了一下: > ``` > errCanceled = errors.New("operation was canceled") > switch err { > case context.Canceled: > return errCanceled > } > ```

其他连接因为http outbound被一直堵塞,直到503错误返回后,才可尝试dialing,但由于连接早已被各自的inbound取消而失败。

以vmess outbound的连接为例:

2020/09/11 21:06:19 [Info]  [1318124717] v2ray.com/core/proxy/http: request to Method [GET] Host [192.168.0.1] with URL [http://192.168.0.1/]
2020/09/11 21:06:19 [Info]  [1318124717] v2ray.com/core/app/dispatcher: taking detour [outTag3] for [tcp:192.168.0.1:80]
2020/09/11 21:06:19 [Debug] [1318124717] v2ray.com/core/app/proxyman/outbound: proxying to outTag1 for dest tcp:vmess-tag3.test.com:6000
2020/09/11 21:06:19 [Info]  [1318124717] v2ray.com/core/proxy/vmess/outbound: tunneling request to tcp:192.168.0.1:80 via tcp:vmess-tag3.test.com:6000

# 503返回,4210779375开始尝试第二次dialing
2020/09/11 21:07:30 [Info] [4210779375] v2ray.com/core/transport/internet/tcp: dialing TCP to tcp:http-tag1.test.com:8080

2020/09/11 21:07:30 [Info] [1318124717] v2ray.com/core/transport/internet/tcp: dialing TCP to tcp:http-tag1.test.com:8080
2020/09/11 21:07:30 [Info] [1318124717] v2ray.com/core/app/proxyman/outbound: failed to process outbound traffic > v2ray.com/core/proxy/vmess/outbound: connection ends > context canceled
2020/09/11 21:07:30 [Info] [1318124717] v2ray.com/core/app/proxyman/outbound: failed to process outbound traffic > v2ray.com/core/proxy/http: connection ends > context canceled
对session id 1318124717,在`21:06:19`发出请求后,一直无响应,直到`2020/09/11 21:07:30` 503返回之后,才开始第一次dialing。

All 12 comments

2020/09/11 21:07:32 [Info] [4210779375] v2ray.com/core/app/proxyman/outbound: failed to process outbound traffic > v2ray.com/core/proxy/http: failed to find an available destination > v2ray.com/core/common/retry: [v2ray.com/core/proxy/http: Proxy responded with non 200 code: 503 Service Unavailable dial tcp: operation was canceled] > v2ray.com/core/common/retry: all retry attempts failed

可以关注下这个

意思是访问cdn.zrahh.com这个域名引起的问题吗?我有个疑问,为啥这个请求出问题会阻塞其他请求呢?后面所有的请求都被阻塞了。

今天又测试了一下,发现阻塞行为只发生在非freedom协议的outbound上。比如,我在阻塞发生后,用curl分别发起了两个请求,一个请求出口协议是freedom,另一个请求出口协议是vmess。

出口协议是freedom的请求:

$ curl -i --proxy http://127.0.0.1:8080 http://10.85.55.218/ && date
HTTP/1.0 302 Moved Temporarily
Cache-Control: no-cache
Connection: keep-alive
Keep-Alive: timeout=4
Location: https://cpathxxx.xxx.net:6082/php/uid.php?vsys=1&rule=0&url=http://10.85.55.218%2f
Pragma: no-cache
Proxy-Connection: keep-alive
Content-Length: 0

Sat, Sep 12, 2020 15:58:28

在请求发出后,响应立即返回。

出口协议是vmess的请求:

$ curl -i --proxy http://127.0.0.1:8080 http://192.168.0.1/ && date
HTTP/1.1 200 OK
Content-Length: 522
Connection: keep-alive
Content-Type: text/html
Date: Sat, 12 Sep 2020 07:59:40 GMT
Etag: "454-20a-5ec031b4"
Keep-Alive: timeout=4
Last-Modified: Sat, 16 May 2020 18:32:20 GMT
Proxy-Connection: keep-alive

<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Cache-Control" content="no-cache, no-store, must-revalidate" />
<meta http-equiv="refresh" content="0; URL=cgi-bin/luci/" />
</head>
<body style="background-color: white">
<a style="color: black; font-family: arial, helvetica, sans-serif;" href="cgi-bin/luci/">LuCI - Lua Configuration Interface</a>
</body>
</html>
Sat, Sep 12, 2020 15:59:40

响应在阻塞解除后才返回。

两个请求差不多是同时发出,但返回时间相差了一分多钟。阻塞解除应该是在日志all retry attempts failed产生后,对比了下时间,阻塞时间大概持续1分15秒。

@Vigilans 你有看法吗

@kslr 这个Issue相关的模块代码我都还是第一次看。。。调查了下是http outbound有问题,应该transfer到v2fly去。(让我看是觉得跟DNS或routing有关嘛 :cry: )


根据这两行:

[1318124717] v2ray.com/core/app/dispatcher: taking detour [outTag3] for [tcp:192.168.0.1:80]
[1318124717] v2ray.com/core/app/proxyman/outbound: proxying to outTag1 for dest tcp:vmess-tag3.test.com:6000

可以注意到vmess outbound是使用了http outbound outTag1作proxy。据此在JSON配置中找到了对应项:

{
    "protocol": "vmess",
    "tag": "outTag3",
    // ...
    "proxySettings": {
        "tag": "outTag1"
    }
}

看最后几行的报错,全部都源自proxy/http

[4210779375] v2ray.com/core/app/proxyman/outbound: failed to process outbound traffic > v2ray.com/core/proxy/http: failed to find an available destination
[4000677990] v2ray.com/core/app/proxyman/outbound: failed to process outbound traffic > v2ray.com/core/proxy/http: failed to find an available destination
[1052261188] v2ray.com/core/app/proxyman/outbound: failed to process outbound traffic > v2ray.com/core/proxy/http: failed to find an available destination
[926572094]  v2ray.com/core/app/proxyman/outbound: failed to process outbound traffic > v2ray.com/core/proxy/http: failed to find an available destination

可以推测问题的源头或许在http outbound,与vmess并无关系。


再看最后几行的报错内容:

v2ray.com/core/common/retry: [v2ray.com/core/proxy/http: Proxy responded with non 200 code: 503 Service Unavailable dial tcp: operation was canceled] > v2ray.com/core/common/retry: all retry attempts failed
v2ray.com/core/common/retry: [dial tcp 10.10.1.1:8080: operation was canceled] > v2ray.com/core/common/retry: all retry attempts failed
v2ray.com/core/common/retry: [dial tcp 10.10.1.1:8080: operation was canceled] > v2ray.com/core/common/retry: all retry attempts failed
v2ray.com/core/common/retry: [dial tcp 10.10.1.1:8080: operation was canceled] > v2ray.com/core/common/retry: all retry attempts failed

common/retry在http outbound中的相关源码:
https://github.com/v2ray/v2ray-core/blob/e8a27648a1e35cf8317c48c811f947a72b599c85/proxy/http/client.go#L93-L105
可以看到,在failed to find an available destination错误产生之前,setUpHTTPTunnel会尝试5次(Dial就发生在这里),这与日志中一个连接尝试Dial的次数一致:

2020/09/11 21:06:15 [Info] [4210779375] v2ray.com/core/transport/internet/tcp: dialing TCP to tcp:http-tag1.test.com:8080
2020/09/11 21:07:30 [Info] [4210779375] v2ray.com/core/transport/internet/tcp: dialing TCP to tcp:http-tag1.test.com:8080
2020/09/11 21:07:31 [Info] [4210779375] v2ray.com/core/transport/internet/tcp: dialing TCP to tcp:http-tag1.test.com:8080
2020/09/11 21:07:31 [Info] [4210779375] v2ray.com/core/transport/internet/tcp: dialing TCP to tcp:http-tag1.test.com:8080
2020/09/11 21:07:31 [Info] [4210779375] v2ray.com/core/transport/internet/tcp: dialing TCP to tcp:http-tag1.test.com:8080

上面日志里的时间十分值得关注。对session id 4210779375,第一次dialing发生在阻塞前,此后4次dialing都在1分15秒阻塞完毕之后了。但是其他连接,如session id 4000677990,并不是这样:

2020/09/11 21:07:30 [Info] [4000677990] v2ray.com/core/transport/internet/tcp: dialing TCP to tcp:10.10.1.1:8080
2020/09/11 21:07:30 [Info] [4000677990] v2ray.com/core/transport/internet/tcp: dialing TCP to tcp:10.10.1.1:8080
2020/09/11 21:07:31 [Info] [4000677990] v2ray.com/core/transport/internet/tcp: dialing TCP to tcp:10.10.1.1:8080
2020/09/11 21:07:31 [Info] [4000677990] v2ray.com/core/transport/internet/tcp: dialing TCP to tcp:10.10.1.1:8080
2020/09/11 21:07:31 [Info] [4000677990] v2ray.com/core/transport/internet/tcp: dialing TCP to tcp:10.10.1.1:8080

所有dialing发生在堵塞之后。再次观察这两个连接各自的报错:
(Note: 其实以下错误是个数组,每行总共应该有5个错误,但是common/retry把相邻的相同错误给合并了)

4210779375: [v2ray.com/core/proxy/http: Proxy responded with non 200 code: 503 Service Unavailable dial tcp: operation was canceled]
4000677990: [dial tcp 10.10.1.1:8080: operation was canceled] # 4000677990是其他连接的代表

根据以上日志可以推测:

  • 所有堵塞之后发生的错误都是operation was canceled
  • 是第一个503 Service Unavailable错误造成了堵塞。也即第一次尝试连接http-tag1.test.com时,过了1分15秒才返回一个503 Service Unavailable,在此期间的所有dialing全部被堵塞,503返回后,所有dialing operation都已经取消了。

根据以上推测,便可以复盘日志中阻塞现象的过程了:

用户向socks inbound发出第一个有问题的请求(session id 4210779375)。请求被分发到http outbound (outTag1),产生阻塞。

2020/09/11 21:06:15 [Info] [4210779375] v2ray.com/core/proxy/socks: TCP Connect request to tcp:cdn.zrahh.com:443
2020/09/11 21:06:15 [Info] [4210779375] v2ray.com/core/app/dispatcher: default route for tcp:cdn.zrahh.com:443
2020/09/11 21:06:15 [Info] [4210779375] v2ray.com/core/transport/internet/tcp: dialing TCP to tcp:http-tag1.test.com:8080

过了大约30秒,socks inbound没有收到响应,认定为超时,关闭了连接。

2020/09/11 21:06:46 [Info] [4210779375] v2ray.com/core/app/proxyman/inbound: connection ends > v2ray.com/core/proxy/socks: connection ends > context canceled
> socks inbound超时相关的代码如下,它会按Policy配置的`connIdle`项中设定的时间设置超时: > https://github.com/v2ray/v2ray-core/blob/7a5079496423d5dda61726299eea402e48c97127/proxy/socks/server.go#L147-L149

尽管inbound主动关闭了连接,但outbound仍在堵塞,直到1分钟后503错误返回,然后继续尝试dialing。

2020/09/11 21:07:30 [Info] [4210779375] v2ray.com/core/transport/internet/tcp: dialing TCP to tcp:http-tag1.test.com:8080
2020/09/11 21:07:31 [Info] [4210779375] v2ray.com/core/transport/internet/tcp: dialing TCP to tcp:http-tag1.test.com:8080
2020/09/11 21:07:31 [Info] [4210779375] v2ray.com/core/transport/internet/tcp: dialing TCP to tcp:http-tag1.test.com:8080
2020/09/11 21:07:31 [Info] [4210779375] v2ray.com/core/transport/internet/tcp: dialing TCP to tcp:http-tag1.test.com:8080

此后的dialing operation,由于连接的context已经被socks inbound取消了,因此全部失败。

2020/09/11 21:07:32 [Info] [4210779375] v2ray.com/core/app/proxyman/outbound: failed to process outbound traffic > v2ray.com/core/proxy/http: failed to find an available destination > v2ray.com/core/common/retry: [v2ray.com/core/proxy/http: Proxy responded with non 200 code: 503 Service Unavailable dial tcp: operation was canceled] > v2ray.com/core/common/retry: all retry attempts failed
> `operation was canceled`与 `context cancelled`是同一个错误,只是在dial的过程中被重命名了一下: > ``` > errCanceled = errors.New("operation was canceled") > switch err { > case context.Canceled: > return errCanceled > } > ```

其他连接因为http outbound被一直堵塞,直到503错误返回后,才可尝试dialing,但由于连接早已被各自的inbound取消而失败。

以vmess outbound的连接为例:

2020/09/11 21:06:19 [Info]  [1318124717] v2ray.com/core/proxy/http: request to Method [GET] Host [192.168.0.1] with URL [http://192.168.0.1/]
2020/09/11 21:06:19 [Info]  [1318124717] v2ray.com/core/app/dispatcher: taking detour [outTag3] for [tcp:192.168.0.1:80]
2020/09/11 21:06:19 [Debug] [1318124717] v2ray.com/core/app/proxyman/outbound: proxying to outTag1 for dest tcp:vmess-tag3.test.com:6000
2020/09/11 21:06:19 [Info]  [1318124717] v2ray.com/core/proxy/vmess/outbound: tunneling request to tcp:192.168.0.1:80 via tcp:vmess-tag3.test.com:6000

# 503返回,4210779375开始尝试第二次dialing
2020/09/11 21:07:30 [Info] [4210779375] v2ray.com/core/transport/internet/tcp: dialing TCP to tcp:http-tag1.test.com:8080

2020/09/11 21:07:30 [Info] [1318124717] v2ray.com/core/transport/internet/tcp: dialing TCP to tcp:http-tag1.test.com:8080
2020/09/11 21:07:30 [Info] [1318124717] v2ray.com/core/app/proxyman/outbound: failed to process outbound traffic > v2ray.com/core/proxy/vmess/outbound: connection ends > context canceled
2020/09/11 21:07:30 [Info] [1318124717] v2ray.com/core/app/proxyman/outbound: failed to process outbound traffic > v2ray.com/core/proxy/http: connection ends > context canceled
对session id 1318124717,在`21:06:19`发出请求后,一直无响应,直到`2020/09/11 21:07:30` 503返回之后,才开始第一次dialing。


再来看哪里导致了http outbound的堵塞。被retry 5次的操作中:
https://github.com/v2ray/v2ray-core/blob/e8a27648a1e35cf8317c48c811f947a72b599c85/proxy/http/client.go#L93-L105

只有setUpHTTPTunnel会返回错误。

setUpHTTPTunnel中,会返回Proxy responded with non 200 code:错误的,只有connectHTTP1connectHTTP2两个函数,在这个issue里自然是connectHTTP1
https://github.com/v2ray/v2ray-core/blob/e8a27648a1e35cf8317c48c811f947a72b599c85/proxy/http/client.go#L153-L178

setUpHTTPTunnel中,调用connectHTTP1的位置是:
https://github.com/v2ray/v2ray-core/blob/e8a27648a1e35cf8317c48c811f947a72b599c85/proxy/http/client.go#L246-L248

注意到调用完connectHTTP1后函数直接就return了。因此当前函数的阻塞应该是发生在connectHTTP1中,如http.ReadResponse函数(不太了解相关函数的阻塞性)。

再来看dialing发生的位置,也在setUpHTTPTunnel中:
https://github.com/v2ray/v2ray-core/blob/e8a27648a1e35cf8317c48c811f947a72b599c85/proxy/http/client.go#L212-L230

这里发现一个会导致其他连接阻塞的东西了:cachedH2Mutex。这个锁在dialing前被锁上,且只有一个defer语句解锁。

这就是说,直到上一个setUpHTTPTunnel返回之前,后续的setUpHTTPTunnel调用都无法dialing。而在这个issue里,第一个setUpHTTPTunnel因为connectHTTP1而被阻塞了,未能退出,因此后续的连接都被阻塞了。

这个锁的相关代码来自https://github.com/v2fly/v2ray-core/commit/d29370a6544177ff67ff0c7b32af27a0a48da44a ,PR https://github.com/v2fly/v2ray-core/pull/18

给大佬认真点赞

@darhwa 能来看下这个问题吗

之前用了个粒度很粗的锁,比较欠考虑。已经提交补丁。

给三位大佬点赞,效率如此之高👍

厉害了

Was this page helpful?
0 / 5 - 0 ratings