Geth version: 1.8.6-stable
OS & Version: Ubuntu 16.04.2
Geth should start syncing the blockchain.
Geth crashes with segmentation violation
No specific steps. Simply starting Geth.
````
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0xa6faab]
goroutine 83 [running]:
github.com/ethereum/go-ethereum/eth/filters.(*EventSystem).eventLoop(0xc420662940)
/build/ethereum-LEwR9U/ethereum-1.8.6+build13246+xenial/build/_workspace/src/github.com/ethereum/go-ethereum/eth/filters/filter_system.go:434 +0x2eb
created by github.com/ethereum/go-ethereum/eth/filters.NewEventSystem
/build/ethereum-LEwR9U/ethereum-1.8.6+build13246+xenial/build/_workspace/src/github.com/ethereum/go-ethereum/eth/filters/filter_system.go:113 +0x104
````
Are you starting geth with RPC enabled?
We get this error if we start geth like this:
./geth --rpc --rpcaddr [redacted1]
But if we start it with a different IP address then it works fine!
./geth --rpc --rpcaddr [redacted2]
Fixed! (See below)
This is happening on both 1.8.3 and 1.8.7, running on CentOS.
@joeysino I'm using WebSockets, not RPC. Are you using AWS by any chance?
@AyushG3112 Yes we are also using AWS. It seems to happen with either --rpc or with --ws.
Oh I see my mistake. --rpcaddr should be given the local machine's IP address, not the remote machine's IP address. (That is for the firewall to handle.)
So that's why it exploded when I changed IP address.
@joeysino So what I've been able to diagnose, the private IP of your AWS instance is assigned to your instance, so you can bind to it. The Public and Elastic IP are allocated to the AWS NATs which forward your requests, so if you try binding to it, it crashes.
Same thing happened with our other nodes, they were not able to start servers at the Public or Elastic IP
Yes that's right. AWS instances have multiple IPs, internal and external, and for this binding we need to use the internal one. (Which can be found from ifconfig.)
So I think we solved our problem at least. But perhaps geth could provide a better error message when the IP cannot be bound to.
Possible message: geth cannot bind to IP 20.30.40.50. Bind IP must be an interface on the local machine
I'm not sure I follow... The original report is about a segfault -- under what circumstances does the segfault occur?
Afaik, when trying to bind to a non-existing address an error message is shown that the address is unavailable. Are you saying it segfaults instead?
@holiman Yes. On binding to a non-available IP, Geth is throwing SIGSEGV, removing the IP or changing it to an available one works perfectly. Can reproduce multiple times.
Most helpful comment
Yes that's right. AWS instances have multiple IPs, internal and external, and for this binding we need to use the internal one. (Which can be found from
ifconfig.)So I think we solved our problem at least. But perhaps geth could provide a better error message when the IP cannot be bound to.
Possible message:
geth cannot bind to IP 20.30.40.50. Bind IP must be an interface on the local machine