Version: v1.0.3-dev
Hi,
We are having trouble with our 3 server cluster. In a recent test (started ~18:00 27/02/2018), it seems that one our servers (server-1) used far more resources than the others. They all run on separate nodes (4 core, 16GB RAM) that are managed through Kubernetes.
dgraph-server-0 graphs:

dgraph-server-0 logs:
dgraph-server-0_logs.txt
dgraph-server-1 graphs:

dgraph-server-1 logs:
dgraph-server-1_logs.txt
dgraph-server-2 graphs:

dgraph-server-2 logs:
dgraph-server-2_logs.txt
dgraph-zero-0 logs:
dgraph-zero-0_logs.txt
What I notice is it seems that server-1 starts at 18:05, server-0 at 18:25 and server-2 at 18:50 (even though they were all deployed at the same time). Is it possible that server-1 took all the load initially and that the zero could not re-balance? (I see predicate move errors in zero's logs)
@jimanvlad
Most probably your queries might be using the predicates present on group 1. Zero balances based on size.
Can you please share the logs of zero.(may be predicte move failed)
The zero logs are attached in the first post.
On Wed, 28 Feb 2018 at 13:36, Janardhan Reddy notifications@github.com
wrote:
Most probably your queries might be using the predicates present on group
- Zero balances based on size.
Can you please share the logs of zero.(may be predicte move failed)—
You are receiving this because you were mentioned.Reply to this email directly, view it on GitHub
https://github.com/dgraph-io/dgraph/issues/2172#issuecomment-369240843,
or mute the thread
https://github.com/notifications/unsubscribe-auth/ACKSZnciuwJ8wEtzS8K-y2FkhABFA41Bks5tZVZSgaJpZM4SWa2q
.>
Thanks,
Vlad
I have been observing the logs here. One strange thing that I see is
Groups sorted by size: [{gid:1 size:57783} {gid:3 size:38972087} {gid:2 size:221618040}]
2018/02/27 19:41:17 tablet.go:170: size_diff 221560257
2018/02/27 19:41:17 tablet.go:87: Going to move predicate _predicate_ from 2 to 1
2018/02/27 20:01:17 tablet.go:91: Error while trying to move predicate _predicate_ from 2 to 1: rpc error: code = DeadlineExceeded desc = context deadline exceeded
So _predicate_ could have had max size as around 221 MB, and couldn't be moved in 20 mins (timeout for predicate move) which makes me wonder if your nodes can communicate with each other? Are your queries which touch predicates on multiple machines working fine? How and where is your kubernetes cluster setup? I have tried replicating this in a cluster but haven't been able to do so.
Another thing is that the server which gets a schema update, mutation or a query for predicate ends up serving it. I see that server 1 received a bunch of schema updates initially hence ended up serving all those predicates. So you might want to randomize that a bit to distribute the load equally initially.
Hi @pawanrawal, thanks for coming back.
I am using the following deployment in Kubernetes:
########## Services
##### Public
# Zero Public
apiVersion: v1
kind: Service
metadata:
name: dgraph-zero-public
labels:
app: dgraph-zero
spec:
type: LoadBalancer
ports:
- port: 5080
targetPort: 5080
name: zero-grpc
nodePort: 30006
- port: 6080
targetPort: 6080
name: zero-http
nodePort: 30005
selector:
app: dgraph-zero
---
# Server Public
apiVersion: v1
kind: Service
metadata:
name: dgraph-server-public
labels:
app: dgraph-server
spec:
type: LoadBalancer
ports:
- port: 8080
targetPort: 8080
name: server-http
nodePort: 30003
- port: 9080
targetPort: 9080
name: server-grpc
nodePort: 30004
selector:
app: dgraph-server
---
# Ratel Public
apiVersion: v1
kind: Service
metadata:
name: dgraph-ratel-public
labels:
app: dgraph-ratel
spec:
type: LoadBalancer
ports:
- port: 8000
targetPort: 8000
name: ratel-http
nodePort: 30007
selector:
app: dgraph-ratel
---
##### Headless
# Zero Headless
apiVersion: v1
kind: Service
metadata:
name: dgraph-zero
labels:
app: dgraph-zero
spec:
ports:
- port: 5080
targetPort: 5080
name: zero-grpc
clusterIP: None
selector:
app: dgraph-zero
---
# Server Headless
apiVersion: v1
kind: Service
metadata:
name: dgraph-server
labels:
app: dgraph-server
spec:
ports:
- port: 7080
targetPort: 7080
name: server-grpc-int
clusterIP: None
selector:
app: dgraph-server
---
##### Specific
# Server 0
apiVersion: v1
kind: Service
metadata:
name: dgraph-server-0-http-public
labels:
app: dgraph-server
spec:
type: LoadBalancer
ports:
- port: 9080
targetPort: 9080
name: server-http
nodePort: 30011
selector:
statefulset.kubernetes.io/pod-name: dgraph-server-0
---
# Server 1
apiVersion: v1
kind: Service
metadata:
name: dgraph-server-1-http-public
labels:
app: dgraph-server
spec:
type: LoadBalancer
ports:
- port: 9080
targetPort: 9080
name: server-http
nodePort: 30012
selector:
statefulset.kubernetes.io/pod-name: dgraph-server-1
---
# Server 2
apiVersion: v1
kind: Service
metadata:
name: dgraph-server-2-http-public
labels:
app: dgraph-server
spec:
type: LoadBalancer
ports:
- port: 9080
targetPort: 9080
name: server-http
nodePort: 30013
selector:
statefulset.kubernetes.io/pod-name: dgraph-server-2
---
########## StatefulSets
# Zero - StatefulSet
apiVersion: apps/v1beta1
kind: StatefulSet
metadata:
name: dgraph-zero
spec:
selector:
matchLabels:
app: dgraph-zero
serviceName: "dgraph-zero"
replicas: 1
template:
metadata:
labels:
app: dgraph-zero
type: zero
spec:
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: type
operator: In
values:
- zero
topologyKey: kubernetes.io/hostname
containers:
- name: zero
image: dgraph/dgraph:latest
ports:
- containerPort: 6080
name: zero-http
- containerPort: 5080
name: zero-grpc
volumeMounts:
- name: datadir
mountPath: /dgraph
command:
- bash
- "-c"
- |
set -ex
dgraph zero --my=$(hostname -f):5080
terminationGracePeriodSeconds: 60
volumes:
- name: datadir
hostPath:
path: /opt/dgraph
---
# Server - StatefulSet
apiVersion: apps/v1beta1
kind: StatefulSet
metadata:
name: dgraph-server
spec:
selector:
matchLabels:
app: dgraph-server
serviceName: "dgraph-server"
replicas: 3
template:
metadata:
labels:
app: dgraph-server
type: server
spec:
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: type
operator: In
values:
- server
topologyKey: kubernetes.io/hostname
containers:
- name: server
image: dgraph/dgraph:latest
ports:
- containerPort: 7080
name: server-grpc-int
- containerPort: 8080
name: server-http
- containerPort: 9080
name: server-grpc
volumeMounts:
- name: datadir
mountPath: /dgraph
command:
- bash
- "-c"
- |
set -ex
dgraph server --my=$(hostname -f):7080 --memory_mb 8192 --zero dgraph-zero-0.dgraph-zero.default.svc.cluster.local:5080
terminationGracePeriodSeconds: 60
volumes:
- name: datadir
hostPath:
path: /opt/dgraph
---
# Ratel Deployment
apiVersion: apps/v1
kind: Deployment
metadata:
name: dgraph-ratel-deployment
labels:
app: dgraph-ratel
spec:
selector:
matchLabels:
app: dgraph-ratel
replicas: 1
template:
metadata:
labels:
app: dgraph-ratel
spec:
containers:
- name: dgraph-ratel
image: dgraph/dgraph:latest
ports:
- containerPort: 8000
name: ratel-http
command:
- bash
- "-c"
- |
set -ex
dgraph-ratel -port 8000 -addr gb1-li-cortex-001.io.thehut.local:30003
Sometimes the predicate move is unsuccessful (not getting the error you highlighted above very often, but I do get this:
2018/03/01 17:45:01 tablet.go:178: size_diff 2445967
2018/03/01 17:45:01 tablet.go:178: size_diff 1468433
2018/03/01 17:45:01 tablet.go:71: Going to move predicate _ip_xid from 1 to 3
2018/03/01 17:45:01 tablet.go:64: Error while trying to move predicate _ip_xid from 1 to 3: rpc error: code = Unknown desc = Conflicts with pending transaction. Please abort.
2018/03/01 17:53:01 tablet.go:173:
Having said that, there are situations when the predicates move fine, but the server is still stuck (see https://github.com/dgraph-io/dgraph/issues/2054 )
I am going to spin up a cluster using Kubernetes on AWS and give this a try today.
2018/03/01 17:45:01 tablet.go:64: Error while trying to move predicate _ip_xid from 1 to 3: rpc error: code = Unknown desc = Conflicts with pending transaction. Please abort.
This error is possible if you had mutations for the predicate going on at that point. Is that always the case? I have not seen a successful predicate move in any of your logs. Do you have consistent write workload or it could be that the client you are using is not aborting transactions on an error?
This and the other issue is definitely related to the connection being broken between the nodes. I am investigating why and how that happens. I can see the following logs after dgraph live has been running for a bit on a Kubernetes cluster.
Total Txns done: 339 RDFs per second: 8932 Time Elapsed: 6m24s, Aborts: 221
Total Txns done: 339 RDFs per second: 8886 Time Elapsed: 6m26s, Aborts: 221
Total Txns done: 339 RDFs per second: 8840 Time Elapsed: 6m28s, Aborts: 221
2018/03/02 11:30:22 batch.go:133: Error while mutating rpc error: code = Unavailable desc = transport is closing
2018/03/02 11:30:22 batch.go:133: Error while mutating rpc error: code = Unavailable desc = transport is closing
Total Txns done: 339 RDFs per second: 8821 Time Elapsed: 6m30s, Aborts: 223
2018/03/02 11:30:23 batch.go:133: Error while mutating rpc error: code = Unknown desc = No connection exists
2018/03/02 11:30:24 batch.go:133: Error while mutating rpc error: code = Unknown desc = No connection exists
2018/03/02 11:30:24 batch.go:133: Error while mutating rpc error: code = Unknown desc = No connection exists
I tried it on a cluster brought up using kops on AWS. Same machine specs as you have. I did see the intermittent issue with predicate move not completing and context deadline exceeded which is most probably a connection issue but mostly things went smoothly and predicate moves completed.
Is your kubernetes setup in-house? Are you sure that your networking is properly setup? I am just trying to find out a setup in which the problem occurs more frequently. I am also updating the nightly binaries to update the size of the tablet being moved among other things. I would recommend trying with dgraph/dgraph:master then.
Hi @pawanrawal:
Yes, we are continuously writing to the graph, that's the whole point of what we're trying to build. I would expect dgraph to pause ingestion for a bit until it moves predicates around, and then continue operation. Is this not the case?
So for the below error, do we know the cause?
2018/03/01 19:45:06 oracle.go:381: Error while fetching minTs from group 1, err: rpc error: code = Unavailable desc = transport is closing
You're saying you do see intermittent issues with a similar set-up in AWS, is that normal? Why would it happen?
I will try with :master
Yes, we spin our Kubernetes cluster in our own data-centre. What other tests can I perform, or what other information can I give you to confirm whether this is a networking issue? What I don't understand is how it works 'some of the time'... If there was a fundamental networking issue, you'd think that it wouldn't work at all.
Yeah, that is exactly what happens.
This error would happen if the grpc connection between Zero and group 1 server was closed.
I saw the issue with a server which had 4GB RAM and the issue was because some containers had restarted because of going OOM. I tried this on a larger machine 16GB and didn't face any issues.
Ok.
What is interesting is that predicate move never worked (from your logs) but servers are able to communicate otherwise (we have an Echo GRPC which is working fine). Can you add the following environment variable to your dgraph server pods and share the logs. I am interested in a log when the predicate move fails with context deadline exceeded error.
env:
- name: GODEBUG
value: http2debug=2
Hi again,
Regarding the predicate moves, it seems that our clients are indeed waiting for the zero to move them. However, the zero never seems to succeed with its moves (we get more conflicts than deadline exceeded errors in recent tests). We tried using GODEBUG, but we got ~500MB files for just 10 minutes of log data and couldn't see anything telling around the time of a move:
(zero)
Groups sorted by size: [{gid:3 size:276244} {gid:1 size:1071332} {gid:2 size:9993521}]
2018/03/07 15:14:00 tablet.go:170: size_diff 9717277
2018/03/07 15:14:00 tablet.go:87: Going to move predicate _predicate_ from 2 to 3
2018/03/07 15:14:00 oracle.go:84: purging below ts:5317, len(o.commits):135, len(o.aborts):12
2018/03/07 15:14:00 tablet.go:91: Error while trying to move predicate _predicate_ from 2 to 3: rpc error: code = Unknown desc = Conflicts with pending transaction. Please abort.
(clients)
Predicate is being moved, please retry later
We then implemented a way to randomise how the predicates are distributed. However, it seems like our group sizes are still very unbalanced:
(zero)
Groups sorted by size: [{gid:3 size:4320255} {gid:1 size:20623424} {gid:2 size:140097589}]
We have found that our dgraph servers sometimes crash. We initially thought this might be due to resource issues but have since upgraded both our cpu and memory specifications for our nodes to no avail. When this does happen, we tend to see only one of the three go down.
Interestingly, in a recent test, the pipeline recovered from an initial crash but not a later one.
In the second crash, it seems that the recovering server could not become the leader of the group it had before (that group being gid:2, the largest):
(zero)
2018/03/07 16:16:15 zero.go:322: Got connection request: id:2 addr:"dgraph-server-1.dgraph-server.default.svc.cluster.local:7080"
2018/03/07 16:16:15 zero.go:419: Connected
2018/03/07 16:18:00 tablet.go:165:Groups sorted by size: [{gid:3 size:4320255} {gid:1 size:20623424} {gid:2 size:140097589}]
2018/03/07 16:18:00 tablet.go:170: size_diff 135777334
2018/03/07 16:18:00 tablet.go:87: Going to move predicate _predicate_ from 2 to 3
2018/03/07 16:18:00 tablet.go:91: Error while trying to move predicate _predicate_ from 2 to 3: rpc error: code = Unknown desc = Server is not leader of this group
After this last restart, the pipeline gets stuck (the other two servers will still accept alterations and queries but none will accept mutations [results in timeouts]).
^ This server getting stuck is incredibly frustrating as we can't get the whole system to run for more than ~1hr.
We have found that our dgraph servers sometimes crash.
Do you have logs before the crash or could you check the pod for the reason for the crash?
In the second crash, it seems that the recovering server could not become the leader of the group it had before (that group being gid:2, the largest):
Do you have logs after the restart when this server couldn't become the leader? Since this is the only node serving the group it should be able to become the leader.
After this last restart, the pipeline gets stuck (the other two servers will still accept alterations and queries but none will accept mutations [results in timeouts]).
Since you have three servers all serving different groups, if one of them goes down then all mutations which touch the predicates on that server will get stuck. What we have to see is that why could the server not become a leader after a restart. You could also mitigate this problem by having replicas.
So right now during predicate move, if there are pending transactions, those are aborted and we do not go ahead with the predicate move. We should go ahead with the predicate move after cancelling the aborting the pending transactions. I will make that change, it should help rebalance the load for your cluster.
Thanks for getting back to us on this. We have managed to save down the logs for a server pre-crash:
Server event history:
2018/03/08 16:24:25 attr: "_programme_created" groupId: 1 Request sent to wrong server.
github.com/dgraph-io/dgraph/x.AssertTruef
/home/travis/gopath/src/github.com/dgraph-io/dgraph/x/error.go:67
github.com/dgraph-io/dgraph/worker.(grpcWorker).ServeTask
/home/travis/gopath/src/github.com/dgraph-io/dgraph/worker/task.go:1250
github.com/dgraph-io/dgraph/protos/intern._Worker_ServeTask_Handler
/home/travis/gopath/src/github.com/dgraph-io/dgraph/protos/intern/internal.pb.go:2563
google.golang.org/grpc.(Server).processUnaryRPC
/home/travis/gopath/src/google.golang.org/grpc/server.go:900
google.golang.org/grpc.(Server).handleStream
/home/travis/gopath/src/google.golang.org/grpc/server.go:1122
google.golang.org/grpc.(Server).serveStreams.func1.1
/home/travis/gopath/src/google.golang.org/grpc/server.go:617
runtime.goexit
/home/travis/.gimme/versions/go1.9.2.linux.amd64/src/runtime/asm_amd64.s:2337
- Restarts at 16:24:29, seems to become follower of group 2 even though there is no leader.
Logs from other servers and zero:
logs-server-0.txt
logs-server-2.txt
logs-zero.txt
We will next look at adding the replicas as suggested. We would be very interested in this feature that allows the predicate moves to be forced through our continuous ingestion stream.
We would be very interested in this feature that allows the predicate moves to be forced through our continuous ingestion stream.
Sure, I am on it and will have something for you soon. Could you share details about how you spin up your kubernetes cluster so that I could replicate this issue?
Deployment file below (slightly different than the one above, as we now have a way to connect to specific servers - dgraph-server-[0-2]-specific - so we can do the load balancing manually as per Lloyd's explanation above).
This is spun up in our own private datacentre, on CentOS 7 machines with 8 cores and 16 GB RAM for each server (of which there are three, as per the deployment). Kubernetes' networking layer is on Calico.
########## Services
##### Public
# Zero Public
apiVersion: v1
kind: Service
metadata:
name: dgraph-zero-public
labels:
app: dgraph-zero
spec:
type: LoadBalancer
ports:
- port: 5080
targetPort: 5080
name: zero-grpc
nodePort: 30006
- port: 6080
targetPort: 6080
name: zero-http
nodePort: 30005
selector:
app: dgraph-zero
---
# Server Public
apiVersion: v1
kind: Service
metadata:
name: dgraph-server-public
labels:
app: dgraph-server
spec:
type: LoadBalancer
ports:
- port: 8080
targetPort: 8080
name: server-http
nodePort: 30003
- port: 9080
targetPort: 9080
name: server-grpc
nodePort: 30004
selector:
app: dgraph-server
---
# Ratel Public
apiVersion: v1
kind: Service
metadata:
name: dgraph-ratel-public
labels:
app: dgraph-ratel
spec:
type: LoadBalancer
ports:
- port: 8000
targetPort: 8000
name: ratel-http
nodePort: 30007
selector:
app: dgraph-ratel
---
##### Headless
# Zero Headless
apiVersion: v1
kind: Service
metadata:
name: dgraph-zero
labels:
app: dgraph-zero
spec:
ports:
- port: 5080
targetPort: 5080
name: zero-grpc
clusterIP: None
selector:
app: dgraph-zero
---
# Server Headless
apiVersion: v1
kind: Service
metadata:
name: dgraph-server
labels:
app: dgraph-server
spec:
ports:
- port: 7080
targetPort: 7080
name: server-grpc-int
clusterIP: None
selector:
app: dgraph-server
---
##### Specific
# Server 0
apiVersion: v1
kind: Service
metadata:
name: dgraph-server-0-specific
labels:
app: dgraph-server
spec:
type: LoadBalancer
ports:
- port: 9080
targetPort: 9080
name: server-grpc
nodePort: 30011
- port: 8080
targetPort: 8080
name: server-http
nodePort: 30014
selector:
statefulset.kubernetes.io/pod-name: dgraph-server-0
---
# Server 1
apiVersion: v1
kind: Service
metadata:
name: dgraph-server-1-specific
labels:
app: dgraph-server
spec:
type: LoadBalancer
ports:
- port: 9080
targetPort: 9080
name: server-grpc
nodePort: 30012
- port: 8080
targetPort: 8080
name: server-http
nodePort: 30015
selector:
statefulset.kubernetes.io/pod-name: dgraph-server-1
---
# Server 2
apiVersion: v1
kind: Service
metadata:
name: dgraph-server-2-specific
labels:
app: dgraph-server
spec:
type: LoadBalancer
ports:
- port: 9080
targetPort: 9080
name: server-grpc
nodePort: 30013
- port: 8080
targetPort: 8080
name: server-http
nodePort: 30016
selector:
statefulset.kubernetes.io/pod-name: dgraph-server-2
---
########## StatefulSets
# Zero - StatefulSet
apiVersion: apps/v1beta1
kind: StatefulSet
metadata:
name: dgraph-zero
spec:
selector:
matchLabels:
app: dgraph-zero
serviceName: "dgraph-zero"
replicas: 1
template:
metadata:
labels:
app: dgraph-zero
type: zero
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: "kubernetes.io/hostname"
operator: In
values:
- gb1-li-cortex-007
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: type
operator: In
values:
- zero
topologyKey: kubernetes.io/hostname
containers:
- name: zero
image: dgraph/dgraph:master
ports:
- containerPort: 6080
name: zero-http
- containerPort: 5080
name: zero-grpc
volumeMounts:
- name: datadir
mountPath: /dgraph
command:
- bash
- "-c"
- |
set -ex
dgraph zero --my=$(hostname -f):5080 |& tee -a /dgraph/logs_zero.txt
terminationGracePeriodSeconds: 60
volumes:
- name: datadir
hostPath:
path: /opt/dgraph
---
# Server - StatefulSet
apiVersion: apps/v1beta1
kind: StatefulSet
metadata:
name: dgraph-server
spec:
selector:
matchLabels:
app: dgraph-server
serviceName: "dgraph-server"
replicas: 3
template:
metadata:
labels:
app: dgraph-server
type: server
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: "kubernetes.io/hostname"
operator: In
values:
- gb1-li-cortex-003
- gb1-li-cortex-004
- gb1-li-cortex-005
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: type
operator: In
values:
- server
topologyKey: kubernetes.io/hostname
containers:
- name: server
image: dgraph/dgraph:master
ports:
- containerPort: 7080
name: server-grpc-int
- containerPort: 8080
name: server-http
- containerPort: 9080
name: server-grpc
volumeMounts:
- name: datadir
mountPath: /dgraph
command:
- bash
- "-c"
- |
set -ex
dgraph server --my=$(hostname -f):7080 --memory_mb 16384 --zero dgraph-zero-0.dgraph-zero.default.svc.cluster.local:5080 --posting_tables memorymap |& tee -a /dgraph/logs_server.txt
resources:
requests:
memory: 24Gi
terminationGracePeriodSeconds: 60
volumes:
- name: datadir
hostPath:
path: /opt/dgraph
---
# Ratel Deployment
apiVersion: apps/v1
kind: Deployment
metadata:
name: dgraph-ratel-deployment
labels:
app: dgraph-ratel
spec:
selector:
matchLabels:
app: dgraph-ratel
replicas: 1
template:
metadata:
labels:
app: dgraph-ratel
spec:
containers:
- name: dgraph-ratel
image: dgraph/dgraph:master
ports:
- containerPort: 8000
name: ratel-http
command:
- bash
- "-c"
- |
set -ex
dgraph-ratel -port 8000 -addr gb1-li-cortex-001.io.thehut.local:30003
Note that before the crash that Lloyd mentioned above, memory consumption and CPU usage was low (nowhere near the available resources).
Error again for your reference, and all 4 logs attached above.
2018/03/08 16:24:25 attr: "_programme_created" groupId: 1 Request sent to wrong server.
github.com/dgraph-io/dgraph/x.AssertTruef
/home/travis/gopath/src/github.com/dgraph-io/dgraph/x/error.go:67
github.com/dgraph-io/dgraph/worker.(*grpcWorker).ServeTask
/home/travis/gopath/src/github.com/dgraph-io/dgraph/worker/task.go:1250
github.com/dgraph-io/dgraph/protos/intern._Worker_ServeTask_Handler
/home/travis/gopath/src/github.com/dgraph-io/dgraph/protos/intern/internal.pb.go:2563
google.golang.org/grpc.(*Server).processUnaryRPC
/home/travis/gopath/src/google.golang.org/grpc/server.go:900
google.golang.org/grpc.(*Server).handleStream
/home/travis/gopath/src/google.golang.org/grpc/server.go:1122
google.golang.org/grpc.(*Server).serveStreams.func1.1
/home/travis/gopath/src/google.golang.org/grpc/server.go:617
runtime.goexit
/home/travis/.gimme/versions/go1.9.2.linux.amd64/src/runtime/asm_amd64.s:2337
Are there any instructions that I can follow to setup this networking layer using Calico? I am just trying to reduce the number of variables here, kubernetes being one.
Sure: https://kubernetes.io/docs/setup/independent/create-cluster-kubeadm/
It’s step 3.
On Fri, 9 Mar 2018 at 09:26, Pawan Rawal notifications@github.com wrote:
Are there any instructions that I can follow to setup this networking
layer using Calico? I am just trying to reduce the number of variables
here, kubernetes being one.—
You are receiving this because you were mentioned.Reply to this email directly, view it on GitHub
https://github.com/dgraph-io/dgraph/issues/2172#issuecomment-371759649,
or mute the thread
https://github.com/notifications/unsubscribe-auth/ACKSZgRCFUdY5lX5xhRQG80AtNFTPPHyks5tckrCgaJpZM4SWa2q
.>
Thanks,
Vlad
We see successful predicate moves after https://github.com/dgraph-io/dgraph/pull/2215. We'll keep testing and see if the crashes or the infinite loops mentioned in https://github.com/dgraph-io/dgraph/issues/2054 stop as well.
Groups sorted by size: [{gid:3 size:24968897} {gid:1 size:32250160} {gid:2 size:35334754}]
2018/03/13 10:35:02 tablet.go:188: size_diff 10365857
2018/03/13 10:35:02 tablet.go:78: Going to move predicate: [_elysium_account_xid], size: [95 kB] from group 2 to 3
2018/03/13 10:35:03 tablet.go:113: Predicate move done for: [_elysium_account_xid] from group 2 to 3
2018/03/13 10:38:12 raft.go:556: While applying proposal: Tablet is already being served
2018/03/13 10:38:12 raft.go:556: While applying proposal: Tablet is already being served
2018/03/13 10:43:02 tablet.go:183:
That is good to know. I am interested in this issue (seems like some sort of race condition) and would suggest creating a separate issue for it so that it can be tracked separately.
2018/03/08 16:24:25 attr: "_programme_created" groupId: 1 Request sent to wrong server.
github.com/dgraph-io/dgraph/x.AssertTruef
/home/travis/gopath/src/github.com/dgraph-io/dgraph/x/error.go:67
github.com/dgraph-io/dgraph/worker.(*grpcWorker).ServeTask
/home/travis/gopath/src/github.com/dgraph-io/dgraph/worker/task.go:1250
github.com/dgraph-io/dgraph/protos/intern._Worker_ServeTask_Handler
/home/travis/gopath/src/github.com/dgraph-io/dgraph/protos/intern/internal.pb.go:2563
google.golang.org/grpc.(*Server).processUnaryRPC
/home/travis/gopath/src/google.golang.org/grpc/server.go:900
google.golang.org/grpc.(*Server).handleStream
/home/travis/gopath/src/google.golang.org/grpc/server.go:1122
google.golang.org/grpc.(*Server).serveStreams.func1.1
/home/travis/gopath/src/google.golang.org/grpc/server.go:617
runtime.goexit
/home/travis/.gimme/versions/go1.9.2.linux.amd64/src/runtime/asm_amd64.s:2337