Dgraph: zero /state does not report predicates size

Created on 15 Apr 2020  路  17Comments  路  Source: dgraph-io/dgraph

Dgraph 20.03.0 from docker image

According to the docs:

/state -- Information about the nodes that are part of the cluster.
Also contains information about size of predicates and groups they belong to.

However, on my cluster the information on the size of predicates is missing:

饾澓 curl http://localhost:6080/state
{"counter":"456","groups":{"1":{"members":{"1":{"id":"1","groupId":1,"addr":"alpha1:7080","leader":true,"lastUpdate":"1586968945"}},"tablets":{"dgraph.acl.rule":{"groupId":1,"predicate":"dgraph.acl.rule"},"dgraph.graphql.schema":{"groupId":1,"predicate":"dgraph.graphql.schema"},"dgraph.password":{"groupId":1,"predicate":"dgraph.password"},"dgraph.rule.permission":{"groupId":1,"predicate":"dgraph.rule.permission"},"dgraph.rule.predicate":{"groupId":1,"predicate":"dgraph.rule.predicate"},"dgraph.type":{"groupId":1,"predicate":"dgraph.type"},"dgraph.user.group":{"groupId":1,"predicate":"dgraph.user.group"},"dgraph.xid":{"groupId":1,"predicate":"dgraph.xid"}},"checksum":"3323928967622107550"},"2":{}},"zeros":{"1":{"id":"1","addr":"zero1:5080","leader":true}},"maxLeaseId":"20000","maxTxnTs":"30000","removed":[{"id":"2","groupId":2,"addr":"alpha2:7080","leader":true,"lastUpdate":"1586968942"}],"cid":"379c86dd-c029-4ae3-b93b-9fd94feb1ad8","license":{"maxNodes":"18446744073709551615","expiryTs":"1589558468","enabled":true}}```

There's currently no way to get the number of n-quads in my predicates (I've checked /debug/vars, /debu/prometheus_metrics 0/state, 伪/stateetc.)

aredocumentation kinbug statuaccepted

All 17 comments

From my practical experience, if alpha do not roll up and snapshot.there is no predicate size by /state.I don't whether it is a bug

This is not a bug per se. Note that the space is only an estimated value and as such it will be 0 for small sized predicates, typically less than 64MB. This is a optimization so as to not go through each predicate and instead obtain the estimated size of predicates for zero to move tablets if needed.

Closing.

Sorry @parasssh, but that doesn't feel a complete answer to the issue. Paul had mentioned that the value is missing. And I agree with him cuz I got my self digging it for other purposes(I was working in Grafana). And got frustrated whit those values. The "n-quad" metric is really messy. Dunno if someone had fixed it, I guess not.

However, on my cluster the information on the size of predicates is missing

If the issue is not a bug, let's reformulate it and change the flag. It is still an issue. Call it a new feature or enhancement. And let's say what the JIRA admins have to say about it.

I think Paul needs this to implement some info in Ratel UI about sizes. And also I think some other kind of info in sizes still needs to be implemented.

I gonna reopen it. @paulftw can you confirm this?

Pinging @lgalatin to decide what to do.

Cheers.

@MichelDiz
We did talk about this internally with @paulftw and @manishrjain on slack. That was the consensus. The predicate size is not shown because it's estimated value is indeed 0 and JSON omits empty (0) values.

An enhancement may be made which @paulftw was going to take up and I have noted on JIRA is to print something like "<64MB" is the value is 0. We can re-purpose this issue for that or open a new one.

Alright, thanks for the update. Let's re-purpose this issue then. Bring (if can be public) the JIRA notes to the top of his text (no need to delete what he wrote previously - or we can wait for Paul and he do the changes himself - or let him open a new one and close this).

BTW, Issues closed without context can frustrate users.

Can we move this issue to Ratel repo if @paulftw is going to work on it?

Alright, thanks for the update. Let's re-purpose this issue then. Bring (if can be public) the JIRA notes to the top of his text (no need to delete what he wrote previously - or we can wait for Paul and he do the changes himself).

I indeed posted the JIRA notes before closing the issue. https://github.com/dgraph-io/dgraph/issues/5215#issuecomment-624888906

BTW, Issues closed without context can frustrate users.

The comment https://github.com/dgraph-io/dgraph/issues/5215#issuecomment-624888906 is the reason / context to close the issue.

@lgalatin If needs to go to Ratel's repo. So better starts from scratch there and close here.

I'll wait for @paulftw to confirm that he will work on it in Ratel and then we can open a new issue there. thanks.

I thought this issue has been closed after slack discussion with @parasssh.
Can confirm that after inserting ~10mil predicates a tablet did start showing the space field.
That was good enough for Ratel.
Screen Shot 2020-05-07 at 3 41 30 PM

As far as I'm concerned this dgraph issue can be closed.

However, I would recommend documenting this behavior better: 1) mention units used for reporting size (number of bytes or number of edges?); 2) exact or approximate threshold when space stops being 0; and of course 3) explicitly say that the field is called space.

An example response with space values present and missing would have made my life a lot easier.

@lgalatin leaving this to you to decide on closing / renaming / filing a docs issue / none of the above.

Let's keep this issue open and use it for documentation. I add 'area/documentation', so we know.

@parasssh @paulftw Is that values right?
For some reason, Ratel (I mean, Dgraph values) says that the user has a predicate with 3TB of size.
But, he is using less than 560GB in the whole system.
https://discuss.dgraph.io/t/ratel-predicate-capactiy/6655

@MichelDiz Paras did mention those sizes are estimates. Will wait for him to confirm

If the user has ACL disabled, can you ask them to post output of <alphaurl:8080>/state
just to make sure that's not a bug in Ratel (I don't think it is, but why not double check)

We don鈥檛 calculate the size of the predicate on disk until it occupies enough space on disk (about 64 MB by default). The following code in dgraph will skip calculation if the entire table is not used for a single predicate. https://github.com/dgraph-io/dgraph/blob/632666fad5968b5d76e1e468ffc13f71e131ca22/worker/draft.go#L1257-L1266
We will always have discrepancies because having a table with a single predicate is not easy. You would need a sufficient amount of data to get a table with a single predicate.

We don鈥檛 calculate the size of the predicate on disk until it occupies enough space on disk (about 64 MB by default). The following code in dgraph will skip calculation if the entire table is not used for a single predicate.

Thanks. That is correct. However, that means the estimated predicate size should always be equal to or less than the actual space. But in this case, the disk space is less than the sum of the predicate sizes which is confusing.

We don鈥檛 calculate the size of the predicate on disk until it occupies enough space on disk (about 64 MB by default). The following code in dgraph will skip calculation if the entire table is not used for a single predicate.

Thanks. That is correct. However, that means the estimated predicate size should always be equal to or less than the actual space. But in this case, the disk space is less than the sum of the predicate sizes which is confusing.

Yes, the estimate is wrong and we're tracking the estimate issue in https://github.com/dgraph-io/dgraph/issues/5408 .

This issue can be closed.

Closing this issue as @jarifibrahim suggested. There's another open bug #5408 to improve the estimate.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

pjebs picture pjebs  路  4Comments

janardhan1993 picture janardhan1993  路  3Comments

MichelDiz picture MichelDiz  路  3Comments

ShawnMilo picture ShawnMilo  路  4Comments

KadoBOT picture KadoBOT  路  5Comments