Dgraph v2.0.0-rc1 (dgraph/standalone:v2.0.0-rc1 Docker image)
This is the latest release.
MacBook Pro (15-inch, 2019), Mojave, 10.14.6
Following the graphql example, and running the following query:
mutation {
addAuthor(input: [
{ username: "John" },
{ username: "John" }
]) {
author {
username
}
}
}
I get a response like the following:
{
"data": {
"addAuthor": {
"author": [
{
"username": "John"
},
{
"username": "John"
}
]
}
}
}
When I query authors again, I can confirm that the multiple authors with the same username are indeed inserted into the database.
Looking at the schema definition:
type Author {
username: String! @id
# ...
}
The username field uses the @id directive, which in the example it says :
The schema specified those usernames as
@id, so Dgraph makes sure that they are unique and you鈥檒l get an error if you try to add an author with a username that already exists (you can update an existing author with the updateAuthor mutation).
So I expect an error if I try to insert multiple authors with the same username.
@abhimanyusinghgaur this is similar to the issue you discovered for adding rules within a group. We were also discussing this with @MichaelJCompton. One solution here is that the upsert variables that are generated by the query should also have the value for @id field in their name.
Just FYI, I have seen this bug as well and would appreciate if either @id gets enforced,
or a @unique annotation would be added that enforces the unique criterion.
Currently, I circumvent the problem but I don't exactly feel confident doing so b/c I already have at least one use case on my plate that would require a properly enforced @unique annotation of fields.
Hi, thanks for the report.
For clarification: this bug only occurs when the same id value is occurs within a single mutation. Two mutations within a block or separate mutations work as expected. So a viable workaround for now is to break up your mutation.
Also, note that any mutation that looks like it's asking for the same unique id to inserted twice is going to produce an error, so the result of mutations like the example is a response with an error and no new stored values.
We are working on a fix.
@MichaelJCompton You're the champ, yes, splitting mutations does the trick for now.
There are a few possible approaches with this:
graphql
type Author {
username: String! @id
phone: String
address: String
# ...
}
graphql
mutation {
addAuthor(input: [
{ username: "John", phone: "1234" },
{ username: "John", address: "SF" },
{ username: "John", phone: "5678" }
]) {
author {
username
phone
address
}
}
}
json
{
"data": {
"addAuthor": {
"author": [
{
"username": "John",
"phone": "5678",
"address": "SF"
}
]
}
}
}
1st Approach:
The 1st approach is easier when there are mutations like addAuthor where the @id directive is a property of top-level object in input. But, in case of mutations like addPost where the @id directive is applicable on a field deep inside input :
mutation {
addPost(input: [
{ ..., author: { username: "John" }},
{ ..., author: { username: "John" }}
])
}
There are 2 cases:
a. The author may be already existing in db, in which case we just need to link the post with the existing author.
b. The author may not exist in db, in which case we need to give error and not do any change in db.
Such deep mutations get a bit complex to handle although perfectly doable.
2nd Approach:
The 2nd approach returns error the moment it encounters a repeated @id value without proceeding further. This kind of leaves the system in an ambiguous state where some part of the mutation gets applied while some part may not execute. So, we are not going with this approach.
3rd Approach:
The 3rd approach is also as complex as the 1st approach, but just a bit easier.
So, between the 1st and the 3rd approach, we are going with the 3rd one, as rather than giving an error it is better to give a correct response if we can.
@abhimanyusinghgaur This seems kind of side-effecty. I as a consumer of the API would prefer a more strict and predicable behavior, like the 1st approach mentioned.
Hi @n8jadams, @marvin-hansen,
Just wanted your thoughts on a hybrid approach that we are taking:
Given the following schema:
type Author {
username: String! @id
phone: String
address: String
questions: [Question] @hasInverse(field: author)
answers: [Answer] @hasInverse(field: author)
}
interface Post {
id: ID!
text: String! @search(by: [fulltext])
datePublished: DateTime @search
author: Author!
comments: [Comment] @hasInverse(field: commentsOn)
}
type Question implements Post {
answers: [Answer] @hasInverse(field: inAnswerTo)
}
type Answer implements Post {
inAnswerTo: Question!
}
type Comment implements Post {
commentsOn: Post!
}
Mutation Types
There can only be two kinds of cases where the @id directive can be involved:
addAuthor:graphql
mutation {
addAuthor(input: [
{ username: "John", ...},
...
]) {
author {
username
}
}
}
addPost (it will be something more concrete like addQuestion or addComment, but using addPost just for demo purposes):graphql
mutation {
addPost(input: [
{ ..., author: { username: "John", ... }},
...
])
}
Client Inputs
And, irrespective of where @id is involved, there can be following kinds of cases that clients can send as input for the Author type:
Author objects have a unique username, e.g.,:graphql
{ username: "John", ...},
{ username: "Mark", ...},
{ username: "Chris", ...}
graphql
{ username: "John", ...},
{ username: "John", ...}
@id properties, rest of the objects just refer the @id:graphql
{ username: "John", phone: "1234", address: "SF"},
{ username: "John"},
{ username: "John"},
...
{ username: "John"}
graphql
{ username: "John", phone: "1234", address: "SF"},
{ username: "John", phone: "1234", address: "SF"},
{ username: "John", phone: "1234", address: "SF"},
...
{ username: "John", phone: "1234", address: "SF"}
graphql
{ username: "John", phone: "1234", address: "SF"},
{ username: "John", phone: "5678", address: "SF"},
...
graphql
{ username: "John", phone: "1234"},
{ username: "John", address: "SF"},
...
DB State
Now again, irrespective of the input being sent by the client, there can be two cases:
Expected Behaviour
Now, let me list which approach is allowed in which case:
@abhimanyusinghgaur I like the approach you came up with, with the minor exception that I would rather have an error message when the Client Input is of types 2.i and 2.ii.
I guess ultimately I just prefer strictness, so you create a "pit of success" for consumers of the API. More specifically I think it's better to place the burden of "merging" Authors on the user, and have DGraph provide a useful error message if their query or variables are malformed. (And consider duplicate, unmerged Author objects as malformed.)
But that's just my opinion. I'm just the guy that reported the bug. 馃槈
Hi @n8jadams, thanks for reporting the bug and your thoughtful comments. I think some of the discussion above got a bit sidetracked with lots of possible options, so I just wanted to touch base on the approach and why.
Dgraph has always had the focus that there's nothing special you have to do to add data - there's no special graph format, no structure you have to build - just serialise to json.
We've kept the same approach in our GraphQL offering. Compare the ease of use of our mutations (where you can just serialise your client side data structures), with some other GraphQL offerings (where you have to write code to build special mutation structures where links to existing data and inserting new data require you to client side process your data structures into the structure that the API needs).
We are taking a 'strict' approach, but also considering serliasation of graph structures and ease of use. As an example of the way we'll handle this issue, let's assume the author and posts example in the comments above.
In my client-side program, I've got data structures for posts and authors and a post has a property with a 'pointer' to an author. So, client-side, I might build two new posts (P1 and P2) and a new author (A1) who authored both (so A1 is the target of both P1.author and P2.author). That's just three things, a simple graph, but if I serialise an array of [P1, P2], I'll generally get:
[
{ ...P1..., "author": { "username": "mjc", ...A1... } },
{ ...P2..., "author": { "username": "mjc", ...A1... } },
]
And that perfectly reasonable set of data structures and perfectly reasonable serialisation, just became an invalid mutation 馃槥... and then you'd have to write code client-side to pick through and unwind your data structures and make multiple mutations or make sure you mutate particular objects in particular orders. And that makes using the API so much harder.
The answer is simply that the schema said username is the unique identifier for the User type and the mutation gave us two JSON objects that are the serialisation of exactly the same thing (same id, same everything else), so we'll treat that as a reference to the one user node in the graph. (Dgraph already does the same thing with it's JSON mutation format and blank nodes.)
I think it's exactly the pit of success that you are talking about - makes it easy to serialise data in code and use the API.
@MichaelJCompton thanks for the explanation of the why behind the approach! It all sounds good to me.
Most helpful comment
Hi @n8jadams, thanks for reporting the bug and your thoughtful comments. I think some of the discussion above got a bit sidetracked with lots of possible options, so I just wanted to touch base on the approach and why.
Dgraph has always had the focus that there's nothing special you have to do to add data - there's no special graph format, no structure you have to build - just serialise to json.
We've kept the same approach in our GraphQL offering. Compare the ease of use of our mutations (where you can just serialise your client side data structures), with some other GraphQL offerings (where you have to write code to build special mutation structures where links to existing data and inserting new data require you to client side process your data structures into the structure that the API needs).
We are taking a 'strict' approach, but also considering serliasation of graph structures and ease of use. As an example of the way we'll handle this issue, let's assume the author and posts example in the comments above.
In my client-side program, I've got data structures for posts and authors and a post has a property with a 'pointer' to an author. So, client-side, I might build two new posts (P1 and P2) and a new author (A1) who authored both (so A1 is the target of both P1.author and P2.author). That's just three things, a simple graph, but if I serialise an array of [P1, P2], I'll generally get:
And that perfectly reasonable set of data structures and perfectly reasonable serialisation, just became an invalid mutation 馃槥... and then you'd have to write code client-side to pick through and unwind your data structures and make multiple mutations or make sure you mutate particular objects in particular orders. And that makes using the API so much harder.
The answer is simply that the schema said username is the unique identifier for the User type and the mutation gave us two JSON objects that are the serialisation of exactly the same thing (same id, same everything else), so we'll treat that as a reference to the one user node in the graph. (Dgraph already does the same thing with it's JSON mutation format and blank nodes.)
I think it's exactly the pit of success that you are talking about - makes it easy to serialise data in code and use the API.