Dgraph: Pick nodes randomly

Created on 29 Mar 2019 · 4Comments · Source: dgraph-io/dgraph

In the close issue, you provided has(predicate). But we you build a recommend system, you need to recommend M items of N to users, where N is big though to avoid repeat too often, and in that case, you need random.

kinquestion

Source

yupengfei

Most helpful comment

+1 random picking would be very useful

samsends on 4 Apr 2020

👍4

All 4 comments

Analysing your discuss post https://discuss.dgraph.io/t/feature-request-random-pick-node/2141
I guess you're using Golang right?
I think you could create a "dice" function to add a random value to "offset" (e.g https://github.com/justinian/dice) in the query itself.

So you can use it that random value like so:

You can run this queries in http://play.dgraph.io

{
  me(func: allofterms(name@en, "Hark Tsui")) @filter(ge (count(director.film), 40)) {
    director.film (orderasc: name@en) (first:1, offset:3)  {
      name@zh
      name@en
      initial_release_date
    }
    count(director.film)
  }
}

#return

{
  "data": {
    "me": [
      {
        "director.film": [
          {
            "name@en": "All the Wrong Clues"
          }
        ],
        "count(director.film)": 41
      }
    ]
  }
}

So we have 41 nodes, means 41 possible values for offset argument.
You dice need to be between 1 and 41 - But remember, the first: argument need to be 1.
Otherwise 41 will be divided by a larger value. And you will not have an exact offset you want. For example. If you put first: 10 you will have 4.1 pages to offset. And those 4 pages can't be random. They always be fix. The only random thing would be the offset (page) itself. And not the nodes.

So (first: 1, offset: 3) argument, means "Give me the third node of the 41 nodes".

{
  me(func: allofterms(name@en, "Hark Tsui")) @filter(ge (count(director.film), 40)) {
    director.film (orderasc: name@en) (first:1, offset:1)  {
      name@zh
      name@en
      initial_release_date
    }
    count(director.film)
  }
}

#Returns

{
  "data": {
    "me": [
      {
        "director.film": [
          {
            "name@zh": "最佳拍档之女皇密令",
            "name@en": "Aces Go Places 3",
            "initial_release_date": "1984-01-01T00:00:00Z"
          }
        ],
        "count(director.film)": 41
      }
    ]
  }
}

Maybe we could add a "dice" function to use in Dgraph. But not sure if its necessary. For anyone could create a simple dice algo.

if we add a "Dice" function would look like this:

{
  me(func: allofterms(name@en, "Hark Tsui")) @filter(ge (count(director.film), 40)) {
   NODES as count(director.film) # set the size of your dice by the count of nodes.
    director.film (orderasc: name@en) (first:1, offset: dice(NODES) ) {
      name@zh
      name@en
      initial_release_date
    }
  }
}

BTW

You could use GraphQL Variables in this operation.
1 - you do a query to know the size of the offset.

{
  me(func: allofterms(name@en, "Hark Tsui")) @filter(ge (count(director.film), 40)) {
   NODES : count(director.film)
  }
}

2 - In your code you create a dice value from the first query.
3 - Use GraphQL Variables to inject the result from the Dice function.

e.g:

query test($a: int) {
  me(func: allofterms(name@en, "Hark Tsui")) @filter(ge (count(director.film), 40)) {
    director.film (orderasc: name@en) (first:1, offset: $a ) {
      name@zh
      name@en
      initial_release_date
    }
  }
}

MichelDiz on 29 Mar 2019

Thanks for the notification.
I ended up doing the same mechanism.
Unfortunately we still have to do 2 requests, the first to get the max count, the second with the random offset.

mehdiym on 29 Mar 2019

👍1

Hello,
Any update on this?

As my queries get bigger and bigger, with lots of query blocks and filters, doing them twice to pick random nodes takes an increasing time.
Moreover, to pick multiple random nodes requires a query block per picked node.

The query syntax with random keyword suggested by @yupengfei could really increase perfs and simplify code.

A graph database is perfect for stats or science apps, and they regularly need to pick random elements.

Keep up the good work!

mehdiym on 12 Mar 2020

+1 random picking would be very useful

samsends on 4 Apr 2020

👍4

Was this page helpful?

0 / 5 - 0 ratings