Presto: Add support for MongoDB Database References

Created on 18 Mar 2020  路  10Comments  路  Source: prestosql/presto

presto> select * from mongodb.test.test_ref;
 $ref |                 $id                 | $db  
------+-------------------------------------+------
 test | 5e 71 93 9b 7d 7f e6 32 5a 0b 33 07 | test 
enhancement good first issue

All 10 comments

Please may I work on this?

@kristina-head sure, go for it!

@kristina-head are you still working on this? I鈥檇 like to give this a shot

@achyudh it's all yours :)

Hi @achyudh are you still working on this? I would like to take a look at this issue as well.

@jasonyanwenl nope, feel free to take it up

Hi @ebyhr, may I ask which part of the code is a good place to start? Thanks!

@jasonyanwenl I think you can start with MongoSession#getTableMetadata that has "guess fields" mechanism.

Hi @ebyhr, currently I am trying to get more background about this issue. I have read two links you have shared. In my sense, DBRefs are kind of like pointers to another document in MongoDB. One thing I am confused is the example you are given:

presto> select * from mongodb.test.test_ref;
 $ref |                 $id                 | $db  
------+-------------------------------------+------
 test | 5e 71 93 9b 7d 7f e6 32 5a 0b 33 07 | test 

Essentially, I don't know what is the exact requirement for solving this issue. Here are two versions I could come up with:

  1. Do you mean that the above SQL query result is just what we expect to have? And if so, do you mean that the test_reftable is automatically created as long as the test table is created. And the fields inside test_ref is auto-filled with $ref, $id and $db?

  2. Do you mean when a document is queried, we need to automatically check if there is any DBRefs inside this doc and if so we need to automatically retrieve the referenced doc and finally return the entire doc along with referenced doc as the query result?

Which way is correct? Or I totally misunderstand any points? Thanks!

@jasonyanwenl Thanks for looking into the details.

  1. Though I almost forgot how I created the above example, test_ref should be created beforehand. If we use DBRef in MongoDB, the field has $ref, $id and $db. You can reproduce this issue by this steps:
  • Start MongoQueryRunner
  • Run mongo shell docker exec -it <container id> mongo
  • Prepare data:
use users;

db.creators.insert({
    "_id": ObjectId("5126bc054aed4daf9e2ab772"),
    "name": "Broadway Center",
    "url": "bc.example.net"
})

db.test_ref.insert(
{
  "_id" : ObjectId("5126bbf64aed4daf9e2ab771"),
  "col1" : "foo",
  "creator" : {
                  "$ref" : "creators",
                  "$id" : ObjectId("5126bc054aed4daf9e2ab772"),
                  "$db" : "users"
               }
})
  • Select the table from Presto. You will see creator column is missing because the type is DBRef
select * from mongodb.users.test_ref;
  1. It is the final goal, but let's start with simple step. The minimum goal is showing DBRef fields so that we can use it on join condition.
Was this page helpful?
0 / 5 - 0 ratings

Related issues

JamesRTaylor picture JamesRTaylor  路  5Comments

findepi picture findepi  路  4Comments

BruceKellan picture BruceKellan  路  4Comments

findepi picture findepi  路  4Comments

dpolonsky picture dpolonsky  路  4Comments