Postgrest: Multiple "hops" for embed(Unnest/flatten embeds)

Created on 24 Jan 2019  路  4Comments  路  Source: PostgREST/postgrest

Environment

  • PostgreSQL version: (if using docker, specify the image)
    docker postgres:10-alpine
  • PostgREST version: (if using docker, specify the image)
    docker postgrest/postgrest (pulled fresh this morning)
  • Operating system:
    Ubuntu 18

Description of issue (less an issue than a question)

Is it possible to do multiple "hops" to embed something?

For example, I have a schema similar to this (the real thing has more columns and checks)

create table dimension (
    id serial primary key,
    dimension_type_name text not null,

    "text" text not null
);

create table dimension_relationship (
    ancestor_id int references dimension(id) not null,
    descendent_id int references dimension(id) not null,
    check (ancestor_id != descendent_id),
    primary key (ancestor_id, descendent_id)
    -- other checks and indices to ensure these only go one way
);

The basic idea behind this structure is that it can store any arbitrary hierarchy of data. The relationships can only go in one direction, but it doesn't have to be tree-like (a dimension can can multiple parents).

I want to get a particular dimension (let's say with dimension_type_name "a"), and all the descendent dimensions with dimension_type_name "b", in a json structure similar to this:

[{
    // this is a dimension of type "a"
    "id": 1, "text": "stuff",
    // these should all be dimensions of type "b"
    "attributes": [{
        "id": 2, "text": "other"
    }],
    // ...
}]

This gets me close:

"/dimension?
    select=*,
    relationships:dimension_relationship.ancestor_id(
        attribute:dimension.descendent_id(*)
    )
    &dimension_type_name=eq.a
    &relationships.attribute.dimension_type_name=eq.b"

But there's an extra level of nesting that I don't want, and a bunch of null attributes:

[{
    "id": 32,
    "dimension_type_name": "a",
    "text": "tmgxledhjxjlgqh",
    "relationships": [
        {
            "attribute": {
                "id": 76,
                "dimension_type_name": "b",
                "text": "didpwfqkgxyfpkr"
            }
        },
        {
            "attribute": null
        },
        {
            "attribute": null
        }
    ]
},
// ...
]

Is something akin to this supported?

"/dimension?
    select=*,
    attributes:dimension_relationship.ancestor_id.dimension.descendent_id(*)
    &dimension_type_name=eq.a
    &attributes.dimension_type_name=eq.b"

Here's a sql schema you can use to quickly get running with this structure:

create function random_between(low int, high int) returns int as $$
begin
    return floor(random() * (high - low + 1) + low);
end;
$$ language plpgsql strict;

create function random_text() returns text as $$
    select array_to_string(array(select chr((97 + round(random() * 25)) :: integer)
    from generate_series(1, 15)), '');
$$ language sql strict;

create table dimension_type (
    "name" text not null unique
);


create table dimension (
    id serial primary key,
    dimension_type_name text references dimension_type("name") not null,

    "text" text not null
);


create table dimension_relationship (
    ancestor_id int references dimension(id) not null,
    descendent_id int references dimension(id) not null,
    check (ancestor_id != descendent_id),
    primary key (ancestor_id, descendent_id)
);

create unique index idx_symmetric_unique_dimension_relationship on dimension_relationship
(greatest(ancestor_id, descendent_id), least(ancestor_id, descendent_id));


insert into dimension_type ("name") values
('a'), ('b'), ('c'), ('d'), ('e'), ('f'), ('g'), ('h'), ('i'), ('j'), ('k'), ('l'), ('m'), ('n'), ('o');

insert into dimension (dimension_type_name, "text")
select dimension_type_name, "text"
from (
    select
    generate_series(1, 100) as nums,
    (ARRAY['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o'])[floor(random()*15)+1] as dimension_type_name,
    random_text() as "text"
) vals;

insert into dimension_relationship (ancestor_id, descendent_id)
select ancestor_id, descendent_id
from (select generate_series(1, 300) as nums, random_between(1, 50) as ancestor_id, random_between(51, 100) as descendent_id) vals
on conflict do nothing;
embedding enhancement

Most helpful comment

Using some our test suite fixtures(clients-has many-projects-has many-tasks), suppose you want to get all tasks with their projects and only the client name, you could do this:

GET /tasks?select=name,projects(*,client_name:clients(*)->name)
 [{"name":"Design w7","projects":{"id":1,"name":"Windows 7","client_name":"Microsoft"}}, 
  {"name":"Code w7","projects":{"id":1,"name":"Windows 7","client_name":"Microsoft"}},   
  {"name":"Design w10","projects":{"id":2,"name":"Windows 10","client_name":"Microsoft"}}
  {"name":"Code w10","projects":{"id":2,"name":"Windows 10","client_name":"Microsoft"}}, 
  {"name":"Design IOS","projects":{"id":3,"name":"IOS","client_name":"Apple"}},          
  {"name":"Code IOS","projects":{"id":3,"name":"IOS","client_name":"Apple"}},            
  {"name":"Design OSX","projects":{"id":4,"name":"OSX","client_name":"Apple"}},          
  {"name":"Code OSX","projects":{"id":4,"name":"OSX","client_name":"Apple"}}]            

This would generate the following query:

with pg_source as (
  select 
    "test"."tasks"."name", 
    row_to_json("projects_projects".*) as "projects" 
  from "test"."tasks"  
  left join lateral( 
    select 
      "test"."projects"."id", 
      "test"."projects"."name", 
      row_to_json("clients_clients".*)->'name' as "client_name" 
    from "test"."projects"  
    left join lateral( 
      select 
        "test"."clients"."id", 
        "test"."clients"."name" 
      from "test"."clients"  
      where "test"."clients"."id" = "test"."projects"."client_id"   ) as "clients_clients" on true  
    where "test"."projects"."id" = "test"."tasks"."project_id"   ) as "projects_projects" on true    ) 
select coalesce(json_agg(_postgrest_t), '[]')::character varying as body
from ( select * from pg_source) _postgrest_t;

I think this would be useful and we'd only have to allow the -> after the embeds.

@blainehansen Thanks for the report, this feature will be in the plans.

All 4 comments

@blainehansen #1075(pending enhancement) would clear the {"attribute": null} results.

Doing embeds like attributes:dimension_relationship.ancestor_id.dimension.descendent_id(*) is not supported.

Is there any intention of a feature like this? Or should I just close?

Thank you :smile:

Not with "multiple hops" but I think we can offer "flattening" by using PostgreSQL json arrow operators, in your example this would look like:

"/dimension?
    select=*,
    relationships:dimension_relationship.ancestor_id(
        attribute:dimension.descendent_id(*)
    )->attribute
    &dimension_type_name=eq.a
    &relationships.attribute.dimension_type_name=eq.b"

Using some our test suite fixtures(clients-has many-projects-has many-tasks), suppose you want to get all tasks with their projects and only the client name, you could do this:

GET /tasks?select=name,projects(*,client_name:clients(*)->name)
 [{"name":"Design w7","projects":{"id":1,"name":"Windows 7","client_name":"Microsoft"}}, 
  {"name":"Code w7","projects":{"id":1,"name":"Windows 7","client_name":"Microsoft"}},   
  {"name":"Design w10","projects":{"id":2,"name":"Windows 10","client_name":"Microsoft"}}
  {"name":"Code w10","projects":{"id":2,"name":"Windows 10","client_name":"Microsoft"}}, 
  {"name":"Design IOS","projects":{"id":3,"name":"IOS","client_name":"Apple"}},          
  {"name":"Code IOS","projects":{"id":3,"name":"IOS","client_name":"Apple"}},            
  {"name":"Design OSX","projects":{"id":4,"name":"OSX","client_name":"Apple"}},          
  {"name":"Code OSX","projects":{"id":4,"name":"OSX","client_name":"Apple"}}]            

This would generate the following query:

with pg_source as (
  select 
    "test"."tasks"."name", 
    row_to_json("projects_projects".*) as "projects" 
  from "test"."tasks"  
  left join lateral( 
    select 
      "test"."projects"."id", 
      "test"."projects"."name", 
      row_to_json("clients_clients".*)->'name' as "client_name" 
    from "test"."projects"  
    left join lateral( 
      select 
        "test"."clients"."id", 
        "test"."clients"."name" 
      from "test"."clients"  
      where "test"."clients"."id" = "test"."projects"."client_id"   ) as "clients_clients" on true  
    where "test"."projects"."id" = "test"."tasks"."project_id"   ) as "projects_projects" on true    ) 
select coalesce(json_agg(_postgrest_t), '[]')::character varying as body
from ( select * from pg_source) _postgrest_t;

I think this would be useful and we'd only have to allow the -> after the embeds.

@blainehansen Thanks for the report, this feature will be in the plans.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

RB14 picture RB14  路  32Comments

LorenzHenk picture LorenzHenk  路  23Comments

steve-chavez picture steve-chavez  路  26Comments

sscarduzio picture sscarduzio  路  30Comments

posix4e picture posix4e  路  34Comments