Hi!
I am currently developing adapter for Apache Cassandra and got stuck with schema definition and querying with Cassandra special types:
Their must be an option to define own custom composite types.
There are two variants that I came up with:
defmodule Post do
use Schema
schema "posts" do
field :title, :string
field :text, :string
field :public, :boolean
field :tags, {Cassandra.Types.Set, :string}
field :location, {Cassandra.Types.Tuple, {:float, :float}}
field :links, {Cassandra.Types.Map, {:string, :string}}
timestamps()
end
end
defmodule Post do
use Schema
schema "posts" do
field :title, :string
field :text, :string
field :public, :boolean
field :tags, {:set, :string}
field :location, {:tuple, {:float, :float}}
field :links, {:map, :string, :string}
timestamps()
end
end
Right now It is impossible to implement update operation like :push and :pull, because they expect only array field type.
And select with :in clause also fails with non-array fields.
Also there must be the way to define nested types like so:
field :shape, {:list, {:tuple, {:float, :float}}}
# we have list of tuples here
Nice idea, I'll try to implement custom keys for map keys right now. However, I noticed that your proposed syntax for defining the type of the key wouldn't work with {:map, {:string, :string}, because you can already nest types in the 2nd item of the type tuple. So I propose writing it like {:map, :string, :string} so it can be pattern matched properly.
@narrowtux can you provide example with pattern matching collision that you have noticed?
There was an issue 1784 that adds similar functionality for types in migration and there was no problem with pattern matching. So I thought that it would be great if complex type definition for migration and schema will stay the same.
This is only a problem with the :map type. It already supports nested types for values.
In the unit tests test/type_test.exs, I found
assert load({:map, {:array, :integer}}, %{"a" => [0, 0], "b" => [1, 1]}) == {:ok, %{"a" => [0, 0], "b" => [1, 1]}}
on line 67. Using a {:map, {key_type, value_type}} syntax would mean that you'd have to change this already existing syntax.
I already implemented {:map, key_type, value_type} which was easily done with pattern matching.
@narrowtux Oh, now I see. I have updated issue description according to it.
Also I'm unsure what to use for the underlying tuple type (which type goes into load/2 and which type is returned by dump/2).
Other databases don't support the tuple type AFAIK, so it would be simpler to use a list and have the cassandra adapter handle the extra step. Would that work?
Same thing for the set.
CQL (cassandra's QL) has unique syntax to work with each type. So for example if I will use ecto array type for underlying cassandra tuple type I need to make additional roundtrip to db to guess underlying type. That will hit performance badly.
Also inserting/updating/selecting elixir tuple values without conversion to array back and forth looks more natural.
They all need to be converted to array anyway because you can't use enum or tail recursion on a tuple.
Sounds really complex. Maybe we are talking about different things. Right now I have implemented ecto custom type to pass tuples as is. It looks like:
defmodule Tuple do
@moduledoc """
Represents Cassandra tuple type.
"""
@behaviour Ecto.Type
def type, do: :custom
def cast(value), do: {:ok, value}
def load(value), do: {:ok, value}
def dump(value) when is_tuple(value), do: {:ok, value}
def dump(_), do: :error
end
As you can notice it simply pass value "as is" without any conversion. I don't like this solution because I can't validate types of tuple elements. I just have no additional data for that.
Let's return to my very first variant
field :location, {Cassandra.Types.Tuple, {:float, :float}}
It will be really nice if I could get second element in custom type somewhere. My custom type could look like
defmodule Tuple do
@moduledoc """
Represents Cassandra tuple type.
"""
@behaviour Ecto.Type
def type, do: :custom
def cast(value, opts), do: ...
def load(value, opts), do: ...
def dump(value, opts) when is_tuple(value), do: ...
def dump(_, _), do: :error
end
I have strict typed casts working for tuples actually, they allow nesting as well, just like the map and array type.
The only thing I'm wondering about is how the tuple should be sent to the adapter.
Maybe one of the pros could chime in @josevalim @michalmuskala
If the tuple type should only work with Cassandra, yes, we could simply return a tuple all the time, but I'd like it to work with Postgres and MySQL too
@narrowtux if you really planning to suport tuple for PostgreSQL and MySQL, it is possible to convert tuple to underlying array of strings, but it is important to define types for each tuple element to make correct conversion. So syntax might be the same as I have proposed for Cassandra:
field :location, {:tuple, {:float, :float}}
One of the principles of ecto is not to emulate things that are not supported by the databases themselves. Given that neither PostgreSQL nor MySQL support tuple types natively, I don't see a possibility of having support for them in ecto for those databases.
PostgreSQL has composite types [1] that are essentially tuples and that's how postgrex decodes them.
[1] https://www.postgresql.org/docs/9.6/static/rowtypes.html
I think implementation of tuple type for PostgreSQL and MySQL is quite out of scope of the current issue. I only need the way to work with tuples for Apache Cassandra that supports it natively.
https://github.com/elixir-ecto/ecto/issues/1871#issuecomment-269208170
I did exactly that.
Ok so I'll just have it return a tuple for now.
Pull requests that add those features to Ecto are welcome. If you need help on getting it done, please let us know and we will be glad to provide directions!
Most helpful comment
PostgreSQL has composite types [1] that are essentially tuples and that's how postgrex decodes them.
[1] https://www.postgresql.org/docs/9.6/static/rowtypes.html