Ecto: utc_datetime having no affect on PostgreSQL DDL

Created on 20 Dec 2016  路  11Comments  路  Source: elixir-ecto/ecto

Precheck

  • Related to issue #535 which introduced the concept of explicitly storing timestamps with a time zone (in particular, for explicitly storing the UTC TZ).

Environment

  • Elixir version (elixir -v): Elixir 1.3.4, Erlang/OTP 19 [erts-8.1] [source] [64-bit] [smp:4:4] [async-threads:10] [hipe] [kernel-poll:false] [dtrace]
  • Database and version (PostgreSQL 9.4, MongoDB 3.2, etc.): PostgreSQL 9.6.1
  • Ecto version (mix deps): ecto 2.1.0, phoenix_ecto 3.1.0
  • Database adapter and version (mix deps): postgrex 0.13.0
  • Operating system: Client macOS 10.11.6, DB server Alpine Linux 5.3.0

Current behavior

Migration:

    create table(:foo) do
      add :utcDT, :utc_datetime
      add :naiveDT, :naive_datetime
      add :userId, :uuid

      timestamps(type: :utc_datetime)
    end

DDL:

CREATE TABLE IF NOT EXISTS public.foo (
    id integer DEFAULT nextval('foo_id_seq'::regclass) NOT NULL,
    "utcDT" timestamp without time zone,
    "naiveDT" timestamp without time zone,
    "userId" uuid,
    inserted_at timestamp without time zone NOT NULL,
    updated_at timestamp without time zone NOT NULL,
    PRIMARY KEY(id)
);

Expected behavior

The DDL for :utc_datetime columns should be timestamp with time zone, not timestamp without time zone.

Most helpful comment

@delitescere the current behaviour is correct. The :utc_datetime means Ecto will guarantee dates put in are in UTC and consider the dates coming out are in UTC as well. So we are aligning with the database expectation except we guarantee it at the Ecto level. We may introduce a database backed one in the future.

All 11 comments

Looks like it's a change needed to lib/ecto/adapters/postgres/connection.ex:

    defp ecto_to_db(:utc_datetime),   do: "timestamptz"
    defp ecto_to_db(:naive_datetime), do: "timestamp"

rather than

    defp ecto_to_db(:utc_datetime),   do: "timestamp"
    defp ecto_to_db(:naive_datetime), do: "timestamp"

The timestamp with timezone type does not do what you think it does, in particular it doesn't store the timezone in the database.

Citing from https://www.postgresql.org/docs/9.6/static/datatype-datetime.html

For timestamp with time zone, the internally stored value is always in UTC (Universal Coordinated Time, traditionally known as Greenwich Mean Time, GMT). An input value that has an explicit time zone specified is converted to UTC using the appropriate offset for that time zone. (...)
When a timestamp with time zone value is output, it is always converted from UTC to the current timezone zone, and displayed as local time in that zone. To see the time in another time zone, either change timezone or use the AT TIME ZONE construct (see Section 9.9.3).

Especially the last part of converting the stored value to the server's local timezone is rarely the desired behaviour. That's why Ecto does not use this type - you can always select it explicitly using timestamptz in the migration.

If you need to store the timezone in the database (which is probably what you want) you need to use a custom type.

The storage as "always in UTC" is precisely what we want, as it is explicitly part of the type.

See #535 for the rationale for distinguishing it.

@delitescere the current behaviour is correct. The :utc_datetime means Ecto will guarantee dates put in are in UTC and consider the dates coming out are in UTC as well. So we are aligning with the database expectation except we guarantee it at the Ecto level. We may introduce a database backed one in the future.

and consider the dates coming out are in UTC as well.

By default json representation doesn't cast it as UTC (doesn't add the "Z" character), meaning client is free to interpret it any way it wants, e.g. by default a Ruby client will interpret it as local client time.

Given these constraints, is there any canonical way to ensure all representation of the time after it was casted by Phoenix will be explicitly UTC with the Z character?

@egilburg you can convert them to DateTime when converting to JSON. The goal is to eventually include a :datetime in Ecto that will include Z by default.

@delitescere the current behaviour is correct. The :utc_datetime means Ecto will guarantee dates put in are in UTC and consider the dates coming out are in UTC as well. So we are aligning with the database expectation except we guarantee it at the Ecto level. We may introduce a database backed one in the future.

If another program (not necessarily written in elixir) reads the same database that ecto wrote to using :utc_datetime as timestamp without time zone then this program will have to _make the same assumption_ that ecto does: implicitly interpret a timestamp without time zone as a timestamp without time zone AT TIME ZONE 'UTC'.

SET TIME ZONE 'UTC'; -- to display result in UTC
SELECT (TIMESTAMP WITHOUT TIME ZONE '2019-01-28T11:00:00') AT TIME ZONE 'UTC';

I believe _repeating the same assumption_ in every programs that access the data is not following the DRY convention.

Because timestamp with time zone can _explicitly_ encode a timestamp without time zone AT TIME ZONE 'UTC', that would make the overwhole system more DRY and less ambiguous to use timestamp with time zone as a mapping to :utc_datetime.

implicitly interpret a timestamp without time zone as a timestamp without time zone AT TIME ZONE 'UTC'.

That's a dangerous assumption to make. A timestamp without time zone literally means "a time stamp with an unknown timezone". Whether or not that can be assumed to be UTC depends on knowledge not possessed by Ecto. For example, you can have a web app which only accepts dates from a logged-in user who has a time zone preference, and any dates saved are using that timezone for interpretation. Another application may assume all stored dates are always using server time. While I think it's best practice to save dates as UTC, it's not something that can be reliably done for all use cases out there.

That's a dangerous assumption to make. A timestamp without time zone literally means "a time stamp _with an unknown_ timezone".

That is my point, let's not make assumptions. Let's make things explicit, in that case by using timestamp with time zone.

Whether or not that can be assumed to be UTC depends on knowledge not possessed by Ecto. For example, you can have a web app which only accepts dates from a logged-in user who has a time zone preference, and any dates saved are using that timezone for interpretation. Another application may assume all stored dates are always using server time. While I think it's _best practice_ to save dates as UTC, it's not something that can be reliably done for all use cases out there.

Assuming the "dates" you are referring to are a datetime tuple of {year, month, day, hour, minute, second}, I see 2 use cases here.

  1. The "dates" represent points in time in the past: which means that with the combination of the datetime tuple and a time zone (from the preference or server in your example) you can build an _unambiguous_ representation of that point in time using elixir DateTime. The same way you could encode a distance by using cm or mm internally without any ambiguity on the meaning of the data, it does not matter which time zone is used internally to store the point in time ("Europe/Paris" or "Etc/UTC" for example). If you think it does matter then see the second point below. You can always retrieve a datetime tuple in a specified time zone (from the preference or server in your example) from a given past DateTime irrespective of the time zone used internally by DateTime. That means we can get an unambiguous representation in "Etc/UTC" of any elixir DateTime struct and feed that into a PostgreSQL timestamp with time zone (although PostgreSQL allow for even more permissive input format such as '2003-04-12 04:05:06 America/New_York'). PostgreSQL will internally store the timestamp in UTC but it does not matter because it represents the same point in time as originally intended.

  2. The "dates" represent wall times: in that case, I would argue that elixir DateTime is not appropriate and that you should use elixir NaiveDateTime in combination with a time zone string to represent that wall time. An then it is outside the scope of that discussion because DateTime is not involved.

After a few hours of thinking, I realize that I am wrong on the use case 2 in my previous message.

Indeed, when daylight saving occur they can be 2 different wall times represented by the sole combination of a datetime tuple and a time zone. So, the elixir DateTime seem to be exactly what is needed to represent a _wall time_, which can be in the past or not.

And indeed, storing a wall time as timestamp with time zone in PostgreSQL would not be a smart idea.

Whether or not that can be assumed to be UTC depends on knowledge not possessed by Ecto. For example, you can have a web app which only accepts dates from a logged-in user who has a time zone preference, and any dates saved are using that timezone for interpretation. Another application may assume all stored dates are always using server time. While I think it's _best practice_ to save dates as UTC, it's not something that can be reliably done for all use cases out there.

Sorry if my questions seem dumb but I realize I did not fully get what you mean in your last message (the emphasis is mine).

The documentation indicates (the emphasis is mine):

utc_datetime - has a precision of seconds and casts values to Elixir鈥檚 DateTime struct and expects the time zone to be set to UTC.

So it seems to me that Ecto has the knowledge that a utc_datetime is in UTC.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

sntran picture sntran  路  4Comments

nathanjohnson320 picture nathanjohnson320  路  4Comments

kelostrada picture kelostrada  路  3Comments

madshargreave picture madshargreave  路  3Comments

brandonparsons picture brandonparsons  路  3Comments