Servo: Implement IndexedDB.

Created on 4 Aug 2015  路  40Comments  路  Source: servo/servo

(Just filing as a placeholder.)

A-contendom C-assigned E-hard I-enhancement

Most helpful comment

Just to provide some context on SQLite and other storage choices as they're being used in Gecko/Firefox:

Re: SQLite

  • SQLite 4 was an experiment and the fancy bits (WITHOUT ROWID in particular, which allows for tables to be effectively maintain the lexical row ordering directly without indirecting through primary keys) are now in SQLite 3.
  • Firefox is overhauling how we implement storage as it relates to PrivateBrowsing. Previously implementations had to be in-memory only and data would never touch disk. We're now moving to store data on disk in an encrypted form with the keys never persisted. We delete the storage at the end of the private browsing session. In the event of a crash, the data is encrypted and then only subject to analysis of what's on disk. SQLite's VFS layer makes this not too difficult, and SQLite's page-centric storage model potentially helps with the analysis-at-rest concern. (I haven't thought too much about how that would compare with log-structured-merge independent files on disk.)
  • Mozilla Corporation is a SQLite consortium member which gets us insanely professional levels of support in terms of investigations of problems we're experiencing involving SQLite, guidance, etc. They take writing safe C seriously and have an incredibly comprehensive test corpus.

Re: other options:

  • https://github.com/mozilla/rkv has been landed in Gecko as a rust layer on top of the LMDB engine. I think rkv/LMDB sounded more promising than it's turned out to be, especially with there being limitations like 512-byte key limits, but it sounds like the Browser Architecture Group team may be serious about keeping this around.

Re: selfish concerns on my part ;)

  • The existing Gecko IndexedDB implementation is a lot of fairly grungy C++ and we will be looking longer term to replace it with a rust implementation. It's not clear that Gecko could justify the increase in binary size of the RocksDB engine, so selfishly it would be nice if Servo used a databse implementation compatible with what Gecko is already using.

    • I think there may be in-tree https://github.com/jgallagher/rusqlite users already? But honestly, based on a quick skim of diesel, its ORM-ish mapping potentially seems like it allows for less accidental mismatches when binding statements.

  • That said, the Gecko implementation may have some architectural considerations that don't make sense for an initial servo IndexedDB implementation. Specifically, with fisssion potentially consolidating origins into a single process (but potentially also not), we may want to shift where we store the canonical state and perform caching. Especially for API's like https://github.com/WICG/kv-storage that build on IndexedDB and can potentially very performance-conscious as they're competing against localStorage.

All 40 comments

I like to work on it.
Some Questions before a start to work.
Can help me someone, who work on it in firefox?
I need to create many WebIDLs, is possible to create a folder like "indexed". To add all files inside this folder, for example http://mxr.mozilla.org/mozilla-central/source/dom/

cc @nox

Note that this is a pretty large project.

To start, you'll need to pick a backend and write Rust bindings for it. We were thinking SQLite4 or a LevelDB fork, probably the former.

@farodin91 We don't support subdirectories in dom/ yet; I don't think it's worth blocking on that.

@jdm Technically we just need to use reexports and it will work fine.

@Manishearth SQLite 4 suffers from a complete lack of bindings and my days are unfortunately made of 24 hours like everyone. :(

@Manishearth I can start with building an rust api for SQLite 4.

That sounds valuable :)

I like to write a binding in rust for SQLite. Any prefer which lib to use for it?

This is my first time to create an rust binding. Is there a small example for it

Please note that SQLite resides in a Fossil repository. I had started writing bindings for them, but it's very primitive for now. I can push that somewhere this week if you want.

I create an repo for this. I can start to work on it this week.
Thank you.

@nox Any update on pushing the in my repo?

Sorry things were a bit hectic for me this week.

No problem

I start work on it https://github.com/farodin91/sqlite-rs. But currently my build script doesn't work to build sqlite4 on linux. I have no idea.

Any update to pushing in my repo?

@nox Any updates?
It is also possible to use LevelDB with lib https://github.com/skade/leveldb.

I'm picking this up. Will start by writing SQLite 4 Rust bindings.

It seems there is no commit in SQLite 4 since 2015-08-15 and never had an official release, doesn't levelDB seems more mature (already used for indexeddb in Chrome) ?

That might just be because they intend SQLite 4 to be stable. SQLite is under active development, see https://www.sqlite.org/src/timeline . That might be SQLite 3, but I think the codebase is mostly shared.

@Manishearth What was the reason for choosing SQLite4 over SQLite3? If we are to choose SQLite3 instead, we can use existing crates that have already generated bindings for it.

4 is a kv store, 3 isn't. It's not just a version number.

We need a NoSQL db for IDB. Ironically, SQLite 4, is more or less NoSQL.

Here's what I discovered - someone has already written Python bindings for the SQLite 4 backend LSM DB. I wonder if I should do the same, instead of writing bindings directly to SQLite 4?

"Need" is a strong word, given an existing implementation in Gecko that uses sqlite3.

Per the conversation at IRC, it looks like there's only 2 viable alternatives:

  • SQLite3; or
  • LevelDB

SQLite4 is determined to be not mature enough for implementation now.

I'm going to use SQLite3.

For database storage, we have two ways:

  • We can do like Gecko, and create a SQLite database per IndexedDB database.

    • 馃憤 We can open the database with any SQLite tool and we have everything at once.

    • 馃憥 We will never have concurrent transactions when one of them is a write transaction.




  • We can create one SQLite database per IndexedDB object store.

    • 馃憤 By using ATTACH, we can lock on a per-object store basis and still do transactions that encompass multiple object stores.

    • 馃憥 It becomes a bit harder to see everything from an IndexedDB database at once with SQLite tools.




For indices, we also have two ways, the old Gecko way and a very fancy way that I would like to experiment with:

  • We can do like Gecko, and have SQLite tables for IndexedDB indices. In Gecko elements from all object stores are stored in a single table and all the entries in all indices are also stored in two tables, one with a PK for unique indices, and another one for non-unique indices.

    • 馃憤 We can open the database with any SQLite tool and we have everything at once.

    • 馃憥 It duplicates a lot of data for each index.




  • We can do like the json1 SQLite extension and implement and document a "Serialised Structure Clone" binary subtype in SQLite, and then implement a function to extract a key path from this format (similar to the json function in json1). We can then use that method to create an index on an expression for each IndexedDB index we want. We can make that function return NULL when the key path is not found, and use a partial index to ignore such values.

    • 馃憤 It is quite cool as a show case of what fancy things we do.

    • 馃憤 We can enforce that the things we put in that database are actually serialised structure clones.

    • 馃憤 It makes IndexedDB indices piggyback SQLite indices.

    • 馃憥 It requires custom pluggable code into SQLite and these APIs aren't bound in the existing SQLite crates.




Some interesting links:

This would also require a reimplementation of the serialised structured clone format described here (or something similar to it that we can store and introspect directly from SQLite).

We can't use ATTACH for my first idea because its documentation states that it's not really robust when faced with crashes.

There are various things necessary on the Servo side, at least these three things:

Blocks #13942

hi @nox, do you know if the issue has advanced any further? Thanks!

It has not.

I am looking for a new task, and I thought this one was interessting. I have a few questions before I am comfortable claiming this issue.

As I understand it, IndexedDB should be implemented using a WAL to favor performance. This decision was taken in Gecko, and is seen in Chrome/IE. See change here. Despite the new 3.0 specification states it to be durable.

Since SQLite4 never got released, and SQLite3 not being the obvious choice, perhaps RocksDB is a valid option?

It is wildly used, and seems to fit indexeddb and servo quite well, i.e. fast and concurrent; embedded KV store.
https://rocksdb.org/
https://rocksdb.org/docs/support/faq.html

It also has great rust bindings through rust-rocksdb used by e.g. TiDB. This would make it a lot easier to get a working IndexedDB.

Thoughts?

Seems like it's worth a shot!

@highfive: assign me

Hey @rasviitanen! Thanks for your interest in working on this issue. It's now assigned to you!

Just to provide some context on SQLite and other storage choices as they're being used in Gecko/Firefox:

Re: SQLite

  • SQLite 4 was an experiment and the fancy bits (WITHOUT ROWID in particular, which allows for tables to be effectively maintain the lexical row ordering directly without indirecting through primary keys) are now in SQLite 3.
  • Firefox is overhauling how we implement storage as it relates to PrivateBrowsing. Previously implementations had to be in-memory only and data would never touch disk. We're now moving to store data on disk in an encrypted form with the keys never persisted. We delete the storage at the end of the private browsing session. In the event of a crash, the data is encrypted and then only subject to analysis of what's on disk. SQLite's VFS layer makes this not too difficult, and SQLite's page-centric storage model potentially helps with the analysis-at-rest concern. (I haven't thought too much about how that would compare with log-structured-merge independent files on disk.)
  • Mozilla Corporation is a SQLite consortium member which gets us insanely professional levels of support in terms of investigations of problems we're experiencing involving SQLite, guidance, etc. They take writing safe C seriously and have an incredibly comprehensive test corpus.

Re: other options:

  • https://github.com/mozilla/rkv has been landed in Gecko as a rust layer on top of the LMDB engine. I think rkv/LMDB sounded more promising than it's turned out to be, especially with there being limitations like 512-byte key limits, but it sounds like the Browser Architecture Group team may be serious about keeping this around.

Re: selfish concerns on my part ;)

  • The existing Gecko IndexedDB implementation is a lot of fairly grungy C++ and we will be looking longer term to replace it with a rust implementation. It's not clear that Gecko could justify the increase in binary size of the RocksDB engine, so selfishly it would be nice if Servo used a databse implementation compatible with what Gecko is already using.

    • I think there may be in-tree https://github.com/jgallagher/rusqlite users already? But honestly, based on a quick skim of diesel, its ORM-ish mapping potentially seems like it allows for less accidental mismatches when binding statements.

  • That said, the Gecko implementation may have some architectural considerations that don't make sense for an initial servo IndexedDB implementation. Specifically, with fisssion potentially consolidating origins into a single process (but potentially also not), we may want to shift where we store the canonical state and perform caching. Especially for API's like https://github.com/WICG/kv-storage that build on IndexedDB and can potentially very performance-conscious as they're competing against localStorage.
Was this page helpful?
0 / 5 - 0 ratings

Related issues

noisiak picture noisiak  路  3Comments

jdm picture jdm  路  3Comments

mrobinson picture mrobinson  路  3Comments

ferjm picture ferjm  路  3Comments

larsbergstrom picture larsbergstrom  路  3Comments