Parse-server: [PROPOSAL/DISCUSSION] AWS DynamoDB support

Created on 24 May 2017  路  31Comments  路  Source: parse-community/parse-server

I think DynamoDB is a good integration with Parae, for those who don't want to manage their own database.

I will start writing an Adapter for this.

But which is better, create a different module called parse-server-dynamodb-adapter or put it in this repo?

Most helpful comment

for now most tests are passing for Parse.Query

TL;DR

the falling tests are expected to fail because of these reasons

  • DynamoDB doesn't support matching strings using $regex, however I made startsWith and contains work with simple strings, like startsWith('foo'), contains('bob'), but I try to improve this at the moment
  • DynamoDB doesn't support skipping, however it does support pagination, skip should include an ObjectId from where DynamoDB should skip results, I try to make this possible, but pagination using $gt / $lt should work on any key.
  • DynamoDB doesn't support sorting on keys other than objectId, however I may do sorting on the adapter layer
  • containedIn array should not include more than 100 values!
  • DynamoDB cannot store bytes as "Byte"

However

  • query.each and query.each async should succeed, they succeed sometimes, but sometimes they timeout, I'm looking into this at the moment
  • I don't know when you insert an array of Pointers they are not transformed to strings like ['_User$1234', '_User$1230' ...], using the attribute name __type is an issue in DynamoDB, it seems to be a reserved keyword ... I can try to transform the array or the attribute __type, using _type or type or ___type is just fine.

I also solved the issue of User SignUp and applying filters/query on the objectId by duplicating the field, this make fetch,includeandgetqueries fast because DynamoDB is hybrid key-value document-store database, I check if the query includes the key_idI usegetItemotherwisequery` method.

I'm working to make more tests succeed as long as it is supported by DynamoDB or a reasonable workaround is possible.

The current stat of all tests, we are getting closer ...!

Executed 1269 of 1278 specs (147 FAILED) (6 PENDING) (3 SKIPPED) in 10 mins 48 secs.

https://travis-ci.org/benishak/parse-server/jobs/243806546

All 31 comments

That's a great idea, the biggest issue would be testing the adapter as we don't have a specific spec for the dB adapters, but we run the full test suite against them. Ideally, we'd need a single set of specs, that fully covers the database adapters. In the meantime, I believe you can start in that repo, and we'll see how we go from there.

Once the database adapter specs are properly written, I'd move out the Postgres adapter to it's own repository.

Also it is even possible to migrate data from MongoDB to DynamoDB using AWS Database Migration Service.AWS claims Zero Down Time

https://aws.amazon.com/about-aws/whats-new/2017/04/aws-database-migration-service-adds-support-for-mongodb-and-amazon-dynamodb/

@flovilmart

I almost finished implementing this as an independent adapter, I will publish the repo as soon as I finish testing.

I was able to test it with Parse JS SDK and results are very promising.
I used the same structure as MongoAdapter to minimize or eliminate any changes at the controllers. I also used the MongoTransform from Parse Server, I had only to override some methods and manipulate the Date Type, as DynamoDB doesn't support Date at all! I'm storing dates as string and seems that Parse Server was able to parse it upon read.
DynamoDB also cannot store attributes like __type as I wanted to store the pointers as it is without converting it to string but it didn't work so I followed the MongoAdapter way. This is actually good if some people want to migrate from MongoDB to DynamoDB.

Do you have any suggestions how to best write tests for the adapter, is mocking/stubbing the DynamoDB API good idea to make sure it is working as expected?

store the pointers as it is without converting it to string

That's more efficient to convert them to strings

I need to find a way to abstract the loading of the adapter from the test suite, this way you would be able to run the full test suite against your adapter and see what's going on.

@flovilmart you can have a look on the code here
https://bitbucket.org/wahb/parse-server-dynamodb-adapter

I will migrate it later to github when I'm done with testing

some tests are now available

@benishak you're the best 馃憤

@benishak Keep up the good work! Will test when ready, would love to see some cost savings switching to DynamoDB!

Would plain old SQL be worth considering?

thanks everyone for the support, I will share the the github repo as soon as i'm done.

@dplewis
I'm not sure what you mean but in DynamoDB doesn't support SQL, it has a custom Expression Language which looks like this

#field1 = :value AND #field2 < :value2

Now I'm converting the Parse Query which is compatible to MongoDB Query Format ( using JSON ) to DynamoDB expression

@benishak sorry I didn't want to open up a new discussion about adding SQL as a database option. Also good work I can't wait to see it

@flovilmart In DynamoDB you cannot use $ne operator on _id, is there any reason why parse is checking the _id in this case ?

{username: this.data.username, objectId: {'$ne': this.objectId()}}
...
{email: this.data.email, objectId: {'$ne': this.objectId()}}

https://github.com/parse-community/parse-server/blob/6cc99aa193f7b733a04748a543c48d2f3ee7fc85/src/RestWrite.js#L420

What is Parse trying to check here ? I replaced that queries with

{username: this.data.username}
...
{email: this.data.email}

And nothing breaks, can I do a pull request, or it is a change that can break things?

It's checking the usernames and emails are not already taken by another user.

@flovilmart I know that and I can see that, I'm asking what is the sense of passing {'$ne': this.objectId()} to the query. It just makes no sense, I think it is safe to remove it and only pass username or email to the query like this

{username: this.data.username}
...
{email: this.data.email}

no it isn't that will return the user itself.

so what can I do? this is not going to on DynamoDB. In DynamoDB you cannot apply not equal on primary/range key (objectId).

I do really see no sense in passing {'$ne': this.objectId()}, if username exists the result.length of the query {username: this.data.username} will be > 0 anyway!

Yeah but this is something at the core of the server, I understand your frustration, but I'M not really happy changing something like that because Dynamo doesn't support it, what will you do to support all the queries on objectId then?

In DynamoDB there is 2 keys

  • partition key (required) : it defines your partition and it is unique, I'm setting this to the className
  • sort key (optional) : is the primary key of the partition, it can be anything you want _id or username, but it needs to be able to apply it to all partitions (classes), thus I pick _id. Having a sort key helps index your data, otherwise you have to create multiple indexes, it also ensures uniqueness.

You don't need to setup a sort key, so I can just not set a sort key but this will make inserting multiple items with the same objectId possible and make getting items by its id which is incredible fast on dynamodb impossible.

I actually think this part of code is not correct in parse and shouldn't be like that. I will do some tests and come back to you.

in DynamoDB you can use $eq, $gt, $gte, $lt and $lte, BETWEEN, BEGINS_WITH on objectId, in all other keys you can use anything else like $ne, $not, $in, $nin, $regex (contains), $exists, $type, $null, $undefined

I do really not see any usefulness of using $ne in general on the database primary key, I think many databases doesn't allow this either not just DynamoDB.

I only have this problem with SignUp user otherwise who uses DynamoDB should proceed the best practice to query their data.

So I don't think this is the only way to check the existence of a username or email

We don't have he problem with postgres either. I also read on dynamo that you can apply filters that are run a-posteriori. And this is most probably what you should use for all the constraints on the objectId

Thanks!
In DynamoDB there is 2 kinds of Filters/Query Conditions

  • KeyConditionExpression / KeyConditions: Query on the partition/sort key, For KeyConditions, only the following comparison operators are supported: EQ | LE | LT | GE | GT | BEGINS_WITH | BETWEEN and you cannot include attributes that are not partition or sort key, all queries MUST contains at least the partition key.

  • FilterExpression / QueryFilter: Filter to apply on other non-key attributes and support all kind of comparisons

A Query can contain both KeyConditions and QueryFilter

params = {
    TableName: "Music", 
    KeyConditionExpression: "className = '_User' AND _id = 123467",
    FilterExpression: "username = 'benishak'
}

Dynamo.query(params, callback)

So back to the checking the existence of a username, I did some tests, here is a way to make it works in all for DynamoDB and not break anything, that query is only useful if an existent user tried to change his username to the same username he is already have

// ...
let _this_objectId = this.objectId();
return this.config.database.find(
    this.className,
    {username: this.data.username},
    {limit: 1}
  ).then(results => {
    if (results.length > 0) {
      let foundUser = results[0];
      if (foundUser.objectId === this.objectId()) return;
      throw new Parse.Error(Parse.Error.USERNAME_TAKEN, 'Account already exists for this username.');
    }
    return;
  });

// do the same for email

I think this is a reasonable solution, isn't it?

I doubt that this will be the only thing that require a change to be compatible with the whole test suite.

How do you run the tests?
You should probably:

  1. Fork the repo
  2. Add your database adapter
  3. In spec/helper.js load/configure your adapter
  4. Setup Travis accordingly
  5. Probably apply your signup patch

And report back with the failures.

@flovilmart
I'm having this temporary repo, you can check the tests
https://travis-ci.org/benishak/parse-server/jobs/242707542

After a while I start getting these errors, the connection is closed and never opened again

- Error: listen EADDRINUSE :::8378
      - Failed: There were open connections to the server left after the test finished 
...
 Failed: XMLHttpRequest failed: "Unable to connect to the Parse API"

my plan is to have this as a separate adapter, not to be included in parse core project, at least at the moment, but this is solved some testing issues at the moment, so I'm not planing to make PR here.

Some tests are expected to fail cause of the limitation of DynamoDB, but I'm working for improvement ...

@benishak when you're done i will use your adapter as a reference for writing one for Neo4J graph database, keep up the good work!

for now most tests are passing for Parse.Query

TL;DR

the falling tests are expected to fail because of these reasons

  • DynamoDB doesn't support matching strings using $regex, however I made startsWith and contains work with simple strings, like startsWith('foo'), contains('bob'), but I try to improve this at the moment
  • DynamoDB doesn't support skipping, however it does support pagination, skip should include an ObjectId from where DynamoDB should skip results, I try to make this possible, but pagination using $gt / $lt should work on any key.
  • DynamoDB doesn't support sorting on keys other than objectId, however I may do sorting on the adapter layer
  • containedIn array should not include more than 100 values!
  • DynamoDB cannot store bytes as "Byte"

However

  • query.each and query.each async should succeed, they succeed sometimes, but sometimes they timeout, I'm looking into this at the moment
  • I don't know when you insert an array of Pointers they are not transformed to strings like ['_User$1234', '_User$1230' ...], using the attribute name __type is an issue in DynamoDB, it seems to be a reserved keyword ... I can try to transform the array or the attribute __type, using _type or type or ___type is just fine.

I also solved the issue of User SignUp and applying filters/query on the objectId by duplicating the field, this make fetch,includeandgetqueries fast because DynamoDB is hybrid key-value document-store database, I check if the query includes the key_idI usegetItemotherwisequery` method.

I'm working to make more tests succeed as long as it is supported by DynamoDB or a reasonable workaround is possible.

The current stat of all tests, we are getting closer ...!

Executed 1269 of 1278 specs (147 FAILED) (6 PENDING) (3 SKIPPED) in 10 mins 48 secs.

https://travis-ci.org/benishak/parse-server/jobs/243806546

current status

Executed 1241 of 1271 specs (57 FAILED) (27 PENDING) (3 SKIPPED) in 11 mins 46 secs.`

Most of the 57 Failing tests are compatibility issues:

  • Unique Indexes : DynamoDB doesn't support this, however I'm thinking in creating a collection with the name _UNIQUE_INDEX and insert information about unique fields in it, I will load the content of that collection in inMemoryCache at Adapter initialization and before inserting/updating I will check for duplication by issuing a query.
  • Date : DynamoDB cannot store date objects, I transform them into 'string', Parse seems to handle this without problem. While reading data, I can see createdAt, updatedAt and other date attributes having the correct value but some tests are trying something like changedAt.getTime() which fails because changedAt is a string, I will solve this by loading the schema at Query then check fields with date types and transform them back to date Object

  • GeoPoint: Storing/updating and simple query is fine, however $near and similar operation is not supported and I have no solution for it

Other compatibility issues:

  • Skip : Can be solved if Parse.Query.skip() can pass a string, by passing an objectId, skip is doable)
  • Sort : After sort is doable by local sorting however if you want to sort before you query the data, this is going to be impossible!
  • Matching string with RegEx : except for contains('string') and startsWith('string') there is no solution for this

Otherwise :
these queries are failing and succeeding randomly :

  • Parse.Query query.each
  • Parse.Query query.each async
  • Parse.Query async methods
    assertion is being executed too early, but I debugged the outputs of the query and they are as expected!
  • I also excluded ParseGlobalConfig tests because of the tests is setting the objectId to an integer! so I'm not sure whether I need to change my code or change the test to make it work ?

what do you think @flovilmart ?

update on uniqueness

I managed to implement uniqueness at the application layer, checking duplication will be fast as DynamoDB is document and key-value store. However in async code, it is not sure that the uniqueness is ensured.

stable release
https://github.com/benishak/parse-server-dynamodb-adapter

also available via npm
https://www.npmjs.com/package/parse-server-dynamodb-adapter

I will include the parse-server specs in the tests soon. Till now all tests are passing except 18 tests are failing due to limitations in DynamoDB.

enjoy!

Awesome Job! If you wanna add the link to the module on the parse-community.github.io and docs, feel free to open the PR鈥檚!

failing due to limitations in DynamoDB.

Limitations with the test cases or permanent limitations due to DynamoDB? I'm confused - if the tests are failing, not 100% pass, shouldn't that mean that DynamoDB can't really fully utilize the Parse platform and there will be broken features?

@flovilmart Thanks! sure I will!

@nitrag
Not necessarily, the specs are already skipping some tests for other databases too.

I guess permanent limitations due to DynamoDB, It could be solved in the future if AWS add these features.
I listed compatibilities in the repo, you can check them
Major limitations includes
Skip (although this is due to Parse SDK cannot pass string, however skip in DynamoDB can be done by passing the objectId
Sort : you can sort by objectId but not by other keys
RegEx is not supported but you still can use startswith and contains with strings
You cannot have object bigger than 400KB
You cannot use $in operator with an array larger than 100 when using containedIn
GeoPoints operations like $near ... etc are not supported but you can save GeoPoints as object or array
Uniqueness cannot be guaranteed if you insert data in parallel but this is not an issue, I don't think that 2 users or more in this world will come to signup with the same username at the same moment and their finger hit the submit button at the same exact moment.

I tried my best to make some operations works with workaround like sorting which work but after the results are found, this is okay, unless you want to combine it with skip for pagination or so.

Otherwise there should be no issue using DynamoDB with Parse, a lot of apps doesn't need RegEx, GeoPoints, storing large arrays ... etc

Was this page helpful?
0 / 5 - 0 ratings