Amplify-cli: @searchable directive does not index existing documents

Created on 3 Sep 2018  路  13Comments  路  Source: aws-amplify/amplify-cli

Do you want to request a feature or report a bug?
Bug

What is the current behavior?
When adding @searchable directive to an existing model (one with entries already in Dynamo), the ElasticSearch service does not index existing documents. It will only index new documents added after the @searchable directive is added.

Steps to reproduce the issue:

  1. Create a model in /amplify/backend/api/schema.graphql
  2. Use amplify push to deploy CloudFormation template and publish API
  3. Use AppSync GraphQL explorer to create new documents for that model
  4. Go back to schema.graphql file and add @searchable directive
  5. Use amplify push to update resources

What is the expected behavior?
Elasticsearch service should index all existing documents for a given model, and not just new ones.

Additional environment details

  • OS version (ie Windows 10 build X, macOS Sierra 10.12.6, etc.): Ubuntu 18, Node v.8.9.3
  • Output of amplify --version: 0.1.16
  • Did this work in previous versions?: N?
feature-request graphql-transformer

Most helpful comment

+1. I am running into this too. Is there a workaround in the meantime like using https://www.npmjs.com/package/dynamodb-to-elasticsearch to backfill the data?

All 13 comments

This is a feature request. Backfilling data is a goal in the future but is not immediately in scope.

+1. I am running into this too. Is there a workaround in the meantime like using https://www.npmjs.com/package/dynamodb-to-elasticsearch to backfill the data?

+1 for this feature. Is there an easy way to manually do this?

@mikeparisstuff +1 for the feature.

+1 for this feature.

+1 for this feature

+1 enterprise must-have.

If you have DDB Stream -> Lambda -> ES setup then you could easily add a field e.g. 'migration_run:1' to all existing records in a DynamoDB table to stream it to Elastic?
Put some throttling or scheduling in place to not impact normal operation.
That's how AWS would do it 馃槅

Any updates on this? Without this, it makes it pretty hard to use @ searchable. Even if we had a manual workaround that would be great.

@malcomm The fix is fairly simple. Create a migration Lamda function. Query all documents (or query a batch and re-run). Make it update one "migration" field in each document. DynamoDB will then sync the entire document to Elasticsearch. You can then delete the "migration" field or keep it around for a future DB migration.

Hi everyone - we've added a section in our documentation explaining how to backfill existing data https://docs.amplify.aws/cli/graphql-transformer/directives#backfill-your-elasticsearch-index-from-your-dynamodb-table

Thanks @brene the script works beautifully! For anyone looking to put together the script by brene, you can find the _Lambda function ARN_ in AWS > Services > Lambda. There really is a lambda generated called DdbToEsFn to my surprise as I didn't know since it was auto-generated. Also, you can find the _Event source ARN_ in AWS > DynamoDB > <Your Table> > Overview Tab > Stream details

It would be even better if the CLI did this easily via something like:
amplify api reindex

  • Delete Existing indexes
  • Repush all @searchable fields via the above lambda
  • Cleanup
Was this page helpful?
0 / 5 - 0 ratings