Airflow: Cloud Memorystore for Memcached operators

Created on 14 Apr 2020  路  8Comments  路  Source: apache/airflow

Description

Hello,

Airflow has extensive support for many GCP services. However, we lack integration with the Cloud Memorystore for Memcached operators service.

I think it would be very useful to have a similar set of operators as to CloudMemorystore for Redis

Before starting work, I recommend reading the GCP Service Airflow Integration Guide

The final work should contain:

  • How to guide - (Similar to https://airflow.readthedocs.io/en/latest/howto/operator/gcp/natural_language.html)
  • API Reference: - (Similar to https://airflow.readthedocs.io/en/latest/_api/airflow/providers/google/cloud/operators/natural_language/index.html#airflow.providers.google.cloud.operators.natural_language.CloudNaturalLanguageAnalyzeEntitiesOperator)
  • Example DAG: https://github.com/apache/airflow/tree/master/airflow/providers/google/cloud/example_dags
  • One operator
  • Hook
  • Unit tests
  • System tests (Similar to https://github.com/apache/airflow/blob/master/tests/providers/google/cloud/operators/test_natural_language_system.py)

If you haven't used the GCP yet, after creating the account you will get $300, which will allow you to get to know this service better.

The implementation of this task will allow a better understanding of GCP services, as well as learn methods of testing and documenting the code that is required by the community. If anyone is interested in this task, I am willing to provide all the necessary tips and information.

Use case / motivation

N/A

Related Issues

N/A

feature Google

Most helpful comment

Hello @tanjinP, it's great that you're willing to work on this. I'm looking forward to use these operators 馃

Thanks again for the reminder - definitely in the works, draft PR coming very soon, I promise 馃槵

All 8 comments

@mik-laj you can assign this to me 馃檪

@mik-laj just a few questions and concerns before I dive deeply into this:

  1. The Memorystore Redis Hook and Operator have a 'generic' name (cloud_memorystore.py). This new one should probably be specific (cloud_memorystore_memcache.py). Is this okay?

    • Once this is complete and implemented, it might be worth going back to refactor the Redis one to be more specific (suffix with _redis.py)

  2. It seems like there is an opportunity to have a single hook/abstraction for both Memorystore options, but they also have different Python client libraries (Redis here and Memcache here). They do have a lot of the same methods, but there are a few differences. So that leads to me conclude they should have their own hooks (keep existing one for Redis and make a new one for Memcache) at least until the libraries mature some more. At this time the Memcache feature is still in beta. What do you think?
  3. I have gone through the Google documentation and will use #5957 as a guide for the approach. I plan to do some small cleanups like consistent docstrings if the opportunity is there. Let me know if that is fine.
  1. I would prefer that you still have one file, but you can create multiple classes in one module.
  2. Yes. I think it's a good idea to create a new hook in this case.
  3. This is fine if you only change things related to the Memorystore service.

@tanjinP Do you need help? If you don't have time, would you like to release the ticket so other people can work on it?

Sorry about the delay. I will begin working on this ticket over the weekend and will have a draft PR at the very least in a couple of days

Hello @tanjinP, it's great that you're willing to work on this. I'm looking forward to use these operators 馃

Hello @tanjinP, it's great that you're willing to work on this. I'm looking forward to use these operators 馃

Thanks again for the reminder - definitely in the works, draft PR coming very soon, I promise 馃槵

No worries at all ;-)

Was this page helpful?
0 / 5 - 0 ratings