Helllo,
We've introduced mypy, but it's not very useful if we don't have type annotations. I think it's worth starting with the providers package, which is most often used by end-users. Currently, our annotation coverage is 47%. I will be happy when we have 75% coverage for each provider.
0.000000 dingding
0.000000 opsgenie
0.000000 presto
0.000000 qubole
0.000000 samba
0.000000 zendesk
16.666650 ssh
16.666667 oracle
16.666667 yandex
16.666700 odbc
21.428550 databricks
25.000000 exasol
25.000000 jdbc
25.000000 postgres
25.000000 singularity
25.000000 snowflake
27.380950 slack
27.964227 amazon
33.333333 sftp
46.153850 imap
50.733332 microsoft
66.761375 cncf
67.857150 salesforce
69.255826 google
71.250000 ftp
83.333300 openfaas
87.500000 redis
87.921875 apache
88.333340 mysql
88.888900 papermill
96.875000 mongo
100.000000 celery
100.000000 cloudant
100.000000 datadog
100.000000 discord
100.000000 docker
100.000000 elasticsearch
100.000000 facebook
100.000000 grpc
100.000000 hashicorp
100.000000 http
100.000000 jenkins
100.000000 jira
100.000000 pagerduty
100.000000 segment
100.000000 sendgrid
100.000000 sqlite
100.000000 vertica
You can also check the results using the command below.
MP_DIR=$(mktemp -d); mypy --linecount-report ${MP_DIR} airflow/; cat "${MP_DIR}/linecount.txt" | grep providers | grep -v example_dags | awk '
$4 != 0 {
split($5, a, ".");
print 100.00 * ($3/$4), a[3]
}'| awk '
{ sum[$2] += $1; N[$2]++ }
END {
for (key in sum) {
avg = sum[key] / N[key];
printf "%f %s\n", avg, key;
}
}' | sort -g
If you decide to finish this ticket you don't have to do all the work yourself. One PR can only deal with a single file/provider and it's ok.
If anyone is interested in this task, I am willing to provide all the necessary tips and information.
Are you wondering how to start contributing to this project? Start by reading our contributor guide
Best regards,
Kamil Bregu艂a
CC: @ephraimbuddy @OmairK
This looks interesting, can you assign me this task @mik-laj
@OmairK I assigned you to this ticket.
I also added a small note to the first post.
If you decide to finish this ticket you don't have to do all the work yourself. One PR can only deal with a single file/provider and it's ok.
I'm interested too. We would do it one PR per provider I guess?
I'm interested too. We would do it one PR per provider I guess?
Yeah this sounds like a good idea.
Hi @mik-laj - I would love to grab at least one of these providers. May just need a little bit of time 馃槄
@coopergillan I assigned you to this ticket. When you create a PR, add a link to this ticket so that others can see your work.
Hi @mik-laj , I would love to help with one of the providers.
@rafyzg I assigned you to the ticket. Which provider is interesting for you?
I would like to help with this. I am new to the project, so I may need some guidance, if available.
I have no provider preference.
These are small providers and this can be a good start to the adventure.
If you need help, you can ask @ephraimbuddy or @OmairK, or on the #newbie-questions channel on our Slack channel.
@mik-laj Thank you! I will try celery!
@rafyzg I assigned you to the ticket. Which provider is interesting for you?
I will try first with one file in mysql providers.
I'd be interested in trying one or more of these too. I've been playing around with the Redis provider a bit this week so could start there if that works?
@scrambldchannel I assigned you to this ticket. :-D
I would like to help with this as well. I can start by working on the Discord provider.
@MagicTurtle2203 I assigned you to this ticket. :-D
@mik-laj I'd like to do another one. Is datadog available? If so, feel free to assign any others to me. :)
@mik-laj I'd like to do another one. Is
datadogavailable? If so, feel free to assign any others to me. :)
I think no one has started working on datadog so why dont you go ahead and make draft PR :smile_cat:
@scrambldchannel I assigned you to this ticket. :-D
Cool, thanks, I've created branch and commit in my fork but have been wrestling with setting up a proper dev env on this machine., will open a PR once I'm back in front of my proper dev setup.
Hey @scrambldchannel -> not sure if you know about the 'breeze' development environment for Airflow? It is super easy to work with and it can be set up in 5-10 minutes easily. I just updated the docs and we have now screencast describing how it works - each chapter has accompanying video screencast showing how it works: https://github.com/apache/airflow/blob/master/BREEZE.rst . I heartily recommend using it.
Thanks @potiuk, breeze is indeed cool, I was stuck on an older windows machine and thought I'd cut my losses for now. I just wanted to show I was still interested!
@mik-laj i'd like to try the grpc provider
@chipmyersjr I assigned you to this ticket 馃樅
I would like to give IMAP a shot
@mik-laj I'd like to get the postgres provider
I can start off with vertica and amazon/athena
Does anyone know what is the best practice for type annotations in test code? Is it recommended to also have type annotations in pytest/unittest functions?
Does anyone know what is the best practice for type annotations in test code? Is it recommended to also have type annotations in pytest/unittest functions?
My personal take - It's definitely less of a concern than in prod code. In many cases it is too much of a burden to add and maintain typing information, especially that code in tests get copy&pasted more than in the prod (https://stackoverflow.com/questions/6453235/what-does-damp-not-dry-mean-when-talking-about-unit-tests)
However when you do have complex structures and want to make use of autocomplete and mypy checking the types - feel free to add them in the tests as well.
Question:
If a method returns an object type that is not already imported in the file, what's the best approach?
botocore.paginate.PageIteratorfrom typing import TYPE_CHECKING
if TYPE_CHECKING:
from botocore.paginate import PageIterator
AnyOption 1 seems like a waste.
Option 2 works with mypy but not pylint
Error: E0601: Using variable 'PageIterator' before assignment (used-before-assignment)
1馃悈
@mik-laj 馃憢馃従 is there an update on what files remain?
in the first message, I included a command that allows you to check where the typing are still missing.
More info: https://mypy.readthedocs.io/en/stable/command_line.html?highlight=Report#report-generation
When i am working on typing coverqge i often use html report.
@mik-laj what is the status of this issue?
@turbaszek We still have many suppliers with coverage below 75 percent.
0.000000 dingding
0.000000 opsgenie
0.000000 presto
0.000000 qubole
0.000000 samba
0.000000 zendesk
16.666650 ssh
16.666667 oracle
16.666667 yandex
16.666700 odbc
21.428550 databricks
25.000000 exasol
25.000000 jdbc
25.000000 postgres
25.000000 singularity
25.000000 snowflake
27.380950 slack
27.964227 amazon
33.333333 sftp
46.153850 imap
50.733332 microsoft
66.761375 cncf
67.857150 salesforce
69.255826 google
71.250000 ftp
Hi @mik-laj can you assign me to this task ? I will try to deal with a one provider (postgres) at this moment :smile:
@SalAlba I assigned you to this ticket 馃悎 I am waiting for your contribution and more photos of cats.
I'm giving a shot at Zendesk now.
@mik-laj what is the status of this issue?
we don't have big changes since last update.
So, I will try my skills with jdbc
current status of type coverage for provides below 75%
0.000000 dingding
0.000000 opsgenie
0.000000 presto
0.000000 qubole
0.000000 samba
0.000000 zendesk
16.666650 ssh
16.666667 oracle
16.666667 yandex
41.074865 amazon
52.078754 microsoft
69.100860 google
Only big ones are left (Microsft, AWS and Google) If someone wants to contribute to this ticket, it is worth hurrying.
There are still 2 providers left, but we are making progress. Anyone wanna help? You don't have to work on the entire provider. Only selected files and modules are helpful too.
45.860268 amazon
69.098890 google
75.000000 postgres
77.619050 databricks
80.184239 microsoft
80.625000 ftp
82.449500 cncf
83.333300 odbc
83.333300 openfaas
87.421875 apache
87.500000 presto
87.500000 redis
88.333340 mysql
88.888900 papermill
90.178575 slack
90.909100 elasticsearch
94.532960 qubole
96.428575 salesforce
96.875000 mongo
98.076900 imap
100.000000 celery
100.000000 cloudant
100.000000 datadog
100.000000 dingding
100.000000 discord
100.000000 docker
100.000000 exasol
100.000000 facebook
100.000000 grpc
100.000000 hashicorp
100.000000 http
100.000000 jdbc
100.000000 jenkins
100.000000 jira
100.000000 opsgenie
100.000000 oracle
100.000000 pagerduty
100.000000 plexus
100.000000 samba
100.000000 segment
100.000000 sendgrid
100.000000 sftp
100.000000 singularity
100.000000 snowflake
100.000000 sqlite
100.000000 ssh
100.000000 vertica
100.000000 yandex
100.000000 zendesk
@mik-laj I will try amazon AWS SageMaker.
@Swalloow I assigned you to this ticket.
@mik-laj I added type hints to the rest modules of amazon provider in #11531. Its type coverage of amazon is 77.7. But I'm waiting for review of #11434.
I wonder whether should I unify these PRs and close old one. Could you advise me?
The assumptions from the first post have been met - each provider has coverage of at least 75%. Therefore, I consider the ticket completed. We can do more type annotations, but I believe that their absence is no longer a defect.
Thanks to everyone who supported this ticket with both contributions and reviews.
75.000000 postgres
77.619050 databricks
77.701421 amazon
77.777800 presto
80.625000 ftp
82.638900 cncf
83.155193 google
83.333300 odbc
83.333300 openfaas
87.421875 apache
87.500000 redis
88.333340 mysql
88.888900 papermill
90.178575 slack
90.909100 elasticsearch
94.532960 qubole
96.428575 salesforce
96.875000 mongo
96.914682 microsoft
98.076900 imap
100.000000 celery
100.000000 cloudant
100.000000 datadog
100.000000 dingding
100.000000 discord
100.000000 docker
100.000000 exasol
100.000000 facebook
100.000000 grpc
100.000000 hashicorp
100.000000 http
100.000000 jdbc
100.000000 jenkins
100.000000 jira
100.000000 opsgenie
100.000000 oracle
100.000000 pagerduty
100.000000 plexus
100.000000 samba
100.000000 segment
100.000000 sendgrid
100.000000 sftp
100.000000 singularity
100.000000 snowflake
100.000000 sqlite
100.000000 ssh
100.000000 vertica
100.000000 yandex
100.000000 zendesk
Most helpful comment
The assumptions from the first post have been met - each provider has coverage of at least 75%. Therefore, I consider the ticket completed. We can do more type annotations, but I believe that their absence is no longer a defect.
Thanks to everyone who supported this ticket with both contributions and reviews.