One of the things we're struggling with is keeping track of how many people are using Kedro, what features they're using and how they're using it. These are important metrics that I think we'd benefit from in order to help focus development and make the feedback cycle more efficient and data-driven.
One solution would be implementing rudimentary telemetry in Kedro, obviously very much opt-in and anonymous aggregate statistics (as opposed to individual/potentially identifying statistics). Other tools that do this include Homebrew and DVC.
I thought I'd open this as a discussion to see how users feel about this and gather any ideas for the best way to do this.
If telemetry is introduced, I will definitely disable it as a Kedro user.
To gather more meaningful and useful feedback from users, I would suggest setting up a user community forum.
Here is a great example of TensorFlow:
https://www.tensorflow.org/community/forums
In addition, a web page (or a README.md in GitHub repository) in which Kedro users can show off
their use cases would be nice too.
Once set up, I will post my use cases.
(e.g. https://github.com/Minyus/pipelinex, https://github.com/Minyus/kaggle_nfl)
Could you give examples of which statistics you would ideally need?
Could you give examples of which statistics you would ideally need?
I don't think we have any concrete ideas yet and the statistics we'd collect would likely change, but at a start, we'd like to get a concrete grasp on the number of Kedro users, use of new features we release, and the like.
If telemetry is introduced, I will definitely disable it as a Kedro user.
Could you expand a little on why you'd disable it? (it's perfectly valid, and I disable telemetry wherever I can too, but it'd be good to understand your reasons)
To gather more meaningful and useful feedback from users, I would suggest setting up a user community forum.
I think this is an orthogonal issue, a user community forum will attract different kinds of users and provide a different kind of feedback mechanism, meaningful in its own way. It's a good medium to gather qualitative feedback and engage with the super committed users who care enough to post on a community forum, less so for accurate, qualitative data from everyday users.
I'd almost certainly opt out. Bringing Kedro to projects is always a negotiation, and I wouldn't want to make that harder by having to explain that we're sending telemetry data back and working through concerns there.
For personal projects I would likely not about sending usage statistics. For work related projects I would likely opt out to stay on the safe side.
Thank you all for chiming in! This issue can now be closed.
Most helpful comment
I'd almost certainly opt out. Bringing Kedro to projects is always a negotiation, and I wouldn't want to make that harder by having to explain that we're sending telemetry data back and working through concerns there.