In the last couple of months, transformers
has seen an exponential increase in interest; you have exceeded 20k stars, congrats! @thomwolf wrote a blog post on how to open-source your code for a larger audience, but as expected, a side-effect is that you'll get more issues and more pull requests that need to be monitored. Not too long ago there were only 300 open issues, and now we're at 375. On top of that, many issues are closed by the stale bot and not even _actually_ solved, which is unfortunate.
I am no expert in the finer details of transformers and their implementation, but I often make do. When I have a free moment, I go over issues and see where I can help. Things can get frustrating, though, when general question about PyTorch or Tensorflow are asked, or when people have a question and don't fill in the template, or ask one-sentence questions. It makes me lose interest and enthusiasm to help out.
Not all of this can be solved, but perhaps it can be of use to direct a stream of questions to Stack Overflow. A few weeks ago I created the tag huggingface-transformers
, intended for users who have a question about their specific use-case whilst using the transformers library. Considering that it seems hard for you as a company to keep track of all issues (which, again, is understandable), I would propose to direct the "Questions & Help" of the issue template to Stack Overflow. In other words, keep Github for feature requests, bug reports, and benchmarks and models, but nothing else. That way, it is easier to keep an overview of _real issues_ without them piling up and getting closed by stalebot, and on top of that you get a huge (free!) support team which is the open source community that is active on Stack Overflow.
It is just an idea, of course, but I think it could help out in the logistics of things.
PS: the issue template also still refers to 'Pytorch Transformers' instead of 'Transformers'.
PPS: I am aware that I also still ask questions and that I am no expert in transformers by far, so I really don't intend to place this issue from atop my high horse. But due to the increased interest and following increased issues and question, it seems a good idea to direct future general questions to a more open platform.
Hi Bram, first of all we want to reiterate our appreciation for what you've been doing – the community is very lucky to have you.
You raise some good points. Would you like to update the issue templates, updating what needs to be updated + linking to Stack Overflow for support requests?
In the longer term, we've floated a number of different ideas:
Thoughts?
Thanks @julien-c for the nice words. It's not much, but I help where and when I can.
I think that the decision of how to support the community best depends on the answer of how much time/effort/resources you (as a company) can put into it. I don't mean the platform, but the people that dedicate time to provide support. I can imagine that this is not lucrative because you don't really get anything in return, so it is not an easy decision. It is an important one, though, because as you can see: when I posted this not even two weeks ago there were 375 open issues, now there are 404.
Three examples come to mind of types of support that I came into contact with:
Summary (but still quite long): if you plan to extend the resources that are going to issue support, I think the discourse forum is the best option. I wouldn't really bother with discord. Opening up Slack is nice, but it should be very clear what it should be used for, then. I wouldn't allow general questions to be asked there, but rather the more one-on-one questions concerning "I have a new model and tokenizer that I wish to add to transformers", i.e. the questions that you can discuss with words where you don't necessarily need to write whole blocks of code.
If you decide that spending more resources on support is not in your plan, then I would just move all general questions to Stack Overflow. I know it's "the easy" option, but I think it's the most viable one. All general questions in one place, tagged with the correct tag, and a whole community that can help out for general PyTorch/Tensorflow questions. On top of that, it's free advertisement, too, because your library will pop up here and there and will get noticed by others. Something you won't have on your private forum.
tl;dr
If you will put resources towards more support
If you won't put resources towards more support
Just my two cents, of course.
Reopened to trigger more discussion
These are very good points, thanks a lot for sharing and summarizing your thoughts @BramVanroy
Regarding to the Issue template. Currently the following "categories" can be used (opening a new issue):
I think it would be a good idea to automatically add labels for these categories! At the moment I can't really filter out bugs or general questions.
I agree that adding automatic labels would definitely make life easier when looking for specific issues. The templates are in a good place thanks to @BramVanroy, automatic labeling should be the next step.
we might want to look into code owners as a building block for this
we might want to look into code owners as a building block for this
Code owners might also seem like a good idea with respect to storing the README.md files of user models in model_cards/
, as you suggested yesterday. So that everyone can edit their own model card when need be. That being said, that might give more overhead (in the CODEOWNERS file) with not much benefit (reviewing changes to model cards shouldn't take a long time).
I propose the following automatic labels:
benchmark
model-addition
bug-report
(after review by a member, and verifying that it actually is a bug, the label should then be changed to bug
or another relevant label)feature
migration
general
If agreed, I can do a PR again. Discussion welcome, of course.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
I think @BramVanroy did most of this so closing this issue. Thanks Bram! 🤗
i would +1 opening a discord server. Its pretty great for creating a general point to congregate and categorising multiple subject-channels. I have lots of smaller questions about this project that I don't feel are appropriate for SO or a github issue.
i would +1 opening a discord server. Its pretty great for creating a general point to congregate and categorising multiple subject-channels. I have lots of smaller questions about this project that I don't feel are appropriate for SO or a github issue.
The problem is that with this kind of format there are billions of questions but barely any answers. spaCy's gitter is such an example. I guess something like that could be set up but without the guarantee of any response.
It's of course anecdotal, but i'm a member of many framework-related discords, and they're the most responsive places typically, compared to IRC, gitter, reddit etc. In my again anecdotal experience, gitter and github are the most barren places for any conversation. I suggest we just do it and see how it goes, its only 1 click to make a discord
@julien-c What do you think? Should we open a discord (without guarantee)?
Still dying for this :D
Found this thread while googling to see if the HuggingFace community had a Discord. Was it ever created? I feel like it would be a really nice place for people to discuss NLP stuff more freely and share their findings :)
@andantillon Nope, but we do have a forum!
Most helpful comment
@julien-c What do you think? Should we open a discord (without guarantee)?