Hey @pcmill how is it going?
I wonder, what is your method to gather/create sentences?
I mostly (roughly) translate sentences from English. I made a list of categories and kind of filled them:
My sentences got merged this day actually, but I think I can add some more sentences that reference more locations in Belgium since they also speak Dutch.
@pcmill today also another big merge was done for Dutch sentences, also in collaboration with Sjoerd. There I also included many locations, but the more the better.
Just saw that the limit has gone up from 500 sentences to 5000. Therefore we need an additional 1500 for the launch. @pcmill @sroet, interested in setting up a different collaboration channel?
Sure, sounds like a good idea.
@danielsjf Where did you saw about the 5k limit?
If you go to the website, go to languages, click on the tab progress and you will see it. Under Dutch, it says 3600/5000.
@danielsjf Ok, I see, thanks. Russian is at 0 at the moment, I'm working on a fork, wasn't merged yet.
But it good to know which volume is required. Wasn't apparent for me before.
@danielsjf @pcmill Sorry, I will not be able to contribute constructively anytime soon. (currently in the middle of moving countries) I should be able to proofread, if required.
No problem @sroet, I will try my best to come up with some more sentences. You already made most of them anyway.
We are live on the website with Dutch, so I can close this issue :smile:
Finally! Thanks for the great work guys!
Awesome stuff y'all!
Submitted the news on Tweakers and now we are already at 2h30 recorded and 130 speakers ;-)
Dutch is one of our fastest growing datasets right now! Great work everyone!
Most helpful comment
We are live on the website with Dutch, so I can close this issue :smile: