Cntk: If the documents can be more concise and consistent, it'll be much better, especially for novices

Created on 9 Apr 2017  路  7Comments  路  Source: microsoft/CNTK

For example

From https://github.com/Microsoft/CNTK/blob/v2.0.rc1/Tutorials/CNTK_101_LogisticRegression.ipynb:

The aforementioned model parameter update using a single observation at a time is attractive since it does not require the entire data set (all observation) to be loaded in memory and also requires gradient computation over fewer datapoints, thus allowing for training on large data sets. However, the updates generated using a single observation sample at a time can vary wildly between iterations. An intermediate ground is to load a small set of observations and use an average of the loss or error from that set to update the model parameters. This subset is called a minibatch.

Different words for the same concepts, for example:

  1. observation, sample, datapoints, data set
  2. loss, error
  3. iteration, update

As a learner of Machine Learning, I don't think the brain likes this. Why not make it easy? For example:

The previous model parameter update used a single sample at a time, it's good that it doesn't require all the samples to be loaded in memory, but the loss can vary wildly between updates. The proper solution is to use a minibatch of samples in each update and use an average of the loss from the minibatch to update the model parameters.

I think writing excellent document is not so easy, MSDN or FreeBSD's documents are good samples.

All 7 comments

I don't work for Microsoft so I won't speak on their behalf, but I can say that some level of knowledge in neural networks is required to properly use cntk. I highly doubt developers are going to attempt to simplify concepts when most of the documentation is used by developers like themselves.

Documentation is currently CNTK's weakest point. Doc updates lag and/or not logged at all. Maybe consider opening the wiki updates to the community ?

I think it's better that the developer be responsible for the document, so that good developers are enough, otherwise there are also need for good document writers.

@playgithub your argument makes no sense

There are multiple good points here:
(a) Documentation is not consistent.
(b) Documentation is lagging.
(c) Documentation should be open to the public.

For (a) we really need an editor and we currently don't have one. So you will need to ping us if something looks inconsistent.

For (b) we are trying to address this with new material such as the notebooks under the /Manual folder.

For (c) feel free to submit Pull Requests to improve the notebooks. Before the move of our github wiki to docs.microsoft.com it was also possible to improve those docs by submitting PRs against our wiki repository. Now you can just send your feedback as issues here. For example, I just incorporated the suggestion here #1724 to our docs.

In the spirit of opensource, would appreciate if you can issue a Pull Request. I agree with your comment and anything we do to make text easier to read is beneficial to all who use the material. Looking forward to your contribution.

Will look forward to your pull requests for specific improvement. In the mean time we will do our best to improve our documentation keeping your feedback in mind. Closing this issue for now.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

shyamalschandra picture shyamalschandra  路  17Comments

fchollet picture fchollet  路  16Comments

nikosdim1 picture nikosdim1  路  17Comments

GuntaButya picture GuntaButya  路  16Comments

youssefhb picture youssefhb  路  27Comments