I installed CNTK for Windows v 1.1 GPU, and tried to run the Tutorial from this link https://github.com/Microsoft/CNTK/wiki/Tutorial.
The data is generated with the provided python script, but there seems something wrong with the format:
CNTKCommandTrainBegin: Train
LockDevice: Locked GPU 0 to test availability.
LockDevice: Unlocked GPU 0 after testing.
LockDevice: Locked GPU 0 for exclusive use.
attempt: Unexpected character('.') while reading a sequence id at the offset = 1
, retrying 2-th time out of 5...
attempt: success after 2 retries
Creating virgin network.
Post-processing network...
I tried to add "skipSequenceIds = true" to the reader section and the error seems disappear.
However, there is still the error:
Post-processing network complete.
Created model with 9 nodes on GPU 0.
Training criterion node(s):
(none)
EXCEPTION occurred: TrainOrAdaptModel: No criterion node was specified.
[CALL STACK]
> Microsoft::MSR::CNTK::MatrixBase:: operator=
- Microsoft::MSR::CNTK::MatrixBase:: operator=
- Microsoft::MSR::CNTK::MatrixBase:: operator=
- Microsoft::MSR::CNTK::MatrixBase:: operator=
- Microsoft::MSR::CNTK::MatrixBase:: operator=
- Microsoft::MSR::CNTK::MatrixBase:: operator=
- Microsoft::MSR::CNTK::MatrixBase:: operator=
- BaseThreadInitThunk
- RtlUserThreadStart
I checked the Train section, and the criterionNodes is defined just as in the Tutorial:
criterionNodes = (lr)
I don't know what is wrong with my .cntk file. Can anyone give some suggestion?
Sorry, my fault, I'd updated configs to take the data in the new format, but overlooked the python script to generate the datasets. Anyways, the script is fixed now, please give it another shot.
I'm also having the same problem, thought I was being lame and missing something, so I decided just to download the script provided and still same thing. The only thing that was modified was a rootFolder variable I added, but everything else is the same.
This is the one I used:
https://raw.githubusercontent.com/wiki/Microsoft/CNTK/Tutorial/3Classes_bs.cntk
Along with the two data sets:
https://raw.githubusercontent.com/wiki/Microsoft/CNTK/Tutorial/Train-3Classes_cntk_text.txt
https://raw.githubusercontent.com/wiki/Microsoft/CNTK/Tutorial/Test-3Classes_cntk_text.txt
Using 1.1 if that matters.
Edit:
This script:
https://raw.githubusercontent.com/wiki/Microsoft/CNTK/Tutorial/lr_bs.cntk
.. and its corresponding data sets do not work for me either with same error and same rootFolder variable.
I also generated fresh data sets with the python script, used "sparse" and alias variables, still nothing.
Error that is popping up:
> CNTKCommandTrainBegin: Train
>
> Creating virgin network.
>
> Post-processing network...
>
> 2 roots:
> err = SquareError()
> lr = Logistic()
>
> Validating network. 9 nodes to process in pass 1.
>
> Validating --> labels = InputValue() : -> [1 x *]
> Validating --> w = LearnableParameter() : -> [1 x 2]
> Validating --> features = InputValue() : -> [2 x *]
> Validating --> p.z.PlusArgs[0] = Times (w, features) : [1 x 2], [2 x *] -> [1 x
> *]
> Validating --> b = LearnableParameter() : -> [1 x 1]
> Validating --> p.z = Plus (p.z.PlusArgs[0], b) : [1 x *], [1 x 1] -> [1 x 1 x *]
>
> Validating --> p = Sigmoid (p.z) : [1 x 1 x *] -> [1 x 1 x *]
> Validating --> err = SquareError (labels, p) : [1 x *], [1 x 1 x *] -> [1]
> Validating --> lr = Logistic (labels, p) : [1 x *], [1 x 1 x *] -> [1]
>
> Validating network. 5 nodes to process in pass 2.
>
>
> Validating network, final pass.
>
>
>
> 4 out of 9 nodes do not share the minibatch layout with the input data.
>
> Post-processing network complete.
>
> Created model with 9 nodes on CPU.
>
> Training criterion node(s):
> (none)
>
> EXCEPTION occurred: TrainOrAdaptModel: No criterion node was specified.
>
> [CALL STACK]
> > Microsoft::MSR::CNTK::MatrixBase:: operator=
> - Microsoft::MSR::CNTK::MatrixBase:: operator=
> - Microsoft::MSR::CNTK::MatrixBase:: operator=
> - Microsoft::MSR::CNTK::MatrixBase:: operator=
> - Microsoft::MSR::CNTK::MatrixBase:: operator=
> - Microsoft::MSR::CNTK::MatrixBase:: operator=
> - Microsoft::MSR::CNTK::MatrixBase:: operator=
> - BaseThreadInitThunk
> - RtlUserThreadStart
Thank you @raaaar . The data format is correct now, but the error remains. Does the version I use matter?
Post-processing network complete.
Created model with 9 nodes on GPU 0.
Training criterion node(s):
(none)
EXCEPTION occurred: TrainOrAdaptModel: No criterion node was specified.
[CALL STACK]
> Microsoft::MSR::CNTK::MatrixBase:: operator=
- Microsoft::MSR::CNTK::MatrixBase:: operator=
- Microsoft::MSR::CNTK::MatrixBase:: operator=
- Microsoft::MSR::CNTK::MatrixBase:: operator=
- Microsoft::MSR::CNTK::MatrixBase:: operator=
- Microsoft::MSR::CNTK::MatrixBase:: operator=
- Microsoft::MSR::CNTK::MatrixBase:: operator=
- BaseThreadInitThunk
- RtlUserThreadStart
It seems that after ndl -> bs transition, the tutorial no longer works with 1.1 release. The new release should be published in a few days, meanwhile please try building from source.
How to solve this problem even i defined the CriteriaNodes as mentioned in tutorial 1,
Training criterion node(s):
(none)
EXCEPTION occurred: TrainOrAdaptModel: No criterion node was specified.
Could you let me know which code version this is? You will need the 1.5 release, or a very recent master source code. Older versions would require the BS to be modified by adding explicit tags to all outputs:
p = Sigmoid (w * features + b, tag="output")
lr = Logistic (labels, p, tag="criterion")
err = SquareError (labels, p, tag="evaluation")
If the latest master or the 1.5 binary do not work for this, please let me know (it would mean that some versions got confused somewhere).
Thanks. As you explained 1.5 release is working fine.
Yep just did on 1.5 as well, as output I get:
50 numbered dnn files
an unnumbered dnn file
a ckp file
and an _AllNodes_.txt file.
One Warning, don't know if its normal or ok, but didnt seem to effect anything.
Warning: node name 'AllNodes' does not exist in the network. dumping all nod
es instead.
I was on the 1.5 GPU 1-bit SGD version using 0 as the deviceID
This looks about right. Do you get the expected objective value and error rate in the log (the values are given in the tutorial)?
The warning text is a little misleading. I will put this on the list to fix.
I wasn't able to find the exact text in the log, but I did find the final mini batch results and it was different but similar to the one in the example:
lr = 0.04158641 * 1000; err = 0.01101350 * 1000
I posted the log in pastebin here:
http://pastebin.com/m5pHcrEA
The weights though look good as far as I can tell. I posted them as well here:
http://pastebin.com/i24x2FKH
It also may or may not be worth mentioning that I saw in the log that I was not running 1-Bit SGD, I was unaware that I had to enable it manually even if I downloaded the 1-Bit SGD binary. It probrably said it somewhere and I just didn't read it.
Regardless, it seems ok, based on what knowledge I have about the system and neural networks in general, I haven't actually tested any weights in a program yet. Mostly just learning and getting the gist of CNTK. I'll try out the multiple classification as well and make sure it is working properly.
Thanks. 1-bit SGD affects the accuracy a little, so it must be enabled explicitly. It also requires to run the job under MPI.
I noticed you removed the Test command. You can run that and see if your evaluation error is the same as written in the Tutorial.
Let us know if you have further questions. Would you be OK with closing this one, and opening a new Issue with further questions you may have?
Yeah this was someone elses, and technically its closed, I just had a similar problem. But yeah and thank you, btw the multiple classification works as well. Thanks.
Most helpful comment
Could you let me know which code version this is? You will need the 1.5 release, or a very recent master source code. Older versions would require the BS to be modified by adding explicit tags to all outputs:
If the latest master or the 1.5 binary do not work for this, please let me know (it would mean that some versions got confused somewhere).