Notepad3 uses Google's "Compact Encoding Detector" (CED) to guide the encoding detection for final encoding decission on file load.
Some people think, Mozilla's (u)chardet would generate better results, they would like to switch to this Encoding-Detector.
To feed this discussion with data, I created an experimental Notepad3 version to test the differences between both Encoding-Detectors:
Please test development beta version _5.19.301.1628_XpErImEnTaL from Xperimental sub-dir.
(For beta channel, see issue #160) or download from my Google Drive.
This version shows in the titlebar both original detector results.

The result of new UCHARDET is used to calculate the final encoding result (depending on some other settings: fallback and reliability).
To get a final result close to the UCHARDET detector, please use following default encoding settings:

Don't forget to disable the file history (encoding of loaded files is persisted here).

Hello @RaiKoHoff ,
In attachment a test with RC files "(be_BY)" 馃槂

All settings are as you requested.
Do you want that I perform others tests. 馃
I would have liked it to detect the CP DOS-850
This is my test:

Hi @RaiKoHoff ,
**Notepad3 (64-bit) v5.19.108.1602** (out-of-the-box) and the encoding is changed from UTF-8 (no sign) to OEM(CP-850) and saved as try.bat Notepad3 (64-bit) v5.19.301.1628 XpErImEnTaL 馃槵 
Attachement: Try_bat.zip
Exactly, the same as I said.
Obviously too few input characters for both encoding detectors 馃槥
The easy way out for this cases are "file encoding tags" :

Ensure you didn't activate Don't use file encoding tags. option.
Yeah!
For me this works fine:
REM encoding: IBM850
Thanks a lot!
Some time ago I suggested using tellenc first, then CED. :smile:
Now I have another idea: back to tellenc.
But in another way: add a dialog in which to show tellenc's statistics for current file.
And save and load this statistics in simple configuration files that users can correct themselves.
@data-man : Yes tellenc is very lean in adding source-code to Notepad3 (only one file),
unfortunately, tellenc's "_one language-special-char occurrence frequency analysis_" is a little bit too simple for this purpose.
On the other hand, using both/three (CED and UCHARDET (and tellenc) ) for encoding detection and choose the best of two/three worlds, would be too much, since the corresponding detection intersection will be quite huge ... 馃
Nice idea, to show a selection dialog in case of ambiguity 馃槂
Development beta version _5.19.304.1630_XpErImEnTaL uses both detector results in final decision.
Ed.: This version also enhanced the commandline encoding selection (using all encoding possibilities).
(For beta channel, see issue #160) or download from my Google Drive.
New test with of "BAD Detection" of af-AF RC files.

Regarding last comment https://github.com/rizonesoft/Notepad3/issues/973#issuecomment-469394975:
CED has no confidence level but only reliable=true/false. To compare both reliability levels, I set CED(reliable)=75%_confidence - so CED detection won (52% < 75%).
Maybe 75% is too much, will reduce that to ???.
Found a small UCHARDET documentation, worth to know:
Uchardet uses language data, and therefore rather than supporting a
charset, we in fact support a couple (language, charset). So for
instance if uchardet supports (French, ISO-8859-15), it should be able
to recognize French text encoded in ISO-8859-15, but may fail at
detecting ISO-8859-15 for non-supported languages.This is why, though less flexible, it also makes uchardet much more
accurate than other detection system, as well as making it an efficient
language recognition system._Since many single-byte charsets actually share the same layout (or very_
_similar ones), it is actually impossible to have an accurate single-byte_
_encoding detector for random text._Therefore you need to describe the language and the codepoint layouts of
every charset you want to add support for.Notepad3 implementation uses UCHARDET code taken from:
https://github.com/PyYoshi/uchardet#supported-languagesencodings.
Unrelated, but probably not worthy of a separate issue...
Between 1633x and 1634x, did something change with regards to the HighDPI toolbar selection code?
Out of the box, on a 1920 x 1080 display:

Between 1633x and 1634x, did something change with regards to the HighDPI toolbar selection code?
Hello @RaiKoHoff ,
I confirm, @craigo- above issue, the HightDPI Toolbar is no longer selected by default ? 馃槈
Idem for v5.19.305.1636_XpErImEnTaL
Tested with: v5.19.305.1636 XpErImEnTaL

With "Training for Afrikaans" in "UCHARDET", the result of RC "af-ZA" detection is correct. 馃槃
My opinion:
@hpwamr , @craigo- : You are right: rework of loading external toolbar bitmap introduced this little problem, should be fixed with _5.19.305.1637_XpErImEnTaL.
Training capabilities of UCHARDET are limited, it is not a machine-learning Artificial-Intelligence :wink:
Please test version _5.19.307.1647_XpErImEnTaL.
I think, we should keep both detectors, to get the best of both worlds:
New[Settings2] options:
DevDebugMode=1
# Encoding Detector information in Titlebar (maybe later used for other output)
AnalyzeReliableConfidenceLevel=51
# Confidence/Reliability level for reliability switch in encoding dialog:
md5-c26c65e3ed2c54dee264785e7bffef7d
ReliableCEDConfidenceMapping=66 # if CED has reliable result, set this value for Confidence
UnReliableCEDConfidenceMapping=20 # if CED has not a reliable result, set this value for Confidence
Setting ReliableCEDConfidenceMapping to 100(%), reliable results from CED will win over UCHARDET results.
Setting UnReliableCEDConfidenceMapping to values grater than UCHARDET's Confidence-Level,
unreliable CED results will win over UCHARDET's result.
Setting UnReliableCEDConfidenceMapping to values grater than AnalyzeReliableConfidenceLevel,
unreliable CED results may be used, even with reliability switch ON.
I think, we should keep both detectors, to get the best of both worlds:
I like the UCHARSET analysis and philosophy, but YES, I agree with you, sometimes both detection methods are needed to improve the end result.... 馃槈
:+1: version _5.19.307.1648_XpErImEnTaL starts the detectors in parallel for large files, so the detection speed is optimized. Unfortunately the binaries grow in size because of the <futures> usage.
What is happening:

Notepad3 (64-bit) v5.19.309.1652 XpErImEnTaL
I've unckeked the option: 'Don't parse encoding file tags'
Versions v5.19.309.1652 and above are working fine for me ?
Please clear file History, maybe a wrong encoding is written there ?
There si nothing to clean: just I downloaded the version and make the try (the history was empty)
馃槷
There si nothing to clean: just I downloaded the version and make the try (the history was empty)
5.19.309.1654 is available.
For another test that I have done, it seems that the '_Encoding file tag_' must be placed in the 1st line (not in the 2nd).
Is it really mandatory?
In my opinon, the encoding tags is useful for parsing easily.
@lenny20 , @jczanfona : please see my comment for encoding tags : https://github.com/rizonesoft/Notepad3/issues/964#issuecomment-471358716
As we want to keep both encoding detectors, we need a reasonable default value for:
AnalyzeReliableConfidenceLevel=51
# Confidence/Reliability level for reliability switch in encoding dialog
(Currently set to 51%)
and
ReliableCEDConfidenceMapping=66 # if CED has reliable result, set this value for Confidence
UnReliableCEDConfidenceMapping=20 # if CED has not a reliable result, set this value for Confidenc
Last parameters are for balancing results between UCHARDET and CED ... 馃
To give CED more weight in decision making, I would like to raise its (initial default) confidence level of a "reliable" result to 85% (ReliableCEDConfidenceMapping=85).
(Means if CED's result is "reliable", it gets a confidence level of 85. UCHARDET's confidence level must be higher to beat CED).
What do you think?
(Yes, having a statistic would be better base for decision, currently this value is only a gut feeling)
Hello @RaiKoHoff ,
Tested with all my MUI Resources files with the new settings #1093
It seems to me OK. 馃
Hello @lhmouse and @lenny20
It would be very interesting to have a feedback from the Chinese tester's side ? 馃槈
Latest beta of today: Notepad3Portable_5.19.328.1666_develop.paf.exe.7z
@hpwamr : Did you switch on "Don't parse encoding file tags"? Otherwise there is no encoding detection, the file tags overrule the detection ;-)
Oops, I just forgot...
I will renew my tests and make a statistical table.
@RaiKoHoff Could you, please, check in "Exchange" the file "UCHARDET_CET_Detection_Rate.xlsx"
Let me know if you want more details. 馃
I have encountered no encoding detection problems so far.
With beta version v5.19.503.1689, the CED encoding detector has been removed from the binaries.
Some detection analysis shows, that there is few/no supplementary benefit from CED using it concurrently to UCHARDET.
(So the "debug" display, as shown above, changed accordingly.)
With the change from CED to UCHARDET, I did not encounter any detection issue.
As far as I am concerned, this issue may be closed....
Most helpful comment
With beta version v5.19.503.1689, the CED encoding detector has been removed from the binaries.
Some detection analysis shows, that there is few/no supplementary benefit from CED using it concurrently to UCHARDET.
(So the "debug" display, as shown above, changed accordingly.)