The word list for wallet recovery keys needs to have all inappropriate, emotionally-charged words removed. Filtering should err on the side of caution.
Not easily reproduced, since the words shown are different with each invocation. Bottom line: sometimes, inappropriate words are shown, due to their inclusion in the dictionary from which the words are selected. (E.g., a user reported seeing "pedophilia" in the set of words generated; this was not well-received.)
Actual result:
Infrequently, inappropriate/charged words will appear in wallet recovery key word lists.
Expected result:
No inappropriate, emotionally-charged terms should appear in Brave-generated word lists.
Reproduces how often:
Rarely, but "never" is a requirement.
about:brave info:
0.19.139
Reproducible on current live release:
Yes.
here's the word list: https://raw.githubusercontent.com/diracdeltas/niceware/master/lib/wordlist.js. i need folks to help look through it for inappropriate words and list them in this issue.
once that is done i will replace each of the inappropriate words with a friendly word that isn't already in the dictionary. then i will add a migration method in niceware to deal with existing users of Brave Sync/Payments so that their old words will still work for recovery.
Sharing some words I found which some folks may classify as inappropriate
(will continue to update list)
'anal',
'circumcise',
'circumcised',
'circumcising',
'coon',
'defecate',
'defecation',
'decapitate',
'decapitation',
'dehumanization',
'dehumanize',
'dehumanized',
'dehumanizing',
'enslave',
'enslaved',
'enslavement',
'enslaver',
'fecal',
'genitalia',
'gook',
'gooky',
'heterophile',
'heterosexual',
'heterosexuality',
'homosexual',
'homosexuality',
'incest',
'incestuously',
'intravaginal',
'invagination',
'jap',
'jew',
'jigaboo',
'midget',
'murder',
'murderee',
'murderer',
'murdering',
'murderously',
'necrophile',
'necrophilia',
'necrophilic',
'necrophilism',
'pedophile',
'pedophilia',
'pedophiliac',
'pedophilic',
'penile',
'prepubescence',
'prepubescent',
'prostitute',
'prostituted',
'prostituting',
'prostitution',
'pubic',
'rape',
'raped',
'scat',
'scatologic',
'scatological',
'scatology',
'sodom',
'sodomite',
'slave',
'slaver',
'slaverer',
'slavering',
'slavery',
'subhuman',
'torture',
'torturer',
'transsexual',
'transsexualism',
'turd',
'urinal',
'urinary',
'urinate',
'urination',
'urine',
'urinogenital',
'vagina',
'vaginae',
'vaginal',
'vaginate',
'vomit',
'vomited',
'vomiter',
'vomiting',
'vomitive',
'vomitory',
'wetback',
'wop',
I'm no expert on this at all, but the wordlist from a bitcoin wiki (https://en.bitcoin.it/wiki/Mnemonic_phrase#Word_Lists) seems a little safer than the niceware one, e.g. https://github.com/bitcoin/bips/blob/master/bip-0039/english.txt - is there a reason not to start with this one?
@petemill that list is only 2^11 words long whereas mine is 2^16. so if we used the BTC list the recovery codes would be 24 words instead of 16.
having said that, the words on the BTC list are generally easier to write so maybe it's not a big deal.
anecdotally my backup of 16 niceware words and 24 bip39 words took about the same amount of physical space. it's also nice that the bip39 words are unique after the first four characters
the other interesting note wrt bip39 is that the key derivation is actually wordlist agnostic. instead of mapping from words via index lookup to reconstruct the seed, the phrase is used as input to the key derivation function. interesting from an internationalization pov
BTC does use a pretty short word maximum, easier to write, also easier to deal with UI. Definitely not a reason to use it, but also a positive IMO.
A few more to tack onto the clifton list :
'zion',
'zionism',
'zionist',
'yankee',
'xenophobe',
'xenophobia',
'xenophobic',
'wristy',
'wristdrop',
'woodcock',
'womenfolk',
'womanize',
'womanized',
'womanizer',
'womanizing',
'wino',
'willy',
'wiener',
'wienie',
'welch',
'welched',
'welcher',
'welching',
'weiner',
'weasel',
'weaseled',
'weaseling',
'weaselly',
'wasp',
'waspier',
'waspily',
'waspish',
'waspishly',
'waspy',
'waif',
'waifing',
'vulva',
'vulvae',
'vulval',
'vulvar',
'vulvate',
'vixen',
'vixenish',
'vixenishly',
'vixenly',
'testicle',
'testicular',
'stripper',
'stripping',
'stript',
'striptease',
'stripteased',
'stripteaser',
'stripteasing',
'sperm',
'spermary',
'spermatic',
'spermatocidal',
'spermatocide',
'spermatozoa',
'spermatozoan',
'spermatozoon',
'spermic',
'spermicidal',
'spermicide',
'spew',
'spewed',
'spewer',
'spewing',
'slave',
'slaved',
'slaver',
'slaverer',
'slavering',
'slavery',
'slavey',
I think the EFF already did all this work for us? https://www.eff.org/deeplinks/2016/07/new-wordlists-random-passphrases
I think the QR code already did all of this work for us. ; )
We decided in Slack that new sync/wallet users will get a bip39 wordlist instead of a diceware wordlist. The old diceware words for existing users will still work, however.
So basically the sync/wallet recovery code will do:
if (recoveryWords.length === 16) {
// recover using niceware
} else if (recoveryWords.length === 24 {
// recover using bip39
}
To generate code words for new wallets / sync chains, just use bip39.
closing in favor of https://github.com/brave/browser-laptop/issues/13313