Keepassxc: Add ability to search by field with logic keywords

Created on 28 Jan 2017  Â·  17Comments  Â·  Source: keepassxreboot/keepassxc


Search currently searches all default fields and groups without the ability to narrow in on your goal. Logic keywords are also not utilized

This feature would allow named fields to be used in the search dialog to narrow down the search in large databases. It would also introduce the logical keywords AND and OR. Examples:

EPIC discussion new feature

Most helpful comment

The logic can be simplified using implicit AND/OR.
Specifying 2 or more value for the same field equals an OR for that field, specifying more fields equals an AND for those different fields

user:john -> search for entries that have john in username
user:john url:google -> search for entries that have john in username AND also have google in url
user:john user:alice -> search for entries that have john OR alice as username
user:john user:alice url:google -> search for entries that have john OR alice as username AND also have google in url
title:"Gmail" -> search for entries that have Gmail as exact Title

I think with this syntax you can make up a good part of all the meaningful searches.
An OR between 2 different field can be made with 2 different searches, the NOT operator isn't much valuable but I think we can use something like user;john (semicolon) or user:!john (colon + exclamation mark).

All 17 comments

That would be awesome. Also "notes" would be good

Since this ticket is far more complex and has been moved to several later milestones already, maybe there's a way to add a simple option for now:

Title is usually the main defining field of an entry, thus how about an option search in title only (next to case sensitive and limit search to selected group).

The original idea is amazing, but I'm afraid it won't happen for a long time because of its complexity.

Its not too complex, once you put in the ability to specify a field, it is a minor effort to also add conditionals. Although by default the conditional is AND, is OR really that useful? This could be accomplished with a setting of "Search for ANY keyword"

The logic can be simplified using implicit AND/OR.
Specifying 2 or more value for the same field equals an OR for that field, specifying more fields equals an AND for those different fields

user:john -> search for entries that have john in username
user:john url:google -> search for entries that have john in username AND also have google in url
user:john user:alice -> search for entries that have john OR alice as username
user:john user:alice url:google -> search for entries that have john OR alice as username AND also have google in url
title:"Gmail" -> search for entries that have Gmail as exact Title

I think with this syntax you can make up a good part of all the meaningful searches.
An OR between 2 different field can be made with 2 different searches, the NOT operator isn't much valuable but I think we can use something like user;john (semicolon) or user:!john (colon + exclamation mark).

I like @TheZ3ro's approach.

I think one of the most important parts is that by default, the search should be case insensitive and a substring search.

While title:gmail would return all entries which titles contain gmail (case insensitive), title:"gmail"should only return an entry with an exact match.

attributes should be searchable too attr:foobar, this will also fix #1290
I have some entries with attribute Token or templates with attribute _etm_type_PIN with protected information, currently is impossible to search entries with that attributes

Seems doable to me :+1:

I would propose a slightly different syntax:

  • Space implies an implicit AND match
  • Builtin fields such as Title, Username, Password, URL are available under some aliases: I would propose to match the Username field with any of these: user:…, username:…, localized term:… (as used in the UI, e.g. benutzername:…). Similar for others.
  • user:john - match part of field
  • !user: - field does not exist
  • user: - field exists
  • user:+ - field exists and is set
  • user:=john - match field exactly
  • user:= - match empty field
  • "john doe" - match john doe anywhere (vs. john doe, which would match john combined with doe anywhere). Any whitespace within "…" should probably be matched by any amount of whitespace (like a regular expression \s+).
  • \" - search for "
  • "account name":john - search for field name containing spaces
  • expression1 | expression2 - means "or". Separated by whitespace. "OR" or "or" aren't too bad either, but aren't language-agnostic and it may be more unexpected if one has to escape either "OR" or "or" (sorry). Lower associativity than implicit "AND".
  • (...) - grouping expressions ((foo:bar | foo:baz)), but might be extended to terms: foo:(bar | baz).
  • ! - Unary not operator. Separated to the left by whitespace or other operators: !user::John Should have high precedence (higher than the implicit AND)
  • :+i - separated by white space or operators. Set flag for case insensitive matching for this group
  • :-i - case sensitive matching, e.g. :-i user::John (:+i title:foo) surname:Doe (match user and surname case sensitive, but title case insensitive; similar to title:foo :+i user:John surname:Doe
  • :+p - search protected fields, too. Flags could be combined: :+pi, :-ip, :+p-i or separately written as :+p :-i.
  • @:+ - has attachments
  • !@: - has no attachments
  • @:foo - has attachment with name matching foo (similar for operator :=)
  • /:admin, /:=ssh - match partial/full group path

What I don't like too much is that user:John Doe does not do what might be expected (one of user:"John Doe" or user:John user:Doe would), but interpreting user:John Doe as user:"John Doe" would make parsing harder (requires following context to interpret token) and cases such as user:John [email protected].

Invalid expressions could be searched for as ANDed search terms in all fields with or without warning (not sure what's preferable).

Not a very formal spec so far, but that would mostly follow from implementing the parser (and then the documentation). Extensions for additional operators and flags would be easily doable: title:~foo.*bar for regular expressions (needs some escaping considerations at least), but this should already cover most related issues in the tracker.

I would offer to implement that feature in this or a similar form later this month if syntax and features can be agreed upon.

I really like you proposal but this seems too complicated for Average Joe that just want to search his password.
For example another user pointed out Thunderbird's QuickSearch that has an awesome UX implementing the type of filters we are looking for.

IMHO much of specification as "useless" (Keep It Simple), I don't think I've ever searched for an entry and wanted to filter out entries that doesn't have an username set (for example) or for example the explicit definition of case insensivity. I would go for insensitive by default and "..." for exact-sensitive-match.

Anyway there are parts of your specification that I like, namely the protected field search, the in-attachment search (search for an attachment name) and the field localization.

About the field localization, this is a very tricky one. Translators can translate "user" with various different strings and the end user must match the translated string otherwise the query won't work.
So we should always support english terms and also localized terms. For example if I have language set to Italian I should be able to use both "user" and "utente".
But I will appreciate a different approach and less error-prone for the end user. (See Thunderbird's QuickSearch above, they go this easily)

Every contribution is appreciated!

It seems that I missed that there are even more issues regarding search open than I checked. Most features of such a text-driven search interface are of course opt-in if some care is taken with respect to the syntax. It is of course way harder to map these features to a more user-friendly UI. QuickSearch is a great idea, I will try to participate in that discussion. And hopefully contribute some code, too.

I took @luzat's list and changed it to fit what I would do. Simplified it quite a bit. Many things stayed the same, some inconsistencies where resolved and some details where changed. When I talk about GUI selections, I'm referring to my proposal from https://github.com/keepassxreboot/keepassxc/issues/1317. (Is there any way to link to a specific post instead of a whole issue?)

  • Space implies an implicit AND match.
  • Specific fields can be searched by fieldname:value. The field name is case-insensitive. The value's case-sensitivity depends on the option selected in the GUI. The value is searched for with a contains and not with a matches. That is, it works just like the search works right now.
  • Tokens that do not contain a : are searched for in all fields selected in the GUI.
  • all: is available to search in all fields, even if not selected in the GUI.
  • @: is available to search attachment names.
  • Built-in fields such as Title, Username, Password, URL are available under some aliases: e.g. user: and username: work.
  • Built-in fields are available under their localized name such as benutzername: or utente:
  • English field names are always available.
  • Translations can provide their own aliases. (Maybe not immediately needed?)
  • - is the unary excludes operator. user:-john excludes all results where the string "john" is part of the user field. Higher precedence than AND and OR. This operator is already used by Google and tbh, even with a programming background I wouldn't think of ! as my first try for exclusion.
  • "val ue" disables treatment of the space as an implicit AND and includes the space in the search value. Works on field names, too (do field names even support whitespaces internally?). The search is still case-insensitive. Since users expect this to mean "exact search", they probably want to switch from contains mode to matches mode, too.
  • Any number of whitespaces , whether surrounded by "..." or not, is treated as a single space.
  • (...) is used for grouping, e.g. (foo:bar | foo:baz), or even foo:(bar | baz).
  • expression1 | expression2 - means "or". Does not need to be separated by whitespace. Lower precedence than AND.
  • OR is available as an alias to |. Uppercase only, and needs to be separated by whitespace. This is already established by Google in all language versions, so there are probably many people who expect this to work: https://support.google.com/websearch/answer/2466433?hl=en
  • For consistency with OR an explicit AND is also available.

@TheZ3ro found the protected field search a good idea. Maybe this can be lumped into all:?

Edit: Just fixed some typos.

Like I said above localized field can be painful, I think we can put them aside for the moment (and maybe include them later).

I agree for most of the points except for:

  • remove the all: keyword, since the behavior by default (with no GUI filter applied) will cover all the fields it's redundant.
  • remove grouping. We can use foo:bar foo:baz instead of (foo:bar | foo:baz) as implicit OR so we don't need grouping and query can be more simple but still meaningful

Also note: custom field can have 1 or more emaningful spaces inside them so a thing like

"foo bar":"john"
"foo   bar":"john"

are valid queries but they will return different results

  1. ORing: foo:bar foo:baz should be OR'd? How'd you express AND? AND is probably the most popular operator. Most implementations seem to either provide a selector "match all/any" or parens if they support AND and OR. OR is represented as OR or , (Google/eBay). Both are preferable to a | (as is - to !). Given that this will for any syntax require some sort of parser, it's not really much work to add support for parens, at least for whole expressions. Implementation of ORing string literals (as in foo:(bar OR baz)) could be overkill.

  2. Localized fields: The localized field can be implemented by just mapping the hard-coded english long and short term (title/user/pass or even t/u/p would be more popular than username/password) and the tr(...)anslation to those fields - don't see complications there? Only problem I see: They may match custom attributes of that name, but if this is really were a problem for somebody one could interpret field names within " as always being attributes (u:foo matches the Username field, "u":foo matches only the custom attribute u). Probably overkill, though.

  3. Whitespace: For matching whitespace I would suggest to always match /\s+/ with /\s+/, spoken in regex (just like most search machines would do?). Doubt that many will ever notice any differences when matching whitespace between those options, so it's not that important.

  4. Finally, just do it like Google: Everything is very close to Google's syntax already. Would there be any real-world use case which isn't covered by Google's simple syntax and matching (possibly minus * wildcard matches)? It's easily extendable by additional operators if there is a need. It boils down to just:

  • Match any amount of whitespace to any amount of whitespace
  • Use " to concatenate words such that they have to appear in that order
  • Use AND by default
  • Support -, OR, (), field: (including some special field names)
  • Use prefix + for exact match (that is, field must match completely) (Google also uses + for a slightly different sort of exact matching, but close enough)
  • has:field, empty:field, sshagent:true, @:/attachment(s):, ingroup(s):/belowgroup(s): etc. (need not all be known or implemented directly)
  • Allow the "field with whitespace":value syntax, possibly having "field":value always refer to attributes, not the built-in fields (see 2. above).

The implementation isn't hard, the syntax is understood better than any other and it solves just about every real-world query need that I could think of. For edge cases, one should mostly follow Google's syntax and semantics. It'll be hard to get better usability except for very esoteric use cases.

I've looked into Google's search operators for Gmail and I think that's the way to go and it's already well documented and widely used.

I think the list above is mostly fine, I would remove the + and use " for exact match since " for custom field always means exact match so we have a consistent behaviur.
Adding support for ORing with {}

The problem with localized string:
Let's assume we made tr("user") available, translator can translate it to a string str = tr("user").
Now an end user enters a query like tr("username"):"bob" and results are wrong. They will blame us and we can't help him, we can't document every "standard" field for every language and hinting them in the UI can be a pain and an overkill.
Google by default use only english keywords and I think it's the way to go avoiding trouble.

Also, in autotype we use only english keywords for standard fields and we can use the same keyword here.
(maybe implementing also the 1-letter variant from the field references)

I think the specification is almost decided.

Agreed with the comment on leaving out + and making " implying that the field must match completely. Still case-insensitive unless enabled in the GUI though, right?

Also, I'd replace sshagent:true by is:sshagent. This would make it just like Google's syntax again and is closer to normal human language.

About using localised strings: Do we agree that it would be cool to have? For me, it's certainly not a priority but if someone comes up with a clean implementation it should be accepted. The translated aliases don't even need to be documented as far as I am concerned. They certainly shouldn't be hinted at in the UI. It would just be useful because people will try them.
I don't quite see why the results would be wrong in your example, @TheZ3ro. It was not my suggestion to make the aliases translatable. That probably doesn't make sense since reasonable abbreviations might differ from English. It might be outside the capabilities of the translation framework you use, but the idea was to give translators the ability to add aliases independently of English.
Let's say the field name is username and the search aliases name and user are provided. The user runs his KeePass in German. The translated name of the field would be benutzername but the translator decided that benutzer, name and nutzer are reasonable aliases .
Now, the user queries like nutzer:john. The search would then check if nutzer is in the union of English name, English aliases, German name, and German aliases. That is (username, name, user, benutzername, benutzer, nutzer).

One of my passwords was leaked by some site (according to a leaks database). The leaks database does not list which site has leaked it.

For that use case its useful to be able to search on password.

Workaround: right click on the password column header, uncheck 'Hide passwords', and sort alphabetically by password.

This is implemented in my PR referenced above.

Was this page helpful?
0 / 5 - 0 ratings