Tiddlywiki5: Prohibiting tiddlers from starting with '+', '-', or '~'

Created on 9 Jul 2019  Â·  27Comments  Â·  Source: Jermolene/TiddlyWiki5

I propose that tiddlers be restricted from starting with '+', '-', or '~',
OR barring that, I propose that the stringified list syntax specify that such tiddlers must be wrapped in brackets (i.e. "tiddlerA tiddlerB [[-tiddler-starting-with-dash]]")

This is so that lists are genuinely subsets of filters, which it seems like there was a concerted effort to make true. It's not quite there though. A B C +B is not equivalent when interpreted as a list or a filter.

Having true equivalency has several benefits:

Some settings like DefaultTiddler pretend to accept either lists or filters, but that's not true currently. If a new user inputs...

--Dashboard--

Into "Default tiddlers:" under Basic settings, they'll be confused as to why it's not working.

There's another benefit, and this is the cool one. If lists are truly subsets of filters, then you can put in the most beautiful optimization in the filter compiler. First, it's pretty easy and efficient to tell the difference between a list filter and a non-list filter. In the filters module, you'd have...

exports.isList = function(filterString) {
   var index = 0;
   while ((index = filterString.indexOf('[', index)) >= 0) {
      index += 1;
      if (index >= filterString.length || filterString[index] != '[') {
         return false;
      }
      index += 1;
   }
   return true;
};

There may be some edge cases involving illegal filters this isn't accounting for, but otherwise, I'm pretty certain that works. Non-list filters will have a standalone bracket somewhere inside of it, which is what this looks for.

Next, you add (something like but not exactly) this into the beginning of exports.compileFilter:

if (this.isList(filterString)) {
   var list = $tw.utils.parseStringArray(filterString);
   return function() { return list; }
}

Next step: Bask in the glorious optimization. I'm not certain how many filters in tiddlywiki could often be downgraded to quick-return lists like that, but I know at least some would (especially in my own projects), as would any in plugins. The isList check is fairly efficient compared to the actual compilation process, though I could profile a bit to make sure.

So I don't know about you, but I think this is totally worth saying that tiddlers starting in '+', '-', and '~' are either disallowed, or must be wrapped in brackets when specified in lists.

-Flibbles

Most helpful comment

Oh, it seems there's still interest in this (despite it not fixing the language disparity). My pull request #4087 might still be relevant. It fixes stringifyList, and it adds a warning when someone is entering titles starting with +-~=.

If you guys aren't fond of the big blue popup boxes, has anyone ever considered utilizing html5's field validation functionality?

<input type="text" class="tc-titlebar tc-edit-texteditor" pattern="^[^\|\[\]{}+\-~=][^\|\[\]{}]$">

example

All 27 comments

Okay, just did a quick peek at what filters are compiled at vanilla-tiddlywiki startup, and about 1% of them would downgrade into lists. But I still contend that the optimization would be a net-gain. The check is cheap compared to the regexes the compiler proceeds to throw at the filterString. Also any list-or-filter inputs specified by core tiddlywiki or by a plugin would benefit from this.

hmm, That's worth 2 separated issues. ...

  • First is: +, - and ~ as tiddler titles need special handling
  • Second is: A proposal for a possible filter optimization.

We should treat it like that.

I'd support both of them. ;)

I propose that tiddlers be restricted from starting with '+', '-', or '~',

There is a warning when one tries to create a title with any of | [ ] { }. Is it enough to include prefixes + - ~ there?

Okay!

Pull request made. Among other things, it modifies the field validator like @twMat suggested. (It also includes '=' as a bad prefix, since apparently that's new in 5.1.20.)

I'm addressing the bad prefixes first, since the performance improvement shouldn't be added until the list language is a true subset of the filter language. I'll add the other issue once this one is resolved.

...I did not realize the "Tiddlers should contain | [ ] { } was only a warning. If those types of tiddlers are actually allowed then it's hard to say that the list language is a subset of the filter language. Some nasty behavior can crop up if the user ignores those warnings.

Thanks @flibbles. I think indeed that the fundamental issue here is the misapprehension that a stringified list will, if evaluated as a filter, yield the same list of titles. That has never been true; the idea is just that they be familiar, but naturally they are in a sense fundamentally different.

Some nasty behavior can crop up if the user ignores those warnings

Just to be clear, it's perfectly possible to use those characters in tiddler titles; it just makes it less convenient to work with those titles because of the ambiguity inherent in the wikitext syntax. But it's always possible to reformulate things to safely work with arbitrary titles.

I see, @Jermolene. So that makes sense. It seems then that warning users not to use +-~= to start their titles isn't necessary if there's no aim to have the filter language fully contain the list language.

However, I still contend that having $tw.utils.stringifyList wrap such titles in brackets may be worthwhile. It ensures that, at the very least, stringifyList always returns a filter-compatible list, thus making it quite feasible that an end user can maintain the filterLanguage ⊃ listLanguage rule if they wish.

Also, quick point. Are you _sure_ it's not just better to put this rule in place. You say they're fundamentally different, but are they? The documentation itself states: "They are in fact the simplest case of a filter, and are thus a way of expressing a selection of titles." (tw5.com#Title List)

Also, other than this +-~= issue, I can't think of a single reason why they aren't.

Also, if it's decided that they aren't. My super nifty code improvement is problematic now, and that makes me sad.

Also, quick point. Are you _sure_ it's not just better to put this rule in place. You say they're fundamentally different, but are they? The documentation itself states: "They are in fact the simplest case of a filter, and are thus a way of expressing a selection of titles." (tw5.com#Title List)

Good point, it was indeed true a while ago, but not since filters grew more complex (c. 2013).

I'm happy to refine the stringifyList logic if we can also take the opportunity to make the function more efficient (lets do some benchmarks); that would deal with the potential objection around performance.

All right. All right. I've racked my brain on this, and must reluctantly agree that the list language is not a subset of the filter language. It's those danged quotes!

"Tiddler with spaces" Tiddler-without-spaces

Filter parsing returns ["Tiddler with spaces", "Tiddler-without-spaces"]
List parsing returns ['"Tiddler', 'with', 'spaces"', 'Tiddler-without-spaces"]

Hmm. I'll close this issue then, and terminate my merge-request. But I haven't quite given up on my optimization.

Also, other than this +-~= issue, I can't think of a single reason why they aren't.

I think the warning should contain these characters. .. If it doesn't help, it doesn't hurt and makes users live much easier.

I think the warning should contain these characters. .. If it doesn't help, it doesn't hurt and makes users live much easier.

Do you mean that we should warn against starting a tiddler title even though we've decided to fix the stringifyList() end of the problem? Doesn't that make the warning unnecessary. The more complex the warning gets the less likely users are to read and understand it.

Fixing the .stringifyList() function is OK. But users still will use the wrong syntax if they write filters, that contain lists. eg: <$list filter="[tag[test]] -test-tiddler" where it should have been <$list filter="[tag[test]] [[-test-tiddler]]"

So we also need to add the special cases to the docs. ... Since most users don't read the docs, we should add something to the warning too.


I'm not happy with edit area like this, but at least nobody can say, we didn't tell them.

not-happy

For me it would be OK, if we would put a big red exclamation mark in front of the title input. .. If the user clicks it, we can reveal the whole info. .. But the existing overkill is much better then no info.

big red exclamation mark [....] If the user clicks it, we can reveal the whole info.

Possibly for the "tiddler already exists" warning because it is triggered by false positives, e.g if the user wants to type "HelloThere world" AND because you also get the popup confirmation when saving ("Do you wish to overwrite...").

For the forbidden characters the warning is fine as it is because it only shows when it is really relevant.

Oh, it seems there's still interest in this (despite it not fixing the language disparity). My pull request #4087 might still be relevant. It fixes stringifyList, and it adds a warning when someone is entering titles starting with +-~=.

If you guys aren't fond of the big blue popup boxes, has anyone ever considered utilizing html5's field validation functionality?

<input type="text" class="tc-titlebar tc-edit-texteditor" pattern="^[^\|\[\]{}+\-~=][^\|\[\]{}]$">

example

@flibbles I think this would be a much better option, than the existing infos. The main problem imo is translations. ... But with some javascript magic it would be possible

Actually, translation is the easy part. HTML5 form validation normally requires... _a form_. Since tiddlywiki doesn't use those, I made that screenshot above with the following steps:

  1. element.checkValidity() to check against built-in pattern attribute (though Tiddlywiki would likely use it's own check)
  2. element.setCustomValidity("Warning: ...") To set the warning message.
  3. element.reportValidity() to make it emit the warning popup.

It would be trivial to work Tiddlywiki's localization into this mix as part of a <$validate> widget or something. The _tricky_ part is I'm not certain HTML5 form validation is available on all the browsers Tiddlywiki claims to support.

That aside, I'm going to reopen my pull request (#4087). It seems even @Jermolene is interested in having stringifyList check against bad prefixes. That'll be a better place to discuss it.

Hi @flibbles @twMat @pmario in fact I think we should entirely get rid of the warning message about illegal characters in tiddler titles. The reasons are:

  1. The message dated from a time when key core features didn't work properly when some of those characters were used in tiddler titles (e.g. the TOC). Now that those features are fixed the only problems are with custom wikitext, and we shouldn't be trying to teach how to write wikitext in a warning message in the edit template
  2. Users either don't read or misunderstand the message (I think elsewhere @flibbles mentioned that they had read the message as prohibiting those characters, rather than a more nuanced warning)

I remain interested in merging the fix to stringifyList(), but see my comment above. To state it more clearly: the rationale for the change is so thin that to justify it we need to do more than just add another condition that will slow the function down; we should take the opportunity to optimise the function, with the goal that the revised version is faster than what we have at the moment.

Okay, @Jermolene. I've looked into the performance of stringifyList. In the course of opening tw5.com, dicking around a bit, and then making an edit to any page, stringifyList will get called about 1200 times. 1000 of these times occurs during the saving of the edited file. This is all in browser. The nodeJS server called it about 200 times near bootup and didn't really call it again.

That 1000 times during a save ends up taking about 1.3 ms. Nearly all calls to it are for arrays of length 1 or 2, with the occasional length of 12 or 16. The average is less than 2.

Here is my a benchmarking method:

$tw.utils.benchmark = function() {
        $tw.perf.enabled = true;
        const count = 1000000;
        const str = $tw.utils.stringifyList
        const template = ["Love", "Hate", "Dogs and cats", "$:/something/in/the/core", "$:/Probably/more/stuff/in/core", "$:/Thats/probably/it", "$:/getting/bored", "$:/all", "With spaces", "+starts_with_plus", "More stuff with spaces", "-one_more"];
        const list = [template];
        for (var i = 0; i < template.length; i++) {
                list.push(template[i]);
        }       
        var test = function(methodName) {
                const method = $tw.utils[methodName];
                const length = list.length;
                $tw.perf.report(methodName, function() {
                        for (var i = 0; i < count; i++) {
                                const a = list[i % length];
                                method(a); method(a); method(a);
                                method(a); method(a); method(a);
                                method(a); method(a); method(a);
                                method(a);
                        }       
                })();
        };   
        test("stringifyList");
        test("stringifyList_prefix");
        test("stringifyList_optimized");
        test("stringifyList_optimized_prefix");
};
  • stringifyList is the vanilla method.
  • stringifyList_prefix has my prefix test added to it.
  • stringifyList_optimized contains the only optimization I found that works after about 15 minutes of work.
  • stringifyList_optimized_prefix contains both the improvement and the prefix test

Each of these was called 10,000,000 times, and here are the results:

Chrome

  • stringifyList: 1461.88ms
  • stringifyList_prefix: 1447.93ms
  • stringifyList_optimized: 1302.05ms
  • stringifyList_optimized_prefix: 1449.86ms

Firefox

  • stringifyList: 3269.00ms
  • stringifyList_prefix: 3691.00ms
  • stringifyList_optimized: 3285.00ms
  • stringifyList_optimized_prefix: 4020.00ms

Safari

  • stringifyList: 2107.00ms
  • stringifyList_prefix: 2371.00ms
  • stringifyList_optimized: 2129.00ms
  • stringifyList_optimized_prefix: 2435.00ms

Things I tried included template literals, using array.map instead of appending to a new array, and putting in special handling for length==1. When you consider how many times stringifyList gets called, my best improvement saves about 0.03 ms per tiddler save, and my prefix check costs the user about 0.05 ms.

...yeah.

@Jermolene, @pmario, @twMat

I just gotta say. I understand that it's important to have an efficient program, but watching you guys harp on about optimizing this NodeJS javascript program is like watching my neighbors pimp out their pre-owned Ford Aspire.

Okay, @Jermolene. I've looked into the performance of stringifyList

Thanks @flibbles for the thorough analysis. Can you share your optimised version, perhaps in a PR?

I just gotta say. I understand that it's important to have an efficient program, but watching you guys harp on about optimizing this NodeJS javascript program is like watching my neighbors pimp out their pre-owned Ford Aspire.

The reason why I mentioned performance here is that this change is adding a rarely used complication to a frequently used core function. To ensure that there's a net positive for all users we try to take the opportunity to make improvements that benefit everyone.

I made a ticket for removing the warning message at #4107

function stringifyList(value) {
        if($tw.utils.isArray(value)) {
                var result = new Array(value.length);
                for(var t=0, l=value.length; t<l; t++) {
                        var entry = value[t] || "";
                        if(entry.indexOf(" ") !== -1 || "+-~=".indexOf(entry.charAt(0)) !== -1) {
                                result[t] = "[[" + entry + "]]";
                        } else {
                                result[t] = entry;
                        }
                }
                return result.join(" ");
        } else {
                return value || "";
        }
};

Between the moderate improvements I made and the extra check. This is slower, because there really wasn't any room to make any improvements at all. Like I explained, the insignificant improvements I saw were not consistent across different platforms.

For instance, I preallocated the array instead of pushing values on an empty one. A JS implementation might preallocate, and thus save a ton of memory allocation, but it might not. It's not actually in the spec that it has to do it. I also changed the iterator to cache the value of the array length. 6+ years ago, this improvement might have been good. These days, most browsers optimize for-loops anyway.

But I'll say again. Optimizing this function accomplishes nothing. It gets called less than 1000 times on any tiddler save-and-rerender operation, which amounts to ~0.05ms. Total.

Trying to optimize a function simply because you're modifying it's functionality is the wrong way to go about optimizing.

I'm not saying to disregard performance entirely during development, but it's far far more effective to instead look at the program as a whole and see where bottlenecks are. I did this out of spite and came up with #4108 and #4102, which will have a much more profound effect than squeezing clock cycles out of stringifyList.

So @Jermolene, if you didn't want to take my implementation change unless it guaranteed it didn't reduce performance, then will you take it along with my two unrelated optimizations I proposed earlier?

-Flibbles

Hi @flibbles thanks for investigating. Please do go ahead and create a PR for the stringifyList() changes.

@flibbles - I just want to note that your meticulous investigation is appreciated! It is as much a step forward as anything. Thank you!!!

Apologies @flibbles I just discovered that you've already created a PR at #4087

Was this page helpful?
0 / 5 - 0 ratings

Related issues

tobibeer picture tobibeer  Â·  42Comments

joshuafontany picture joshuafontany  Â·  34Comments

felixhayashi picture felixhayashi  Â·  28Comments

twMat picture twMat  Â·  33Comments

inmysocks picture inmysocks  Â·  34Comments