Newsboat: Ignoring articles not working as expected

Created on 16 May 2018  路  4Comments  路  Source: newsboat/newsboat

I have a rule like the following in my config:

ignore-article "https://example.com/*" "title !~ \"interesting thing\""

I'd expect this to ignore any articles from any feed from https://example.com/ that does not have the string "interesting thing" in the title. But I see articles slip past this rule anyway :(

I think there could be 3 causes for this:

1) I'm an idiot and this rule does not do what I expect it to do.
2) * doesn't work as a wildcard in feed URLs, but only as a standalone wildcard for _all_ feeds.
3) There's an actual bug!

Is anyone able to tell me what's going on? :)

Edit: I just realised that there's a #newsboat channel on Freenode, which might have been a better place for something questionish like this, but I'll leave it here, to avoid causing too much commotion.

question

All 4 comments

Looks like your second assumption is correct, only wildcard for all feeds is recognized and it isn't doing any actual parsing on the url.

The relevant codebit is the following:

https://github.com/newsboat/newsboat/blob/6f3c0ccc037576bf069206f6628925ea5174b375/src/rss.cpp#L408-L419

@tsipinakis answered the question, now let's figure out what else we need to do :)

First of all: can we make the documentation better? Here's what it says at the moment:

The basic format is that the user specifies an RSS feed for which the ignore shall be applied ("*" matches all RSS feeds), and then a filter expression (see previous section). If newsboat hits an article in the specified RSS feed that matches the specified filter expression, then this article is ignored and never presented to the user.

@decibyte, did you read that? (I won't judge you if you didn't, promise!) Was it misleading in some way, or didn't explain the mechanics of * well enough? Any suggestions on how to improve it?

Second, maybe this should lead to a feature request for globs or regexes in ignore-article? Both will conflict with URL's use of question mark, though.

It's indeed better to reserve questions for IRC, but plenty of them are asked here as well, so don't worry about it. ;)

Thanks, both of you!

@Minoru, I _did_ read the docs, and _did_ understand what it said. I guess I was just hoping for a little bit of additional magic :)

can we make the documentation better?

That would be to explicitly say that * _only_ works to match _all_ feeds, not as a wildcard in feed URLs.

Second, maybe this should lead to a feature request for globs or regexes

Oh yes, that would be sweet! Would you prefer to open a new issue for this yourself or do you want me to do it?

Both will conflict with URL's use of question mark, though

Yeah. Let's save that discussion for the issue for the actual feature request.

I'll close this one, as I consider my question answered. Thansk again.

@decibyte, turns out we already have an issue for that over at Newsbeuter tracker: akrennmair/newsbeuter#500. But if you have more thoughts on the matter, please do open an issue here, and we'll discuss it further.

(My stance regarding Newsbeuter issues and how they apply to Newsboat is outlined in #31.)

Was this page helpful?
0 / 5 - 0 ratings