News: Problems with entire feeds marked as unread after upgrade to 15.3.1

Created on 6 Feb 2021  路  36Comments  路  Source: nextcloud/news

IMPORTANT

Read and tick the following checkbox after you have created the issue or place an x inside the brackets ;)

  • [x] I have read the CONTRIBUTING.md and followed the provided tips
  • [x] I accept that the issue will be closed without comment if I do not check here
  • [x] I accept that the issue will be closed without comment if I do not fill out all items in the issue template.

Explain the Problem

What problem did you encounter?
(I would have continued the closed issue #1122 but it is currently closed for further comments)
Since 15.2.2 there is a problem with some (random?) feeds which are marked as unread after every execution of cron or the news updater. As @Grotax mentioned in #1122 Comment this should be fixed with 15.3.1. Unfortunately with my nextcloud installation even after the upgrade to news in version 15.3.1 and Nextcloud Update to 20.0.7 the problem still occurs. Only when I disable purging (Comment) it works for me. Otherwise I have hundreds of unread items from two or three feeds marked as unread after every sync or every hour.

Steps to Reproduce

Explain what you did to encounter the issue

  1. Update to 15.3.1
  2. Set feed purging to 200 in the settings of the news app
  3. Wait several hours and use the news app to validate it is not an old sync run or caching problems
  4. Get hundreds of unread items from only two or three feeds.

System Information

  • News app version: 15.3.1
  • Nextcloud version: 20.0.7
  • Cron type: system cron
  • PHP version: 7.4
  • Database and version: mariadb Ver 15.1 Distrib 10.3.25-MariaDB, for debian-linux-gnu (x86_64) using readline 5.2
  • Browser and version: Chrome (Brave) V1.19.92 Chromium to 88.0.4324.152
  • OS and version: Windows 10

Affected URLs:

Thank you very much for your ongoing work and continued development! I like the news reader very much and appreciate the development performance!

0. Needs triage bug help wanted

Most helpful comment

Same here with lots of feeds, for example:

KO:
http://feeds.feedburner.com/contemporist
http://feeds.feedburner.com/canonrumors/rss
https://www.sonyalpharumors.com/feed/
http://www.the-digital-picture.com/News-RSS.aspx
http://toolsandtoys.net/feed/

OK:
youtube feeds, for example: https://www.youtube.com/feeds/videos.xml?channel_id=UC6107grRI4m0o2-emgoDnAA
http://feeds.feedburner.com/MaisonEtDomotique

As I see feedburner in both OK and KO list, I imagine the problems does not come from the feeds themselves...
I did nothing but uppdating to the latest News app version (no setting changes)...

I tried different ways to mark feeds as read: scrolling, clicking, using keyboard shortcuts, using the "mark all read" button... no difference. They keep coming back as unread.

My settings:

  • system cron
  • purge: 60
  • max read per feed: 1000
  • max redirect: 10
  • timeout: 60
  • update interval: 3600

All 36 comments

Please also provide your used database, the Nextcloud log and some feed urls of affected feeds.

Please also provide your used database, the Nextcloud log and some feed urls of affected feeds.

I added the used database and the affected URLs to the Issue description. For the news app I see no relevant log entries on a warning or higher level. Do you also need the logs on debug/info level?

Same here with lots of feeds, for example:

KO:
http://feeds.feedburner.com/contemporist
http://feeds.feedburner.com/canonrumors/rss
https://www.sonyalpharumors.com/feed/
http://www.the-digital-picture.com/News-RSS.aspx
http://toolsandtoys.net/feed/

OK:
youtube feeds, for example: https://www.youtube.com/feeds/videos.xml?channel_id=UC6107grRI4m0o2-emgoDnAA
http://feeds.feedburner.com/MaisonEtDomotique

As I see feedburner in both OK and KO list, I imagine the problems does not come from the feeds themselves...
I did nothing but uppdating to the latest News app version (no setting changes)...

I tried different ways to mark feeds as read: scrolling, clicking, using keyboard shortcuts, using the "mark all read" button... no difference. They keep coming back as unread.

My settings:

  • system cron
  • purge: 60
  • max read per feed: 1000
  • max redirect: 10
  • timeout: 60
  • update interval: 3600

I have same problem as described.

  • News 15.3.1 (but same problem on previous version)
  • Nextcloud 20.0.7 snap
  • Ubuntu 20.04 VPS

If I select _All Articles_, I see the last 4 years of content.

If I select Unread Articles, it shows 832 unread (but actually already read) messages, but only from the last 5 months.

  • If I click the three dots and select Mark Read, they are marked as read.
  • Half an hour later or so, they are all there again, as Unread Articles.
  • This only applies to some of my feeds (the majority).

Not marked as read:
https://www.dedoimedo.com/rss_feed.xml
https://www.theguardian.com/sport/cycling/rss
https://feeds.feedburner.com/d0od

Correctly and permanently marked as read:
www.ghacks.net/category/operating-systems/linux/feed/
http://flavio.tordini.org/feed
http://msmvps.com/blogs/hostsnews/rss.aspx
https://github.com/nextcloud/news/releases.atom
https://pi-hole.net/feed/
https://www.schneier.com/blog/atom.xml

Interesting: when I Open feed URL for these correctly handled feeds in Firefox, the pages do not open and instead I get this type of FF pop-up box:

20210208_screenshot_001

I wrote to the webmaster of www.schneier.com some time ago, querying this behaviour and he said to contact you and suggest the following:

Essentially, the feed contains more than one link, and Nextcloud is picking the wrong one.

Here's the detailed version, which may help them find the issue faster:

The feed includes the webpage URL for each entry -- the URL you want -- coded like this:

    <link rel="alternate" type="text/html" href="https://www.schneier.com/blog/archives/2020/11/new-windows-zero-day.html" />

The "alternate" link relation and "text/html" type in that code tells feed readers that this is another version of the entry content, and that it's in the form of a webpage.

The feed also contains a link to the "Subscribe to comments on this entry" feed, like this:

        <link rel="replies" type="application/atom+xml" href="https://www.schneier.com/blog/archives/2020/11/new-windows-zero-day.html/feed/atom/" thr:count="2"/>

This code shows that the link contains replies, in Atom feed format.

Nextcloud is ignoring the metadata, and apparently just picking the last link it sees, which is likely to be wrong on any feed that contains more than one link per entry.

@alexdebril I expect this is incorrectly handled on a Feed-IO level?

@Jakethethird What the webmaster of http://www.schneier.com/ told you was right, however it had been fixed since (a few weeks ago I think).

@SMillerDev I performed some tests on the feeds reported in this thread and feed-io doesn't seem to complain about any of them. I'll push them in the API's database to monitor how it behaves through time, let's see if we can get something from it

@alexdebril - OK. My point was also that the feeds that are not marked as read seem to be those that don't throw up this dialog box when opening the feed URL in Firefox. Instead I get a page of xml in the browser window.
Maybe that is useful in diagnosing the issue?

I'm hitting the same issue, I mark older articles as read, and on the next update they are marked as read again.

You can disable the item puring in the settings, that will probably help.

There was a bug report for earlier versions which was (probably incorrectly) closed: #1122

One of two that still continously resets:
https://feeds.megaphone.fm/cyberwire-daily-podcast
the other:
https://www.bnr.nl/podcast/digitaal?widget=rssfeed

"Mark as Read" have been done from Android as well as from the WebInterface.

Same problem here.
NC: 20.0.7
News: 15.3.1
OS: Ubuntu Server 20.04.1 (NC installed as Snap)

Some of these feeds:
https://netzpolitik.org/feed
https://posteo.de/blog/feed
https://thecodinglove.com/feed
https://www.spektrum.de/alias/rss/spektrum-de-rss-feed/996406
I have more, if needed.

There was a bug report for earlier versions which was (probably incorrectly) closed: #1122

The fix for #1122 solved this problem for me. I'm subscribed to the heise.de and the posteo feed mentioned here (using slightly different links to the same feed) and don't have any problems with it. For testing purposes I've subscribed to the theguardian.com/sport/cycling/rss feed mentioned here and ran a few cronjobs, everything worked as expected.

I'm using PostgreSQL. The NC Snaps are using MySQL as the database, so could this only affect people usingMySQL/MariaDB?

There was a bug report for earlier versions which was (probably incorrectly) closed: #1122

The fix for #1122 solved this problem for me. I'm subscribed to the heise.de and the posteo feed mentioned here (using slightly different links to the same feed) and don't have any problems with it. For testing purposes I've subscribed to the theguardian.com/sport/cycling/rss feed mentioned here and ran a few cronjobs, everything worked as expected.

I'm using PostgreSQL. The NC Snaps are using MySQL as the database, so could this only affect people usingMySQL/MariaDB?

I am using nextcloudpi with 10.3.27-MariaDB.

I have trouble with the heise.de feed...

I opened #1122 and can confirm that there is still an issue.
Is there a way to clear the database completely and start from a fresh OPML import?

Are some of you willing to test a patch, assuming you have basic knowledge of PHP and server administration?
Since it works on my end I couldn't verify it myself (Grotax already tried a similar version).

diff --git a/lib/Db/ItemMapperV2.php b/lib/Db/ItemMapperV2.php
index ed7efff3f..f51b7af4d 100644
--- a/lib/Db/ItemMapperV2.php
+++ b/lib/Db/ItemMapperV2.php
@@ -167,7 +167,8 @@ class ItemMapperV2 extends NewsMapperV2
             ->from($this->tableName)
             ->where('feed_id = :feedId')
             ->andWhere('starred = false')
-            ->orderBy('last_modified', 'DESC');
+            ->orderBy('last_modified', 'DESC')
+            ->addOrderBy('id', 'DESC');

         if ($removeUnread === false) {
             $rangeQuery->andWhere('unread = false');

1142.patch

I opened #1122 and can confirm that there is still an issue.
Is there a way to clear the database completely and start from a fresh OPML import?

This wouldn't change anything, I've already tested it with a clean dataset. Apparently either it works for the whole server or it doesn't.

Are some of you willing to test a patch, assuming you have basic knowledge of PHP and server administration?
Since it works on my end I couldn't verify it myself (Grotax already tried a similar version).

diff --git a/lib/Db/ItemMapperV2.php b/lib/Db/ItemMapperV2.php
index ed7efff3f..f51b7af4d 100644
--- a/lib/Db/ItemMapperV2.php
+++ b/lib/Db/ItemMapperV2.php
@@ -167,7 +167,8 @@ class ItemMapperV2 extends NewsMapperV2
             ->from($this->tableName)
             ->where('feed_id = :feedId')
             ->andWhere('starred = false')
-            ->orderBy('last_modified', 'DESC');
+            ->orderBy('last_modified', 'DESC')
+            ->addOrderBy('id', 'DESC');

         if ($removeUnread === false) {
             $rangeQuery->andWhere('unread = false');

1142.patch

This does seem to work..., at first glance... i'll report in a day on if new news shows all or just new stuff.

Are some of you willing to test a patch, assuming you have basic knowledge of PHP and server administration?
Since it works on my end I couldn't verify it myself (Grotax already tried a similar version).

diff --git a/lib/Db/ItemMapperV2.php b/lib/Db/ItemMapperV2.php
index ed7efff3f..f51b7af4d 100644
--- a/lib/Db/ItemMapperV2.php
+++ b/lib/Db/ItemMapperV2.php
@@ -167,7 +167,8 @@ class ItemMapperV2 extends NewsMapperV2
             ->from($this->tableName)
             ->where('feed_id = :feedId')
             ->andWhere('starred = false')
-            ->orderBy('last_modified', 'DESC');
+            ->orderBy('last_modified', 'DESC')
+            ->addOrderBy('id', 'DESC');

         if ($removeUnread === false) {
             $rangeQuery->andWhere('unread = false');

1142.patch

This does seem to work..., at first glance... i'll report in a day on if new news shows all or just new stuff.

Good job, thanks!!!

Are some of you willing to test a patch, assuming you have basic knowledge of PHP and server administration?
Since it works on my end I couldn't verify it myself (Grotax already tried a similar version).

diff --git a/lib/Db/ItemMapperV2.php b/lib/Db/ItemMapperV2.php
index ed7efff3f..f51b7af4d 100644
--- a/lib/Db/ItemMapperV2.php
+++ b/lib/Db/ItemMapperV2.php
@@ -167,7 +167,8 @@ class ItemMapperV2 extends NewsMapperV2
             ->from($this->tableName)
             ->where('feed_id = :feedId')
             ->andWhere('starred = false')
-            ->orderBy('last_modified', 'DESC');
+            ->orderBy('last_modified', 'DESC')
+            ->addOrderBy('id', 'DESC');

         if ($removeUnread === false) {
             $rangeQuery->andWhere('unread = false');

1142.patch

I also applied this patch and will report later if it works for me.
Thanks for your work!

The fix still fails for this one:
https://feeds.megaphone.fm/cyberwire-daily-podcast

The fix still fails for this one:
https://feeds.megaphone.fm/cyberwire-daily-podcast

The mentioned feed is gigantic .. it contains 1536 items and is nearly 5MB (most other contain 10 - 50 and are some low number KB big).
I suspect that a timeout occurs during parsing, as all items are compared with the database each time. Do you happen to have a message in the log?

@anoymouserver

Yes, I was a bit surprised when I saw the number of items in it:

./bin/feedio check https://feeds.megaphone.fm/cyberwire-daily-podcast                                                

reading https://feeds.megaphone.fm/cyberwire-daily-podcast
----------------------------------------------------------

+----------------------------------------------------+------------+-----------+---------------------------+---------+------------+-----------+
| URL                                                | Accessible | readSince | Last modified             | # items | unique IDs | Date Flow |
+----------------------------------------------------+------------+-----------+---------------------------+---------+------------+-----------+
| https://feeds.megaphone.fm/cyberwire-daily-podcast | OK         | OK        | 2021-02-08T22:00:00+01:00 | 1536    | OK         | OK        |
+----------------------------------------------------+------------+-----------+---------------------------+---------+------------+-----------+

So without any memory limit, it works. But your idea makes sense, parsing so many items can lead to memory exhaustion

There are error messages:

[no app in context] Info: Deprecated event type for {"[object] (OCP\SabrePluginEvent)":{"*statusCode":200,"*message":"","*server":{"[object] (OCA\DAV\Connector\Sabre\Server)":{"tree":"[object] (OCA\DAV\Connector\Sabre\ObjectTree)","*baseUri":"/remote.php/webdav/","httpResponse":"[object] (Sabre\HTTP\Response)","httpRequest":"[object] (Sabre\HTTP\Request)","sapi":"[object] (Sabre\HTTP\Sapi)","*plugins":[],"transactionType":null,"protectedProperties":{"...":"Over 20 items, aborting normalization"},"debugExceptions":false,"resourceTypeMapping":[],"enablePropfindDepthInfinity":true,"xml":"[object] (Sabre\DAV\Xml\Service)","*listeners":{"...":"Over 20 items, aborting normalization"},"*wildcardListeners":[],"*listenerIndex":[],"*logger":null}},"Symfony\Contracts\EventDispatcher\EventpropagationStopped":false}}: null

PROPFIND /remote.php/webdav/InstantUpload/Camera/2018/
xxxxxxxxxxxxxxxxxxxxxxxx at 2021-02-09T13:19:15+01:00

or

[no app in context] Info: Deprecated event type for OCP\IPreview:PreviewRequested: Symfony\Component\EventDispatcher\GenericEvent is used

GET /index.php/apps/files/api/v1/thumbnail/256/256/Ixxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx  at 2021-02-09T13:27:49+01:00

or

not enough messages to train a classifier

I tried this fix, it seems to work (after 6 hours, no feed came back with unread posts)

I tried this fix, it seems to work (after 6 hours, no feed came back with unread posts)

Until now the patch seems to work for me. But I am not 100% sure yet because once I got old feed entries displayed as new ones. Maybe this was directly after the patch and the caching or something else need some time.

I will give you another update tomorrow. Thank you all for your help with this issue and the good work and fast responses of the developers!

One day later, everything is still ok for me (with the patch)

One day later, everything is still ok for me (with the patch)

Same for me. Looks good!

The patch is included in 15.3.2

@Grotax I assume my issue is related to this issue, but to be honest I'm not 100% sure. I'm usually reading all news on my mobile phone using the Nextcloud News app. Since a couple of days I realized that a problem with the unread items counter exists. If I e.g. select the Heise news feed and set the read flag for the whole feed, the unread counter is still greater than 0. If I scroll through the list of displayed news items I cannot find any unread ones. If I select the "All unread" category also unread items from the mentioned news feed are shown. All this items are 1 or 2 days old and have definitely been read in the past. I can select all this items and set the read flag, which also forces the unread counter of the Heise news feed itself to be set to 0.
I've already updated the News app to v15.3.2 yesterday evening but the described effect re-appeared this morning. Is there anything I can do to narrow down the problem?

@Grotax I assume my issue is related to this issue, but to be honest I'm not 100% sure. I'm usually reading all news on my mobile phone using the Nextcloud News app. Since a couple of days I realized that a problem with the unread items counter exists. If I e.g. select the Heise news feed and set the read flag for the whole feed, the unread counter is still greater than 0. If I scroll through the list of displayed news items I cannot find any unread ones. If I select the "All unread" category also unread items from the mentioned news feed are shown. All this items are 1 or 2 days old and have definitely been read in the past. I can select all this items and set the read flag, which also forces the unread counter of the Heise news feed itself to be set to 0.
I've already updated the News app to v15.3.2 yesterday evening but the described effect re-appeared this morning. Is there anything I can do to narrow down the problem?

I can see a similar problem for my installation, even after applying the patch. Also for me it is only related to the heise.de Newsfeed and no other feeds are affected. Not all feed items of the heise.de feed are marked as unread only ~10 which are always 2 days old. This problem occurs not after every sync/cron as it looks like. It seems to be randomly in the last 2 days I had the problem only once a day, but today I have seen this already twice.

Check if the content is still the same, heise is known to do update articles, the longer your time until a refetch is the more likely it get's that heise re-published the item with some error fixing or updates to the article.

Because I can't see that issue, I will re-add the heise feed to get the maximum unread items

If using the Nextcloud News app on Android, I suggest going to settings and deleting the cache so it reloads the feeds. This fixed the wrong number of feeds being shown in the starred category, and might be helpful here.

btw, for me the patch seems to have resolved the main problem of feeds not being marked as read. Thanks!

I have been on the 15.3.2 update for a while now and while nearly all feeds seem to behave well I still have re-appearing messages from 2017 in one (https://puri.sm/feed/).
I have deleted the feed and re-added, maybe that will help.

I also have still reoccuring old feed news from at least

(News app 15.3.2, Nexcloud 20.0.7, PostgreSQL 11.10)

https://feeds.megaphone.fm/cyberwire-daily-podcast - after delete and re-add this one behaves normaly.
Probably the upgrade script doesn't initialize all needed fields. While adding a new one does update.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

siccovansas picture siccovansas  路  7Comments

mjanssens picture mjanssens  路  8Comments

ThomasKujawa picture ThomasKujawa  路  4Comments

anoymouserver picture anoymouserver  路  4Comments

bcutter picture bcutter  路  5Comments