Hello,
I am writing an Android RSS client with FreshRSS support and I encountered a problem when syncing items using Greader API.
I would like to get from the server items which read state has been modified. Let's say I synchronize my items list with my Android client. Then, I mark some items as read in the web client. I would like to make notice the mobile client that several items have been marked as read.
I already know that if an item contains user/-/state/com.google/read in the categories field in the returned json, like that :
"categories": [
"user/-/state/com.google/reading-list",
"user/-/label/IT",
"user/-/state/com.google/read"
]
it is read an I can mark it as read in the mobile client database.
When fetching items with reader/api/0/stream/contents/user/-/state/com.google/reading-list end point and the parameter ot (unix timestamp), I can get new items, and, if they have been read before the mobile client fetch them, they will have user/-/state/com.google/read in their categories field. But if I mark an item older than the ot parameter as read, I won't get it.
Am I missing something which could solve this problem ? If not, Is there anything doable for this ?
Otherwise, thanks for this awesome project that is FreshRSS !
Hello @Shinokuni and welcome :-)
First, I would like to say that getting the synchronisation strategy right is essential for a good client. Except News+ and to a lower extent EasyRSS, the other clients I have tested all have inefficient synchronisation strategies (in some cases very bad). By inefficient, I mean far too many requests, redundant requests, as well as expensive requests for the client and/or the server (leading to slow synchronisation, high battery consumption, high bandwidth consumption, high CPU usage on client and server, high database usage on server, etc.).
Therefore, I am always very pleased to provide the exact API calls to perform.
The following seven requests are what News+ does for its global synchronisation (see also full log below), which is both robust and efficient. No need to make a single additional request for that phase. I can also provide logs for other phases such as login, posting changes, etc. In case of doubt, I suggest you install News+ and check on your server the exact calls that are made, and do the same.
/reader/api/0/tag/list/reader/api/0/subscription/list/reader/api/0/stream/contents/user/-/state/com.google/reading-list (with some filters in parameter to exclude read items with xt, and get only the new ones with ot, cf. log below)/reader/api/0/stream/items/ids (with a filter in parameter to exclude read items with xt)/reader/api/0/stream/contents/user/-/state/com.google/starred (with some filters in parameter to exclude read items with xt, and get only the new ones with ot)/reader/api/0/stream/contents/user/-/state/com.google/starred (with some other filters, which includes read starred items)/reader/api/0/stream/items/ids (with a filter to get only starred ones)It is also possible in News+ to synchronise / "pull for refresh" a specific category/folder, or feed, or tag/label, but that is only necessary when the user wants to get read items or more/older items than the global limit.
Full log:
[Mon, 08 Oct 2018 09:02:46 +0200] [debug] --- Array
(
[date] => 2018-10-08T09:02:46+02:00
[headers] => Array
(
[Connection] => Keep-Alive
[Accept-Encoding] => gzip
[Authorization] => GoogleLogin auth=test/ABCDEF0123456789
)
[_SERVER] => Array
(
[PATH_INFO] => /reader/api/0/tag/list
[REQUEST_URI] => /api/greader.php/reader/api/0/tag/list?client=newsplus&output=json&ck=1538982165918
[QUERY_STRING] => client=newsplus&output=json&ck=1538982165918
[REQUEST_METHOD] => GET
[HTTP_AUTHORIZATION] => GoogleLogin auth=test/ABCDEF0123456789
[PHP_SELF] => /api/greader.php/reader/api/0/tag/list
)
[_GET] => Array
(
[client] => newsplus
[output] => json
[ck] => 1538982165918
)
)
[Mon, 08 Oct 2018 09:02:46 +0200] [debug] --- Array
(
[date] => 2018-10-08T09:02:46+02:00
[headers] => Array
(
[Connection] => Keep-Alive
[Accept-Encoding] => gzip
[Authorization] => GoogleLogin auth=test/ABCDEF0123456789
)
[_SERVER] => Array
(
[PATH_TRANSLATED] => /usr/share/FreshRSS/reader/api/0/subscription/list
[PATH_INFO] => /reader/api/0/subscription/list
[REQUEST_URI] => /api/greader.php/reader/api/0/subscription/list?client=newsplus&output=json&ck=1538982165918
[QUERY_STRING] => client=newsplus&output=json&ck=1538982165918
[REQUEST_METHOD] => GET
[HTTP_AUTHORIZATION] => GoogleLogin auth=test/ABCDEF0123456789
[PHP_SELF] => /api/greader.php/reader/api/0/subscription/list
)
[_GET] => Array
(
[client] => newsplus
[output] => json
[ck] => 1538982165918
)
)
[Mon, 08 Oct 2018 09:02:49 +0200] [debug] --- Array
(
[date] => 2018-10-08T09:02:49+02:00
[headers] => Array
(
[Connection] => Keep-Alive
[Accept-Encoding] => gzip
[Authorization] => GoogleLogin auth=test/ABCDEF0123456789
)
[_SERVER] => Array
(
[PATH_TRANSLATED] => /usr/share/FreshRSS/reader/api/0/stream/contents/user/-/state/com.google/reading-list
[PATH_INFO] => /reader/api/0/stream/contents/user/-/state/com.google/reading-list
[REQUEST_URI] => /api/greader.php/reader/api/0/stream/contents/user%2F-%2Fstate%2Fcom.google%2Freading-list?client=newsplus&ck=1538982165918&xt=user/-/state/com.google/read&ot=1538978853&n=1000&r=n
[QUERY_STRING] => client=newsplus&ck=1538982165918&xt=user/-/state/com.google/read&ot=1538978853&n=1000&r=n
[REQUEST_METHOD] => GET
[HTTP_AUTHORIZATION] => GoogleLogin auth=test/ABCDEF0123456789
[PHP_SELF] => /api/greader.php/reader/api/0/stream/contents/user/-/state/com.google/reading-list
)
[_GET] => Array
(
[client] => newsplus
[ck] => 1538982165918
[xt] => user/-/state/com.google/read
[ot] => 1538978853
[n] => 1000
[r] => n
)
)
[Mon, 08 Oct 2018 09:02:50 +0200] [debug] --- Array
(
[date] => 2018-10-08T09:02:50+02:00
[headers] => Array
(
[Connection] => Keep-Alive
[Accept-Encoding] => gzip
[Authorization] => GoogleLogin auth=test/ABCDEF0123456789
)
[_SERVER] => Array
(
[PATH_TRANSLATED] => /usr/share/FreshRSS/reader/api/0/stream/items/ids
[PATH_INFO] => /reader/api/0/stream/items/ids
[REQUEST_URI] => /api/greader.php/reader/api/0/stream/items/ids?output=json&s=user%2F-%2Fstate%2Fcom.google%2Freading-list&xt=user/-/state/com.google/read&n=10000&r=n
[QUERY_STRING] => output=json&s=user%2F-%2Fstate%2Fcom.google%2Freading-list&xt=user/-/state/com.google/read&n=10000&r=n
[REQUEST_METHOD] => GET
[HTTP_AUTHORIZATION] => GoogleLogin auth=test/ABCDEF0123456789
[PHP_SELF] => /api/greader.php/reader/api/0/stream/items/ids
)
[_GET] => Array
(
[output] => json
[s] => user/-/state/com.google/reading-list
[xt] => user/-/state/com.google/read
[n] => 10000
[r] => n
)
)
[Mon, 08 Oct 2018 09:02:50 +0200] [debug] --- Array
(
[date] => 2018-10-08T09:02:50+02:00
[headers] => Array
(
[Connection] => Keep-Alive
[Accept-Encoding] => gzip
[Authorization] => GoogleLogin auth=test/ABCDEF0123456789
)
[_SERVER] => Array
(
[PATH_TRANSLATED] => /usr/share/FreshRSS/reader/api/0/stream/contents/user/-/state/com.google/starred
[PATH_INFO] => /reader/api/0/stream/contents/user/-/state/com.google/starred
[REQUEST_URI] => /api/greader.php/reader/api/0/stream/contents/user%2F-%2Fstate%2Fcom.google%2Fstarred?client=newsplus&ck=1538982165918&xt=user/-/state/com.google/read&ot=1538978853&n=1000&r=n
[QUERY_STRING] => client=newsplus&ck=1538982165918&xt=user/-/state/com.google/read&ot=1538978853&n=1000&r=n
[REQUEST_METHOD] => GET
[HTTP_AUTHORIZATION] => GoogleLogin auth=test/ABCDEF0123456789
[PHP_SELF] => /api/greader.php/reader/api/0/stream/contents/user/-/state/com.google/starred
)
[_GET] => Array
(
[client] => newsplus
[ck] => 1538982165918
[xt] => user/-/state/com.google/read
[ot] => 1538978853
[n] => 1000
[r] => n
)
)
[Mon, 08 Oct 2018 09:02:51 +0200] [debug] --- Array
(
[date] => 2018-10-08T09:02:51+02:00
[headers] => Array
(
[Connection] => Keep-Alive
[Accept-Encoding] => gzip
[Authorization] => GoogleLogin auth=test/ABCDEF0123456789
)
[_SERVER] => Array
(
[PATH_TRANSLATED] => /usr/share/FreshRSS/reader/api/0/stream/contents/user/-/state/com.google/starred
[PATH_INFO] => /reader/api/0/stream/contents/user/-/state/com.google/starred
[REQUEST_URI] => /api/greader.php/reader/api/0/stream/contents/user%2F-%2Fstate%2Fcom.google%2Fstarred?client=newsplus&ck=1538982165918&n=1000&r=n
[QUERY_STRING] => client=newsplus&ck=1538982165918&n=1000&r=n
[REQUEST_METHOD] => GET
[HTTP_AUTHORIZATION] => GoogleLogin auth=test/ABCDEF0123456789
[PHP_SELF] => /api/greader.php/reader/api/0/stream/contents/user/-/state/com.google/starred
)
[_GET] => Array
(
[client] => newsplus
[ck] => 1538982165918
[n] => 1000
[r] => n
)
)
[Mon, 08 Oct 2018 09:02:52 +0200] [debug] --- Array
(
[date] => 2018-10-08T09:02:52+02:00
[headers] => Array
(
[Connection] => Keep-Alive
[Accept-Encoding] => gzip
[Authorization] => GoogleLogin auth=test/ABCDEF0123456789
)
[_SERVER] => Array
(
[PATH_TRANSLATED] => /usr/share/FreshRSS/reader/api/0/stream/items/ids
[PATH_INFO] => /reader/api/0/stream/items/ids
[REQUEST_URI] => /api/greader.php/reader/api/0/stream/items/ids?output=json&s=user%2F-%2Fstate%2Fcom.google%2Fstarred&n=10000&r=n
[QUERY_STRING] => output=json&s=user%2F-%2Fstate%2Fcom.google%2Fstarred&n=10000&r=n
[REQUEST_METHOD] => GET
[HTTP_AUTHORIZATION] => GoogleLogin auth=test/ABCDEF0123456789
[PHP_SELF] => /api/greader.php/reader/api/0/stream/items/ids
)
[_GET] => Array
(
[output] => json
[s] => user/-/state/com.google/starred
[n] => 10000
[r] => n
)
)
Do not hesitate to ask again, but please consider this synchronisation strategy.
@Alkarex Sounds like a good thing to stick in the docs which should make it easier to find through a search engine, maybe somewhere in https://freshrss.github.io/FreshRSS/en/developers/01_First_steps.html?
First of all, thank you for your answer.
Here is the way Readrops handle synchronization.
One the main functionalities of Readrops is to provide an offline experience. Therefore, a large quantity of items is fetched and stored locally when doing the initial synchronization.
Steps :
/reader/api/0/subscription/list/reader/api/0//tag/listreader/api/0/stream/contents/user/-/state/com.google/reading-listSteps :
/reader/api/0/edit-tag/reader/api/0/edit-tag/reader/api/0/subscription/list/reader/api/0//tag/listreader/api/0/stream/contents/user/-/state/com.google/reading-listThe way Readrops handles synchronization is more or less the same as what you described expect Readrops doesn't fetch starred items and makes one query per item read state.
The initial point of my issue was to know if there is a way to get modified items since a precise time. This would allow to have a coherent read/unread state for all items and all platforms.
Do I have to conclude that there is no way to do this ?
One the main functionalities of Readrops is to provide an offline experience.
Speaking just for myself of course, but I doubt I'd even consider using a third-party client except for the offline experience. ;-)
(That's why I currently use EasyRSS.)
Speaking just for myself of course, but I doubt I'd even consider using a third-party client except _for_ the offline experience. ;-)
(That's why I currently use EasyRSS.)
Year, I believe too that it is important to have an offline access to its feeds. Personally, not having an offline access wouldn't bother me that much because the situations where I don't have any connexion (RER A de ses morts) are infrequent and I can do something else.
makes one query per item read state
@Shinokuni Could you please explain that again?
Please check requests 4 and 7.
makes one query per item read state
@Shinokuni Could you please explain that again?
Year, sorry. I meant one request to mark items as read and one request to mark items as unread with /reader/api/0/edit-tag.
Please check requests 4 and 7.
This is interesting. If I use the parameter ot (newer than), do you know if I will get the ids of the latest modified items or only the latest items ? Using ot would allow to avoid fetching an arbitrary number of ids to get all modified items ids, just the new and modified items ids.
No, it is not the date when the items where modified, but the date when they were discovered / added to database. They are still the best calls to get the states as they only retrieve IDs.
Does this mean that if I change the read state of a item created three months ago, I will have to fetch three months of items ids to get it ? In this case, it won't be useful because too expensive.
I agree that the API could surely be improved. We could make some additions (I am open to that), but changing the behaviour of existing calls risk breaking other clients obeying the Google Reader API.
In any case, there are many more items on the server than on the client, so the client need to make reasonable calls.
When you want the states and ask the IDs, you ask only the unread ones (The IDs not in the list are read). The length of that list is at max the number of unread articles on the server, and can be limited by number and date, so it is not that bad.
In practice, I have in general between 1k and 4k unread articles, 300k+ read articles, ~160 feeds, ~17 categories, 400+ favourites, ~10 tags. A full sync in News+ takes about ~3s.
When you want the states and ask the IDs, you ask only the unread ones (The IDs not in the list are read). The length of that list is at max the number of unread articles on the server, and can be limited by number and date, so it is not that bad.
You are right, it is not that bad. But I will still see for a limitation to avoid fetching all unread items ids. I have with my personal account 4k unread items, so 4k local items to update if I fetch all of them. I don't mind when doing the initial sync, but for a classic one, it is not insignificant.
I agree that the API could surely be improved. We could make some additions (I am open to that)
That's nice !
I see here two cases to handle read state synchronization :
reader/api/0/stream/contents/user/-/state/com.google/reading-list. The items list would be sorted by last modified date, insertion date being the first last modified time. This change could break existing API implementations by returning all ready existing in local, items. If the client doesn't have any kind of upsert strategy, this will create duplicates./reader/api/0/stream/items/ids. Apply the same strategy as the first point. This wouldn't break anything because only the order would be modified.Anyway, a big thanks for taking the time to answer me. I will investigate the /reader/api/0/stream/items/ids solution.
@Shinokuni I have tested your client today, and it looks very good already, congrats :-)
https://github.com/FreshRSS/FreshRSS/pull/2798
Closing here, but do not hesitate to ask again, especially if you need any documentation / feedback
Hello, as promised in https://github.com/readrops/Readrops/issues/53#issuecomment-721269522, here is a post about FreshRSS synchronization in Readrops. Due to a lack of time, I wasn't able to make it sooner.
I recently (more or less) worked on the addition of FreshRSS starred items in Readrops and it made me work again on item read state synchronization. If I didn't really encounter problems for managing requests, it was on the other hand more difficult on the db side.
First, SQLite restricts to 999 the number of arguments you can give to an IN operator. It means that when you get more than a thousand items ids with /reader/api/0/stream/items/ids, you will have to split them and make multiple requests to update the state of each item, which would be really slow. This also affects the star state synchronization.
D/FreshRSSRepository: FreshRSS sync timer: 704 ms, server queries
D/FreshRSSRepository: FreshRSS sync timer: 9 ms, folders insertion
D/FreshRSSRepository: FreshRSS sync timer: 84 ms, feeds insertion
D/FreshRSSRepository: FreshRSS sync timer: 0 ms, items insertion
D/FreshRSSRepository: FreshRSS sync timer: 495 ms, starred items insertion
D/FreshRSSRepository: FreshRSS sync timer: 528 ms, update starred items state
D/FreshRSSRepository: FreshRSS sync timer: 1071 ms, reset read changes
D/FreshRSSRepository: FreshRSS sync timer: 2322 ms, reset star changes
D/FreshRSSRepository: FreshRSS sync timer: end, 5213 ms
Here is a log of the synchronization after I implemented the fetch of the starred items. The starred items insertion checks for each item if it already exists in db and inserts it if not, which is pretty slow as the test data was about 10 starred items. Then items star state is updated with the ids from /reader/api/0/stream/items/ids which is also very slow. Finally, local read/star state which indicates if an item had one of these states modified is reset.
The synchronization doesn't contains the update of the read state and doesn't fetch any new items but lasts 5 seconds which is way too much. I had to improve all of this.
Here is the new request strategy:
reader/api/0/tag/listreader/api/0/subscription/listreader/api/0/stream/contents/user/-/state/com.google/reading-listreader/api/0/stream/items/idsreader/api/0/stream/contents/user/-/state/com.google/starredI don't make any further calls as it's not needed with the database strategy below.
I added a few new tables:
/reader/api/0/stream/items/idsreader/api/0/stream/contents/user/-/state/com.google/starredInstead of directly updating each item read state with the ids (limited to 999) in the query, all unread items ids are stored in a new table and then used to update the read state. Before inserting unread items ids in the new table, all old items ids from the previous synchronization are deleted. This process makes the update faster even if it's not perfect.
Instead of dealing with starred items ids, only starred items are fetched and stored in a separate table. This ensures not to have to do any request and query to update starred items read and star state. Before the insertion, all previous inserted items from the last synchronization are deleted. The fetch of starred items is limited to 1000 for performance.
D/FreshRSSRepository: FreshRSS sync timer: 530 ms, server queries
D/FreshRSSRepository: FreshRSS sync timer: 10 ms, folders insertion
D/FreshRSSRepository: FreshRSS sync timer: 72 ms, feeds insertion
D/FreshRSSRepository: FreshRSS sync timer: 0 ms, items insertion
D/FreshRSSRepository: FreshRSS sync timer: 7 ms, starred items insertion
D/FreshRSSRepository: FreshRSS sync timer: 760 ms, insert and update items ids
D/FreshRSSRepository: FreshRSS sync timer: end, 1384 ms
The result is a lot better. Some steps were removed and other improved. Of course, this result was made with good conditions: fast WI-FI connection, fast phone (OP 6), no new items and very few starred items. A synchronization with less good variables would last around 3 seconds.
I didn't write anything about pushing read/star state changes from Readrops. I have two solutions here:
Feel free to suggest changes, I'm totally open to modifications.
@Shinokuni Thanks for the update; that looks very good, congrats 馃憤
Regarding pushing state changes, I suggest a hybrid approach: you need to maintain a list of state changes anyway (e.g. to change state of multiple articles at once, in the case a request does not work, telephone offline, etc.), and pushing regularly (when synchronising, but also at significant events, for instance changing view, or before closing the app - if you can catch that-).
Keep up the good work!
Thanks for the suggestion, I will think about it!
Most helpful comment
@Alkarex Sounds like a good thing to stick in the docs which should make it easier to find through a search engine, maybe somewhere in https://freshrss.github.io/FreshRSS/en/developers/01_First_steps.html?