Apps-android-commons: "Page does not exist" near some usernames in Explore

Created on 23 Jan 2020  Â·  36Comments  Â·  Source: commons-app/apps-android-commons

  1. Go to Explore
  2. Scroll until you see a username with "page does not exist" appended. Example:
    Screenshot_20200123-202312_Commons
    Expected: Just the username.

First figure out where this string comes from (server-side? app side?) then make it not be generated, or if impossible just filter it out.

assigned bug explore good first issue

All 36 comments

can I Work on this? @nicolas-raoul

First figure out where this string comes from (server-side? app side?) then make it not be generated, or if impossible just filter it out.

@vibhusharma101 Sure. Please help in checking this and comment here. Also, I see that you have requested to work on a lot of issues at once. I would suggest that you start with just 1 issue at first and check again for any unassigned issues.

ok Sir, I will take care about this thing in future

@maskaravivek @nicolas-raoul sir this is server side error because author name is set in GridViewAdapter class and which is taking information that we get after calling the api. We can filter this string because user can give any name to the author text view, So the easy option is to filter it out from the string in author's textView.

@vibhusharma101 Great investigation! Could you please post here the whole HTTP response that contains that string? I think you can see it in the logs or by putting a debug breakpoint on a low level network method. Thanks :-)

Sir I am not able to use the app in testing mode I have to use the production version of the app because I am not able to log in. and If I am not logged in I am not able to explore the images. I have also created an issue regarding the same thing.

@vibhusharma101 You can see debug logs in the production app as well.

run :app:installProdDebug.

@maskaravivek ok sir

@nicolas-raoul {"error":{"code":"missingtitle","info":"The page you specified doesn't exist.","*":"See https://commons.wikimedia.beta.wmflabs.org/w/api.php for API usage. Subscribe to the mediawiki-api-announce mailing list at <https://lists.wikimedia.org/mailman/listinfo/mediawiki-api-announce> for notice of API deprecations and breaking changes."},"servedby":"deployment-mediawiki-09"}
this is the json we are recieveing for the request in which page does not exist is coming

and this is the url which sends this request got from android studio
https://commons.wikimedia.beta.wmflabs.org/w/api.php?format=json&action=parse&prop=text&page=File%20talk%3ACenter%20Street%20School%20Branch%20Library.jpg

@vibhusharma101 you still working on this or can i take this issue?

This is a client side error.
private static String getArtist(ExtMetadata metadata) { try { String artistHtml = metadata.artist(); return artistHtml.substring(artistHtml.indexOf("title=\""), artistHtml.indexOf("\">")) .replace("title=\"User:", ""); } catch (Exception ex) { return ""; } }
This method in Media fetches username with page not exist string appended.
It is possible to format the string here and return only username.
Also if it possible to prevent string from generating please let me know.
2020-04-03 00:56:08.307 6584-6650/fr.free.nrw.commons D/OkHttp: --> GET https://commons.wikimedia.org/w/api.php?action=query&format=json&formatversion=2&prop=imageinfo&iiprop=url|extmetadata&iiurlwidth=640&iiextmetadatafilter=DateTime|Categories|GPSLatitude|GPSLongitude|ImageDescription|DateTimeOriginal|Artist|LicenseShortName|LicenseUrl&titles=File%3AEilat%20Dolphin%20Reef%20%283%29.jpg

2020-04-03 00:56:09.093 6584-6650/fr.free.nrw.commons D/OkHttp: <-- 200 https://commons.wikimedia.org/w/api.php?action=query&format=json&formatversion=2&prop=imageinfo&iiprop=url|extmetadata&iiurlwidth=640&iiextmetadatafilter=DateTime|Categories|GPSLatitude|GPSLongitude|ImageDescription|DateTimeOriginal|Artist|LicenseShortName|LicenseUrl&titles=File%3AEilat%20Dolphin%20Reef%20%283%29.jpg (785ms, unknown-length body)

Also there is one more ambiguity in username. The string isn't formatted properly.
image

@318anushka : No news from vibhusharma101 so you can take it.

I have submitted a PR which formats the string and works just fine but i'm unable to figure out how it is generated. The media object fetched from imageInfo API already has the string.

This ticket has caused a great deal of confusion on my part and deeply overlong comments, my apologies anushka.

@nicolas-raoul do you know if this Uploaded by text field is supposed to represent the User ie the username a user uses to login Or is it supposed to be the Author ie the username or the custom author name should one be present?

Just wanted to share a quick comment:

@nicolas-raoul do you know if this Uploaded by text field is supposed to represent the User ie the username a user uses to login ...

EDITED by macgills:
I believe @macgills means: the "uploaded by" text field has the Username of the user who uploaded it and not the currently logged in user.
Just in case someone gets confused (I was, initially). 🙂

No problem @macgills It's best to clarify before updating pr.
And account in Commons is created through app, So isn't both the username (Commons in general and one in app) the same unless user intentionally creates separate accounts.

I think sivaraam is referring to a non Commons username? Like our Github unames? I don't understand at all. Is there a separate name field we can fill out on a user page or something like that?

The app is commons and commons is the app so when I said username I meant "the username of a user of commons" as is understood on this ticket in title and description. What you put in the username field
image

It looks like my comment lead to more confusion than it clarified. Apologies for that.

I think sivaraam is referring to a non Commons username? Like our Github unames? I don't understand at all. Is there a separate name field we can fill out on a user page or something like that?

I'm not referring to any other username. I'm just referring to the username used to log-in to Commons. I was just trying to avoid a reader misinterpreting the following ...

@nicolas-raoul do you know if this Uploaded by text field is supposed to represent the User ie the username a user uses to login ...

... as the username of the user who is using our app to view the 'Explore' screen instead of as the username of the user who uploaded the image for which the 'Uploaded by:' is shown. Hope that clarifies my intention. Let me know if it doesn't.

The app is commons and commons is the app ...

Well. In a more general sense not specific to this issue, if I were to interpret that literally: it's not true. Wikimedia Commons is a lot large. You can do a lot more using the web version of Commons than what you could do with the app. The app is still limited in it's features. There's a lot of room for expansion. For instance, our app is primarily targeted for images while you can upload all kinds of files to Commons. Anyways, that's not the topic of our discussion. So, I'll leave it at that.

@nicolas-raoul Could you weigh in? It is blocking this PR currently.

Sorry I don't have enough knowledge in this area to provide an insightful
answer 😥

Le mar. 21 avr. 2020 à 18:52, Seán Mac Gillicuddy notifications@github.com
a écrit :

@nicolas-raoul https://github.com/nicolas-raoul Could you weigh in? It
is blocking this PR currently.

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/commons-app/apps-android-commons/issues/3341#issuecomment-617077371,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/AAAYKBWMKXAD7MP3E265AY3RNVUGTANCNFSM4KKU5QAQ
.

@misaochan any insights here? starting this comment

My guess is that it should be the Author, however I cannot find relevant documentation to prove that. Can we not just filter the "page does not exist" string out?

We can but this PR introduced a new user field which I do not want because Media already has enough fields for my liking. Hopefully we can rename Media.creator to Media.author to represent what the field actually is because there is a quite confusing soup of user/author/creator/uploadedBy/artist and I am trying to make sense of it

In that case, only changing this should be fine. We can remove user field if it's not required and maybe rename few others.

This is the API that we use to fetch images in explore page and it doesn't return author/createdby/user fields. So I had to extract the author name from Artist which is a HTML field. I parse the HTML to extract the author name. As this field isn't very reliable I added a separate property to Media instead of merging it with author/createdby/user etc. Lets evaluate if there's a way to consolidate things.

https://commons.wikimedia.org/w/api.php?action=query&format=json&formatversion=2&generator=categorymembers&gcmtype=file&gcmsort=timestamp&gcmdir=desc&prop=imageinfo&iiprop=url%7Cextmetadata&iiurlwidth=640&iiextmetadatafilter=DateTime%7CCategories%7CGPSLatitude%7CGPSLongitude%7CImageDescription%7CDateTimeOriginal%7CArtist%7CLicenseShortName%7CLicenseUrl&gcmtitle=Category:Mango&gcmlimit=10

This is the API that we use to fetch images in explore page and it doesn't return author/createdby/user fields.

I agree that the API we use doesn't provide an easy way to fetch a string for the "author" (the actual "creator"/"artist" of the image/media). When it comes to "user"/"createdby" (the person who uploaded it to commons) the API does provide a way to easily get that without having to do any HTML parsing. We just are not requesting that field to be returned in the API response now. We can get the user who "uploaded" the image by asking for user property to iiprop. Here's a random example to demonstrate that:

https://commons.wikimedia.org/w/api.php?action=query&format=json&prop=imageinfo&generator=categorymembers&formatversion=2&iiprop=url%7Cextmetadata%7Cuser&gcmtitle=Category%3AMango&gcmprop=ids%7Ctitle&gcmtype=file&gcmsort=timestamp&gcmdir=descending

See also: ApiSanbox link

This is how the user field was added in the PR.

My guess is that it should be the Author, however I cannot find relevant documentation to prove that.

For the "Uploaded by" field, I believe that we should actually should show the (user)name of the user who actually "uploaded" the image rather than the name of the person who actually "created" the image. That makes sense to me as we say "Uploaded by" and not "Author". For a particular image in Commons, there are chances that the uploader of the image might be _different_ from the creator of the image. Here's just one example: File:Troy Hunt.jpg - Wikimedia Commons. The image was "uploaded" by the user "IagoQnsi" (as could be seen from the "File History" section) but the "Author" is "Troy Hunt" as he's the one who originally took the image.

This is what I've been trying to convey repeatedly in the PR #3620 discussion. For some reason which I couldn't understand, the people there don't seem convinced. I would be glad if someone could kindly point out what I'm conveying incorrectly. Also, correct me if I'm wrong somewhere.

Just wanted to add a few other minor things that popped up to mind about this Author/Uploader dichotomy.

  • The HTML parsing for the Author field _could fail at times_ (e.g. for Troy Hunt's image) which would result in a empty string being assigned to creator for that image. In case you're wondering why the "Author" field is missing for some images in the Explore screen, this is the reason. This interacts badly with the "Nominate for deletion" flow. Read the next point for the "why?"
  • We're currently using the posting the nomination for deletion notice in the user page of the Author/Creator of the image. A couple of issues with that:

    • The Author/Creator field of an image being nomniated could be empty at times for the reason I mention before. We currently throw a RuntimeException in those cases and do not nominate the image for deletion.

    • As correctly mentioned in issue #3464's description it should be posted in the user page of the _user who uploaded_ the image. As there are chances for the uploader to be different from the creator, don't be surprised if we receive an issue in the future such as "App posts deletion request notifications ({{subst:idw}}) on the creator's talk page instead of uploader's talk page".

If we introduce uploader field, we could do away with the HTML parsing of Author, post the deletion notice without worrying about the uploader field being null and we could confidently post the deletion notice in the uploader's talk page without having to worry about the case when the image has a different author. These are the reasons I'm strongly arguing for introducing the uploader field. I'll be glad to hear counter-arguments, if any. If anyone identifies anything wrong with the above points, let me know.

I think there might be a misunderstanding here. When @macgills said "Author", I thought he meant (as per his edit) the "Username of the user who uploaded it".

When it comes to "user"/"createdby" (the person who uploaded it to commons) the API does provide a way to easily get that without having to do any HTML parsing. We just are not requesting that field to be returned in the API response now. We can get the user who "uploaded" the image by asking for user property to iiprop. Here's a random example to demonstrate that:

This makes sense to me. If no one has any objections, we can go with this, and perhaps rename all variables surrounding this to "uploadedBy" or something similar that is more descriptive?

Apologies that I have not been able to follow the PR, the discussion is very long.

image

Should have looked at this long ago, clearly it is the user. Or at least this is the closest analogue I can find.

My confusion is mostly cleared but we could improve the situation by removing getAuthorName from SessionManager, renaming creator in Media to author and instead of setting this author at upload time just create the object with author set to customAuthor if it exists else userName

I think these changes to the data flow are important enough to do now as part of this PR, as they made me stumble a lot and hold up this PR trying to figure out whats what.

We really have to do the cleaning up of the author/artist/creator mess but I believe it's easier to do after all the dust settles. By which I mean it's best done after the replacement PR for #3620, PR #3566 are merged.

Generally, we now seem to agree that we _should not_ assume the "author" and "uploader" of an image to be the same person. The PR replacement #3620 would fix this for the images shown in the "Explore" screen. For the same reasons, we're better off not assuming that the logged in user is the "author" of an image shown in the Contribution. It's best to obtain the "author" information from the image meta-data received from the server. I hope #3566 does this.

My confusion is mostly cleared but we could improve the situation by removing getAuthorName from SessionManager, renaming creator in Media to author and instead of setting this author at upload time just create the object with author set to customAuthor if it exists else userName

From what I understand, a better way to deal with this is to get the "author" information of an image from the API response. It's the best solution considering the fact that, we show all contributions and _not just the ones_ that an user uploads using our app.

I just left a comment on anushka's PR, I'm interested in finishing it up.

@madisoncfallin We haven't heard from @318anushka in a while. So, feel free to take this up by addressing the pending review comments in the PR :)

Looks like the PR is well out of date. Unfortunately, most of the work that 318Anushka did was on files that were removed when the Media section was reorganized this summer.

@sivaraam I'll attempt fix the issue in a new PR.

Hi @madisoncfallin, any progress on this?

Was this page helpful?
0 / 5 - 0 ratings