Apparently this file was first uploaded, then 17 minutes later overwritten with what seems to be the same file:
https://commons.wikimedia.org/wiki/File:King_Long_bus_on_line_1034.jpeg#filehistory
Even if the second upload was a human error, I believe overwrite should not happen?
This is strange, whenever I upload images with the same filename, it doesn't overwrite but adds a number on to the end of the filename. Can you reproduce the bug?
I have not encountered this bug actually.
wow, that's pretty dangerous :-/
Easy way to accidentally deface many articles.
@misaochan I looked into the stats and two cases were yours:
https://commons.m.wikimedia.org/wiki/File:Hanmer_springs_-_1.jpeg
https://commons.m.wikimedia.org/wiki/File:Hanmer_springs_-_2.jpeg
Another thing I noticed was that all the overwrites were ".jpeg" or ".png" - no ".jpg". Is there "jpg" specific code that needs to be generalized?
I found a procedure to reproduce:
The multi upload issue above should be fixed by #232. However, I see single uploads are also overwritten (see the first link of this issue), so I guess this bug is not completely solved yet.
Here is the list of overwrites I have: overwrite_csv.txt. Do you see any patterns that indicate how they happened?
Several single-upload cases happened in this one month - at least we might want to try asking them.
Overwrites seemed to have stopped since August 2016, but it looks like they are back since November 2016:
http://tools.wmflabs.org/commons-app-stats/
Strange. The last release was 27th Oct, which only involved filtering years from the categories. I can't think of what might have caused this.
I have looked at a few of the recent overwrites, the pattern seems to be repeated identical uploads to the same, new, own file.
https://commons.wikimedia.org/wiki/File:Marstadter_See_-_6.jpeg
https://commons.wikimedia.org/wiki/File:Dresdner_umland_NO_November_2016_-_17.jpeg
https://commons.wikimedia.org/wiki/File:Relic_bibi_ka_alam.jpeg
https://commons.wikimedia.org/wiki/File:Indian_rupee_de_monitisation.jpeg
Just a speculation, but does it have to do with unstable/interrupted network connection? Can we detect unstable connection somehow? Also, in the long term, maybe we want to provide an option of "Do not upload until connected to wifi" (=delayed upload).
Fortunately, I don't see recent cases of overwriting existing, unrelated files.
I thought that specific bug deserves a separated issue and started it: "Prevent repeated uploads of the same image under the same file name #325"
Interesting. We can check if the device is connected to the internet via the ConnectivityManager class, but I think we would need to repeatedly call that throughout the duration of the upload if we wanted to prevent uploading with an unstable connection? The option to delay uploading until connected to wifi might be a better task to invest effort in IMO.
I don't think ConnectivityManager is necessary.
Just analyze the response of the HTTP server, that will tell you without a
doubt whether the query succeeded or failed.
If it failed, run the query again.
If it fails again, (maybe after a few retries), just give up and if
possible tell the user.
Someone else just encountered this issue at #230
Instead of adding the number only after the upload process has begun (which I think is the way it currently happens), is there a way to introduce an additional step where it happens before the upload process starts?
A great idea. I made #703 - please feel free to edit the description and title.
What is the current state of the issue?
Personally i see this as a big, parent issue - as long as any overwrite happens, it will stay opened. We don't really know why many of the cases happen but they need to be prevented eventually.
At the same time, I think sub-issues with known concrete steps to reproduce can be opened. (e.g. "overwrite happens after X is done on Android version Y")
Incidentally the web-based Commons Upload Wizard also has an overwrite problem: https://phabricator.wikimedia.org/T179967
I reproduced the problem.
I have the whole logcat, please ask me if you want it.
The problem is still happening in 2.8.1, see for instance https://commons.wikimedia.org/wiki/File:Venner%C3%B8d_skole.jpeg
I accidentally posted a new issue for this at #1837 . @PeterFisk could you please get back to us, letting us know what Android version you are using (can be found in your phone settings), whether you reinstalled the app between the two uploads (unlikely, looking at the timing), and if you had any connectivity issues while uploading the overwriting file?
If there is indeed a connection with .jpeg files, we need to bump up the priority of #1825 then.
Device and Android version:
Samsung Galaxy S9, SM-G960F
Android 8.0.0
Stock version Samsung Experience 9.0
I did not reinstall between the two image uploads. I accidentally gave them the same name, but expected to get a warning or at least automatic rename of the file.
PeterFisk
I tested the .jpeg hypothesis. It turns out that there is indeed a connection with .jpeg! And unfortunately, now all files submitted via "Share" turn into .jpegs. https://commons.wikimedia.org/wiki/File:Royal_arcade_Melbourne.jpeg overrides, https://commons.wikimedia.org/wiki/File:Creperie_Brisbane_2.jpg does not.
This bodes more investigation, but for the time being I think the simplest way to get a hotfix out quickly is to solve the .jpeg problem #1825 . @neslihanturan are you able to do this soon? If you are, please send the fix to 2.8-release so that I can release a hotfix for it ASAP.
Just documenting the results of my debugging here, after delving into the legacy code. The culprit is this block of code in UploadService:
try {
filename = Utils.fixExtension(contribution.getFilename(), MimeTypeMap.getSingleton().getExtensionFromMimeType((String)contribution.getTag("mimeType")));
synchronized (unfinishedUploads) {
Timber.d("making sure of uniqueness of name: %s", filename);
filename = findUniqueFilename(filename);
unfinishedUploads.add(filename);
}
What this code does is calls fixExtension on the filename. fixExtension doesn't just FIX the extension, it is solely responsible for appending any extension at all prior to the uniqueness check. Before fixExtension is called, the filename never has an extension, it is always "filename". Depending on the results of MimeTypeMap.getSingleton().getExtensionFromMimeType((String)contribution.getTag("mimeType")), ".jpg" is added if the mimeType is image/jpeg, otherwise if the mimeType is null, fixExtension just silently returns "filename" unchanged.
So, what happens in the case of mimeType being null, is that "filename", rather than "filename.jpg", gets passed untouched to findUniqueFilename(). "filename" is then sent to the fileExistsWithName() method in ApacheHttpClientMediaWikiApi.java,
@Override
public boolean fileExistsWithName(String fileName) throws IOException {
return api.action("query")
.param("prop", "imageinfo")
.param("titles", "File:" + fileName)
.get()
.getNodes("/api/query/pages/page/imageinfo").size() > 0;
}
which obviously does not ever return a match because there is no such file as "filename" in Commons, all Commons images have suffixes.
After silently passing this test, "filename" automagically gets a .jpeg suffix at the end of its upload. I haven't had the time yet to find out if this is done on our side or on the server's side.
Basically, aside from a few engineering issues, this current overwrite issue is caused by MimeTypeMap.getSingleton().getExtensionFromMimeType((String)contribution.getTag("mimeType")) returning null. Not sure if it's faster to find out why it's returning null sometimes when Share is used, or to refactor the code to ensure that the uniqueness check is never done with an empty filename suffix. I suppose the absolute fastest way is to simply make fixExtension add ".jpg" even if the tag is null.... but not sure what the side effects of that would be.
Have to head off now as it's 4am, but will be back to look into it tomorrow. @neslihanturan hopefully this might be of help to you, as #1825 appears to be highly linked. I have a branch up with lots of logs that may be helpful, too - https://github.com/misaochan/apps-android-commons/tree/fix-jpeg-overwrite
Basically, aside from a few engineering issues, this current overwrite issue is caused by MimeTypeMap.getSingleton().getExtensionFromMimeType((String)contribution.getTag("mimeType")) returning null. Not sure if it's faster to find out why it's returning null sometimes when Share is used, or to refactor the code to ensure that the uniqueness check is never done with an empty filename suffix. I suppose the absolute fastest way is to simply make fixExtension add ".jpg" even if the tag is null.... but not sure what the side effects of that would be.
I did the same debug as you did and came to same point. But extension was null in my case as it should be. Because the only files I could reproduced this issue were no-extension files. I think adding .jpg (if extension is null) will be our solution.
@neslihanturan I was thinking about that - what if the file being uploaded is not ACTUALLY a .jpg, though? For instance, if it was a SVG or PNG.
I tested this theory by uploading a .png. When I upload via Share, the .png upload always fails with the error:
<?xml version="1.0" encoding="UTF-8"?><api servedby="deployment-mediawiki-09"><error code="verification-error" info="File extension ".jpg" does not match the detected MIME type of the file (image/png)."><details><detail>filetype-mime-mismatch</detail><detail>jpg</detail><detail>image/png</detail></details><docref xml:space="preserve">See https://commons.wikimedia.beta.wmflabs.org/w/api.php for API usage. Subscribe to the mediawiki-api-announce mailing list at &lt;https://lists.wikimedia.org/mailman/listinfo/mediawiki-api-announce&gt; for notice of API deprecations and breaking changes.</docref></error></api>
Because the mimeType is not detected in fixExtension, so we added a .jpg to the end of it, which doesn't work since it's a .png.
On the other hand, when I upload the .png via in-app gallery button, it appropriately detects the .png mimeType.
I looked up stats to see how big an issue this would be - at the moment, it seems like .jpg is used by almost every phone for camera, but screenshots can sometimes be .png. So it could be a potentially big issue. However, so far the only phone I can find that specifically says it saves screenshots as .png is the iPhone... which our app doesn't support. :)
I think what I will do is merge your PR, but keep this issue open. While the PR works as a band-aid, we still need to find out why mimeType returns null when uploaded via Share, but not via the in-app gallery button.
Still happening: https://commons.wikimedia.org/wiki/File:Okonomiyaki.jpeg
Note: Maybe it is only happening in the structured-data branch.
@nicolas-raoul Can i work on this issue?
@gouri-panda Yes, sure :-)
Please first try to reproduce on the master branch: Upload a picture, then a different one with the same title (caption), then let us know.
I suggest you use the "betaDebug" flavor when building the app.
@nicolas-raoul I tried to reproduce this issue but it can't reproduce on the master branch. It checks the page exists or not if it exists It just appends the title with a suffix number.
private String findUniqueFilename(String fileName) throws IOException {
String sequenceFileName;
for (int sequenceNumber = 1; true; sequenceNumber++) {
if (sequenceNumber == 1) {
sequenceFileName = fileName;
} else {
if (fileName.indexOf('.') == -1) {
// We really should have appended a filePath type suffix already.
// But... we might not.
sequenceFileName = fileName + " " + sequenceNumber;
} else {
Pattern regex = Pattern.compile("^(.*)(\\..+?)$");
Matcher regexMatcher = regex.matcher(fileName);
sequenceFileName = regexMatcher.replaceAll("$1 " + sequenceNumber + "$2");
}
}
if (!mediaClient.checkPageExistsUsingTitle(String.format("File:%s",sequenceFileName)).blockingGet()
&& !unfinishedUploads.contains(sequenceFileName)) {
break;
}
}
return sequenceFileName;
}
I did it on the structured branch too, but it did the same thing there too.
Thanks for insvestigating!
I wonder how this happened...

@macgills do you remember how you uploaded the second okonomiyaki picture? Was it via Nearby? Thanks!
It wouldn't have been through nearby, I was just getting your photos from Google Drive. I think I uploaded 2 separate pictures with the same title but I doubt I had disabled any checks during development
It seems that again all files are automatically converted to .jpeg, sometime between 8 April and 21 April (see https://commons.wikimedia.org/w/index.php?target=Misaochan2&namespace=all&tagfilter=&start=&end=&limit=50&title=Special%3AContributions - all these files should be the same format because they were taken with the same camera). I think that might be related to this issue. https://github.com/commons-app/apps-android-commons/issues/228#issuecomment-414144656
All the .jpegs were uploaded from structured-data branch though, and all the .jpgs were uploaded from other branches. Will need to test again currently on master and 2.13-release to see if the same problem happens.
This is a structured data issue.
structured data code
/**
* Choose a filename for the media.
* Currently, the caption is used as a filename. If several languages have been entered, the first language is used.
*/
public String getFileName() {
return uploadMediaDetails.get(0).getCaptionText();
}
code on master
public String getFileName() {
return title
!= null ? Utils.fixExtension(title.toString(), getFileExt()) : null;
}
@gouri-panda do you still want to work on this? If not I can take it over
@macgills sure,you can take it over.
Most helpful comment
Just documenting the results of my debugging here, after delving into the legacy code. The culprit is this block of code in UploadService:
What this code does is calls fixExtension on the filename. fixExtension doesn't just FIX the extension, it is solely responsible for appending any extension at all prior to the uniqueness check. Before fixExtension is called, the filename never has an extension, it is always "filename". Depending on the results of
MimeTypeMap.getSingleton().getExtensionFromMimeType((String)contribution.getTag("mimeType")), ".jpg" is added if the mimeType isimage/jpeg, otherwise if the mimeType is null, fixExtension just silently returns "filename" unchanged.So, what happens in the case of mimeType being null, is that "filename", rather than "filename.jpg", gets passed untouched to
findUniqueFilename(). "filename" is then sent to the fileExistsWithName() method in ApacheHttpClientMediaWikiApi.java,which obviously does not ever return a match because there is no such file as "filename" in Commons, all Commons images have suffixes.
After silently passing this test, "filename" automagically gets a .jpeg suffix at the end of its upload. I haven't had the time yet to find out if this is done on our side or on the server's side.
Basically, aside from a few engineering issues, this current overwrite issue is caused by
MimeTypeMap.getSingleton().getExtensionFromMimeType((String)contribution.getTag("mimeType"))returning null. Not sure if it's faster to find out why it's returning null sometimes when Share is used, or to refactor the code to ensure that the uniqueness check is never done with an empty filename suffix. I suppose the absolute fastest way is to simply make fixExtension add ".jpg" even if the tag is null.... but not sure what the side effects of that would be.Have to head off now as it's 4am, but will be back to look into it tomorrow. @neslihanturan hopefully this might be of help to you, as #1825 appears to be highly linked. I have a branch up with lots of logs that may be helpful, too - https://github.com/misaochan/apps-android-commons/tree/fix-jpeg-overwrite