After few days of running client 2.4.2 crashes. Earlier had the same problem.
Not sure what's causing the issue. According to the logs the owncloud client crashes upon dealing with locked (opened) files. In this case a >200MB *.pst file. But that only happenes after few days of working correctly. The file is constantly open in Outlook.
Operating system: Linux Debian
Web server: Apache
Database: MySql
PHP version: 5.7
ownCloud version: 10..0.8
Storage backend (external storage): local storage
Client version: 2.4.2
Operating system: Windows 7
OS language: PL
Installation path of client: C:\Program Files
Please use Gist (https://gist.github.com/) or a similar code paster for longer
logs.
tail -1 /tmp/owncloud.log
07-25 16:01:15:819 [ info sync.propagator.upload ]: Resuming "Pliki programu Outlook/Archiwum 2016.pst" from chunk 0 ; sent = 0
grep -A 3 -E'Resuming.*Archiwum 2016.pst' /tmp/owncloud.log
07-25 15:46:06:463 [ info sync.propagator.upload ]: Resuming "Pliki programu Outlook/Archiwum 2016.pst" from chunk 0 ; sent = 0
07-25 15:46:06:466 [ warning sync.propagator.upload ]: Could not prepare upload device: "Permission denied"
07-25 15:46:06:466 [ info gui.lockwatcher ]: Watching for lock of "C:/Users/pjanusz/Documents/Pliki programu Outlook/Archiwum 2016.pst" being released
07-25 15:46:06:466 [ warning sync.propagator ]: Could not complete propagation of "Pliki programu Outlook/Archiwum 2016.pst" by OCC::PropagateUploadFileNG(0x6acde28) with status 3 and error: "Permission denied"
--
07-25 15:50:11:433 [ info sync.propagator.upload ]: Resuming "Pliki programu Outlook/Archiwum 2016.pst" from chunk 0 ; sent = 0
07-25 15:50:11:437 [ warning sync.propagator.upload ]: Could not prepare upload device: "Permission denied"
07-25 15:50:11:438 [ info gui.lockwatcher ]: Watching for lock of "C:/Users/pjanusz/Documents/Pliki programu Outlook/Archiwum 2016.pst" being released
07-25 15:50:11:438 [ warning sync.propagator ]: Could not complete propagation of "Pliki programu Outlook/Archiwum 2016.pst" by OCC::PropagateUploadFileNG(0x6acdd20) with status 3 and error: "Permission denied"
--
07-25 16:00:43:517 [ info sync.propagator.upload ]: Resuming "Pliki programu Outlook/Archiwum 2016.pst" from chunk 0 ; sent = 0
07-25 16:00:43:520 [ warning sync.propagator.upload ]: Could not prepare upload device: "Permission denied"
07-25 16:00:43:520 [ info gui.lockwatcher ]: Watching for lock of "C:/Users/pjanusz/Documents/Pliki programu Outlook/Archiwum 2016.pst" being released
07-25 16:00:43:521 [ warning sync.propagator ]: Could not complete propagation of "Pliki programu Outlook/Archiwum 2016.pst" by OCC::PropagateUploadFileNG(0x6acdab8) with status 3 and error: "Permission denied"
--
07-25 16:01:15:819 [ info sync.propagator.upload ]: Resuming "Pliki programu Outlook/Archiwum 2016.pst" from chunk 0 ; sent = 0
@piojan *.pst is a special thing. We've seen previously that Win system changes it all the time, so the client has to sync it all the time.
Did you report the crash with the crash reporter?
Looking at the crash reporter, this might be this crash:
https://sentry.io/owncloud/desktop-win-and-mac/issues/623245098/
lockwatcher.cpp in OCC::LockWatcher::LockWatcher at line 27
folderman.cpp in OCC::FolderMan::FolderMan at line 51
application.cpp in OCC::Application::Application at line 151
main.cpp in main at line 65
crtexe.c in __tmainCRTStartup at line 334
Edit: Looking closer at the code: that crash is probably not related.
I haven't used any external tools to analyse the crashes yet.
Would it be beneficial to try to run CrashRpt (first I will have to learn how to use this tool)?
My *.pst file is and archive from last years so it shouldn't be changing that often (I guess that there is an option to ignore this file in syncs). Most of the times the client manges to sync this file due to the infrequent changes.
I was not able to reproduce the problem.
Having a backtrace would help, for sure.
@ogoffart It looks to me like the error cases in FileSystem::openAndSeekFileSharedRead might not release resources in all cases. I'm on it.
EDIT: See #6707. However, in practice it looks like _open_osfhandle wasn't failing here, and reading the QFile::open() overload also makes me think it'd never error. That means my PR probably doesn't close this issue.
@piojan We don't yet know what's causing the problem unfortunately! When this crash happens, do you get a crash reporter window? If so, please click send and note down the current time.
Otherwise, if you can, running the client in a debugger and getting a stacktrace yourself would be ideal (http://wiki.freepascal.org/Creating_a_Backtrace_with_GDB ? I haven't checked it out in detail).
When starting the gdb I am getting:
Reading symbols from C:\Program Files (x86)\ownCloud\owncloud.exe...(no debugging symbols found)...done.
I will wait for a crash and I will see what I can get out of this.
thread apply all bt or bt all
for space reasons we don't ship the debug symbols so it might be useless. :-/
did you not enable the crash reporter in the client settings so the crashes are sent to us?
CC @dschmidt
Together with the "warning" about the invalid parameter a windows shows up:
Microsoft Visual C++ Runtime Library
The application has requested the Runtime to rerminate it in an unusual way.
After clicking ok (the only option) application in closed. Untill that button is pressed new treads are regularly created. The GUI show "Waiting" on one of the shares.
Crash raports are enable in app but I am not seeing any info about reporting a crash or anything similar. Maybe it's happening in the background?
[New Thread 14420.0x30f4]
[New Thread 14420.0x4c90]
[New Thread 14420.0x1f30]
[New Thread 14420.0x52c4]
[New Thread 14420.0x3c3c]
[New Thread 14420.0x744]
[New Thread 14420.0x278]
warning: Invalid parameter passed to C runtime function.
warning: Invalid parameter passed to C runtime function.
[New Thread 14420.0x3c28]
warning: 08-21 16:16:32:070 [ warning default ]: QWaitCondition: Destroyed while threads are still waiting
warning: 08-21 16:16:32:150 [ warning default ]: QWaitCondition: Destroyed while threads are still waiting
warning: 08-21 16:16:32:160 [ warning default ]: QWaitCondition: Destroyed while threads are still waiting
[Inferior 1 (process 14420) exited with code 03]
(gdb) bt
No stack.
(gdb)
Would you have some other suggestions what else I can check?
Is there a publicly accessible app with the debugs symbols?
Hey @piojan,
thanks for filing the report.
Can't say much about the actual issue, but it's odd that you can't see the crash reporter. I just verified it's working in the "ordinary crash" case in 2.4.2.
Could you maybe give the 2.5beta a spin?
If the problem still occurs, maybe the crash reporter behaves differently now (i.e. it comes up) because we use a different compiler there and for that reason a different code path for the crash reporter.
We're releasing 2.5.0 rc1 tomorrow, would be good to test it with the MSI from there.
There should definitely be a crash reporter coming up.
Have tested it on: ownCloud-2.5.0.10300-daily20180830-setup.exe
Still revived the same termination window (no crash report) + "warning: Invalid parameter passed to C runtime function".
I have also noticed this (at least when running throught gdb):
warning: Could not find " "owncloud_crash_reporter" " in ../libexec or application path
warning: m_crashReporterChar: owncloud_crash_reporter
@piojan Sorry, I should have written this more explicit. Could you try with the .MSI installer, not the .EXE installer?
Also it needs to be a recent one, e.g. the 2.5.0 rc1.
Thank you for your patience :-/
https://github.com/owncloud/client/releases/tag/v2.5.0-rc1
I have to agree that 2.5.0-rc1 now correctly reports a crash:
bp-e189e43d-bbd6-4511-98e8-b2d192180911
Just in case another one:
bp-01062332-f3f2-4f6e-9c43-0931c2180913
Thanks, here is the backtrace:
https://sentry.io/owncloud/desktop-win-and-mac/issues/683372416/
Crash: Unhandled C++ Exception at 0x769ac54f
File "propagateuploadng.cpp", line 319, in OCC::PropagateUploadFileNG::startNextChunk()
File "propagateuploadng.cpp", line 177, in OCC::PropagateUploadFileNG::slotPropfindFinished()
File "networkjobs.cpp", line 372, in OCC::LsColJob::finished()
File "abstractnetworkjob.cpp", line 234, in OCC::AbstractNetworkJob::slotFinished()
Let's try to fix this for an early 2.5.1
( bp-01062332-f3f2-4f6e-9c43-0931c2180913 is the same backtrace)
I wonder what could cause this exception. Line 319 is:
UploadDevice::prepareAndOpen do some call the the Windows API, but i would not think they are allow to throw exception. Still in prepareAndOpen, we see this suspicious code:
I thought maybe this could overflow and cause an allocation too big so that it would bad_alloc. But both size and start are unsigned, so is the return value of FileSystem::getSize, which could pottentially return 0 if the file is locked. According to the definition of qBound which is { return qMax(min, qMin(max, val)); }, It should return the minimum value if the maximum value is smaller than the minimal one, so it can never return a negative number.
(and size is the size of a chunk).
Maybe there is something else that i do not see.
@ckamm: any idea?
@ogoffart but wouldn't then UploadDevice be in the backtrace instead of of only startNextChunk?
But that only happenes after few days of working correctly
@piojan Did you watch the memory usage of owncloud.exe before this crash happens?
We regularly check memory leaks and https://github.com/owncloud/client/pull/6707 is in 2.5.0-rc1 and you still get crashes so maybe there is something else up?
but wouldn't then UploadDevice be in the backtrace instead of of only startNextChunk?
Yes... or maybe the last part of the stacktrace is missing or optimized away.
[...] memory usage [...] memory leaks
Out Of Memory is possible, But i'd say it is unlikely given that we have twice the same backtrace.
I will pay closer attention to the memory aspects in the future.
In my previous debugs with gdb I had 2x at each crash time:
warning: Invalid parameter passed to C runtime function.
On the other hand a warning is not an exception (however it might lead to it as some point). But we don't know what param and value.
I could check with 2.5.0rc1 if anything new is going to show up with gdb. Maybe a backtrace would be achievable now.
I also don't see where this would be going wrong. Since @piojan can reproduce it, we could create a custom build with more logging if the attempts to get a backtrace with gdb don't work out.
I still could imagine this being caused by frequent lock-related failures in the windows codepath of openAndSeekFileSharedRead() but don't see any concrete problem.
gdb is now useless:
Starting program: C:\Program Files (x86)\ownCloud\owncloud.exe
[New Thread 2152.0x2390]
../../gdb-7.6.1/gdb/minsyms.c:862: internal-error: sect_index_data not initializ
ed
A problem internal to GDB has been detected,
further debugging may prove unreliable.
Quit this debugging session? (y or n) n
../../gdb-7.6.1/gdb/minsyms.c:862: internal-error: sect_index_data not initializ
ed
A problem internal to GDB has been detected,
further debugging may prove unreliable.
Create a core file of GDB? (y or n)
As for the mem usage after around 1-3 hours of running:
Private bytes: 1194 MB
Working Set: 1123 MB
after starting again: 31 and 54 MB.
after 30 minutes of running: 200-220 MB
but from the historical private bytes I can see a jump from: 188 to 198 (at 16:14) and from 198 to 207 after another 4 min (16:19). Next one of around 10M at 16:27. So irregular in time scale but are associated with slight cpu load (maybe after each synch try).
On my home pc the mem usage after 2 months of running 2.4 <100MB.
So looks like we might be leaking ;)
@ogoffart You mentioned you had fixed some memory leaks?
OK just saw @ogoffart 's message:
I looked at it with heaptrack and noticed a couple of small leaks, which are unlikely to be the ones in this bug report
So it's a Windows-only leak?
@piojan I've tried to reproduce on Win10 by creating a locked file and restarting the sync a dozen times, but don't see a leak or anything unexpected yet on my end unfortunately.
Would you be up for testing with a custom build that includes a memory leak checker? (if I get around to making one) (best bet right now would be heob + mingw debug build)
@piojan What's your current state? Do you want to run a 2.5.1 daily build and see if this works better?
https://download.owncloud.com/desktop/daily/ownCloud-2.5.1.10890.10790-daily20181024.en-US.msi
@guruz Looks like the latest 2.5.1 has same leaking problems (in my usage case):
I will wait for a crash, send it over and attach it's id here.
(Edit: bp-97c06f3e-4852-4cc1-9021-216f12181025)
@ckamm I can give it a try with the special build whenever you will be ready.
@piojan Could you email me at mail at ckamm de so I can send you the binaries conveniently?
@ckamm Is this something that can be done in a branch on the client repo here? Then we can also create a build via our build server and have it on download.owncloud.org
@guruz My intention is to package up my debug mingw build with dependencies as well as a heap tracker and a heap-tracker start script. There will likely be minimal actual code changes - some added logging in the critical area at best.
Have sent a debugging package to @piojan and hope it'll work for him.
Unfortunately my custom debug build crashes for @piojan even though dependencies seem to all be provided according to https://github.com/lucasg/Dependencies
@dschmidt How much effort do you think it would be to provide windows debug builds?
I can still see some identical backtrace in sentry with 2.5.1 daily:
Out Of Memory is possible, But i'd say it is unlikely given that we have twice the same backtrace.
Well, actually, this is the allocation of the chunk itself which is probably one of the bigger allocation in the program, so it is actually quite logical it fails there.
@piojan We've found the leak, will ping here once test builds are available.
I cannot reproduce as part of 2.5.1 QA. we need to rely on @piojan to report a test result eventually.
@guruz I suggest to push this past the 2.5.1 milestone, as a promising fix was added.
@ckamm we get compile errors on several Linux platforms due to this fix.
Can you please check protability of std::make_unique and friends:
335s] /home/abuild/rpmbuild/BUILD/owncloudclient-2.5.1.10969-rc2/src/libsync/propagateuploadng.cpp:316:19: error: 'make_unique' is not a member of 'std'
[ 335s] auto device = std::make_unique<UploadDevice>(&propagator()->_bandwidthManager);
[ 335s] ^
[ 335s] /home/abuild/rpmbuild/BUILD/owncloudclient-2.5.1.10969-rc2/src/libsync/propagateuploadng.cpp:316:48: error: expected primary-expression before '>' token
[ 335s] auto device = std::make_unique<UploadDevice>(&propagator()->_bandwidthManager);
Affected platforms are: CentOS7, RHEL7, Ubuntu14.04, openSUSE_Leap_42.1, openSUSE_Leap_42.2, openSUSE_Leap_42.3
It seems to work on the old CentOS6 compiler and on the new openSUSE_Leap_15.0 compiler, though.
@ckamm @guruz I have tested an daily build a day or two after you have found the bug and merged the patch (sightly earlier then 20181104). The problem is gone. The memory utilisation was very nicely oscillating around 34-38MB. In the morning I will also double check the newest daily build (but I would not expect any problems after identification of the root causes of this bug).
Thank you for all your hard work on this project.
@piojan Thank you for the report and effort!