tests/drone/scality.multibucket.config.phpTests pass
A new test is failing:
There was 1 failure:
1) OCA\Files\Tests\Command\TransferOwnershipTest::testTransferAllFiles
Failed asserting that actual size 3 matches expected size 0.
/var/www/owncloud/server/apps/files/tests/Command/TransferOwnershipTest.php:180
e.g. https://drone.owncloud.com/owncloud/files_primary_s3/1219/11/7
CI with current core master and files_primary_s3 master
The new test was added in PR #36222 by @karakayasemi
That PR was only merged yesterday, so last night was the first night that the test was run in the context of files_primary_s3 object storage.
The test passes for single-bucket scality, but fails in the multi-bucket case. I suspect that in the multi-bucket case the 2 users are on different storages, so maybe an ownership transfer like that does not work (the files would have to be moved between storages?).
Because this is a new test added to core, I don't know if it would have passed or not previously.
I can get the test to fail locally from files_primary_s3 with:
drone exec --pipeline phpunit-scality-multibucket-php7.2-sqlite > unit-output.txt
and look in unit-output.txt
The fail is in line 2176 of the output:
[phpunit-tests:129] ...SSSSSSSSSSSSSSSSSSSF.................................... 7257 / 10000 ( 72%)
Maybe the scality log output before that point has something useful?
Feature seems to work for me with Scality single-bucket:
Background Using 10.0.3.RC1 and files_primary_s3 and Scality
1 - Create a user
2 - Log in with user and create some file/folders
3 - Via CLI run occ files:transfer-ownership user1 admin
Result: Files are transferred without problems to admin user
No logs are shown in owncloud.log
Yes, the "Scality with single bucket" CI passes.
It is multi-bucket that fails. (And maybe just if the files of the 2 users are in different buckets?)
confirmed problem regarding shares (multibucket setup):
root@20216bd2cdea: /tmp/u1 # occ files:transfer-ownership user1 user2
Analysing files of user1 ...
3 [============================]
Collecting all share information for files and folder of user1 ...
3 [============================]
Transferring files to user2/files/transferred from user1 on 20191021_085753 ...
Restoring shares ...
Share with id 1 points at deleted file, skipping
1/3 [=========>------------------] 33%Share with id 2 points at deleted file, skipping
2/3 [==================>---------] 66%Share with id 3 points at deleted file, skipping
3/3 [============================] 100%
files are transferred correctly though
The transferred file has a new fileid with s3-multibucket. This cause problems because the share transfer expects an existing fileid for the source, which is missing with s3-multibucket. This doesn't happen in a local FS.
Option1 -> assume the fileid can change.
This is a problem because we're moving the whole "files" directory of the user. We don't really know what are the fileids of the new files, so we'll have to look for each of them.
The path information we get from the shares is incomplete and doesn't work for subfolders. Note that the fileid store in the shares is no longer valid because the transfer copied the files (different storage), which generates new fileids.
We'd probably need to traverse the transferred tree looking for the matching share in order to update the source fileid of the share.
This approach is discarded for now because the fileid shouldn't change.
Option 2 -> change in the objectstore implementation.
It seems the transfer works for local storage due to a special implemented case (https://github.com/owncloud/core/blob/master/lib/private/Files/Storage/Local.php#L468)
We can try something similar for objectstore, assuming the server for both storages are the same. Main problem is we don't have that information easily accessible at the moment. We'll have to assume it's possible to rename between buckets, or at least moving content without changing the oc's fileid.
It seems to me the main problem is that the storage (backend) modification and the cache modification is very coupled. "Common" operations (from the Common storage) which are inherited aren't considering that the storage changes the cache by itself (or so it seems).
In practice, this means that the Common's moveFromStorage operation relies on several storage operations that shouldn't modify the cache, but the objecstorage operations are modifying the cache with unpredictable results.
So far, it seems the solution is to inhibit the cache operations while the moving is in progress, and change the cache afterwards.
@phil-davis @jvillafanez was working on a fix.
Does this fix the test too?
Checking the results in https://github.com/owncloud/files_primary_s3/pull/262 by running files_primary_s3 tests against core objectstore_transfer branch
Edit: not true - a different test failed. https://drone.owncloud.com/owncloud/files_primary_s3/1265/3/7
See comments below.
https://github.com/owncloud/core/pull/36326 should fix the issue. There are things to decide (https://github.com/owncloud/core/pull/36326#issuecomment-545917964) but those affect normal transfer too, so I think we can merge the PR and fix those (if needed) in a different PR.
CI passes in files_primary_s3 when run against core objectstore_transfer branch. Great stuff.
Assigned myself just so that I see this in my current tasks and push it along. Not much that I can do myself right now - #36326 needs to get reviewed, approved and hopefully merged.
Sadly https://drone.owncloud.com/owncloud/files_primary_s3/1273/3/7
....SSSSSSS......S........................................... 9699 / 9899 ( 97%)
..........................E...............SSS................ 9760 / 9899 ( 98%)
............................................................. 9821 / 9899 ( 99%)
............................................................. 9882 / 9899 ( 99%)
................. 9899 / 9899 (100%)
Time: 6.06 minutes, Memory: 404.00MB
There was 1 error:
1) OCA\Files_Versions\Tests\VersioningTest::testMoveFileIntoSharedFolderAsRecipient
Doctrine\DBAL\Exception\UniqueConstraintViolationException: An exception occurred while executing 'UPDATE "oc_filecache" SET "storage" = ?, "path" = ?, "path_hash" = ?, "name" = ?, "parent" =? WHERE "fileid" = ?' with params [1055, "files\/folder1\/test.txt", "6a98ea9b5e25281b48c8cf1c744e5396", "test.txt", 6379, 6380]:
SQLSTATE[23000]: Integrity constraint violation: 19 UNIQUE constraint failed: oc_filecache.storage, oc_filecache.path_hash
/var/www/owncloud/server/lib/composer/doctrine/dbal/lib/Doctrine/DBAL/Driver/AbstractSQLiteDriver.php:48
/var/www/owncloud/server/lib/composer/doctrine/dbal/lib/Doctrine/DBAL/DBALException.php:128
/var/www/owncloud/server/lib/composer/doctrine/dbal/lib/Doctrine/DBAL/Connection.php:855
/var/www/owncloud/server/lib/private/DB/Connection.php:187
/var/www/owncloud/server/lib/private/Files/Cache/Cache.php:550
/var/www/owncloud/server/lib/private/Files/ObjectStore/ObjectStoreStorage.php:391
/var/www/owncloud/server/lib/private/Files/Storage/Wrapper/Wrapper.php:572
/var/www/owncloud/server/lib/private/Files/Storage/Wrapper/Availability.php:448
/var/www/owncloud/server/lib/private/Files/Storage/Wrapper/Wrapper.php:572
/var/www/owncloud/server/lib/private/Files/Storage/Wrapper/Jail.php:501
/var/www/owncloud/server/lib/private/Files/View.php:876
/var/www/owncloud/server/lib/public/Events/EventEmitterTrait.php:50
/var/www/owncloud/server/lib/private/Files/View.php:918
/var/www/owncloud/server/lib/private/Files/Filesystem.php:766
/var/www/owncloud/server/apps/files_versions/tests/VersioningTest.php:166
Caused by
Doctrine\DBAL\Driver\PDOException: SQLSTATE[23000]: Integrity constraint violation: 19 UNIQUE constraint failed: oc_filecache.storage, oc_filecache.path_hash
It fails the same when I rerun it :(
And the stack has:
/var/www/owncloud/server/lib/private/Files/ObjectStore/ObjectStoreStorage.php:391
which is new/changed code from PR #36326
@jvillafanez ideas?
It seems the test passes if the versioning is active in the buckets, but fails if not.
There are some questionable things that the test is doing, such as https://github.com/owncloud/core/blob/master/apps/files_versions/tests/VersioningTest.php#L158-L163 ...
It seems to not be a "hard" failure:
https://drone.owncloud.com/owncloud/files_primary_s3/1276/12/7 - Sunday morning, scality multibucket with PHP 7.3 passed (total of 10,002 unit tests)
https://drone.owncloud.com/owncloud/files_primary_s3/1277/12/7 - Monday morning, scality multibucket with PHP 7.3 failed (total of 10,002 unit tests)
https://drone.owncloud.com/owncloud/files_primary_s3/1276/11/7 - Sunday morning, scality multibucket with PHP 7.2 failed (total of 10,002 unit tests)
https://drone.owncloud.com/owncloud/files_primary_s3/1277/11/7- Monday morning, scality multibucket with PHP 7.2 passed (total of 10,002 unit tests)
similarly with PHP 7.1 and 7.0 it sometimes passes, sometimes fails.
And after adding a 1 second sleep between file_put_contents calls for the "fake" old versions,
https://drone.owncloud.com/owncloud/files_primary_s3/1282/5/7 - 3 of 4 scality multibucket failed.
Maybe the flakiness is when the 2 users in the test case end up with storage in different buckets?
Then the system will be trying to wanting to move the versions (as well as the current file) from one bucket to another. And, I think, we already acknowledged that versions do not transfer when moving a file to another bucket.
It seems the test passes if the versioning is active in the buckets, but fails if not.
So how do we get intermittent failures?
Does the versioning "become active" at some point?
How would the state of versioning in a bucket be different from run-to-run?
Or is it that on some runs the 2 users end up in the same bucket (thus not triggering the "move between buckets" logic), and on other runs the 2 users are in different buckets and so the "move between buckets" logic gets triggered (and has a problem somewhere)?
I just tested once with multibucket + S3 versioning, maybe I got lucky. The other options is that the flakiness happens with multibucket without S3 versioning.
There are also other problems such as https://github.com/owncloud/core/blob/master/apps/files_versions/tests/VersioningTest.php#L168 (a few lines below), which is checking for the versions, but those versions won't be moved I think
I made some core API test scenarios last night in PR https://github.com/owncloud/core/pull/36436
Those do "simple" moving of file/folder into or out of a share. They pass with multi-bucket, e.g.
https://drone.owncloud.com/owncloud/files_primary_s3/1288/13/16
in PR https://github.com/owncloud/files_primary_s3/pull/265 (just a test PR to run bits of CI)
Of course these kind of tests can pass because the 2 users happen to be in the same bucket anyway. And the "problem" probably needs the file(s) being moved to have some versions - I can easily make test scenarios for that also.
What to do next?
I feel like there could be a problem somewhere in all that delayed cache update logic, so I don't want to just skip the test on objectstore.
Maybe we need to rewrite the test. I mean, I'm not sure if putting the files directly in the files_versions folder is even supported, so having a step in the test that does that is really questionable; and checking directly that the file exists through the view instead of using a version API is also questionable.
Lots of the "integration" test cases there do all that "writing 'fake' versions files" thing to setup the test.
Maybe it was all written before the days of automated acceptance tests?
So one approach would be to cover all the "integration" test cases there with API acceptance tests (e.g. in PR #36445 I added some scenarios for moving files with versions in and out of shared folders). And re-engineer the phpunit stuff so it is just "unit" tests of the relevant class/es.
It might be a problem in the code. I've reproduce the issue manually with and without S3 versioning.
| 261 | 46 | files/fume.txt | 1a6f9c83d602c8510214f5569913bb30 | 209 | fume.txt | 4 | 3 | 1170 | 1539707152 | 1574346534 | 0 | 0 | 5dd69f2634e32 | 27 | SHA1:53096d0463c87e944d4c72fbad276148344847d1 MD5:a7ac697bae548a9d33b9779385a2c6c4 ADLER32:2dc04e78 |
| 271 | 43 | files/fooo1/fume.txt | 17d71a65c0d7d937e3524ce8ca9925be | 201 | fume.txt | 4 | 3 | 0 | 1539707152 | 1539707152 | 0 | 0 | 5dd69f5321fe3 | 27 | NULL |
+--------+---------+----------------------+----------------------------------+--------+----------+----------+----------+------+------------+---------------+-----------+------------------+---------------+-------------+-----------------------------------------------------------------------------------------------------+
2 rows in set (0.00 sec)
It seems the DB entry for the moved file is already there before manipulating the cache.
Unless I've made a mistake in the code and missed a case (which could happen) this could be more problematic to fix if that entry is created outside if the s3 storage (maybe a wrapper).
It's still very weird that the transfer command works but not this test, when it does the same thing.
@micbar I think we'll have to revert the https://github.com/owncloud/core/pull/36326 PR. I've reproduced the problem manually, so it isn't a problem "just" with the tests.
The reason the PR works with the transfer-ownership command seems to be that the transfer-ownership uses a View object.
// move file into the shared folder as recipient
//\OC\Files\Filesystem::rename('/test.txt', '/folder1/test.txt');
$view = new View($this->user2 . "/files");
$view->rename('/test.txt', '/folder1/test.txt');
With the change above, the test fails below the rename trying to check the versions (which is expected because the versions aren't there), but it goes through the rename.
The problem is that the web interface isn't using this View object and still fails
Looks we've stepped into a mine.... we'll need to book time for this, and it seems this will take a lot of time...
@jvillafanez Ok for Revert PR
Note: the OCA\Files_Versions\Tests\VersioningTest::testMoveFileIntoSharedFolderAsRecipient failure is "random" - it seems to fail about 50% of the time (e.g. there are 4 pipelines in the files_primary_s3 drone that do this with multi-bucket S3, and usually either 1,2, or 3 out of 4 fail.
The reason for that could likely be one of:
1) some timing issue with some actions happening in the same second (but I tried to work-around that possibility by putting sleeps in the test and it made no difference to the failure rate)
2) sometimes user0 and user1 files end up in the same bucket (so the test is moving the file within the same bucket, and the special code for cross-bucket move is not executed), and sometimes user0 and user1 are in different buckets (so the test executes the special code for cross-bucket move and that fails)
IMO (2) is likely, and probably indicates that there is a "hard" fail when the users really are in different buckets.
It seems there is also a problem with handling of the root folder of the View
// move file into the shared folder as recipient
//\OC\Files\Filesystem::rename('/test.txt', '/folder1/test.txt');
$view = new View("/{$this->user2}/files");
$view->rename('/test.txt', '/folder1/test.txt');
Note the initial "/" on the view. With that, the test crashes as before.
The problem seems to be in the getRelativePath function of the View. Basically, the function expects the path to start with "/" if the root also starts with it (I haven't seen any warning in the PHPDocs about this). We should open a ticket for this.
The revert and test skip #36460 and #36462 have been merged. So that unblocks files_primary_s3 tests.
IMO the new OCA\Files\Tests\Command\TransferOwnershipTest exposed an issue(s) with moving stuff between object storage buckets. That was most likely already any issue - there was just not a test that exposed the issue.
So we can keep this issue open and work next sprint to sort it all out (without blocking current CI)
I think I've tracked down the issue: https://github.com/owncloud/core/blob/master/lib/private/Files/ObjectStore/ObjectStoreStorage.php#L444 is creating the entry in the DB, called from https://github.com/owncloud/core/blob/master/lib/private/Files/Storage/Common.php#L582.
Currently checking what happens if we inhibit the touch call in the objectstore while moving files.
The problem with the versions will remain though.
Removed my assignment because we have worked around getting CI green.
Now developers can work on new PR(s) like https://github.com/owncloud/core/pull/36464 that re-enable the testTransferAllFiles test on object storage and get it working.