245

lidanger on 4 Nov 2016

Hi guys, after #245 I start new branch to work better in concurrency environment. Current v2 locks file using file system locks. It´s not a good way to do that. So, in next v3 i'm work hard to change LiteDB to thread safe and work with locks in .net using ReadWriteLockSlim class.

So, for now, it´s better you encapsulate LiteDB in a class and use lock statement when you need read/write access data.

mbdavid on 4 Nov 2016

v3 will be so great :)

vip32 on 5 Nov 2016

Hello. Also have an issue with database getting locked up; shows timeout error.

I have tried to implement a thread safe locking which also failed.

I will make an assumption here that the problem is with the journal. For example, my lock is released when the data is written on journal, but on the next write it fails. So the assumption here goes like this, data is written on journal, my program tries for the next write but at the same time litedb.dll is trying to write from journal to main file. Lastly, my assumption here is because i see the journal file staying here where normally appears then right after within milliseconds disappears.

In addition, trying to set ; journal=false but i get a weird error Parameter Name: filename cannot be null.

Please advice.

azasisgod on 9 Nov 2016

@azasisgod Could you explain your setup? Just to make sure, you are using multiple writers concurrently, right?

What do you mean by 'thread safe locking'? Does that mean you wrote a custom synchronization (e.g. via a mutex) so only one writer can use the DB at any time (in _addition_ to the locking provided by LiteDB itself)?

nerai on 17 Nov 2016

hi @mbdavid,

first of all, congrats for v3, it's really improved since v2.

I asked you on the past about concurrency and v3 appears to be very well on that case but, this time, I have another question:

"still no async/await support? if it's possible, how we should do it?"

thanks a lot.

regards,
[]

rmszc81

rmszc81 on 21 Nov 2016

Hi @rmszc81, thanks! It´s nice to hear this. I dedicate almost 3 months and many weekends to do this :)

LiteDB still without support async, but now it´s only because .NET 3.5. If this new version concurrency works fine I will add this in a net40 version. For now, Insert/Update/Delete operations still sync, but I beleave now that will it´s possible to only add Task around this methods (because locks now are in-process and no more "in-file")

Are you need async for performance reason? Are you running in which platform?

mbdavid on 21 Nov 2016

hi @mbdavid , tks for quick answer.

we're not using this on production but we're searching for alternative technologies to build our web api. this web api will used for an erp integration with many e-commerce platforms so, as we tested sqlite and ndatabase already, we're doing this with litedb.

the main reason is that all api code is async, that's why.

I have a personal project, a desktop application, and I'm testing litedb as an alternative to sqlite too but that is another story.

Tks btw.

Regards,
[]

@rmszc81

rmszc81 on 21 Nov 2016

Happiness!

I have concurrency problems with v3. Actually locking problem.
@mbdavid could you please give an example how should be used v3 in multi threaded scenarios?

MiklosPathy on 29 Nov 2016

Hi @MiklosPathy, in v3, each LiteEngine/LiteDatabase instance open datafile in exclusive lock mode. So, there is no 2 concurrency instances at same time. First thread open and second thread must wait first Dispose datafile.
So, there is no more multi-process access. There is only multi-thread. To work in multithread scenario, just open your datafile in a "global" area where all your threads can see. Like this:

```C#
public class DbManager
{
public static LiteDatabase Instance { get; private set; }

static DbManager()
{
    Instance = new LiteDatabase("...");
}

}

// to use
DbManager.Instance.GetCollection("person");
```

mbdavid on 29 Nov 2016

I hadn't fully understood the concurrency behavior in v3. I had assumed that it would be possible to have one write and one read connection.

in v3, each LiteEngine/LiteDatabase instance open datafile in exclusive lock mode.

Are you saying that it will not be possible to have a one read connection plus one write connection to the datafile? Not even multiple read only connections will be possible?

kuiperzone on 5 Dec 2016

@kuiperzone, It´s not full clear this behavior and I will write a wiki exclusive to explain how use in several scenarios.

v3 works different from old versions. It´s thread safe. But, to be thread safe, only a single instance can access datafile. So, it´s recommended keep "global" single instance of LiteDatabase (or LiteEngine) an all threads access (to read or write) this instance. With single instance LiteDB can manage concurrency access using lock statements.

For a local database it´s make more sense to be like this: mobile, local desktop and small web apps works in a single process. It´s ease/much more fast to access a single instance in this scenarios - open only one time datafile and maintain loaded pages in cache doing much less access to disk.

Current v3 version also support open datafile in "read only" mode. In this case, it´s possible open many instances to read only data. It´s useful in LiteDB.Shell, for example, when another process are using "read/write" more and you want only query some data. To do this, just use "read only=true" on connection string.

mbdavid on 5 Dec 2016

HI David,

Thanks for the reply.

I understand the limitations of an embedded database. Certainly, things would be simpler with just one file connection and many applications will be fine with this.

For me, however, I have multiple processes with one writer plus one or two readers. The data is purely sequential with no complex relationships, and the writer will be updating once a minute or so, and I was therefore hoping that a simple, efficient, local NoSql solution would have done the trick. If your solution will work with one writer plus multiple readers, then it would work for me perfectly. However, without that I may have to re-think things.

In any case, thank you for writing such a great piece of software. Have been looking around the source code and it's certainly well designed and implemented. Also learned some new stuff.

kuiperzone on 5 Dec 2016

Hi @kuiperzone, in your case you can:

1) Use v2.0.4 - all write locks are based on FileStream - keep datafile open as minimum as possible. Ca handle multiple processes.

2) Use v3 - Keep single instance to write. Use others instances with "ReadOnly" with minimum open time as possible. It´s possible occurs, in this case, read only connection read at same time on write instance are writing. Be prepared to read uncommitted data (try use bigger Cache Size parameter or no journal).

mbdavid on 5 Dec 2016

Hi,

Thanks very much indeed for the considered reply. It's possible I could still use LiteDB, but I would need to understand its limitation better than I do currently. Thought I'd explain where I'm coming from on this and share my thoughts on LiteDB. There are some questions as well:

Previously my application used its own file format to store data, but I'm in the process of re-writing and wanted to try a different approach. I've read a bit about SQLite over the years and understood it could handle a limited number of concurrent connections. However, I wanted a NoSQL solution and assumed LiteDB would be similar wrt to multiple connections. Indeed, the initial v1/2 documentation suggested that it could handle this.

My application has one long running writer process which is adding data at a regular interval. I expect a single write once per minute from it, but it could potentially be higher. I don't expect to be writing continuously or intensively.

I have one long running reader process which would expect to see updates once they are written to file by the writer. Additionally, I have a requirement for one or two intermittent reader processes as well. It is fine that a reader be temporarily blocked from reading while the writer updates, but clearly, I never want to see a scenario where the files are corrupted or a process fails in an undefined state.

Now...

Use v3 - Keep single instance to write. Use others instances with "ReadOnly" with minimum open time as possible. It´s possible occurs, in this case, read only connection read at same time on write instance are writing. Be prepared to read uncommitted data (try use bigger Cache Size parameter or no journal).

OK. Sounds promising. For this to work, I need to understand what is meant "be prepared to read uncommitted data". What would happen if a writer process is updating an area on the file while another process is enumerating over the logical data?

Some kind of file locking with a timeout would be applicable here I guess, which is the way I assumed things would work as per v2. I could potentially use v2, but not sure I feel that committing a new application to the use of an old library is a good idea.

Use others instances with "ReadOnly" with minimum open time as possible.

How do I guarantee that a long running reader process has minimum open time? How do I close the file? Do I call "Dispose()" on LiteDatebase?

I also see the following in the Wiki:

Transaction are required to LiteDB works. If omitted in write operations, like Insert(), Update() and Delete(), LiteDB will create an auto transaction for each operation. This is a slow solution if you compare to use BeginTrans() at start and Commit() on end, because auto transaction writes on disk after each document change.

Is this still applicable for v3? Not sure I understand from previous communication what's being held in cache and what's not. I get the feeling a lot has changed in v3. (In my case, it would be possible for the writer process to call BeingTrans() before every update and Commit() after.)

Here are my general thoughts about LiteDB as they are now:

I love the NoSQL and embedded approach, especially if is able to support limited multi-process concurrency (i.e. one writer only). My view is that if you find yourself looking at multiple writers then you need to be looking at a client-server solution, but 1 writer + multi readers would be extremely valuable if it would work in a safe and reliable way.

You might decide in v3 that multi-process concurrency simply can't be supported going forward. I could certainly understand this. In this case, it's something that should be make clear especially as other solutions seem to offer it and it was (seemingly) offered in v2. This is a huge and breaking change for any existing applications that are reliant on it.

If it is possible that LiteDB could support 1 writer + limited n readers, however, that would be fantastic. However, you need to define and be clear about what the limitations are. I don't feel I could use a solution where multi-process concurrency "might work" or "might not" if I "tweak this" or "tweak that". It needs to either work within defined limitations, or not at all – so that users know from the outset to adopt a different approach.

Again, thank you creating LiteDB and taking the time to get back to me.

I'm currently developing my project and will keep with LiteDB v3 for the time being. I'll aim to run some multi-process concurrency tests in the next few weeks and will let you know if I find any problems.

Cheers

Andy

kuiperzone on 6 Dec 2016

Hi @kuiperzone, to choice best version of LiteDB for you I will try explain better how each version works with concurrency so you can decide better.

v1.0.4 - Each LiteDatabase instance open datafile with FileShare.ReadWrite. So, when on instance need write on disk, use FileStream.Lock()/Unlock() and no others can read/write. Writer instance change ChangeID parameter. When next reader try fetch data, check ChangeID on header before starts. If this ChangeID was not the same when reader instance stats, clear all pages in cache do avoid "dirty read". But, the problem here is: if reader instance are reading datafile when writer instance starts write, reader can read inconsistent data.

v2.0.4 - Now, LiteDB works always as "closed file" by default. Any operation open datafile, read/write data, close datafile even if LiteDatabase still not closed (by IDispose). No more cache, at end of operation, clear all pages to avoid dirty read. Now, lock control are based in how I open datafile: reader are read only and writers are exclusive writer. But, the problem are performance. Open/Close datafile (and journal file) are very slow operation. v2 are slower than v1.

v3 - I change how multi process access datafile to improve speed, based in how users are using LiteDB (mobile/desktop/web apps). In almost all cases, the scenario are the same: a multi thread (not multi-process) enviorment. So, I decide to change from "multi-process" to "multi-thread" based data access.

If old apps are using using statment, there is no change - but will not get best performance. LiteDB will lock second concurrency instance until first release (by Dipsose). You user decide to keep a single instance will get betters benefits about performance (as keeping pages in cache and keeping datafile (and journal) opened, and avoid first v1 problem about read inconsistent data when using same instance).

When I say keep reader instance as minimum as possible I mean do not keep in reader instance always open. Use new LiteDatabase(), read data, dispose. Using this, you avoid (but not eliminate) the problem about read same page during writer are writing. This can be throw unexcepted exception. Data is not corrupted, but in an inscosistent state.

After your considerations, if you want use LiteDB I recommend you use v3 but not was I told before. Always use a normal instance keeping close as much as possible, to read or write. Just create instance, read or write data and close. This serialization way will garantee that you will not have problem read uncomminted or read inconsistent state.

Keep in touch about how yours tests.

Mauricio

mbdavid on 6 Dec 2016

OK, I've been thinking hard about implementing my own solution from scratch, but this will be rather a non-trivial affair I fear.

This is such a pity, because if I had knowledge/control over where your code read and write, it would be easy for me to open a database-wide file lock when I'm about to write and read, and close it when I'm finished. This way, I could implement file concurrency. In fact, when I think about it like this, I'm not sure why you've abandoned it.

What I'm looking at doing is this:

Encapsulating LiteDB entirely in my own classes so that I create a new LiteDatabase and LiteCollection on every update and read operation I make, and when I've finished, discard and call Dispose(). Prior to doing this, I will open a file lock with an aquisition timer, so that the operation will wait for a timeout before giving up. (I think you used to do something like this in version 2, as there was a timeout option). ReadOnly mode will use a shared lock, and write mode an exclusive lock.
Updates and Find() of single documents will therefore be very inefficient. However, where I care about performance is on reading through the enumerator. To acheive this, when I call Find() I will have to read and cache in memory the results of a large block of documents before releasing the file lock. I will probably have to encapsulate your IEnumerable and IEnumerator results to do this.
This is a major ball breaker for me, so for the love of God, can I not appeal to you to implement a simple database-wide exclusive file locking mechanism when writing, and shared locking when reading with an acquisition timeout setting? This could be optional (so as not to affect the multi-threaded performance) and only used when configured with a file lock "timeout" value via the connection string to the LiteDatabase constructor.

Would this be a major thing to implement? If you do this, I promise to buy you a Latte via a PayPal donation or something. :)

Thanks and best wishes

Andy

kuiperzone on 8 Dec 2016

Just want to add that I see another user posting in the issues who has 10 processes accessing a database concurrently. If you completely abandon file concurrency, will this not be a breaking change for these users? Can you not support it optionally on a database wide level with a separate locking file?

PS. LiteDB looks to be best C# NoSQL embedded/local solution out there.

kuiperzone on 8 Dec 2016

Hi @kuiperzone, after this issue thread I'm re-thinking about process concurrency. I read this nice explanation about how Sqlite deal with concurrency:

https://www.sqlite.org/lockingv3.html

To Sqlite, doesn't have diference between tread-safe (multi-thread) or process-safe (multi-process). But I don't know yet how lock a file in so many ways like this (as I know it's only possible lock/unlock - and a locked file can't be not read/write from others)

Also, I read about Mutex class in C# where I can do locks between processes. This could be a good/simple solution.

About remove concurrency, this could be a breaking change in some scenarios, not all. I will update documentation too to warning users about that. And also it´s was the main reason to change major version from 2 to 3 (also file format too).

mbdavid on 8 Dec 2016

@kuiperzone, I has reading about other solutions to process safe and had an idea: use an -lock aditional file to control concurrency.

When a process want read only data, open/create this file in FileShare.Read mode and release/delete this file when finish. Many processes can do this at same time. When a process needs write, open this file in FileShare.None mode. Only 1 can do this and only when there is no more are reading. It´s works like ReadWriteLockSlim but for multi process. All this process can be done inside current Locker class.

And better of all, its could be optional (by connection string) and user can choice still access via exclusive mode like now (and do not create any lock file).

I will made same tests this weekend in a new branch. ps: I prefer Caramel Macchiato... 😄

mbdavid on 8 Dec 2016

@mbdavid, Yes, my thinking was the use of an additional file to control locking on a database.

My thinking is that it should be possible to specify a timeout so that if a reader/writer can't lock the file because it is already held, it will continuouly retry every x milli or x microseconds or so before timing out and throwing an exception. If you use a locking file, rather than a mutex, it will also allow a database to be hosted on a shared disk provided it supports remote locking.

You should aim keep things simple. However, if you could do it without adding much complexity or affecting performance (i.e. make shared file locking optional), it would be a hugely powerful feature.

I was beginning to think that when you release v3, I may end up adding file locking to your code for my own use. It would be easy for me to do on some methods, like Update() perhaps, but would require an in depth understanding before I could do it for IEnumerable read methods. So I'm glad you're re-thinking this. I'm wondering whether I could contribute to your project in some way. Perhaps with some concurrent unit tests? Or perhaps by adding a little documentation in places?

kuiperzone on 8 Dec 2016

Hi guys, many changes maded. I don't finish yet but my first tests are fine. In dev branch, I made support again to multi process (and keep multi thread too).

Checkpoint was removed but I'm studing how implement again. It's important when used in big transactions.

mbdavid on 12 Dec 2016

👍2

BUG: A LiteDatabase opened with "read only=true" in connection string allows documents to be inserted.

kuiperzone on 12 Dec 2016

Its changing: use Mode=ReadOnly | Exclusive | Shared (default)

mbdavid on 12 Dec 2016

OK. No worries. Just to confirm though - it's not changed yet? (Source still uses "read only").

Suggest you make backward compatible with "read only" otherwise existing apps using "read only" in v2 will break.

kuiperzone on 12 Dec 2016

It´s on dev branch. v1/v2 doesn't have read only mode.

mbdavid on 12 Dec 2016

Upsert() now appears to be working. Good stuff!

It has a bool return result. I'm assuming this is true on success.

Question - in what scenario would it return false rather than throwing an exception?

PS. I find it tad confusing that Upsert() and Insert() return bool, whereas Insert() returns BsonValue.

kuiperzone on 12 Dec 2016

I've done a little multi-process concurrency testing with the dev branch. Basically, it doesn't work, not at least in the way I would expect. It works only in the way @mbdavid described previously, namely that the database must be disposed of between each read or write.

Here's what I do:

I run a writer process which opens a Collection in a DB adds documents at a given interval, say every 50 ms.

I also have a reader unit test which launches the writer process then tries to read documents as they are being added. This test starts the writer process, sleeps for 1 second, and then enters a loop where documents are read with a call to FindAll(). It reads through the enumerator until MoveNext() is false, and then performs FindAll() again to get a new Enumerable object. The loop runs for 10 seconds and checks the document sequence number for the maximum value ever read.

Basically, the reader can only read those documents written to file before the reader LiteDatabase and Collection was instantiated. i.e. those written in the 1 second sleep prior to entering the read loop. Documents added while the read loop is in progress do not get picked up.

I would have expected that the action of creating a new Enumberable result by calling FindAll() would have re-read the disk contents, but this does not appear to be the case.

Am I missing something here?

kuiperzone on 14 Dec 2016

Is all about locking states. You can have N read process reading, but only 1 write (in excluvise mode). So, if any process still reading, write processas are waiting to write on disk. I will stil waiting until there is no more reading processes. When all finish, then writer processes can write. During writing, no one can even read. All others must wait writer processes finish.

If you starts a FindAll() IEnumerable, you have a read process using until finish read - so, it's impossible read more documents during loops because write process will wait to write.

Tip: you must open all your instances (read/write) as Shared mode (default). Only shared mode use AvoidDirtyRead() before any read to detect if file was changed.

Read about locking here: http://www.sqlite.org/lockingv3.html - LiteDB implmeent the some locking states.

mbdavid on 14 Dec 2016

@mbdavid Hi David, Thanks for reply.

Question - How do I finish reading?

I mean, what if I abandon a FindAll() enumerable/enumerator result? How then is the file lock released?

Or do I need to call Dispose() on the LiteDatabase reader instance to ensure that reading has finished?

kuiperzone on 14 Dec 2016

@mbdavid , I'm really confused here. In a previous comment, you say "use Mode=ReadOnly | Exclusive | Shared (default)"

So I am building a connection string using terminating in "...;mode=readonly" and "...;mode=exclusive" etc.

However, in the LiteDB souce I can find no code which reads these settings. ConnectionString.cs still uses "read only", and "mode" appears to be entirely ignored. I don't understand - how do set exclusive?

kuiperzone on 14 Dec 2016

@kuiperzone, there is no need close database. Just use foreach or any LINQ that will dipose IEnumerable. http://stackoverflow.com/questions/35796913/how-can-i-cancel-an-ienumerable
I made a unit test to check about linq First method

mbdavid on 14 Dec 2016

@kuiperzone, ok, it's was not clean about open datafile modes. This is new that I added in this current version. So lets understand:

Exclusive: Works exactly as v3-beta. Open datafile as exclusive mode. No others instances even open datafile. It's ideal if you are using global instances (as mencioned in v3-beta).
ReadOnly: Open datafile as readonly. Can't changed anything. Do not create any external file (like jounal). It's useful if you have only read processes or has no permission in disk to write.
Shared (default): Can have multiple instances opened at same time. All instances can read/write data. It's works like v1/v2 but has a smart lock control (from sqlite). Use disk Lock to control lock states

Current dev version are not finish yet. Readonly and locks must be finish and implemented in NetStandard.

mbdavid on 14 Dec 2016

@mbdavid

Current dev version are not finish yet.

OK. So only Shared (default) appears to be implemented in the dev version I downloaded a couple of days ago. This is fine of course, so long as I understand correctly.

Just use foreach or any LINQ that will dipose IEnumerable.

I see!

Now, I did actually check the documentation for IEnumerator before I asked the question -- IEnumerator does not inherit IDisposable.

However IEnumerator<> does. Strange there's a difference between the generic and non-generic variants. I missed that.

kuiperzone on 14 Dec 2016

@kuiperzone, definitely you must use only shared mode in all instances. The others modes are not for your case.

Yes, I never checked about non-generic version of IEnumerator... but LiteDB uses only generic version.

mbdavid on 14 Dec 2016

can i be removed from this mail thread please, as well you may delete all
my issues and also could one sate on main doc page (read me that cf is not
supported.

maybe in the future I'll submit a push request which may contain a cf
helper. :)

On Wed, Dec 14, 2016 at 2:15 PM, Mauricio David notifications@github.com
wrote:

@kuiperzone https://github.com/kuiperzone, definitely you must use only
shared mode in all instances. The others modes are not for your case.

Yes, I never checked about non-generic version of IEnumerator... but
LiteDB uses only generic version.

—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/mbdavid/LiteDB/issues/332#issuecomment-267019600, or mute
the thread
https://github.com/notifications/unsubscribe-auth/AVQ2C0K1t9hUS5TjUw2UZEWAErnLd_wJks5rH93ogaJpZM4KpPQl
.

--
Dean Van Greunen
0728855371

DeanVanGreunen on 14 Dec 2016

I can remove you, but you can remove yourself. Click in "Unsubscribe" issue notification

mbdavid on 14 Dec 2016

@mbdavid , OK it's working. But I have to dispose of LiteDatabase after every write operation in the writer process in order for them to be picked up by the reader.

Not sure I should have to do this. Am I missing something?

UPDATE: I'm adding data with Upsert(). Is it automatically being flushed/committed to file?

kuiperzone on 14 Dec 2016

Hi, @mbdavid I replace in bin directory dll of v2 to v3 and the site was not started
In code I used using (LiteDatabase(fn)) StartPage is read data and show it. Probably is other fileformat?
Also in small new test I stored 200 mb data on disc (list of objects, only the one collection) This Bll class From not small app with List as properties of objects in stored collection. The next method is open and try FindAll().ToList() and I got error (Exception not handled by user...) Is it not good sign about v3??? We want to start use it and now test it app...

qart2003 on 17 Dec 2016

Hi @qart2003, is your datafile are from v2? v3 uses another format. If you want update from v2 to v3, use "upgrade=true" in connection string (first time only) or call LiteEngine.Upgrade. This will create a backup file and convert your documents from v2 to v3.
About your other problem, please, create a new issue with more information to better track.

mbdavid on 17 Dec 2016

@mbdavid Thanks, I will try. The second part is ok, it was my funny error

qart2003 on 17 Dec 2016

@kuiperzone yes, after each Commit, all changes are made in disk. If you are not using transactions, "auto transaction" are created in each execution.

After made all changes to work with shared datafile I realized that NetStandard has no Lock/Unlock command in 1.x (will be available only in v2). So, shared datafile will work only in NET35 build. I tried to made using external -lock file and didnt work too. All changes are in master branch again and I will continue from there.

mbdavid on 19 Dec 2016

Litedb: Concurrency question

Most helpful comment

All 42 comments

245

Related issues