This relates to v2 alpha, specifically something observed at random using commit 6c7e575bf (Thu Jun 28 21:10:18 -0400). It's been a one-off so far but I'll keep an eye out.
There would have been a few different things going on in Redis-land in the app when this occurred. This specific Leading up to it (real code, but boring stuff omitted):
``c#
// background: this code might happen on its own, or possible after the processing thread here has finished waiting viaWaitHandle.WaitAnyon a small array of wait handles, one of which is set via a_notifySignal.Set()called in aSubscribe` callback.
// get an item from a 'queue' (_database is a Lazy<>)
var itemId = (string)_database.Value.ListRightPopLeftPush(_itemList, _processingList);
// begin an operation to add a key indicating the item is being processed (TTL = when this times out)
// (by now we have a plain old reference to database. This is from a different method)
var addAttemptAndGetCount = database.ListLeftPushAsync(attemptExpiriesKey, expiry.Ticks);
// begin an operation to get the data for the item
var getData = database.StringGetAsync(itemDataKey);
// wait for the attempt-expiry operation to complete (get the number of items in the key = # of attempts to process)
var attempt = (int)await addAttemptAndGetCount;
// wait for the get-item-data operation to complete
var rawItem = (string)await getData; // here, the result is incorrect
The final return value here should be a JSON string, but was in fact the following string:
```c#
{"$type":"Redacted`1[[Redacted, Redacted]], Redacted","g":"e68aa639-afd0-490a-b5f7-5d0f3454dc56","q":"2018-07-03T11:13:17.3632568Z","i":{"$type":"Redacted","ContractId":"217672a3-6ef0-4630-bc69-705fcb7b1c36","TrustId":"105c9707-796c-4772-8ba0-056299a0ad7b","LastScore":0,"NewScore":0,"Threshold":56,"ContractName":"This is a test contract for a red flag",
*1
$4
PING
*4
$4
LREM
$44
redacted-ashsa
(Some class names etc are redacted. The JSON starts valid but is truncated)
Note that the data returned could well have been set moments before , so this could be just an outgoing buffer. However, of course we're expecting to read it back, so it could be an incoming buffer, though it seems from your comments @mgravell that the remaining data is an outgoing buffer (due to PING, not PONG).
This is an excellent report, thanks. I'm going to:
On Tue, 3 Jul 2018, 18:37 Kieren Johnstone, notifications@github.com
wrote:
This relates to v2 alpha, specifically something observed at random using
commit 6c7e575bf (Thu Jun 28 21:10:18 -0400). It's been a one-off so far
but I'll keep an eye out.There would have been a few different things going on in Redis-land in the
app when this occurred. This specific Leading up to it:// get an item from a 'queue'var itemId = (string)_database.Value.ListRightPopLeftPush(_itemList, _processingList);
// begin an operation to add a key indicating the item is being processed (TTL = when this times out)var addAttemptAndGetCount = database.ListLeftPushAsync(attemptExpiriesKey, expiry.Ticks);
// begin an operation to get the data for the itemvar getData = database.StringGetAsync(itemDataKey);
// wait for the attempt-expiry operation to complete (get the number of items in the key = # of attempts to process)var attempt = (int)await addAttemptAndGetCount;
// wait for the get-item-data operation to completevar rawItem = (string)await getData; // here, the result is incorrectThe final return value here should be a JSON string, but was in fact the
following string:{"$type":"Redacted`1[[Redacted, Redacted]], Redacted","g":"e68aa639-afd0-490a-b5f7-5d0f3454dc56","q":"2018-07-03T11:13:17.3632568Z","i":{"$type":"Redacted","ContractId":"217672a3-6ef0-4630-bc69-705fcb7b1c36","TrustId":"105c9707-796c-4772-8ba0-056299a0ad7b","LastScore":0,"NewScore":0,"Threshold":56,"ContractName":"This is a test contract for a red flag",1
$4PING4
$4LREM
$44redacted-ashsa(Some class names etc are redacted. The JSON starts valid but is truncated)
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
https://github.com/StackExchange/StackExchange.Redis/issues/875, or mute
the thread
https://github.com/notifications/unsubscribe-auth/AABDsIedLO1EmJ8qQEB7-l3lqgSxWCmLks5uC6vUgaJpZM4VBXNw
.
Anecdotally, it seems fine - I have a volume test locally that abuses a redis server with random large data:
Verified: 2000x500, 27ms/184ms
Verified: 2000x512, 13ms/165ms
Verified: 2000x600, 16ms/197ms
Verified: 2000x1000, 26ms/230ms
Verified: 2000x1024, 25ms/241ms
Verified: 2000x1200, 32ms/281ms
Verified: 2000x2000, 52ms/436ms
Verified: 2000x2048, 49ms/424ms
Verified: 2000x2400, 49ms/454ms
Verified: 2000x4000, 81ms/718ms
Verified: 2000x4096, 82ms/751ms
Verified: 2000x4800, 90ms/868ms
Verified: 2000x8000, 141ms/1479ms
Verified: 2000x8192, 144ms/1521ms
Verified: 2000x9600, 166ms/1793ms
Verified: 50x1048576, 484ms/3727ms
still trying to look, but it raises my optimism that it is already fixed :)
the bad news is: I can't see any relevant commits... code review time, I guess!
hmmm... question: when and how was itemDataKey set? re-reading your text, I'm confused; I had assumed that the error was in the write queue, but ... if the issue is with the data from itemDataKey - I can't see where that is written. The reason I'm suspicious of "write" here is because of the PING - we should never see PING in the "read" code - it would be PONG. The other possibility is that an active buffer from the write loop has somehow made it into the read space, or v.v.
also: the data when itemDataKey was set: was it a string, or a byte[], or?
Ah - now there's a point, I was thinking that it was the read that was failing, but of course you're right - it's probably in the writer.
The writer code is easier to share:
```c#
// note: _sub = Lazy<> resolved via GetSubscriber() on multiplexer object once in constructor of this class
public async Task WriteAsync(string queueName, T item)
{
var wrapper = new WorkQueueWrapper<T>(item); // wrapper class to help with serialisation
var text = wrapper.ToJsonString(); // just returns a string
var itemList = WorkQueueKeyFormatter.ItemList(_keyPrefix, queueName); // builds a key name
var itemData = WorkQueueKeyFormatter.ItemData(_keyPrefix, wrapper.Id); // likewise
var notifyChannel = WorkQueueKeyFormatter.NewItemNotifyChannel(_keyPrefix, queueName); // likewise, for publish below
var database = _database.Value; // lazy-loaded
await database.StringSetAsync(itemData, text); // do the write here! <--
await database.ListLeftPushAsync(itemList, wrapper.Id.ToString("N")); // add the ID to a 'queue'
_sub.Value.Publish(notifyChannel, "", CommandFlags.FireAndForget); // fire blank string to wake up readers / processors
}
```
at this point it could be either, neither, or both :) Can I assume that text here is string? (hard to be sure with var)
Yes it's a plain old string sorry - added some comments after the initial post.
OK, I'm drawing a blank on this one tonight. This is ... eesh. I can't repro it, but...
Getting more issues! Could be the same thing!
Debug.Assert(totalBytes == encodedLength); in private unsafe void WriteRaw(PipeWriter writer, string value, int encodedLength) is failing.
More screenshots and goodies incoming..
Edit: actually, that's probably it. I'll email you the actual string we're setting there but can't share publicly. After this happens, there are various protocol errors since things I presume get out of sync.
Call stack:
c#
PhysicalConnection.WriteRaw
PhysicalConnection.WriteUnified
PhsyicalConnection.Write
CommandKeyValueMessage.WriteImpl
PhysicalBridge.WriteMessageToServer
PhysicalBridge.WriteMessageDirect
PhysicalBridge.TryEnqueue
ServerEndpoint.TryEnqueue
ConnectionMultiplexer.TryPushMessageToBridge<bool>
ConnectionMultiplexer.ExecuteAsyncImpl<bool>
RedisBase.ExecutAsync<bool>
RedisDatabase.StringSetAsync
Note: my commit is different to the current version I see on the current pipelines branch (https://github.com/StackExchange/StackExchange.Redis/blob/f1003b471d82adf527aa7736834e4425b02f8394/StackExchange.Redis/StackExchange/Redis/PhysicalConnection.cs#L607).
Mine has charsRemaining -= charsRemaining; whereas your line 625 has charsRemaining -= Scratch_CharsPerBlock; which seems more correct..
Oops, no disregard - git history shows newest at the bottom, of course.
Could that be the bug - https://github.com/StackExchange/StackExchange.Redis/blob/53127b871ec5c7374e455ac43243bc90dd3d21ed/StackExchange.Redis/StackExchange/Redis/PhysicalConnection.cs#L787 ?
Looks like charsRemaining -= charsToWrite; would be more correct, and this would be triggered anytime value.Length > 512 and the span returned by GetSpan isn't big enough to do it in one go?
In terms of a test, perhaps first writing very small things so the span pool (showing my ignorance here) gets filled with smaller buffers - then throwing a big one in there - might trigger the problem?
You are a superstar, thanks; that really helps! And yes, I'm an idiot: charsRemaining -= charsRemaining; is clearly wrong.
Yay for a useful Debug.Assert!
I'm switching to a Convert-based implementation as part of the same fix; current approach is overly pessimistic about what it can encode
This should be fixed in https://www.nuget.org/packages/StackExchange.Redis/2.0.0-alpha.79 - if not, please let me know, and: thanks again for the report
Excellent, thanks very much :). Just a quick question - does it make sense to you that this would cause the initial "set" to push weird outgoing buffers into the key, i.e. the very original issue?
I suppose that if the Debug.Assert didn't stop the debugger, this line:
...would have advanced the writer past what was actually written, hence bug triggered?
Just so I can sleep easy :) Thanks again.
yes, the bug makes perfect sense from the error; bytes in the code you're talking about is just the prefix line - so for 172 bytes that would be $712\r\n, and bytes would be 6. The Advance for the actual data is inside the loop in WriteRaw(PipeWriter, string, int) - if this actually wrote 512 bytes, we can get in major trouble. Interestingly, I wonder if you by conincidence happened to get an alignment where there was also a CRLF after the next 200 bytes (712-512 with my made up numbers), because I would expect the server to verify the CRLF after the actual payload, but: yes, this symptom exactly matches this bug.
The Debug.Assert only applies in Debug builds - I have, for now, promoted this to an always enabled check / exception.
Hi @mgravell, after upgrading to 2.0.0-alpha.79 to fix the above bug, we're now getting a new error which looks like it could be related.
Our code
```c#
var items = await _database.Value.ListRangeAsync(processingList);
Causes the following exception
Unhandled Exception: System.IO.IOException: Unable to write data to the transport connection: Cannot access a disposed object.
Object name: 'System.Net.Sockets.Socket'.. ---> System.ObjectDisposedException: Cannot access a disposed object.
Object name: 'System.Net.Sockets.Socket'.
at System.Net.Sockets.Socket.BeginSend(Byte[] buffer, Int32 offset, Int32 size, SocketFlags socketFlags, SocketError& errorCode, AsyncCallback callback, Object state)
at System.Net.Sockets.Socket.BeginSend(Byte[] buffer, Int32 offset, Int32 size, SocketFlags socketFlags, AsyncCallback callback, Object state)
at System.Net.Sockets.Socket.SendAsyncApm(ArraySegment1 buffer, SocketFlags socketFlags)
at System.Net.Sockets.Socket.SendAsync(ArraySegment1 buffer, SocketFlags socketFlags, Boolean fromNetworkStream)
at System.Net.Sockets.NetworkStream.WriteAsync(Byte[] buffer, Int32 offset, Int32 size, CancellationToken cancellationToken)
--- End of inner exception stack trace ---
at System.Net.Sockets.NetworkStream.WriteAsync(Byte[] buffer, Int32 offset, Int32 size, CancellationToken cancellationToken)
at System.IO.Stream.WriteAsync(Byte[] buffer, Int32 offset, Int32 count)
at System.Net.Security.SslStreamInternal.StartWriting(Byte[] buffer, Int32 offset, Int32 count, AsyncProtocolRequest asyncRequest)
at System.Net.Security.SslStreamInternal.ProcessWrite(Byte[] buffer, Int32 offset, Int32 count, LazyAsyncResult asyncResult)
at System.Net.Security.SslStream.BeginWrite(Byte[] buffer, Int32 offset, Int32 count, AsyncCallback asyncCallback, Object asyncState)
at System.IO.Stream.<>c.
at System.Threading.Tasks.TaskFactory1.FromAsyncTrim[TInstance,TArgs](TInstance thisRef, TArgs args, Func5 beginMethod, Func3 endMethod)
at System.IO.Stream.BeginEndWriteAsync(Byte[] buffer, Int32 offset, Int32 count)
at System.IO.Stream.WriteAsync(Byte[] buffer, Int32 offset, Int32 count, CancellationToken cancellationToken)
at Pipelines.Sockets.Unofficial.StreamConnection.AsyncStreamPipe.WriteBuffer(Stream target, ReadOnlySequence1 data, String name) in C:codePipelines.Sockets.UnofficialsrcPipelines.Sockets.UnofficialStreamConnection.AsyncStreamPipe.cs:line 142
at Pipelines.Sockets.Unofficial.StreamConnection.AsyncStreamPipe.
--- End of stack trace from previous location where exception was thrown ---
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state)
at System.Threading.ThreadPoolWorkQueue.Dispatch()
Continuing with the Visual Studio Debugger prompted me to open the code file for `ConnectionMultiplexer.cs` which I did by cloning this repo and checking out to commit `c8025c226d38b2934a768702b19945234ec3acef`.
The code was throwing a exception on line 1979
StackExchange.Redis.RedisConnectionException: 'No connection is available to service this operation: LRANGE prism-jamessaff_wqproc_PRSM_Development_evt; String encode went negative; IOCP: (Busy=4,Free=996,Min=8,Max=1000), WORKER: (Busy=5,Free=32762,Min=8,Max=32767), Local-CPU: n/a'
With the inner exception of
RedisConnectionException: InternalFailure on redacted.redis.cache.windows.net:6380/Interactive, origin: WriteMessageToServer, outstanding: 1, last-read: 0s ago, last-write: 0s ago, unanswered-write: 0s ago, keep-alive: 60s, state: ConnectedEstablished, mgr: 0 of 10 busy (10 max), in: 0, last-heartbeat: never, last-mbeat: 1s ago, global: 1s ago
Upon rerunning our code several times, we also seem to get the `String encode went negative` message when using `WriteAsync`.
An example
StackExchange.Redis.RedisConnectionException
HResult=0x80131500
Message=InternalFailure on SET
Source=System.Private.CoreLib
StackTrace:
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
Inner Exception 1:
InvalidOperationException: String encode went negative
```
After upgrading to 2.0.0-alpha.86 the problem still persists. Do you have any ideas? If there's any other information you'd find useful just let me know.
TL;DR: String encode went negative is being thrown for multiple commands
Well shoot
On Thu, 5 Jul 2018, 11:07 James Seden Smith, notifications@github.com
wrote:
Hi @mgravell https://github.com/mgravell, after upgrading to
2.0.0-alpha.79 to fix the above bug, we're now getting a new error which
looks like it could be related.Our code
var items = await _database.Value.ListRangeAsync(processingList);
Causes the following exception
Unhandled Exception: System.IO.IOException: Unable to write data to the transport connection: Cannot access a disposed object.
Object name: 'System.Net.Sockets.Socket'.. ---> System.ObjectDisposedException: Cannot access a disposed object.
Object name: 'System.Net.Sockets.Socket'.
at System.Net.Sockets.Socket.BeginSend(Byte[] buffer, Int32 offset, Int32 size, SocketFlags socketFlags, SocketError& errorCode, AsyncCallback callback, Object state)
at System.Net.Sockets.Socket.BeginSend(Byte[] buffer, Int32 offset, Int32 size, SocketFlags socketFlags, AsyncCallback callback, Object state)
at System.Net.Sockets.Socket.SendAsyncApm(ArraySegment1 buffer, SocketFlags socketFlags) at System.Net.Sockets.Socket.SendAsync(ArraySegment1 buffer, SocketFlags socketFlags, Boolean fromNetworkStream)
at System.Net.Sockets.NetworkStream.WriteAsync(Byte[] buffer, Int32 offset, Int32 size, CancellationToken cancellationToken)
--- End of inner exception stack trace ---
at System.Net.Sockets.NetworkStream.WriteAsync(Byte[] buffer, Int32 offset, Int32 size, CancellationToken cancellationToken)
at System.IO.Stream.WriteAsync(Byte[] buffer, Int32 offset, Int32 count)
at System.Net.Security.SslStreamInternal.StartWriting(Byte[] buffer, Int32 offset, Int32 count, AsyncProtocolRequest asyncRequest)
at System.Net.Security.SslStreamInternal.ProcessWrite(Byte[] buffer, Int32 offset, Int32 count, LazyAsyncResult asyncResult)
at System.Net.Security.SslStream.BeginWrite(Byte[] buffer, Int32 offset, Int32 count, AsyncCallback asyncCallback, Object asyncState)
at System.IO.Stream.<>c.b__55_0(Stream stream, ReadWriteParameters args, AsyncCallback callback, Object state)
at System.Threading.Tasks.TaskFactory1.FromAsyncTrim[TInstance,TArgs](TInstance thisRef, TArgs args, Func5 beginMethod, Func3 endMethod) at System.IO.Stream.BeginEndWriteAsync(Byte[] buffer, Int32 offset, Int32 count) at System.IO.Stream.WriteAsync(Byte[] buffer, Int32 offset, Int32 count, CancellationToken cancellationToken) at Pipelines.Sockets.Unofficial.StreamConnection.AsyncStreamPipe.WriteBuffer(Stream target, ReadOnlySequence1 data, String name) in C:codePipelines.Sockets.UnofficialsrcPipelines.Sockets.UnofficialStreamConnection.AsyncStreamPipe.cs:line 142
at Pipelines.Sockets.Unofficial.StreamConnection.AsyncStreamPipe.d__14.MoveNext() in C:codePipelines.Sockets.UnofficialsrcPipelines.Sockets.UnofficialStreamConnection.AsyncStreamPipe.cs:line 106
--- End of stack trace from previous location where exception was thrown ---
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state)
at System.Threading.ThreadPoolWorkQueue.Dispatch()Continuing with the Visual Studio Debugger prompted me to open the code
file for ConnectionMultiplexer.cs which I did by cloning this repo and
checking out to commit c8025c226d38b2934a768702b19945234ec3acef.The code was throwing a exception on line 1979
StackExchange.Redis.RedisConnectionException: 'No connection is available to service this operation: LRANGE prism-jamessaff_wqproc_PRSM_Development_evt; String encode went negative; IOCP: (Busy=4,Free=996,Min=8,Max=1000), WORKER: (Busy=5,Free=32762,Min=8,Max=32767), Local-CPU: n/a'
With the inner exception of
RedisConnectionException: InternalFailure on redacted.redis.cache.windows.net:6380/Interactive, origin: WriteMessageToServer, outstanding: 1, last-read: 0s ago, last-write: 0s ago, unanswered-write: 0s ago, keep-alive: 60s, state: ConnectedEstablished, mgr: 0 of 10 busy (10 max), in: 0, last-heartbeat: never, last-mbeat: 1s ago, global: 1s ago
Upon rerunning our code several times, we also seem to get the String
encode went negative message when using WriteAsync.An example
StackExchange.Redis.RedisConnectionException
HResult=0x80131500
Message=InternalFailure on SET
Source=System.Private.CoreLib
StackTrace:
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()Inner Exception 1:
InvalidOperationException: String encode went negativeAfter upgrading to 2.0.0-alpha.86 the problem still persists. Do you have
any ideas? If there's any other information you'd find useful just let me
know.TL;DR: String encode went negative is being thrown for multiple commands
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/StackExchange/StackExchange.Redis/issues/875#issuecomment-402673691,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AABDsLm0Rw_C4EG4PFw3krVnuMvfntIEks5uDeVvgaJpZM4VBXNw
.
Wondering if this should somehow take charsRemaining as a param somewhere? (Second param)
@mgravell Just looking at this - shouldn't we need a Math.Min(charsRemaining, span.Length) on the 2nd arg there, else getting an over-sized span would cause it?
FWIW that seems correct to me/us too Nick .. though I'm confused as to how our library's tests all passed.. or the extensive ones you have did too! If it's always writing the whole span length, I would have thought almost every single string write would overflow, unless tiny spans were being returned each time. Weird
All of this just emphasizes that a single correct implementation of multi-buffer encode / decode is needed in the BCL. @davidfowl any ideas? I.e. can we hope for one?
@NickCraver oversized charCount is fine - if it can't use them, it tells you via out charsUsed; but yes, we need to pass in charsRemaining
Sorry for the inconvenience; if you can tolerate my braindead bugs, it should be fixed in https://www.nuget.org/packages/StackExchange.Redis/2.0.0-alpha.87 ... (I hope, whimper)
Thanks for this! We've just updated and so far it seems to be working fine. If anything else comes up we'll let you know :)
All of this just emphasizes that a single correct implementation of multi-buffer encode / decode is needed in the BCL. @davidfowl any ideas? I.e. can we hope for one?
Yes agreed we need to build out the ecosystem of types here so that bugs aren't replicated everywhere. Looking at these methods it looks similar to the https://github.com/aspnet/SignalR/blob/896c6692b4fdaee16e2fbbb26b86821091c32f71/src/Common/Utf8BufferTextWriter.cs that SignalR uses to write text to the underlying IBufferWriter<byte>.
I'm hoping once we have enough real things using pipelines, Memory<T>, Span<T>, ReadOnlySequnce<T> the obvious patterns will emerge. I know it's a bit annoying being an early adopter but we REALLY appreciate it 😄 .