Crystal: Different behaviour of #write action between SSL client socket and TCPSocket

Created on 11 Dec 2017 · 27Comments · Source: crystal-lang/crystal

Writing the feature that can switch between TCPSocket and OpenSSL::SSL::Socket::Client. Instance type is the union of these classes.
The main thing is that the server (external library) manages the type in his first message.

require "socket"
require "openssl"
require "json"

class Connection
  alias Sock = TCPSocket | OpenSSL::SSL::Socket::Client

  @socket : Sock = TCPSocket.new "localhost", 4222

  def initialize
    json = JSON.parse @socket.gets.to_s

    if json[ "tls_required" ] == true # did not work without "== true"
      context = OpenSSL::SSL::Context::Client.new
      context.verify_mode = OpenSSL::SSL::VerifyMode::NONE
      @socket = OpenSSL::SSL::Socket::Client.new @socket, context, true
    end

    # The problem is here
    @socket.write "Hello, world!\r\n".to_slice
  end
end

Connection.new
p :connected
gets

#write of SSL socket requires to_slice, TCPSocket works perfectly even with strings.

1.a) If @socket is TCPSocket then server accepts message immediately.
1.b) If @socket is the SSL socket, it uses buffer and does not send a message until call #flush directly:

loop{
  @socket.write "Word\r\n".to_slice
  sleep 0.5
}

did not send a message for a several minutes.

2.a) Using #unbuffered_write for SSL socket works fine.
2.b) #unbuffered_write for TCPSocket raises: private method 'unbuffered_write' called for IO::FileDescriptor.

Of course I can use workaround like #as and always check before writing if the server requires SSL or not, but it works slower. Or I can make separated instances for sockets, but I expected that sockets should share the same interface.

Am I missing a better solution? Or this behaviour is perfectly viable for now?

Source

SlayerShadow

Most helpful comment

Coming back to this after a year, I'm interested as to how opinions have changed. I think we all agree that sockets being sync by default is a good idea, but then how should OpenSSL::SSL::Socket behave? As we've seen with the zlib reader/writers, wrapper IOs need to be buffered for good performance. But doing this leads to confusing behaviour such as this.

One option I can think of is to have a convention for wrapping IOs to copy sync status from the underlying IO in initialize.

RX14 on 27 Dec 2018

👍3

All 27 comments

Also, Ruby works similarly in both cases: TCPSocket#write and OpenSSL::SSL::SSLSocket#write.

SlayerShadow on 11 Dec 2017

1) I am certain you are wrong that TCPSocket#write accepts strings. IO#write always takes a slice. If you want to send a string, use io << "string" or even io.puts "string" if you want to get line endings automatically.
2) I'm not sure why TCPSocket doesn't buffer your write, it should. Always call flush when you need to flush the buffered data to the network.
3) unbuffered_write should be private on both.

This is the correct way to use IO:

@socket << "Hello, world!"
@socket.flush

I also suggest defining @socket : IO instead of the more specific subclasses, to keep the interface cleaner and strictly generic to all IO types.

RX14 on 11 Dec 2017

unbuffered_write for TCPSocket inherited from IO::Buffered and it is really private.
unbuffered_write for SSL sockets inherited from OpenSSL::SSL::Socket for some reason. And it is public.

#write for TCPSocket really uses slices, my mistake. But it is unbuffered and has the same behaviour as Ruby TCPSocket... I'm a bit confused. Do you mean that Crystal TCPSocket#write should be buffered and should have the same behaviour as SSL client socket?

# Ruby server

require 'socket'

server = TCPServer.new 'localhost', 4000
client = server.accept
p client.gets

# Crystal client

require "socket"

client = TCPSocket.new "localhost", 4000
client.write "Hello, world!\r\n".to_slice
gets

Server accepts the string immediately when client connects.

I use #write instead of #puts, and use "rn" just to make 100% sure that the server always receive 2 bytes as line endings and not "r" nor "n". In my case mac/unix style does not work.

Also, versions.

$ crystal -v
Crystal 0.23.1 [e2a1389] (2017-07-13) LLVM 3.8.1

$ uname -a
Linux pc_name 4.13.0-1-amd64 #1 SMP Debian 4.13.13-1 (2017-11-16) x86_64 GNU/Linux

Debian 10 Buster

SlayerShadow on 12 Dec 2017

fwiw puts won't drop a trailing \r nor add a second \n to a trailing \n. << will not modify your string at all.

jhass on 12 Dec 2017

@SlayerShadow yes, TCPSocket should be buffered. I said earlier i'm not sure why it's not. Needs more investigation.

And both unbuffered methods should be private, it's just we missed a private on the SSLSocket one I think.

RX14 on 12 Dec 2017

@jhass as I remember, Ruby's puts inserted trailing line endings (if it weren't placed manually) separately of the string and before sending. The difference revealed in async code, something like this (in Crystal example):

2.times{|i|
  spawn{
    loop{
      io.puts "Test #{i}"
      sleep rand
    }
  }
}

io could send string by 4 different ways:
1) "Test1\nTest2\n"
2) "Test1Test2\n\n"
3, 4) Same but contrary.

To fix this recommended to use write and send whole string with LE. I cannot reproduce it in Crystal (maybe because of fibers).

@RX14 thank you for advice, only question please. Why don't you like to keep the Ruby behaviour for the write method?

SlayerShadow on 12 Dec 2017

This behaviour was for a looong time. Since Crystal 0.19 or earlier. And it is (was?) very useful and convenent, and familiar with Ruby syntax. I (and maybe not only I) used TCPServer/TCPSocket#write in almost all projects where I used Crystal on Sockets. If you plan to "fix" it, it can provide an extra work before update to the new language version.

SlayerShadow on 12 Dec 2017

An SSL socket probably goes through different buffers, one at the Crystal level (IO::Buffered) and another one at the OpenSSL level, which we don't really have control over.

Anyway, you must flush explicitly, to tell the program that you finished writing data, and want it to be sent. A computer program isn't omniscient and can't infer what you expect; in the case of a buffered IO, the program will only flush by itself when the buffer is full, or the IO is closed for example, it doesn't know when you finished writing you message, you must be explicit.

ysbaddaden on 12 Dec 2017

Ok, thanks, I understand you. But even the docomentation describes usual behaviour:
https://crystal-lang.org/api/0.23.1/TCPSocket.html

# A Transmission Control Protocol (TCP/IP) socket.
# Usage example:

require "socket"

client = TCPSocket.new("localhost", 1234)
client << "message\n"
response = client.gets
client.close

There also not needed any "flush". I sure that the same thing described in Ruby.

I love Ruby and Crystal and I like its similar syntax and behaviour. And I hope that you do not treat my comparisons like I expect that both languages should be absolutely similar.

Why do you think that programmer must always strictly care about buffer? Above example looks more friendly. I think that if you send data to the channel, this data must be ported immediately, it is expected. I don't need call "flush" when I use pp or puts to stdout.

If, for example, my program works with chunked data and pushes to channel a bit of information, I can use buffer manually. For this case Ruby has sync flag which is true by default. If it set to false then socket starts fill buffer and needs to "flush" to send the data.

SlayerShadow on 12 Dec 2017

My mistake, in the documentation was <<. Anyway it is for strings - if I should send binary data with data.to_slice, i must use write, which is buffered by default.

SlayerShadow on 12 Dec 2017

No, all IO writing methods rely on #write(bytes); they're all buffered as soon as IO::Buffered is included, which is the default for File and Socket (and descendants).

ysbaddaden on 12 Dec 2017

So for the similar behaviour of unbuffered sending of strings and binaries what should be the correct way?

If I need to push strings into sockets, I must use

socket << string

If I need to push binary (Bytes), I must use

socket.write(slice)
socket.flush

Or for the common case:

socket.write(string_or_slice.to_slice)
socket.flush

Right?

SlayerShadow on 12 Dec 2017

sync is true in sockets so flush not needed. maybe in openssl manual flush need to insert in ssl code for this but currently missing. if sync is true (default) no flush should be needed from users.

larubujo on 12 Dec 2017

Where is sync set for sockets? I can't find it.

RX14 on 13 Dec 2017

Found it: here. Not sure if we want to keep that behaviour personally...

RX14 on 13 Dec 2017

👍2

@larubujo of course if sync is true then no need to call flush manually. The problem is that the SSL socket does not have the true sync or it does not work correctly. I guess that SSL does not manage the sync flag depending on @RX14 and @ysbaddaden answers.

@RX14 then what do you plan to make? Remove it completely, or move into some superclass (like IO), or move it into superclass with false by default, or something else? What should I expect in the next releases? :)

SlayerShadow on 13 Dec 2017

@SlayerShadow Expect most IO to be buffered by default.

@RX14 ugh, I have no idea why self.sync = true is default in Socket, that's totally defeating the purpose. We have to get rid of it.

ysbaddaden on 13 Dec 2017

Ok, thank you all for your support.

SlayerShadow on 13 Dec 2017

@SlayerShadow we'll obviously keep sync = true as an option, but we'll make it never be the default.

I expect this change to subtly break a few things though. I'll make this change as part of a rollup PR of misc. IO refactors off the back of the IO::FileDescriptor PR and reviews.

RX14 on 13 Dec 2017

👍1

I expect this change to subtly break a few things though.

Thanks for your mention. That's what I noticed earlier when said "I (and maybe not only I) used TCPServer/TCPSocket#write in almost all projects where I used Crystal on Sockets". Maybe Crystal itself uses it as is.

Besides, I'm planning to make this option too as a setting in my library. I guess this realization will depend on native Crystal realization and should be apply for both TCPSocket and SSL Socket.

SlayerShadow on 13 Dec 2017

ruby socket has sync true too. the idea is to send data as soon as you write it. more intuitive. no need to do flush. crystal does same.

larubujo on 13 Dec 2017

😕1

i mean ruby socket has sync true by default

larubujo on 13 Dec 2017

Perhaps a good heuristic is to call flush whenever we read(). Could be confusing and brittle though.

RX14 on 13 Dec 2017

One option I can think of is to have a convention for wrapping IOs to copy sync status from the underlying IO in initialize.

RX14 on 27 Dec 2018

👍3

@RX14 I like that idea.