Ksql: docs: Update recommended mode to run ksqlDB in Production

Created on 25 May 2020 · 19Comments · Source: confluentinc/ksql

Is your feature request related to a problem? Please describe.

~~When running ksqlDB headless mode, REST endpoint is disabled completely.~~

~~While achieving the purpose of isolating resources and focus on queries execution, this disables important functionality that has to be reinvented otherwise, like:~~
~~- Info and Healthcheck REST endpoints not available~~
~~- SHOW ... and DESCRIBE ... statements are not allowed to inspect running queries.~~

Describe the solution you'd like

~~Instead of disabling the REST Endpoint, I'd like to have a restricted version of the REST API to enable monitoring and operation of running queries on Headless mode.~~

UPDATE:

From this discussion, current recommendations is to use interactive mode in Production in order to have REST endpoints available for monitoring and operations.

Current docs https://docs.ksqldb.io/en/latest/ need to be updated. Specially this paragraph:

ksqlDB CLI

You can write SQL queries interactively by using the ksqlDB command line interface (CLI). The ksqlDB CLI acts as a client to ksqlDB Server. For production scenarios, you may also configure ksqlDB clusters to run in non-interactive "headless" configuration, which prevents access from ksqlDB CLI.

bug documentation

Source

jeqo

All 19 comments

Thanks @jeqo,

Such a request makes total sense.

I think long term we need an authorisation model for ksqlDB. This will allow auth rules to be set to allow specific users / groups to perform specific types of operations. This would give the flexibility you require.

@purplefox, is this something already on your radar?

big-andy-coates on 26 May 2020

@jeqo I am curious - have you tried not running headless, but running with the REST API and securing it?

purplefox on 26 May 2020

@purplefox no, I'm running things without security enabled right now.

Although I find adding an authorization model for ksqlDB, as @big-andy-coates suggests, very useful; it would be independent from a _restricted_ (not sure if I'm using the best term here) REST endpoint for headless mode, imho.

After looking at the rest endpoint I can see how SHOW and DESCRIBE might be harder to expose as those are included on the /ksql endpoint, but we could start with /info and /healthcheck, wdyt?

Maybe SHOW and DESCRIBE could be implemented similar to ksql-print-metrics and run only on the ksql container side.

jeqo on 26 May 2020

I'm just a little confused about what you're asking here. You mention you are currently running headless, but you'd love some of the functionality that is currently available when using the REST API. So my suggestion is... why not use the REST API! ;)
But I guess I am missing something here.

purplefox on 26 May 2020

Sorry if I wasn't clear.

When running in headless mode, even though I don't want mutate any entity inside ksqldb instance, I still want to be able to monitor health and describe what is running--even more if this mode suppose to be used in production.

With interactive mode I'd open "everything", and run scripts _manually_ once started to get things going.

jeqo on 26 May 2020

In what sense do you mean "open up 'everything'"? If you enable auth on the REST API you can prevent unauthorised access.

purplefox on 26 May 2020

I mean any statement on the /ksql endpoint, e.g. CREATE STREAM, etc.

Regardless of auth on the REST API, what would be then the way of monitor health/metrics and check readiness/liveness on headless mode?

jeqo on 26 May 2020

Regardless of auth on the REST API, what would be then the way of monitor health/metrics and check readiness/liveness on headless mode?

They're not available when running headless. That's why I suggested you don't run headless as it appears that you want them :)

purplefox on 27 May 2020

Got it. I agree that would be the option to follow with the current version.

This is why I was proposing this as a potential enhacement on how to run applications in headless mode.

jeqo on 27 May 2020

I'm still not following... What would be an enhancement? You mentioned that you'd like to have the REST API when running headless. I suggested that it sound like you just want the REST API ;)

If you "enhance" headless by adding the REST API operations, how would that be different to the REST API?

purplefox on 27 May 2020

@purplefox thanks for your feedback. appreciate the time to follow the discussion.

If healthchecks will only be available on interactive mode:

would be useful then to pass a queries file without forcing headless mode?
would make sense to recommend interactive mode, instead of headless, when move to production if healthcheck is not available. Docs:

For production scenarios, you may also configure ksqlDB clusters to run in non-interactive "headless" configuration, which prevents access from ksqlDB CLI.

jeqo on 27 May 2020

I'm sorry, I am still finding it really hard to parse this :(

Could you clarify again, as precisely as you can, what you are requesting?

purplefox on 27 May 2020

I have a bunch of queries in a queries-file and want to run it as a container (ksqldb-server + queries-file)--not a shared ksqldb server with access to other users to run ad-hoc queries--and I want to monitor/operate this container (eg. /health, /info, etc.)

If queries-file property is set, headless mode is enabled, therefore no healtcheck endpoint.

This is where the points I made before come from:

If healthchecks will only be available on interactive mode:

would be useful then to pass a queries file without forcing headless mode?

would make sense to recommend interactive mode, instead of headless, when move to production if healthcheck is not available.

This is what doesn't feel right for me: I understand headless mode is recommended for production, but no healthcheck is available.

jeqo on 27 May 2020

This is what doesn't feel right for me: I understand headless mode is recommended for production, but no healthcheck is available.

I think this advice is outdated, will clarify with the team.

purplefox on 27 May 2020

Ok.. seems latest advice is to use interactive mode for production :)

/cc @MichaelDrogalis

purplefox on 27 May 2020

👍1

@purplefox thanks for your feedback!

Looking forward to see the recommendations updated as part of the docs. I'll update the issue description to keep this open until docs are fixed.

jeqo on 27 May 2020

@purplefox and others on the team:

Related to the same use-case:
if running interactive mode and want to block other operations other than an initial set of streams (potentially via queries-file): would a flag to disable DDL (CREATE...), DQL (SELECT...), and DML (INSERT...) make sense here (to follow up in a new issue)? or authorization will be the way to go to restrict certain operations?

jeqo on 27 May 2020

if running interactive mode and want to block other operations other than an initial set of streams (potentially via queries-file): would a flag to disable DDL (CREATE...), DQL (SELECT...), and DML (INSERT...) make sense here (to follow up in a new issue)? or authorization will be the way to go to restrict certain operations?

Sounds reasonable - but could you open a new issue for this to avoid confusion?

purplefox on 27 May 2020

👍1

Ah, we had yanked this recommendation a couple of months ago but looks like we missed a page. Thanks for raising - we've opened a PR for it.

@jeqo Your request makes sense. Thanks for opening a new issue for it.