Currently when Centralized Pipeline Management is enabled in Logstash, we need to set the xpack.management.pipeline.id with the ids of all pipelines that the Logstash node will execute.
This means that when we want to add a new pipeline, we need to update the Logstash node configurations with the id of the new pipeline.
This is not ideal in uses cases where the ELK cluster is shared by multiple tenants, that can create their own Logstash pipelines via Kibana.
To address this limitation, it would be nice if the xpack.management.pipeline.id supported wildcards when specifying the pipeline ids. Other approaches are possible, for instance, instead of having the xpack.management.pipeline.id configuration in Logstash configs, move it to the .logstash index in Elasticsearch.
:+1:
Was also looking for the same functionality,
it could be nice to either be able to use a prefix for all pipelines relevant for a logstash host (for instance, if we had pipelines called host1.source1 and host1.source2, it would help querying using host1*),
or even have a labels on the pipelines stored within elastic, that could be used to grab all pipelines tagged with the same label.
looking in the code and _.logstash_ index, the pipelines' names are only stored in the _id column,
so I wasn't able to query using a wildcard and suggest a quick PR.
there are a few solutions I could think of:
Was looking at setting this up today, and was quite perplexed by this limitation. From a user perspective, I would expect I could add whatever I want - it seems very unintuitive to have a nice interface that requires a change and service restart to take advantage of every time you want a new pipeline.
Hi,
I agree with others comments.
Having to restart all logstash processes each time you want to add a new pipeline is very time consuming.
I think we should have the possibility to give logstash istances one or multiple tags from the kibana webui (production, staging, etc...)
These tags must be also applied to pipelines so that each logstash instance knows which pipeline it has to execute
If we want some logstash instances to stop executing a pipeline, we could just delete the tag from the kibana webui
Hope this idea make sens
Best regards
Hi,
I agree with the above, and I actually, for some reason thought this was already possible. I hope very much this will be implemented.
Best regards
Hi,
I agree with the above discussion, having to restart the logstash process everytime you add a new pipeline is something which needs manual intervention. Having to reload the new pipeline id the moment you deploy it in centralized UI is something what I was looking for.
Regards,
I agree with all of the above comments. We're looking at splitting our processing into multiple pipelines and chaining them together to reduce the size of the configuration file. I would like to avoid having to restart logstash in order to define a new pipeline.
Hi,
Totally agree. Restarting Logstash every time is not a convenient way to add new pipelines...
Using wildcards or labels are good ideas.
馃憤
Regards.
We have a scenario where we have 2 logstash clusters (DMZ/internal) but 1 elasticsearch cluster. We were testing CPM with the ability to deploy certain pipelines to a specific cluster and had tried to do it via wildcards by prefixing DMZ pipelines (dmz-) and internal (internal-), then trying to use wildcards per logstash cluster in xpack.management.pipeline.id with no luck.
After coming across this I hope the capability comes soon :)
Hello,
Any news about this ? :)
This seems like a massive oversight in the implementation. According to the docs:
The pipeline management feature centralizes the creation and management of Logstash configuration pipelines in Kibana.
I don't believe this is an accurate statement, since creating a pipeline in Kibana doesn't bring it into existence on any Logstash nodes until you update xpack.management.pipeline.id in the config file and restart Logstash. If we need to roll out configuration changes to every Logstash server in order to create a new pipeline, CPM adds little to no value to the "create" operation of a pipeline (in fact, it makes it slightly more complex, since I need to manage file-based configuration as well as config stored in ES). It seems that based on the current implementation, CPM is only useful for updating existing pipeline definition and config on the fly.
For the time being, I'd suggest updating the docs to make them less misleading by making it clear that CPM doesn't allow you to fully create a Logstash pipeline (i.e. configuration updates and restarts of Logstash nodes are still required).
For me, the expected behavior of this feature would be something like this:
xpack.management settings. Within these settings you should be able to include a list of tags/groups/categories of pipelines that the Logstash node will pull down (e.g. dev-us-east, prod-us-west)dev-us-east)Currently the process for creating a new pipeline with CPM looks like this from what I can see:
xpack.management settings with a hard-coded list of pipelinesfilebeat.yml to include this new pipeline in the xpack.management.pipeline.id arrayFor creating a pipeline, this is actually an additional step compared to simply managing all the pipeline config with a configuration management tool. It also means that you need to do the same steps for anything other than a pipeline definition update or a pipeline setting update, such as renaming the pipeline or deleting the pipeline.
Very disappointing.
This seems like a massive oversight in the implementation. According to the docs:
The pipeline management feature centralizes the creation and management of Logstash configuration pipelines in Kibana.
I don't believe this is an accurate statement, since creating a pipeline in Kibana doesn't bring it into existence on any Logstash nodes until you update
xpack.management.pipeline.idin the config file and restart Logstash. If we need to roll out configuration changes to every Logstash server in order to create a new pipeline, CPM adds little to no value to the "create" operation of a pipeline (in fact, it makes it slightly more complex, since I need to manage file-based configuration _as well as_ config stored in ES). It seems that based on the current implementation, CPM is only useful for updating existing pipeline definition and config on the fly.For the time being, I'd suggest updating the docs to make them less misleading by making it clear that CPM doesn't allow you to fully create a Logstash pipeline (i.e. configuration updates and restarts of Logstash nodes are still required).
For me, the expected behavior of this feature would be something like this:
- Logstash nodes are configured to turn on CPM with
xpack.managementsettings. Within these settings you should be able to include a list of tags/groups/categories of pipelines that the Logstash node will pull down (e.g.dev-us-east,prod-us-west)- Create Logstash pipeline via Kibana (API or UI) and tag/group/categorize this pipeline to specify which Logstash nodes I wish to run this pipeline (e.g.
dev-us-east)- When all the logstash nodes next poll for new pipelines, the new Logstash pipeline is started by the Logstash nodes matching the tags/groups/categories. This should not require a restart of Logstash.
Currently the process for creating a new pipeline with CPM looks like this from what I can see:
- Logstash nodes are configured to turn on CPM with
xpack.managementsettings with a hard-coded list of pipelines- Create Logstash pipeline via Kibana (API or UI)
- Update
filebeat.ymlto include this new pipeline in thexpack.management.pipeline.idarray- Roll out changes to Logstash nodes using configuration management and restart the Logstash node
- Restarted nodes will pull config for the new pipeline from ES
For creating a pipeline, this is actually an additional step compared to simply managing all the pipeline config with a configuration management tool. It also means that you need to do the same steps for anything other than a pipeline definition update or a pipeline setting update, such as renaming the pipeline or deleting the pipeline.
Very disappointing.
100% agree on this. This seems like a proper thought process is needed. Deploying files each time seems so 1980's.
Hello,
any update for this request ? The process for restart all logstash cluster every time we add a new pipeline is very limiting.
Is this a significant level of effort for this change? Going on a year :(
This would be very useful
Still waiting on this. It defeats the purpose of centralized management when you have to restart every logstash instance to add a new pipeline.
Doesn't seem to be any activity from the elastic team.
I believe I'm going to try to implement this myself.
What is everyone's preference on using tags versus wildcard ids?
Looking at the source, currently Logstash uses a call to the /_mget endpoint to retrieve pipelines.
This endpoint does not support searching. It only returns documents requested by exact id.
Additionally the id of the pipeline is stored in the _id field in the .logstash index. Wildcards cannot be used to search this field, it only supports exact matches.
If we wanted to use wildcards in the xpack.management.pipeline.id list there are two options.
xpack.management.pipeline.id list as regex patterns. Then a second request would be made to Elasticsearch to retrieve the matching documents. The documents would not be retrieved with the first query as this could be a very large list of documents. I do not believe most users will have thousands of pipelines created, but I've seen people do dumb things. Additionally unless multiple scroll requests are used, the number of pipelines returned will be limited to the index.max_result_window setting in Elasticsearch. I'm not sure if there is a current limit on the number of results that can be returned by the /_mget call._id or filter them from the Logstash side. However, this requires additional updates to Kibana/Elasticsearch to ensure this field is added during pipeline creation and to ensure any previous pipeline documents are updated.As for the tags approach, initially the tags would be set in the logstash.yml. There would be considerable more work in order for the Logstash tags to be managed in Kibana (since Kibana/Elasticsearch doesn't track Logstash servers beyond metrics).
With this option we have the option to leave the legacy setting in place so that tags may still be specified by exact id include pipelines with matching tags as well. This option would require updates to Kibana/Elasticsearch to allow adding tags to the pipeline_metadata field.
In my opinion, as long as there is not a large number of pipelines then wildcard option 1 is easiest to implement as it only requires changes to Logstash.
Thoughts?
This issue has now been fixed and is available in master and 7.x. This fix will be released as part of Logstash 7.11.
Most helpful comment
Hi,
I agree with others comments.
Having to restart all logstash processes each time you want to add a new pipeline is very time consuming.
I think we should have the possibility to give logstash istances one or multiple tags from the kibana webui (production, staging, etc...)
These tags must be also applied to pipelines so that each logstash instance knows which pipeline it has to execute
If we want some logstash instances to stop executing a pipeline, we could just delete the tag from the kibana webui
Hope this idea make sens
Best regards