Elasticsuite: Installing the module with Docker : suboptimal step to define ES location

Created on 16 Jan 2018  路  24Comments  路  Source: Smile-SA/elasticsuite

While installing the module on a dockerized development environment, the ES instance is not situated in localhost but in a custom alias. We should apply this step but it would require to set an environment specific value in a global setting...

Preconditions


Magento Version : any (but here : 2.2 EE B2B)


ElasticSuite Version : any


Environment : any


Third party modules : N/A

Tests done

  • I tried to use Magento 2.2 config:set before installing the module, but it complain the configuration does not exists (and we cannot install the module before adding this configuration !!)
  • I could set this value in env.php, according to new patterns from Magento but I wanted to commit this step somewhere, for easy dev env install
  • Eventually I just load the value with INSERT INTO vr_core_config_data (path, value) VALUES ('smile_elasticsuite_core_base_settings/es_client/servers', 'elasticsearch:9200') in an install script... but it's far from awesome :(

Ideas

I know it's not a easy thing to fix, but first, the documentation is wrong. It should at least use Magento 2.2 patterns...

In a perfect world, the module should be installable easily, and this configuration should not be used in the upgrade script, but at least in a separate command, or triggered when doing the first indexation...

enhancement

All 24 comments

I am currently evaluating the ability to provide an additional parameter es-hosts to the setup:install process.

This would allow to set the es host just in the same way you are setting up the mysql or the redis host and should resolve a lot of problems. Most of the integrator knows how to deal with this since they are doing it for MySQL by example.

What do you think of the proposed method ?

That would be great ! At least for an installation procedure... I don't know what are the impacts for an existing platform to run again the setup:install command...

But sticking to Magento behavior for Redis, MySQL, ... (even when not that great) is definitely the way to go.

We also need this configuration so that we can run our integration tests.

If you try to run the Magento integration test framework with this module installed, you will get an error like this:

[Progress: 495 / 913]
Module 'Magento_CatalogSearch':
Installing data...

  [LogicException]
  Indexer handler is not available: elasticsuite


Exception trace:
 () at /var/www/current/src/vendor/magento/module-catalog-search/Model/Indexer/IndexerHandlerFactory.php:88
 Magento\CatalogSearch\Model\Indexer\IndexerHandlerFactory->create() at /var/www/current/src/vendor/magento/module-catalog-search/Model/Indexer/Fulltext.php:144
 Magento\CatalogSearch\Model\Indexer\Fulltext->executeFull() at /var/www/current/src/vendor/magento/module-indexer/Model/Indexer.php:412
 Magento\Indexer\Model\Indexer->reindexAll() at /var/www/current/src/generated/code/Magento/Indexer/Model/Indexer/Interceptor.php:323
 Magento\Indexer\Model\Indexer\Interceptor->reindexAll() at /var/www/current/src/vendor/magento/module-indexer/Model/Indexer/DependencyDecorator.php:248
 Magento\Indexer\Model\Indexer\DependencyDecorator->reindexAll() at /var/www/current/src/vendor/magento/module-catalog-search/Setup/InstallData.php:45
 Magento\CatalogSearch\Setup\InstallData->install() at /var/www/current/src/setup/src/Magento/Setup/Model/Installer.php:867
 Magento\Setup\Model\Installer->handleDBSchemaData() at /var/www/current/src/setup/src/Magento/Setup/Model/Installer.php:791
 Magento\Setup\Model\Installer->installDataFixtures() at n/a:n/a
 call_user_func_array() at /var/www/current/src/setup/src/Magento/Setup/Model/Installer.php:342
 Magento\Setup\Model\Installer->install() at /var/www/current/src/setup/src/Magento/Setup/Console/Command/InstallCommand.php:146
 Magento\Setup\Console\Command\InstallCommand->execute() at /var/www/current/src/vendor/symfony/console/Command/Command.php:242
 Symfony\Component\Console\Command\Command->run() at /var/www/current/src/vendor/symfony/console/Application.php:843
 Symfony\Component\Console\Application->doRunCommand() at /var/www/current/src/vendor/symfony/console/Application.php:193
 Symfony\Component\Console\Application->doRun() at /var/www/current/src/vendor/magento/framework/Console/Cli.php:104
 Magento\Framework\Console\Cli->doRun() at /var/www/current/src/vendor/symfony/console/Application.php:117
 Symfony\Component\Console\Application->run() at /var/www/current/src/bin/magento:23

We workaround this during regular (non-integration-test-environment) install onto new environments using an env.php.dist, like this:

<?php
return array (
  'system' => 
  array (
    'default' => 
    array (
      'smile_elasticsuite_core_base_settings' => 
      array (
        'es_client' => 
        array (
          'servers' => 'elasticsearch:9200',
        ),
      ),
    ),
  ),
);

When Magento bootstraps its integration tests, there is a very limited set of options available for configuration (there is no equivalent for env.php). One of them happens to be additional parameters to setup:install which can be added under dev/tests/integration/install-config-mysql.php.

RabbitMQ implementation has provided the same method of implementation as the above, so I would suggest it's a good one to use.

This is a high priority for us (since it's breaking our integration tests) - would you consider accepting a PR, or is this likely to be picked up for development within the next week or so at Smile?

Hi @shavounet, @careys7

Find a way to update all ElasticSuite settings using env variables and with no code.

An extract of the docker compose file i am using in my dev environment (elasticsearch and php fpm services) :

elasticsearch:
    build: ./docker/images/elasticsearch/
    volumes:
      - ./docker/data/elasticsearch:/usr/share/elasticsearch/data:delegated
    ulimits:
      memlock:
        soft: -1
        hard: -1
    environment:
      - cluster.name=docker-cluster
      - bootstrap.memory_lock=true
      - "ES_JAVA_OPTS=-Xms512m -Xmx512m"
      - "xpack.security.enabled=false"
    ports:
      - 9220:9200

  php:
    build: ./docker/images/php
    volumes:
      - ./:/var/www/html:delegated
    environment:
      - "CONFIG__DEFAULT__SMILE_ELASTICSUITE_CORE_BASE_SETTINGS__ES_CLIENT__SERVERS=elasticsearch:9200"

All config variable can be set using this way (not only ElasticSuite). So you should be able to change the number of replicas, the auth and everything else need.
Think we should update the doc waiting for the ability to set variables during install.

Notes :

  • Non-docker install can also benefits of this if you find a way to set variables (php-cli,
  • You need to clean all corresponding records in core_config_data if you want the environment variable to be applied

  • Field that are configured using this method are disabled in the admin configuration edit. Looks more secure to me.

@afoucret I couldn't get this to work with the integration test framework, eg:

dev/tests/integration/phpunit.custom.xml

<phpunit>
   <testsuites>
     ...
   </testsuites>
   <php>
       <!-- Add this to the bottom of the default PHP configuration node -->
        <env name="CONFIG__DEFAULT__SMILE_ELASTICSUITE_CORE_BASE_SETTINGS__ES_CLIENT__SERVERS" value="elasticsearch:9200"/>
    </php>
</phpunit>

dev/tests/integration/etc/config-global.php

<?php
return [
    'catalog/search/engine' => 'elasticsuite',
];

Running as follows:

$ vendor/bin/phpunit -c dev/tests/integration/phpunit.custom.xml --testsuite Custom

I still get this output:

Module 'Magento_CatalogSearch':
Installing data...

  [LogicException]
  Indexer handler is not available: elasticsuite


Exception trace:
 () at /var/www/current/src/vendor/magento/module-catalog-search/Model/Indexer/IndexerHandlerFactory.php:88
 Magento\CatalogSearch\Model\Indexer\IndexerHandlerFactory->create() at /var/www/current/src/vendor/magento/module-catalog-search/Model/Indexer/Fulltext.php:144
 Magento\CatalogSearch\Model\Indexer\Fulltext->executeFull() at /var/www/current/src/vendor/magento/module-indexer/Model/Indexer.php:412
 Magento\Indexer\Model\Indexer->reindexAll() at /var/www/current/src/generated/code/Magento/Indexer/Model/Indexer/Interceptor.php:323
 Magento\Indexer\Model\Indexer\Interceptor->reindexAll() at /var/www/current/src/vendor/magento/module-indexer/Model/Indexer/DependencyDecorator.php:248
 Magento\Indexer\Model\Indexer\DependencyDecorator->reindexAll() at /var/www/current/src/vendor/magento/module-catalog-search/Setup/InstallData.php:45
 Magento\CatalogSearch\Setup\InstallData->install() at /var/www/current/src/setup/src/Magento/Setup/Model/Installer.php:867
 Magento\Setup\Model\Installer->handleDBSchemaData() at /var/www/current/src/setup/src/Magento/Setup/Model/Installer.php:791
 Magento\Setup\Model\Installer->installDataFixtures() at n/a:n/a
 call_user_func_array() at /var/www/current/src/setup/src/Magento/Setup/Model/Installer.php:342
 Magento\Setup\Model\Installer->install() at /var/www/current/src/setup/src/Magento/Setup/Console/Command/InstallCommand.php:146
 Magento\Setup\Console\Command\InstallCommand->execute() at /var/www/current/src/vendor/symfony/console/Command/Command.php:242
 Symfony\Component\Console\Command\Command->run() at /var/www/current/src/vendor/symfony/console/Application.php:843
 Symfony\Component\Console\Application->doRunCommand() at /var/www/current/src/vendor/symfony/console/Application.php:193
 Symfony\Component\Console\Application->doRun() at /var/www/current/src/vendor/magento/framework/Console/Cli.php:104
 Magento\Framework\Console\Cli->doRun() at /var/www/current/src/vendor/symfony/console/Application.php:117
 Symfony\Component\Console\Application->run() at /var/www/current/src/bin/magento:23

I think there's a possibility that Magento\TestFramework\Application ignores these because of its different bootstrap process.

I've submitted a PR to add the command here: #693. This does seem work and allows me to run the integration tests.

I might consider raising the ENV issue with Magento as I think this situation is likely to repeat in other modules. Either I have messed up my configuration, or it doesn't seem supported.

For me it does work. I'm building this in an automated environment with Jenkins (but not yet using Magento Test Framework...)

Hi @careys7,

I will review your PR and proposed you some modification next week.

BR,

Hi everyone,

I have just pushed a new PR which is a complete refactoring of the ES Client instantiation.

Now the following parameters can be used with the setup:install command :

  • --es-hosts : List of servers (eg : elasticsearch, elasticsearch:9200, es-node1:9200,es-node2:9200
  • --es-enable-ssl : Enable SSL (https) for ES connection
  • --es-user and --es-pass : if using an authentication to access ES.

When using this the ES client config is disabled in the backend (typical security) :

capture d ecran 2018-01-31 a 13 48 47

If you want to update one of the setting, run :

bin/magento config:set -l smile_elasticsuite_core_base_settings/es_client/servers my-new-server:9200
bin/magento app:config:import

The patch is available on the master branch and will be part of the next stable release (2.5.0).

@afoucret thank you so much for your work on it!

But I'm afraid that the only one right place to store ES config is env.php file. I see no any difference between database/cache/session/search engine settings. So why 3 of them stored in the env.php but last - inside database/xml?

Some example about using DeploymentConfig https://github.com/renatocason/magento2-module-mq-amqp/blob/master/Model/Client.php#L35

// @AntonKril, @alankent

Hi @koc,

I fully agree with you and If I had to reboot the project now, I would use only the env.php file.

Now, using configuration instead of the env file is part of the project history. We don't want to hurt installed system by loosing existing config during upgrade.

So I choose what I believe to the worst decision (except for all the others I have considered) :

  • If set during install, the config is stored in the env.php file (under system/default)
  • It become not editable in backend, so no override can exist in the DB
  • You can update it using config:set (and locking) and it is still in the env.php file then

If you find a way to use only env.php file and migrate automatically existing setup, I will be very happy to merge some code !!!

@afoucret I think this approach works for both of our use cases:

  1. Different env.php for development and staging / production

In development we use a docker container, and in staging / production we use the AWS Elastic service to host this for us.

As such we need to specify a different endpoint which we can still do via env.php (we never configured this via config.php on previous versions too).

  1. Allow setup:install to execute during the execution of Magento's integration test framework bootstrap

In this case, we can't supply any env.php to Magento because the bootstrap for integration tests wants to create this itself. The only method of supplying the configuration here is to use the installer command which is what you've provided.

Your PR looks tidier and (as expected) more oriented with the project. Thanks for looking at this! We will upgrade to 2.5 once it's released.

@careys7 did you test already that everything was working with AWS Elastic service ?

Last time we checked they were not exposing a "full" Elasticsearch engine : some index operations are not available through AWS which does not allow Elasticsuite to run properly. (see #379).

Imho this is still the case when using Elasticsearch 2.x versions, but they support a lot more of operations with ES 5.x.

Since we will support ES 5.x in the next release, I dont know if you tested it with AWS Elastic service @afoucret ?

@romainruaud I didn't test it.

But I have good hope that ES 5.x support bring us support of AWS ElasticSearch Service since the supoorted APIs are wider for version 5.5. To be more specific, the _termvectors API support was missing in version 2.x and is now supported.

@careys7 can you confirm everything work as expected using AWS managed ES servers ?

Yes that's correct - ran into the same issue with AWS. I understand we are going to use the one from ElasticSearch offered via AWS marketplace, but haven't checked where we are at with that yet.

I should have been clearer :) The main thing that concerns this issue is that I know we will need a different endpoint to what we use in development.

PS - if you are adding ES 5 support to 2.5 release I would imagine we will be keen to try this on the official AWS elasticsearch service. (Did I understand correctly?)

Yes you are correct.
You can test it out using 2.5.0 release.

Woah, that's awesome !

Btw @afoucret, I'm not sure I'm on the latest version, but I might have spotted something wrong (a space in the middle of the path) :
https://github.com/afoucret/elasticsuite/blob/322dfea1769303144c3c90c4a4ad3d180dc7bc9c/src/module-elasticsuite-core/Setup/ConfigOptionsList.php#L50

correct @shavounet

Many thanks for spotting this one.

I submit a PR immediately to fix this one.

Hi @afoucret,

Magento itself uses env.php for storing certain kind of configuration (such as enabled caches) and I would advice against using this method:

  • if you have more than one application server the env.php file is modified only on the machine that processed the request
  • env.php file should be set to R/O for production and web server user should not have write permissions
  • by using default.xml, you still can provide a custom location for ES and use configuration stored in DB
  • why we cannot use a simple defaults.xml to set the configuration? I realize that it is not the easiest solution for newbies and you would have to create a custom module but there are more scenarios where you might want to take advantage of Magento's built in multistore functionality (say using a different ES for each store for some reason)

What do you think?

Lukas

@bery check DevDocs - Deployment Configuration. I think env.php is the right place for connection settings.

env.php shouldn't be stored in VCS, but config.php (or related default.xml, other config XML files from modules) should be.

@careys7 I am not saying it should (and it shouldn't) yet consider following scenario - multiple containers handling requests and modifying env.php as it happens when you disable caches and then you will end up with inconsistent setup.

->request to modify/deploy env.php

  • Container1/AppServer1 -> env.php OK
  • Container2/AppServer2 -> env.php -> save & modify contents -> NOK -> inconsistent from other containers
  • ContainerN/AppServerN -> env.php OK

You can easily reproduce this behavior by disabling caches in Magento's backend.

@bery

IMHO, Elastic client config should be considered as an infrastructure setting, just as MySQL is.
By definition infrastructure settings are not subject to frequent change ... Changing the ES client is something that require to put a website offline since ou need to reindex all the things in the engine.

So you should be able to manage consistency between several nodes (or containers) just in the same way you are doing it for MySQL or Redis by the way. This kind of process should already be well industrialized ...

@afoucret I agree and I am in for storing the configuration in env.php. The caveat when configuration is stored in env.php is that it should not be editable from BE and/or the store config should take precedence (which is confusing and can be misleading). Would it be possible to remove the BE configuration?

I did more testing and if you provision env.php from template without having Magento installed first, even if you provide --es-hosts as a command line parameter, the module tries to look it up in the database.

Was this page helpful?
0 / 5 - 0 ratings