Generator-jhipster: Multi Tenancy (MySql and MongoDB)

Created on 10 Feb 2015 · 37Comments · Source: jhipster/generator-jhipster

I don't know if it's the right place for this question, but inspired by another issue posted here, i would like to convert my app (created with jhipster) to a saas app. How can i convert a jhipster app in a multi tenancy application?
The behaviour of the application could be the following:

one mongodb database with one document (with the same schema) for each company
one mysql database with one schema for each company ( i'm using mysql for the Oauth authentication and other persistence informations)
i would like to choose the appropriate schema/document with a URL parameter

Thank you in advance for any suggestions or guidelines

Source

fontanellif

Most helpful comment

We have an application that we share with multiple clients. Those are big clients for us.
So they have a dedicated domain that is served by our application. Similar to what jira does (they create a sub domain for each compay. i.e. https://company.atlassian.net).

I agree with Julien Dubois that have connection pool dedicated for each tenant is not scalable.
So I made a poc using a connection pool where you set the schema when related to the logged-in tenant when you do a getConnection(). TenantId is store into the jwt token.
Liquidbase does the update of all schema.
It's base on current jhipster 4.10.
Source can be found here : https://github.com/hcouplet/jhipster-multi-tenancy-poc2

Also to answer Julien's question about what's the point having different schema on the same db. I see 2 main advantages :

some of our client are big companies and are concern with european GDPR coming in 2018. They asked us if their data was separated from other clients. With different schema we can say easily that they have their own db schema isolated from other clients.
On coding this makes things very easy, once you have implemented the framework as I show in my poc, you don't have to do anything more. No tenantId in each table or query. So coding is faster, you avoid mosts of the bugs as we had where ie a tenant receive and email linked to another tenant.

hcouplet on 26 Oct 2017

👍8

All 37 comments

If you put all your clients in the same schema, then there is nothing specific to do -> our default architecture will work for you.
If you want to separate schemas per clients, I don't think this will be very scalable: this will cost more ressources (connection pools, database resources...), and you will need to modify our architecture.

jdubois on 10 Feb 2015

@fontanellif Have you found any solution for multitenancy ?
I'm using JHipster too and the topic of multitenancy is a must have in our architecture.

jgasmi on 12 Mar 2015

It depends on what you call multi tenancy (I don't have the same definition), but now you can easily do a many-to-one relationship to your "user" entity, and hence separate all user requests.

jdubois on 12 Mar 2015

Thank you Julien for your answer.
I need to have a Landlord Schema (common schema) for all users and a Multitenant Schema (specific tenant schema).
When the user login, we check if he is a landlord user or not, if so then he will use the landlord schema otherwise we switch him to his specific tenant schema.

In JHipster, you are using liquibase which could handle the multitenant DDL generation. So it will be really very helpful if we could add to the command "yo jhipster:entity XXX" a parameter which indicate if the entity will be multitenant or not. and if so it will be handled by the MultiTenantSpringLiquibase class to apply the changelog to all tenant data sources.

Also for JHipster generator a multitenancy support option could be added and if yes than you could suggest a default implementation of a MultiTenantConnectionProvider.

jgasmi on 13 Mar 2015

OK, but that means we would have a different database schema for each user -> so we would need a different connection pool for each schema. This will not scale up at all with JHipster.
I will not do it, simply because it's not going to work.

jdubois on 13 Mar 2015

Thanks @jdubois for your response, i have another question:
If the @jgasmi solution will not scape up with Jhipster, what type of solution can we try to implement with jhipster?

Thanks

fontanellif on 1 Apr 2015

@fontanellif I would use a specific key, for example my SQL tables would have a foreign key to a "master" table.

jdubois on 1 Apr 2015

@jdubois each Tenant will have a different database schema and a Tenant will have several users.
Many SaaS applications are multitenant and gave the possibility to create your own company for exemple on which you can create and manage your own users, and all your data will be in a separate database or schema.

Actually I'm developing a multitenant application using JHipster and would like to contribute this once done.

jgasmi on 9 Apr 2015

@jgasmi I don't think this can work, your issue will be that multiplying datasources won't scale. People usually just do a table with a foreign key, in order to do multi-tenancy.
But if you have a working solution, of course I will be happy to be wrong!!!

jdubois on 9 Apr 2015

I agree whit @jgasmi, a good architecture is based on a multi tenant
schema/database based. As you can read in various articles, do a table with
a foreign key won't scale.

Btw: @jgasmi I'm trying to deploy the same stuff, but i have some problem
with liquibase and jpa annotation. Can you share with us some information?

On Thursday, April 9, 2015, Julien Dubois [email protected] wrote:

@jgasmi https://github.com/jgasmi I don't think this can work, your
issue will be that multiplying datasources won't scale. People usually just
do a table with a foreign key, in order to do multi-tenancy.
But if you have a working solution, of course I will be happy to be
wrong!!!

—
Reply to this email directly or view it on GitHub
https://github.com/jhipster/generator-jhipster/issues/1113#issuecomment-91335651
.

fontanellif on 9 Apr 2015

@fontanellif "code wins arguments", we'll see if that works!

jdubois on 9 Apr 2015

It's funny, as I worked on a couple of rewrites, where one of the main issue was, that it used separate database for each customers. And having hundreds of separate database simply don't work, even with oracle.

gzsombor on 10 Apr 2015

@gzsombor Salesforce is based on a multi-tenant architecture.
@fontanellif I have a first working Multi-tenant JHipster application example, I will upload it to a new github repository so you can have an idea or even contribute if you want. and when @jdubois approve it than we can include it to JHipster otherwise it will remain as a multi tenant example.

jgasmi on 10 Apr 2015

@jgasmi : I've never worked for Salesforce, but from this https://developer.salesforce.com/page/Multi_Tenant_Architecture description, it seems, that they dont use multiple, possible different schemas, but they built a - system-in-a-system style - metadata model over a very simple database schema model, so they don't manage 100000 different database schemas, but only one, which is heavily sharded.

gzsombor on 10 Apr 2015

@jdubois @fontanellif I just finished and pushed a multi-tenant JHipster app.
https://github.com/jgasmi/jhipster-mt

To test it:

I'm using postgreSQL with a database named "landlord"
login with admin/admin
Create a Tenant (http://localhost:8080/#/tenant)
Create a new user and assign it to the tenant using the register form (you should stay logged in)
logout and login again with the new user

Et voilà !

Your feedbacks are more than welcome and if @jdubois like it, I'm ready to implement JHipster-generator side to support multi-tenancy.

jgasmi on 24 Apr 2015

👍2

Hi,

How do you implement security? user with same role can get hold of other tenet data? On each query do you need to check has permission?

svennela on 24 Apr 2015

@svennela spring-security-oauth2 with pure-resource-server which is by default (out of the box) in the generated JHipster stack using the JdbcTokenStore.
Each user is only belonging to one Tenant, so he can only access his tenant data.
Regarding permissions it should be in the UserDetailsService class (loadUserByUsername), and it depends on your model whether the user permissions are on the landlord db or in the tenant db, but in both cases you have to load the permissions there.

I will add an example of permissions load to hide the landlord entities for tenant users and vice versa.

jgasmi on 25 Apr 2015

Hi @jgasmi , i tested your solution one week ago, and it's a very good work!! I created a fork in order to use mysql and it's work well too.
I used the MultiTenantSpringLiquibase object to create/update tenants, but i'm curious to check how you create tenant at runtime.

Next week i will do another tests and i let you know.

fontanellif on 25 Apr 2015

@fontanellif glad to hear that you already tested it, but last week the dynamic tenant creation was not yet implemented.

Now it's working and you can test it. Just for your information i'm using liquibase custom precondition to create only the tenant tables when creating a new Tenant.

jgasmi on 25 Apr 2015

So @jgasmi and @fontanellif .., does it really proved the concept? Does it scales? Or discussion had been finished without clear conclusion?

ReginaldoSantos on 23 Jun 2017

This doesn't solve the "one connection pool per tenant" issue, so it won't scale. I'm sorry but this is still the same argument from the beginning, and it's a well-known and common problem.
I even had the same issue with one of my clients 2 weeks ago!

jdubois on 24 Jun 2017

👍2

very good argument, so @jdubois do you suggest we use logical multi tenancy in DB instead of

multiple scheme? when using JHipster?

DatamunchCo on 24 Jun 2017

I'm sorry but as I said earlier in this thread "code wins arguments", and as I also said having one connection pool per tenant does not scale.
Now if you look at the code that is proposed, see https://github.com/jgasmi/jhipster-mt/blob/master/src/main/java/com/yjiky/mt/multitenancy/ConnectionProviderFactory.java#L48 there is indeed one connection pool per tenant. So that's basically what I said wasn't good from the beginning.
Now let me do some explanation, maybe I should have said that in the beginning:

If you do one thread pool per tenant, you're going to waste a few threads per tenant. You need to have some connections opened, otherwise connection pooling doesn't make sense, and performance is horrible.
Now if you have a few clients, you're quickly going have thousand of threads eaten up by this
And please note you still need some threads available to answer HTTP requests (at least!)

Now, your CPU is going to slow down after 500 threads. OK, you might push it to 1,000 threads, but that's really the maximum.

So this is just not going to work.

And I've seen this "in real life" many, many times. I think the first time was 20 years ago, and the last time was last month. So yes, I know what this does.

Now I might sound a bit harsh, bit I'm quite annoyed to get lots of arguments here, and spend time on this thread, just to find again this simple anti-pattern.

If you want a working solution, you just use a FK. That's going to work for sure, and scale without much issue.

Then I understand people might still not believe me, or think there is a better solution: please note this is a bug tracker for JHipster, not a Stackoverflow support forum. I don't think this solution is good for JHipster, so I will not put it into the project. But feel free to do it, or post the question on Stackoverflow, it's the right place for such questions.

jdubois on 24 Jun 2017

@jdubois I see your points and they are pretty fair. Just notice your explanation really helped me to fully comprehend what you were saying in the begging. Thanks for that.

Also, I'm sorry, but I don't know how to transpose all this conversation context to stackOverflow and I still would like to have your opinion in one alternative approach.

If multiple connection pool is the issue, what if we use single connection pool and set Tenant Schema for each connection? So we will have a Tenant Agnostic Connection Pool.

Let's say, using Hibernate MultiTenantConnectionProvider, we might have something like

    @Override
    public Connection getConnection(String tenantId) throws SQLException {
      Objects.requireNonNull(tenantId, "Tenant identifier cannot be null!");
      final Connection connection = getAnyConnection();
      try {
        connection.createStatement().execute("USE " + tenantId);
      }
      catch ( SQLException e ) {
        throw new HibernateException(
          "Could not alter JDBC connection to specified schema [" + tenantId + "]", e
        );
      } 
      return connection;
    }

Could you please, give you insights on that?

(Promise you I will not use this issue wrongly again)

ReginaldoSantos on 30 Jun 2017

👍1

Yes, sorry, I should have explained better at the beginning.
The issue with your new design is that you are probably having different login/password or different databases hosts per tenant - otherwise, if you have everything in the same physical host, with the same login/password, what's the point of having different tenants?

jdubois on 30 Jun 2017

👍2

Exactly, you right. My idea is to use an Authentication/Authorizatoin Server where users will be also stored per tenant. And client side will be probably separeted by sub domains.

For my business case, I guess it would be difficult to sell the idea of shared schema.
So, if this is ok, it fits like a glove.

And tks by the way.

ReginaldoSantos on 30 Jun 2017

👍2

It's all about isolation of data and how much you want to make your application code "tenant aware". If you have a single schema with tenant discriminator columns this bleeds into your application e.g. any query you write needs to know to filter on the tenant ID column, it effects your indexing, tables etc. It's easy to get wrong unless you have some framework support for this that knows to plop on the tenant ID value to every query/insert etc. - for JPA, Hibernate actually doesn't even support tenant based column discriminators yet but they already support multi schema or multi database although there's an issue out there to add discriminator based/single schema support HHH-6054.

From an isolation standpoint it's all about the varying level of isolation your application requires and the tradeoffs. Here is an old but good article on the topic. Shared schema basically gives up isolation for increased efficiency, this may be necessary if you have many small tenants or require cross tenant data access within the same transaction. With the multi schema + single shared connection pool you basically gain more isolation but not as much isolation as you would with multi schema + multi connection pools (different database logins) with the tradeoff though that you avoid some of that multi tenant logic bleeding into your application and database. If you had SQL injection problems it is true that a user could switch to another tenant's schema although hopefully you're using a higher level abstraction (e.g. JPA) or some library that discourages raw non-parameterized SQL. The nice thing though is your tenant switching logic is isolated to that connection provider logic rather than bleeding into the application.

The Microsoft article mentions trade offs like it's hard to restore a multi schema layout but that's going to be dependent on the database engine, technically for MySQL for instance schema and database basically mean the same thing and if you had your customers separated by that you can restore one in isolation. Additionally, you can do things like move a customer onto a different database cluster if you're running into scalability limitations or you have a customer that requires true data isolation.

I guess I'm not saying there's a right or wrong really as it depends on the application but I think multi schema/single connection pool is a valid strategy.

ryanrupp on 1 Jul 2017

👍2

Also to answer Julien's question about what's the point having different schema on the same db. I see 2 main advantages :

some of our client are big companies and are concern with european GDPR coming in 2018. They asked us if their data was separated from other clients. With different schema we can say easily that they have their own db schema isolated from other clients.
On coding this makes things very easy, once you have implemented the framework as I show in my poc, you don't have to do anything more. No tenantId in each table or query. So coding is faster, you avoid mosts of the bugs as we had where ie a tenant receive and email linked to another tenant.

hcouplet on 26 Oct 2017

👍8

Thanks @hcouplet - this looks like a good solution! I'm not totally sure this works well with Hibernate and the cache, but still that's already good.
And the GDPR argument is a good one.

jdubois on 26 Oct 2017

Thank your message @jdubois.

For the cache part, according to hibernate documentation, it should be ok :
http://docs.jboss.org/hibernate/orm/5.2/userguide/html_single/Hibernate_User_Guide.html#multitenacy

19.4.3. Caching
Multitenancy support in Hibernate works seamlessly with the Hibernate second level cache. The key used to cache data encodes the tenant identifier.

As we set the hibernate multitenancy to SCHEMA in application-dev.yml

        hibernate.tenant_identifier_resolver : eu.creativeone.poc.tenancy.hibernate.MyCurrentTenantIdentifierResolver
        hibernate.multi_tenant_connection_provider : eu.creativeone.poc.tenancy.hibernate.SchemaMultiTenantConnectionProviderImpl
        hibernate.multiTenancy : SCHEMA

I also confirm that it seems to work perfectly with the poc I made.

hcouplet on 26 Oct 2017

🎉1

Great solution @hcouplet !

This solution have a limitation of tenants? This can work with a UAA Server?

micheltank on 14 Nov 2017

I guess tenants limitation will depend on your db file system. If you ever reach that limit, you can still split tenants in multiple db.

I works with uua with microservices. You can find example here :
https://github.com/hcouplet/jhipster-multi-tenancy-poc3-uaa
https://github.com/hcouplet/jhipster-multi-tenancy-poc3-app1

For calls between services you can either use :

@AuthorizedFeignClient as usual, this will call external service without tenant info
@AuthorizedFeignClient with "X-Tenant-ID" requestHeader. For instance see example :

@RequestMapping(value = "/api/tenants/")
List<String> getAllTenants(@RequestHeader("X-Tenant-ID") String tenantId);

@AuthorizedUserFeignClient : this will forward the jwt token and you will also forward the tenant information
In case your service need to do some work on muliple tenant (for instance multi-tenant cron) use static :

MyCurrentTenantIdentifierResolver.forceTenantId(String tenantId);

hcouplet on 15 Nov 2017

👍2

This question was asked on gitter and poster referred to this issue, I'm adding here a link to this module https://github.com/sonalake/generator-jhipster-multitenancy for completeness.

gmarziou on 24 Nov 2017

@hcouplet @jdubois
I am investigating into your https://github.com/hcouplet/jhipster-multi-tenancy-poc2 . I am not able to create user using /api/register. I have also noticed that it doesn't throw any error on console. Is there a different way to create user e.g. defien new users in users.csv and restart the server ???

systemlogic on 28 Nov 2017

@hcouplet your solution works fine for my business case, but i'm having some problems with Elasticsearch. Whould you haved some problems with this search engine?

fbbnenas2 on 16 May 2018

I can solve my issue using ElasticSearchTemplate. I've builded SearchQuery and IndexQuery specifieing my current tenant for each index. Thanks!

fbbnenas2 on 17 May 2018

I have to revive this thread as I am running into a problem with the otherwise very interesting approach. As jhi tables are created async, MultiTenantSpringLiquibase runs into the problem that e.g. jhi_user does not exist, yet jhi_user is used to derive the schema name from it. Is there any best practice to overcome this limitation, except running liquibase synchronously?