I spend a lot of time working with software houses (Microsoft calls these ISVs – Independent Software Vendors). More and more, these companies are looking to convert their on-premises applications to cloud-based SaaS (Software as a Service) offerings.
For many of the ISVs, their on-premises applications are single-tenant ie: they are designed to support a single organization. When they are looking to cloud offerings, one of the first decisions is whether they should create a single database to hold the data for all their client organizations, or whether they should create a separate database for each client organization.
As with most things in computing, there is no one simple answer to this.
Here are the main decision points that I look at:
For me, this is the #1 item. You have to decide how important isolating one client's data from other clients is. Time and again, I hear how "the app does that" but you need to keep in mind that in most multi-tenant models, you are only one faulty WHERE clause away from showing one client, another client's data. In fact, it's usually far harder to get the code correct in multi-tenant databases.
So I think you need to start by considering what the outcome of that would be. For some ISVs, this would be deeply embarrassing but manageable. For other ISVs, this would simply be terminal for the organization.
Imagine the discussion between your CEO and the client whose data was disclosed. How would that discussion go? Would you still have a job? Would you still have an organization?
If you have even the slightest doubt about this, you should lean towards single-tenant ie: a separate database per client. You still have the chance to mess that up, but you are starting in a better place.
This one is pretty easy. In every current cloud provider, single larger databases are more economical than large numbers of smaller databases. I don't think that cost should be your primary concern for this, but if it is, you will lean towards single-database designs.
If you are working in Azure SQL Database though, and haven't looked at their elastic database pools, you should consider them before making your decision.
A key aspect of "cloudiness" is the granularity of providing resources as required, just when they are required. Having separate databases for each client is much more "cloudy". A simple example of this is that each of your clients might require a different level of performance and/or features.
One client might want to run a very low cost test, another might need good solid general performance, another might need the best performance available. If you use separate databases, even clients running the same application could use databases with different scale and/or performance.
You might also be able to add features to specific clients. For example, one client might want a read-only copy of his/her data in another location.
Having separate databases lets you decide these things on a client-by-client basis.
One other option to consider here is that you might have different requirements even for a single client organization. They might have different environments (ie: Production, UAT, Test, etc.) that require different capabilities.
Everyone who's used a single database to support a large number of clients has run into the "noisy neighbor" situation at some time.
One client gets to the point that they can't get their work done because of the overhead being placed on the single database by another tenant.
If you have a situation where the load placed by different tenants varies, you are going to find life much easier if you have separate databases.
If you use a multi-tenant design, you will need to consider how to move one tenant to another database if that's required. (I've been in ISVs where this is only considered when someone starts complaining but it needs to be part of the design from day #1).
When you have a multi-tenant database, almost every table will have a tenant ID of some type, and these columns will be involved in almost every join operation.
It's not hard to imagine that these databases simply perform worse. Note that I'm not talking about single vs multi-database on a single SQL Server. In those situations, there can also be positive performance outcomes from a single database design – but that's a topic for another day.
If you have one client that needs to roll back their data to an earlier time, this is often extremely difficult with single-database designs. It's trivial to achieve with multi-database designs.
The same applies to situations where clients have different retention policies, or where there is a need for a client to be able to export their own data (or import it again).
You need to consider who will own the data that's used by your application. When you use a single-database model with multiple tenants, it's clearly going to be a database that you own and are responsible for.
Is that an appropriate risk for your organization?
One model that I'm seeing more commonplace now is that while the ISV owns and operates the application, the end customers own their own databases (and pay for them). This might simplify your legal situation in regards to data ownership. It might also help if there are data sovereignty issues.
Take legal advice on this.
As I mentioned, this isn't a simple decision. Nowadays for cloud-based PaaS (Platform as a Service) databases to support SaaS applications though, unless there is a compelling reason not to, I'd suggest starting with a separate database for each client every time.