AWS Database Blog

Introducing role-based access control for Amazon DocumentDB (with MongoDB compatibility)

Amazon DocumentDB (with MongoDB compatibility) is a fast, scalable, highly available, and fully managed document database service that supports MongoDB 3.6 or 4.0 or 5.0 workloads. You can use the same MongoDB application code, drivers, and tools as you do today to run, manage, and scale workloads on Amazon DocumentDB without worrying about the underlying infrastructure.

Amazon DocumentDB has now added support for role-based access control (RBAC). With RBAC, you can grant users one or more predefined roles (for example, read, readWrite, or dbOwner) that determine which operations they are authorized to perform on one or more databases. A common use case for RBAC is enforcing least privilege access within a single application. Another common use case is building multi-tenant applications. A multi-tenant application is a deployment of software and hardware that serve multiple customers. In the context of Amazon DocumentDB, an example of a multi-tenant application is where each customer (or tenant) gets access to their databases within a cluster.

This post introduces you to RBAC concepts and capabilities in Amazon DocumentDB, walks you through two use cases, and discusses design considerations for building multi-tenant applications with RBAC. For more information about the new RBAC capabilities, see Role-Based Access Control (Built-In Roles) in the documentation.

Concepts

Amazon DocumentDB uses the following key RBAC concepts:

  • User – A named entity that can authenticate and perform operations
  • Password – A secret word that authenticates the user
  • Role – A designation that authorizes a user to perform actions on one or more databases
  • Admin Database – A special database to authorize users against
  • Database (DB) – The namespace within a cluster that contains collections

The following code creates a user named app with the password abc123 and read permissions in the database foo(the sample user names and passwords provided in this post are for illustrative purposes only, you should always choose strong passwords only):

db.createUser({user: "app", pwd: "abc123", roles: [{role: "read", db: "foo"}]})

You can list existing users and roles in the cluster with the show users command. The following code shows the output for this command for the user that was created:

 {
	"_id" : "app",
	"user" : "app",
	"db" : "admin",
	"roles" : [
		{
			"db" : "foo",
			"role" : "read"
		}
	]
}

All role information is stored in a special database named admin. In this code example, you can see that the user app resides in the admin database and has the read role for the foo database.

RBAC use cases

To show how RBAC works, this post presents the two most common use cases for RBAC: enforcing least privilege access within a single application and enabling multi-tenant applications on a single Amazon DocumentDB cluster.

Enforcing least privilege

An application often uses a single Amazon DocumentDB cluster as its datastore but has multiple users that require authorization to perform specific operations. Some users may need to read and write data, whereas other users may only require read access. The principle of least privilege is a foundational security tool; you can use RBAC to apply this principle by limiting user access to only what is required to perform their functions. This post presents a use case with three application users (appAdmin, appUser, and analytics). Each user has a different role based on the function they must perform. The following diagram summarizes the roles.

The first user (appAdmin) is the application administrator and needs to create indexes, add users, and read and write data in any database. For this user, assign the roles dbAdminAnyDatabase, readWriteAnyDatabase, and clusterAdmin. To create the user for appAdmin and grant the required roles, enter the following code (to create these users, you must authenticate to your cluster as a user with the root role):

db.createUser( { user: "appAdmin", pwd: "abc123",  roles: [{"db":"admin", "role":"dbAdminAnyDatabase" }, {"db":"admin", "role":"readWriteAnyDatabase"}, {"db":"admin", "role":"clusterAdmin"}]})

The second user (appUser) is the main application user and needs to read and write to the products database. To create the appUser user and grant the required role, enter the following code:

db.createUser( { user: "appUser", pwd: "abc124",  roles: [ { role: "readWrite", db: "products"}]})

The third user (analytics) is for an analytics application that only needs to read data from the products database. To create the analytics user and grant the required role, enter the following code:

db.createUser( { user: "analytics", pwd: "abc125",  roles: [ { role: "read", db: "products"}]})

You can connect and authenticate to your cluster as appUser with the following CLI command:

mongo "mongodb://appUser:abc124@<clusterName>.us-east-1.docdb.amazonaws.com:27017/"

If you don’t specify a particular database in your connection string, you authenticate the user against the test database by default. Because your application data is in the product database, you can issue the use statement to switch the context of the connection to the product database. See the following code:

use products

You can now read from and write to the product catalog. To insert a few documents into the catalog collection, enter the following code:

db.catalog.insertMany([
{ "_id":1, "name":"banana", "inventory": 10},
{ "_id":2, "name":"passion fruit", "inventory": 22},
{ "_id":3, "name":"pink laddy apple", "inventory": 78},
])

You receive the following output:

{ "acknowledged" : true, "insertedIds" : [ 1, 2, 3 ] }

You can now query the data to find a specific type of fruit. For example, if you want to check your current inventory of passion fruit, enter the following code:

db.catalog.find({"name": "passion fruit"})

You receive the following output:

{ "_id" : 2, "name" : "passion fruit", "inventory" : 22 }

Next, log out appUser. See the following code:

db.logout()

You receive the following output:

{ "ok" : 1 }

After you issue the logout command, you are still connected to the Amazon DocumentDB cluster, but no longer authenticated as any user. You can now authenticate as the analytics user, who only has read permissions on the product database. See the following code:

db.auth("analytics", "abc125")

Because you are authenticated with the read-only role, you can read data. See the following code:

db.catalog.find({"name": "passion fruit"})

You receive the following output:

{ "_id" : 2, "name" : "passion fruit", "inventory" : 22 }

However, you are not authorized to write data. For example, the following code attempts to insert new data:

db.catalog.insert({"name": "lemons", "quantity": 99})

You receive the following output:

WriteResult({ "writeError" : { "code" : 13, "errmsg" : "Authorization failure" } })

Multi-tenant application

In this second use case, you have a database to support a gaming development application. To optimize costs, you use a multi-tenant cluster, in which each game developer has access to their own database within a cluster. Four gaming studios have signed up for your game platform: bigCow, raceCar, xQuest, and bounce. To give these customers access to their data, you create four database admin users: one for each customer. Each admin user can administrate their own database for their own game. The following diagram summarizes the user roles.

After you log in with a user who has an admin role for the entire DocumentDB cluster, you can create each game DB’s admin user, and scope the appropriate roles to specific databases within the cluster. See the following code:

db.createUser( { user: "bigCowAdmin", pwd: "abc123",  roles: [ { role: "dbOwner", db: "bigCow"}]})
db.createUser( { user: "raceCarAdmin", pwd: "def456",  roles: [ { role: "dbOwner", db: "raceCar"}]})
db.createUser( { user: "xQuestAdmin", pwd: "ghi789",  roles: [ { role: "dbOwner", db: "xQuest"}]})
db.createUser( { user: "bounceAdmin", pwd: "jkl012",  roles: [ { role: "dbOwner", db: "bounce"}]})

Although each admin user can authenticate to and administrate their database, they are not authorized to perform actions on any other databases in the cluster. For example, the bigCowAdmin user cannot read or write from the xQuest database.

To demonstrate the bigCow user attempting to write to a database they don’t have access to, log out as the current user. See the following code:

db.logout()

You receive the following output:

{ "ok" : 1 }

You can connect and authenticate to your cluster as bigCowAdmin with the following mongo shell command:

mongo "mongodb://bigCowAdmin:abc123@&lt;clusterName&gt;.us-east-1.docdb.amazonaws.com:27017/"

Next, switch the context of your connection to the xQuest database. See the following code:

use xQuest

You receive the following output:

switched to db xQuest

You can attempt to write a single document to the foo collection in the database xQuest. See the following code:

db.foo.insert({'x':1})

You receive the following output:

WriteResult({ "writeError" : { "code" : 13, "errmsg" : "Authorization failure" } })

As expected, the bigCowAdmin user is not authorized to perform that command.

Even though your connection is in the context of the xQuest database, that does not mean that the users can confirm that the database exists within the cluster. The context of a connection is a client-side construct, and you can switch the context of the connection to any database you want, but that does not mean that the database exists or that you have access to read or write from the database. For example, the following code attempts to list the collections in the xQuest database:

db.runCommand( { listCollections: 1})

You receive the following output:

{ "ok" : 0, "errmsg" : "Authorization failure", "code" : 13 }

Because the xQuestAdmin user does not have a role that authorizes them to list collections, the command results in an authorization failure. However, if you switch the context of your connection to the bigCow database, for which the bigCowAdmin user has permissions, you are authorized to perform commands on the bigCow database. See the following code:

use bigCow

You receive the following output:

switched to db bigCow

You are now authorized and can write to the foo collection in the database bigCow. See the following code:

db.foo.insert({'x':1})

You receive the following output:

WriteResult({ "nInserted" : 1 })

Similarly, because the bigCowAdmin user has the role dbOwner, you can list the collections in the bigCow database. See the following code:

db.runCommand( { listCollections: 1})

You receive the following output:

{
	"waitedMS" : NumberLong(0),
	"cursor" : {
		"firstBatch" : [
			{
				"name" : "foo",
				"type" : "collection",
				"options" : {
					"autoIndexId" : true,
					"capped" : false
				},
				"info" : {
					"readOnly" : false
				},
				"idIndex" : {
					"v" : 2,
					"key" : {
						"_id" : 1
					},
					"name" : "_id_",
					"ns" : "bigCow.foo"
				}
			}
		],
		"id" : NumberLong(0),
		"ns" : "bigCow.$cmd.listCollections"
	},
	"ok" : 1
}

Multi-tenant design considerations

Developers building software as a service (SaaS) applications are often interested in building multi-tenant applications to optimize costs. SaaS developers typically isolate tenants (or customers) through authorization security controls like RBAC on shared compute and storage resources. In Amazon DocumentDB, you can build SaaS applications by isolating tenants per cluster or per database (by using RBAC). This section discusses the design considerations and trade-offs of both cluster- and database-level isolation to help you assess which pattern is most appropriate for your application architecture. For more information about high-level approaches to achieving multi-tenant data isolation, see the SaaS Storage Strategies Building a Multitenant Storage Model on AWS whitepaper.

Cluster isolation

In a cluster isolation model, you logically isolate data into a single Amazon DocumentDB cluster for each tenant. Because each tenant has a distinct and dedicated cluster, you achieve full tenant isolation across all dimensions, including compute, storage, backups, encryption keys, and incurred costs.

Database isolation

In the database isolation model, each tenant is isolated in a shared environment through an authorization mechanism (such as RBAC). Within an Amazon DocumentDB cluster, you achieve database-level isolation by associating each tenant with one or more distinct databases within an Amazon DocumentDB cluster.

Cost optimization

One of the advantages and key motivations for building multi-tenant applications using database isolation is the ability to optimize cost through density. Depending on the application and customer guarantees, you can oversubscribe the number of tenants on a single cluster because you know that not all tenants are active at the same time. This density can reduce the cost per tenant, which makes the overall solution more cost-effective.

Performance isolation

When selecting a multi-tenant application architecture, performance isolation is an important consideration. When using cluster isolation, each tenant has a dedicated cluster, and thus dedicated compute resources. This provides a consistent, guaranteed quality of service based on that tenant’s requirements. For example, if a tenant starts on a cluster of r5.large instances, but their application throughput doubles, you can scale that cluster’s r5.xlarge instances to effectively give that tenant twice the compute resources.

With database isolation, tenants share compute resources in an Amazon DocumentDB cluster. The advantage of this approach is that you can achieve a high density of tenants in a single cluster, up to hundreds of tenants. If the aggregate utilization of the tenants is relatively uniform over time, you can provide a relatively consistent experience across tenants. The downside of database isolation is when one or more tenants use a disproportionate amount of resources in the cluster, they can negatively impact the quality of service of other tenants (typically referred to as the noisy neighbor problem). It is important to acknowledge and account for noisy neighbor risk, and set the proper expectations with your tenants when choosing the database isolation model.

One common mitigation strategy is to implement tiering within your service. You can offer a basic tier in which each tenant has access to a database on a shared cluster (and is thus exposed to noisy neighbor risk). For tenants who require performance isolation, you can offer a higher tier in which each tenant has their own cluster.

Security isolation

Another consideration when selecting a multi-tenant approach is security isolation. Amazon DocumentDB’s encryption-at-rest feature (enabled by default) uses a single customer master key (CMK) to encrypt the storage volume, backups, and snapshots in a cluster. If you have tenants who require that their storage encryption keys not be shared with other tenants, you can use cluster isolation (which uses a different CMK), or perform client-side encryption in the application and manage those keys on the client side through AWS KMS. Client-side encryption can offer key isolation in a shared cluster environment, but it adds complexity and possible performance impacts to your application.

Summary

This post introduced RBAC for Amazon DocumentDB, walked through two common use cases, and discussed design considerations for building multi-tenant applications. For more information about RBAC in Amazon DocumentDB, see Role-Based Access Control (Built-In Roles) in the documentation.

For more information about Amazon DocumentDB, see Getting Started with Amazon DocumentDB or watch the video Getting Started with Amazon DocumentDB on YouTube. For more information, see Ramping up on Amazon DocumentDB (with MongoDB compatibility). You can use the same application code, drivers, and tools that you use with MongoDB today to start developing against Amazon DocumentDB.

About AWS SaaS Factory

AWS SaaS Factory provides AWS Partner Network (APN) Partners with resources that help accelerate and guide their adoption of a SaaS delivery model. SaaS Factory includes reference architectures for building SaaS solutions on AWS; Quick Starts that automate deployments for key workloads on AWS; and exclusive training opportunities for building a SaaS business on AWS. APN Technology Partners who develop SaaS Solutions are encouraged to join the program!

Learn more about AWS SaaS Factory >>

 


About the Authors

 

Joseph Idziorek is a Principal Product Manager at Amazon Web Services.

 

 

 

 

Judah Bernstein is a Sr Partner Solutions Architect at Amazon Web Services focused on Software-as-a-Service (SaaS).

 

 

 

 

Jeff Duffy is a Sr NoSQL Specialist Solutions Architect at Amazon Web Services.