Identities and permissions management

AWS was the first cloud provider I've been working on. That's why when I did my first Azure and GCP project, I was always asking myself, "Hey, how would you implement that on AWS?". Answering that question was easy most of the time, but sometimes I got stuck. One of my most significant issues was the identity and permissions management component. I will try to share some related answers in this blog post.

Looking for a better data engineering position and skills?

You have been working as a data engineer but feel stuck? You don't have any new challenges and are still writing the same jobs all over again? You have now different options. You can try to look for a new job, now or later, or learn from the others! "Become a Better Data Engineer" initiative is one of these places where you can find online learning resources where the theory meets the practice. They will help you prepare maybe for the next job, or at least, improve your current skillset without looking for something else.

👉 I'm interested in improving my data engineering skillset

See you there, Bartosz

How to define a user?

Physical users

Let's start with the most fundamental question, how to define a new user? AWS creates them from the IAM service. Each user can have a programmatic access if the access keys exist, and an interactive access if the creation involves password definition.

Azure manages the users in the Active Directory service. A user is defined there by his name and user name. The service also auto generates the password.

Regarding GCP, it also manages the users in the IAM service but the semantic is a bit different than on AWS. The users come from the Google workspace and are invited by their email addresses. The IAM service doesn't have a command to create a new dedicated user.

Service users

What happens if we must link the user to service and hence create a service account user? Nothing different on AWS where even the users with the interactive access can be used by the services. But the same is not true for Azure and GCP.

On Azure, the idea of a service account is more complex than on AWS. The definition includes 3 different types of service accounts that can live in Azure Active Directory:

Regarding GCP, it creates the service account users with a dedicated IAM operation. As a regular user, each service account has its id and set of permissions. It can also be impersonated by another user. Impersonation allows principals and resources to act as a service account.

To sum up, service accounts have a special meaning on Azure and GCP, whereas AWS treats them as normal users associated with the services. And this summary is a good transition to the next question of resources authorization.

How to authorize a user to access a resource?

AWS provides 2 ways to authorize access to resources. Both rely on the policies. A policy defines a set of allowed and disallowed actions that users can make. There are 2 types of policies, identity-based and resource-based. The identity-based policies exist within IAM identities, such as users, groups or roles whereas the resource-based policies live with the resources like S3.

Azure manages access in the Role-Based Access Control (RBAC) system. The system defines roles, so what each security principal (user or service account) can do. For example, it can associate a Reader role to a particular Storage Account or all Storage Accounts in the subscription or resource group. The association boundary is called scope and once the scope is defined, the role is also assigned to a security principal. Azure will later use this role assignment to control access to the resources.

When it comes to GCP, IAM also implements the policies, but they can only be attached to a resource. The users only get roles which are collections of various operations allowed on each cloud service. But they don't authorize these operations directly. The grant consists of associating the role to the resource.

How to create a group?

In general, cloud providers recommend using groups to manage multiple users sharing the same permissions. The group's implementation on AWS and GCP is pretty straightforward since it can be understood as a collection of users sharing the same set of permissions. Although this definition is also valid for Azure, groups in Azure have some differences such as:

What surprised me the most while switching from AWS to GCP was the resource policies association. I couldn't understand why the resource permission cannot be directly assigned at the user/group level. But I got more confused on Azure, with the managed identities vs service principals different, Microsoft 365 groups existence, and the idea of user-based service accounts. And what about you? What were your confusion points?