User-level Git secrets policy in EMR Studio Workspaces

Set up EMR Studio’s workspace collaboration.

Amazon EMR Studio Security is Improved by AWS with Detailed IAM Permissions Control

Within Amazon EMR Studio, the integrated programming environment for big data analytics, Amazon Web Services (AWS) offers fine-grained control over user actions. Comprehensive techniques for setting up user rights using AWS Identity and Access Management (IAM), which is essential for controlling access for users interacting with Amazon EMR clusters on EC2 or Amazon EKS, are highlighted in new material. Administrators may establish fine-grained permissions suited to various user roles and authentication techniques with this thorough approach.

It’s crucial to understand up front that the permissions discussed in this guide are more concerned with managing activities inside the EMR Studio environment than they are with implementing data access control for input datasets. It is necessary to directly configure permissions on the clusters that the Studio uses in order to manage access to datasets.

Building the Foundation: User Roles and Permission Policies


User roles and access policies are important to the EMR Studio permission architecture. Establishing a specific EMR Studio user role is a prerequisite for using the IAM Identity Centre authentication mechanism. This entails establishing a role intended to assign permissions to an AWS service in accordance with typical AWS procedures.

When this user position was developed, its trust relationship policy was its core component. Which service is permitted to fill the role is specified by this policy. Using actions like sts:AssumeRole and sts:SetContext, the trust policy for EMR Studio should allow the elasticmapreduce.amazonaws.com service to take up the role. This is how a typical trust policy looks:

{
  "Version": "2008-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Service": "elasticmapreduce.amazonaws.com"
      },
      "Action": [ "sts:AssumeRole", "sts:SetContext" ]
    }
  ]
}


Any default policies and permissions should be eliminated after the role has been created. Before people or groups are assigned to a Studio, this user role is then associated with particular EMR Studio session policies.

Permissions policies are directly linked to the appropriate IAM identities (users, groups, or roles) for alternative authentication techniques like direct IAM authentication or IAM federation with an external identity provider (IdP). The IAM job or roles generated for the external IdP are linked to policies for IAM federation.

Permission Policy Tiers

To define exactly what a user can do in their Studio, administrators can build one or more IAM permissions policies. Examples of policies for basic, intermediate, and advanced users are given in the documentation. Every Studio process is mapped to the bare minimum of necessary IAM operations in a thorough analysis.

Certain necessary statements must be included in these permissions policies. Permissions are needed, for example, to tag Secrets Manager secrets with the prefix emr-studio-*.

{
  "Sid": "AllowAddingTagsOnSecretsWithEMRStudioPrefix",
  "Effect": "Allow",
  "Action": "secretsmanager:TagResource",
  "Resource": "arn:aws:secretsmanager:*:*:secret:emr-studio-*"
}

Iam:PassRole is another essential necessary declaration that enables users to pass the EMR Studio service role when establishing Workspaces.

{
  "Sid": "AllowPassingServiceRoleForWorkspaceCreation",
  "Action": "iam:PassRole",
  "Resource": [ "arn:aws:iam::*:role/ <your-emr-studio-service-role> " ],
  "Effect": "Allow"
}

(Note: You must enter the actual ARN of your EMR Studio service role in place of the placeholder .)

An Example of a Basic User Policy

A basic user policy permits the majority of basic EMR Studio operations, but expressly forbids a user from directly establishing new Amazon EMR clusters. Permissions to create, describe, list, start, stop, and delete Workspaces (elasticmapreduce: CreateEditor, DescribeEditor, ListEditors, DeleteEditor, etc.), view the Collaboration panel, access S3 for logs and bucket listings, attach and detach existing clusters (EC2 and EKS), debug jobs using persistent and on-cluster user interfaces, and manage Git repositories are among the things it covers.

In order to function well with the EMR Studio service role, this policy additionally includes tag-based access control (TBAC) constraints. It contains the ability to list IAM roles (iam:ListRoles) and describe network objects (ec2:DescribeVpcs, DescribeSubnets, DescribeSecurityGroups).

Crucial Aspect of IAM Authentication: Users who use direct IAM authentication need the CreateStudioPresignedUrl permission, which is absent from the simple policy example.

Capabilities at the Intermediate and Advanced Level

The fundamental permissions are expanded upon by the intermediate user policy. All of the fundamental EMR Studio actions are supported. Most importantly, it adds the permissions required for a user to use a cluster template to generate new Amazon EMR clusters. This includes CloudFormation operations like cloudformation:DescribeStackResources and Service Catalogue actions like servicecatalog:SearchProducts, DescribeProduct, and ProvisionProduct. EMR Serverless applications can also be attached and detached by intermediate users.

The most extensive access is provided by the advanced user policy, which allows all EMR Studio operations. In addition to everything permitted by the intermediate policy, the elasticmapreduce:RunJobFlow action enables the creation of new Amazon EMR clusters not only from templates but also by offering a complete cluster setup. Additionally, the advanced policy permits access to the Amazon Athena SQL editor with related Glue, KMS, and S3 permissions (athena:*, glue:*, kms:*, s3:* actions for data catalogue, queries, etc.), connections to Amazon SageMaker AI Studio for Data Wrangler visual interface (sagemaker:* actions), and use of Amazon CodeWhisperer (codewhisperer:GenerateRecommendations).

Similar to the basic and intermediate examples, the advanced policy mentions that IAM authentication users must have the CreateStudioPresignedUrl permission and has TBAC criteria.

The comprehensive table in the documentation gives an example of the structure present in these policies, showing the steps for adding and removing Workspaces:

ActionBasicIntermediateAdvancedAssociated actions
Create and delete WorkspacesYesYesYes“elasticmapreduce:CreateEditor”, “elasticmapreduce:DescribeEditor”, “elasticmapreduce:ListEditors”, “elasticmapreduce:DeleteEditor”10

Collaboration in the Workspace

Workspace collaboration is supported by EMR Studio, enabling multiple users to collaborate at once. A user needs to have certain rights in order to view and utilise the Collaboration panel in the Workspace UI: Elasticmapreduce:ListWorkspaceAccessIdentities, Elasticmapreduce:UpdateEditor, Elasticmapreduce:PutWorkspaceAccess, and Elasticmapreduce:DeleteWorkspaceAccess. The panel is accessible to any user who has certain permissions.

Tag-based access control can be used to limit who is able to set up collaboration. A default tag with the key creatorUserId and the value being the ID of the user who created the workspace is automatically applied by EMR Studio upon workspace creation. It is advised that older workspaces be manually tagged for TBAC; this tag is applied to workspaces established after November 16, 2021.

Using a policy variable like as ${aws:userId}, a policy statement that makes use of this tag enables a user to set up collaboration exclusively for Workspaces that they have created.

{
  "Sid": "UserRolePermissionsForCollaboration",
  "Action": [
    "elasticmapreduce:UpdateEditor",
    "elasticmaprace:PutWorkspaceAccess",
    "elasticmapreduce:DeleteWorkspaceAccess",
    "elasticmapreduce:ListWorkspaceAccessIdentities"
  ],
  "Resource": "*",
  "Effect": "Allow",
  "Condition": {
    "StringEquals": {
      "elasticmapreduce:ResourceTag/creatorUserId": "${aws:userId}"
    }
  }
}


Powerful components that enable dynamic policy evaluation based on request context are policy variables like aws:userId.

Managing Permissions for Git Secrets

Permissions are required to access Git credentials that are kept as secrets in AWS Secrets Manager in order to integrate Git repositories with EMR Studio. For user-level access control, EMR Studio automatically adds the for-use-with-amazon-emr-managed-user-policies tag to freshly produced Git secrets.

Users or services can specify Git secret permissions. Tag-based permissions are introduced to the EMR Studio user role policy, specifically for the secretsmanager:GetSecretValue operation, in order to implement user-level management. The for-use-with-amazon-emr-managed-user-policies tag, which has the value ${aws:userId}, is used in this policy.

{
  "Sid": "AllowSecretsManagerReadOnlyActionsWithEMRTags",
  "Effect": "Allow",
  "Action": "secretsmanager:GetSecretValue",
  "Resource": "arn:aws:secretsManager:*:*: secret :*",
  "Condition": {
    "StringEquals": {
      "secretsmanager:ResourceTag/for-use-with-amazon-emr-managed-user-policies": "${aws:userId}"
    }
  }
}

When switching to user-level rights, any existing permissions for secretsmanager:GetSecretValue that are included in the EMR Studio service role policy should be eliminated. On September 1, 2023, EMR Studio started applying the user-level tag automatically. User-level rights necessitate explicitly appending the tag to secrets created before this date or recreating them.

By keeping the GetSecretValue permission in the service role policy, administrators can keep using service-level access. Nonetheless, it is advised to employ user-level permissions with tag-based access control for more precise control over individual secret access.

Final Thoughts on EMR Studio Permissions

Any business implementing Amazon EMR Studio must configure these specific IAM rights. Administrators may make sure that users have only the access they need to complete their jobs by utilising user roles, customised permission policies, and tag-based access control for services like Git secrets and Workspace collaboration. This improves the environment’s security posture and makes it clearer what the Studio’s user capabilities are. Although these setups offer strong control over Studio activities, keep in mind that controlling access to the actual data is also a crucial security task.

Thank you for your Interest in Cloud Computing. Please Reply

Discover more from Cloud Computing

Subscribe now to keep reading and get access to the full archive.

Continue reading