View Project

Microsoft

Reducing over-privilege in Azure RBAC through experimentation

Team

  • 2 x Product Managers
  • 1 x UX Designer
  • 1 x Engineer
  • 1 x Data Scientist
  • 1 x Content Developer

Role

UX Designer

Year

2023

Overview

Azure role-based access control (RBAC) is a system that allows admin users to manage access to resources. A core principle of Azure RBAC is helping customers maintain and achieve “least privilege”. This means giving users in the system the fewest permissions they need to do their job. Applying the least privilege principle is critical for keeping an organization’s data secure.

The aim of this project was to drive awareness of highly privileged roles that exist in the customer’s tenant and decrease the quantity of highly privileged role assignments created by users.

In numbers

328,000+

Customer tenants reached during the experiment

900,000+

Total roles assigned by customers over the course of 3 weeks

    18% decrease

    In the amount of highly privileged roles assigned by customers

      Problem

      Users have a difficult time knowing which Azure roles they should assign. Without the proper insights to grant precise access, they often default to assigning highly privileged roles that will provide a broad range of access. This causes over privilege and introduces more security risks.

      Analyzing the current design

      When adding a role assignment, every role is listed in the same section regardless of privilege. Since users have a difficult time knowing which Azure roles they should assign, they often default to the ones located at the top of the list.

      Design approach

      With the current RBAC data available, various hypotheses were formed around over privilege. Documenting these early ideas gave us direction for experimentation and working towards a north star. The hypotheses were prioritized based on the following:

      • Ability to experiment with A/B/C testing
      • Engineering cost
      • Fastest time to market
      • Minimizing throwaway work for future design updates

      Awareness → Interest → Action

      Design changes with the lowest engineering cost began at the “awareness” level. Improving the awareness of highly privileged roles was a starting point for driving our desired behavior. Focusing on small incremental changes also provided insights for future iterations to Azure RBAC.

      Hypothesis

      By separating privileged roles in the role assignment flow, customers will make fewer privileged admin role assignments and instead select more job function roles. This will reduce the number of highly privileged role assignments and result in more secure tenants.

      Designs for experimentation

      Once the hypothesis was formed, I explored various design options for experimentation. I had regular syncs with PMs and engineering to discuss design ideas, brainstorm metrics, and estimate the engineering effort involved with each design update. Engineering had less than two weeks to build all three design variations, which limited the scope of design changes.

      Questions I considered:
      • How might we drive our desired behavior with minimal disruption to the user’s workflow?
      • Does this design work with engineering's time constraint?
      • Do the designs use existing patterns that users are familiar with?
      • How will this design scale for future iterations and RBAC experiments?
      • What will we learn from this design change and what type of metrics will this update support?
      Variation 1 - Tabs added to separate privileged roles from job function roles
      Variation 2 - An additional step added for ‘assignment type’ with radio button options
      Variation 3 - A new privileged role badge and column added to existing list of roles

      Success metrics

      We finalized several metrics to generate insights and determine the impact of our design changes. These metrics were added to a scorecard for evaluation after the experiment. Working closely with data scientists to finalize the metrics and interpret the scorecard was a huge help in this project.

      An example of the metrics scorecard used in our experiment

      Testing method

      The significant amount of traffic to Azure RBAC made it possible to test multiple variations at the same time. Once the new designs were built by engineering, they were pushed to production where user traffic was split evenly amongst the four options (control vs 3 variations). The experiment ran for a total of 3 weeks.

      Impact

      Over a 3-week duration, our experiment reached over 328,000 customer tenants and over 900,000 role assignments.

      Key findings

      01

      Privileged role assignments decreased by 18% for tab and radio buttons

      Tab and radio buttons reduced more than 40,000 privileged role assignments. Each resulted in a 18% decrease in number of privileged roles assigned.

      02

      Smallest UI changes in experiment did not affect users’ behavior

      The privileged role badge and column added for variation 3 did not have any statistically significant impact on our metrics of interest.

      03

      Radio buttons led to increased customer confusion

      Radio buttons led to a significant increase in support cases with finding roles. There was no significant increase in support cases for other variations.

      Measuring success

      We held weekly team syncs to analyze the results from our experiment. To determine  the “success” of each design treatment, we considered the following questions:

      • Were the metric results statistically significant enough to justify choosing one design over another?
      • Were there trade-offs between usability metrics and the amount of privileged admin roles assigned?
      • What kind of feedback and support cases were we receiving from customers during the experiment?

      Balancing usability metrics with fewer privileged roles assigned, we determined the tab layout would provide an improved security posture for customers. We continued to monitor the metrics to see if these changes would persist long term.

      The tab variation was eventually rolled out to 100% production.

      Takeaways

      This was my first opportunity to incorporate A/B testing into the design process. It pushed me to think about measurable results and insights with each design decision. The experiment also reminded me not to underestimate the impact small design changes can have.

      Areas of improvement

      We used various metrics to assess the usability of each design. However, quantitative data can be difficult to interpret and even misleading when assessing usability. For future experiments, incorporating more qualitative feedback from customers might be beneficial to accurately measure usability.

      Next steps

      While this had a significant impact on decreasing privileged role assignments, our changes do not help users pick the correct role they need. This was only the beginning of a series of experiments with the goal of helping customers achieve and maintain least privilege. The next experiment would focus on the highly privileged roles currently assigned in the customer’s tenant.