Warren Parad for Authress Engineering Blog

Posted on Jan 24 • Edited on Jan 31 • Originally published at authress.io

The Risks of User Impersonation

#authentication #authorization #identity #security

What is user impersonation?

User impersonation is anything that allows your systems to believe the current logged in user is someone else. With regards to JWTs and access tokens, this means that one user obtains a JWT that contains another user's User ID. User impersonation or logging in as a customer can be used as a tool to help identify many issues from user authentication and onboarding to corrupted data in complex multi-service business logic flows.

However, at first glance it should is obvious that there are major security implications with such an approach. Even if it isn't, this article will extensively review user impersonation and the security implications as well as offer alternative suggestions to achieve a similar outcome in a software system without compromising security.

The impersonation use cases

No solution is relevant in a vacuum, so let's consider the concrete issues that you might actually have, and the reason you've arrived at this Authress Academy article. If we were to jump straight into a solution, then we'll definitely end up sacrificing security or worse, our user's sensitive data in favor of suboptimal solutions.

Possible use case user stories:

One of your users reports that they are experiencing an issue with a screen in your application portal not showing the correct information. As a support engineer, you want to review the exact display in the application UI that your user sees, so that you can verify the UI is indeed broken and something is actually going wrong.
Similar to above, can you know whether or not the display having an issue is a result of a problem with the UI itself or with the data that application UI is fetching, hence a service API issue.
Sometimes it is a problem with a complex API server flow. A click in your application portal was expected to perform a data change, transformation, or API request to your backend services, but is may not have been sent with the appropriate data. As an product engineer, you would like to know that the correct request data is being sent in the request to my service API.
As an system admin, multiple third party systems are interacting with each other and something™ isn't working, and because you are a great collaborator, even though it isn't your responsibility, you want to help out your customers.

Now, this list isn't exhaustive, but already you can start see that while focusing on the concrete problems, user impersonation might be useful, but these don't actually require it to debug. The root causes often fall into at least one of these categories:

This is a UI component display issue.
An unexpected request is being sent or isn't sent to your service API from your application portal.
The wrong data is being sent in the request from your application UI to your API.
It is a READ permissions data issue for the user.
It is a WRITE permissions data issue for the user.
In is multi-system problem and not an access issue, and having a duplicated environment that exactly matches the current production was your goal to continue debugging.

Note: out of these solutions, none of them even get close to needing user impersonation, they each have straightforward alternatives that are both secure and frequently simpler to implement.

Supported libraries

Fundamentally, user impersonation is insecure by design, we'll see why in a moment. There are much better ways to provide insight into your specific scenario that actually take security into account. But let's assume that we do implement user personation. Is there help available for us by utilizing support from our favorite overengineered solution?

Ruby - Rails pretender
Python - Django hijack
Nodejs - Express/Passport impersonate
Insert your favorite monolithic HTTP Framework here ➤ Deprecated Solution

What's interesting is that in doing the research to actually find existing implementations, 86% of the repos and links I found:

No longer exist, and haven't existed for quite some time
Were archived over 5 years ago
Have less than 10 stars on GitHub

Even if people are trying to make this happen, the tools don't even exist to ensure that we are doing it correctly and safely. The results of this search, tell us something. Even more surprisingly is that most of the Auth SaaS solutions don't offer this either. As it turns out, either no one really cares that much or it is next to impossible to get it right such that no solution can exist. Well that can't be right.

Dangers of user impersonation

Let's assume for a moment that the collective wisdom is correct, and no solutions exist because it is dangerous. What exactly are those dangers? To help convey these issues, say that we managed to get one of these legacy packages above actually working with our system, the first problem that we'll run into is:

Who actually has access to perform this User Impersonation in the first place? Who are our admins?

1. Defining the admins

Of course allowing everyone to impersonate one another basically means our authentication provides no value. We might as well let users enter whatever username they like on every post they make. Realistically, we want to restrict this list to those that it actually makes sense to have the ultimate su privilege.

Figuring out who the admins should be and maintaining access to that closely guarded endpoint that grants user impersonation is a common problem that even eludes the most sophisticated companies. The most notorious example of getting this wrong were the Twitter 2020 admin tools hack and the Microsoft Storm-0558 breaches. Attackers were able to compromise admin-level account tools, and use them to steal and impersonate actual users. Historically, one of these companies had paid significant attention to their own internal security, were, if not the first, one of the first to introduce the notion of public social logins, and were no stranger to the issues at hand, and the other was Microsoft.

Challenge 1: Maintaining both the admin list, and correctly securing the endpoint to allow impersonation in the first place.

2. The implementation

The next issue regarding impersonation becomes transparent when we start to question how it can even work in practice. In theory, practice is the same as theory, in practice it is not.

Once admin is authorized to impersonate a user, what exactly is happening in our platform? Let's flash back to Authentication. In order to secure your system, to ensure the right users have access to the right data at the right time, your users must use a session cookie or session token sent on every request for which your API can verify that user is logged in. This could be a completely opaque GUID that represents some data in your database (a reference token) or a more secure JWT that is stateless. In any case, your system identifies users via your Authentication Strategy, and at the end of the day identification comes down to a single property in a single object somewhere. An example could be the JWT subject claim property:

User user_001 JWT access token:

{
        "iss": "https://login.authress.io",

        // highlight
        "sub": "user_001",
        // highlight

        "iat": 1685021390,
        "exp": 1685107790,
        "scope": "openid profile email"
}

In OAuth/OpenID, the sub claim in a JWT represent the User ID. Thus this particular token represents a verified user with the identity user_001. Anyone that holds this token is now has access to impersonate this user. Hopefully, you have some logging in place to identify when a user is being impersonated and who actually started the impersonation process. But how do we actually impersonate this user?

Well of course, I need to convert a token that represents my admin user into a token that represents the user I want to impersonate. This would be an example of the token that I have right now.

My admin user token:

{
        "iss": "https://login.authress.io",

        // highlight
        "sub": "me_admin",
        // highlight

        "iat": 1685021390,
        "exp": 1685107790,
        "scope": "openid profile email"
}

Since our system, in this scenario uses the sub property to determine which user is accessing the system, I of course need a token that replaces the current value of the sub which is me_admin for me, to one that contains the sub of user_001. So when I impersonate the user, the result must be a token that looks exactly like the user token:

User token generated by the admin:

{
        "iss": "https://login.authress.io",

        // highlight
        "sub": "user_001",
        // highlight

        "iat": 1685021390,
        "exp": 1685107790,
        "scope": "openid profile email"
}

Some of http/auth frameworks have thought a whole two seconds longer than the rest and might have decided to add an additional property to indicate that the token was created through the process of impersonation by an admin instead of directly by the user:

User token generated by the admin with magic:

{
        "iss": "https://login.authress.io",
        "sub": "user_001",

        // highlight
        "generated_by": "me_admin",
        // highlight

        "iat": 1685021390,
        "exp": 1685107790,
        "scope": "openid profile email"
}

And this might even seem like a good idea, however, in practice it creates a Pit of Failure. Enabling admin to create new tokens that contain the original user causes two distinct problems.

The first issue is that one admin user can impersonate another admin user. And that second admin user might be one that potentially has more access and is authorized for more sensitive information. This means that it isn't so straightforward to just add in impersonation and assume that everything will just work out. Our List of Admin, no longer can just be a list of admin, it now must also contain some hierarchal order of who can impersonate whom. If you've been following along this looks a lot like what Authress Authorization provides. Of course you don't absolutely have to have that, but if you don't then you've sacrificed some security.
The second issue is that not every application you have might be interested in allowing users to be impersonated. In any mature system, and even most early software ventures, have some data that you are even less interested in exposing than rest. Sensitive by nature or Regulated data fits this picture. This could be Personal Identifiable Information (PII), Credit Cards (PCI-DSS), or really anything that has been regulated in your locality as a result of governing bodies. You might breach this through user impersonation if for instance your support engineer is in different Data Residency than the user is in. For example, when attempting to debug issues in a UI, almost never is the Date Of Birth (DOB) of the user absolutely necessary to be shown on the screen. Sure it is relevant in some user use cases, but in most debugging scenarios it is not.

If your authentication depends on the property sub in the JWT, then an application cannot opt out of user impersonation. Since you are changing the sub to be the impersonated user, every application will see the new sub value, even if they do not want to support user impersonation. Strike 1.
All applications are forced opted-in. If an application wants to opt-out then the second claim generated_by or it's respective implementation is required. But then still, all applications are opted-in. That means when you design a new application you have to know that you might want to opt out admin from accessing user data in this application, "data is insecure by default, unless explicitly designed otherwise". This is the pit of failure, a pit of success would be opt-in, Data is secured by default, unless otherwise excluded. Strike 2.

A quick call-out is worthwhile on how to secure data like a user's DOB. UIs don't need to know this information in most cases. The screens and activities where DOB is valuable, actually care that the user isBornInJanuary or isOlderThan18, and not the actual date of birth of the user. Unless of course this is the users DOB selection, in which case this component rarely needs to be validated by a support engineer, and if you believe that user impersonation is necessary to help validate the user DOB entry screen, this article isn't going to be of any help for you.

3. Secondary system data leakage

Not only do we need to worry about vulnerabilities in our primary user applications, as well as leaking the data associated with them. Now we also need to worry about protecting these secondary systems used to impersonate users AND leaking the data associated with them as well. Internal systems, by their very design usually end having worse security measures in place because fewer people use them. Fewer users and lower volume means more hacks and less attention given to such an app. In practice, these applications are rarely changed, but frequently break, and most importantly have low priority when it comes to innovation and implementing necessary improvements. They don't end up in your OKR Objectives for this quarter and no one is getting promoted over them.

We are so concerned that someone is abusing these tools that we ourselves leak user access tokens and data to logging systems. We log so zealously to ensure we have captured the usage of these tools, that we end up logging that which we should not. When we log that means we've probably also exported these logs to some third party reporting tools. It is a Catch-22, we know we need to log and report on actions taken as an admin when impersonating a user that log data that we would not normally be logging. The goal to prevent security issues creates a new attack surface.

The result is that these systems will likely end up logging usage of user tokens. That's an introduction of a new attack surface, and due to the issues in priority with fixing, these systems are actually twice as likely to leak user data compared to our primary user applications.

4. Corrupted audit trails

Frequently we can a priori conclude that user impersonation is actually wrong. In the debugging scenarios, the last thing you want to do is gain access to modify the users' data. If you actually needed to modify a user's private data, or one of your customer's account information, you definitely want a dedicated system to handle that. This means, you actually don't want to the be the user, you don't want to impersonate the user, you just want to be the user with the explicit caveat of read only permissions. You only want to see what they see, not actually be able to modify their data. Accidentally modifying user data is guaranteed to happen accidentally if the only way to to verify a user facing UX problem is to completely impersonate a user and get full write access to their account.

Without thinking about, the following issues are associated with impersonating the user in this context:

Audit trails incorrect say the user changed data when they did not. ➤ An admin impersonating the user did it.
The user's sessions may start to include the one generated by the admin. ➤ As a user, it would be an understatement to say they would be concerned if they saw a session in a sensitive account modifying data from a location they are not in.
Logging data in the applications is incorrectly recorded, or may not be recorded at all. ➤ You may be tempted to hide these admin interactions.
And lastly, in every case, now we need to alter our systems to be not only aware of how to process the data due to impersonation, but how to log it. ➤ Impersonation is a virus that starts to infect all of our systems.

The practical-ish solutions

If generating a new token that contains the impersonated User ID is so bad, there must be better solutions out there.

Solution A: Additional token claim property

What if we don't change the subject sub claim, but instead add a new claim. That way, only those services that understand this claim, and actually want to use it would choose to use it. Services that don't know about it, keep using the unmodified sub claim. Admins would still look like admins. Only services that care about a new adminIsImpersonatingUserId claim property would know to use it and how to handle it. This would give you security by default, and only expose services to the danger that have already explicitly designed support for it. You would have to opt in, success finally!

Theoretically this is great, and while it is a bit more secure than altering the subject, in practice, we start to write code that looks like this:

Resolve User Identity:

async function resolveUserIdentity() {
        const userId = jwtToken.adminIsImpersonatingUserId
          ?? jwtToken.sub
        return userId;
}

Then that code ends up in a shared library which all our services implement. So while our intentions were good, the reinforcing system loops cause this to be no better than the alternatives. The reason is, we often find the need to optimize our usage across even a small number of services, some believe preventing code duplication is a bad thing. So the resolveUserIdentity method leads us to the following pattern:

We change our Auth solution to add the new claim to the JWT during impersonation.
Only those services that need to care about this add support for it.

At this point we are still 100% secure. But then:

We update some shared libraries that support JWT verification and add the method resolveUserIdentity to it.
The resolveUserIdentity replaces all the checks to consume the new claim.
All existing services get updated to use this shared library, and are exposed to the dangers of impersonation.

A new claim won't help us. This means that now we are back to the same problem, and arguably the situation is worse. Instead of all the services in the platform trusting the standardize sub, we now maintain a bespoke solution just for our system. This is especially important, the sub claim is an OAuth and OpenID industry standard RFC 9068, everyone in the industry is familiar with it. However, just for your system, there is now a new claim which just ends up being treated as the sub canonical sub, but it is not standard, not self documenting, unexpected and unique. Complexity reduces security. Strike 3.

For more about the systemic issues with a JWT or session token based permission system, permission attenuation is discussed in depth in the token scoping academy topic.

Solution B: DOM Recording

See earlier impersonation use cases.

If we flash back to the original user stories that drove us to implement user impersonation in the first place, we might start to see a pattern emerge. Most of the time the issue is that — something is wrong with the User Experience. The user is stuck in some way, the data isn't being displayed correctly, some component is broken.

All of these are user facing issues, and issues facing the user purely in the UI. The source of the data, and the security therein has near-zero value to us in validating the user experience. Attempting to use expensive full user impersonation instead of simple UI component tests, is the exact same problem we see incorrectly implementing tests at the wrong level.

Let's use the Testing Pyramid as an analogy. The canonical testing pyramid is this:

At the bottom is our unit tests, those tests are cheap and easy to write, find the most issues, and ensure our system is working without much effort.
Then comes the service level tests. Or in the case of UIs these are our screen tests. Multiple pieces of functionality and components are combined together in these tests. We don't want many of them, perhaps 10% max of all our tests test full screens or services. Most of the functionality of the service or screen is already validated in the unit tests — ie we know that our core functions, as well as buttons, slides, pickers, etc — all work correctly.
Now comes the 1% integration or end-to-end tests. You almost never want these, only the most critical flows of your application should be validated. When they report a failure, you have no idea what might have caused that particular failure, you just know there is a problem. In the case of an application like social media platform, The integration test you want is — making a new post. (Obviously there is no reason to test the login flow, since your auth provider has you already covered there!)
At the top of the pyramid is manual exploratory testing. That which cannot be automated, and most importantly needs the intelligence and creativity of a human to identify potential problems in your software application. This is the most expensive and you rarely have an interest in squandering this effort.

The only difference between this and a support case is the context — the why. The services, applications, business logic, and tools that we have at our disposal are all the same. We need to trust that our tests exist to validate the problems we could have. It is always a mistake to invest effort in the top of the pyramid when we lack the assets at the bottom. Likewise, our support pyramid is this:

At the bottom is application logs. There is no sense in attempting to tackle any of the higher layers until you have sufficient application logs that exactly report incoming requests, outgoing responses, unexpected data scenarios, edge cases that aren't completely implemented, and systemic issues.
Just above that is documentation. This includes expected common flows, uncommon flows, and demos of the more complex to use aspects of our application. The biggest benefit of this documentation is that we can help out users. I want to repeat that it is more for us, than it is for our users. The pyramid exists to inform us what we should do, not how our users should operate.
The next rung up are User recordings. For users that are having issues, we have concrete recorded data for their flow. The flows would include anything relevant to the application, how they used it, what actions they took. All so we can actually see what happened in context for when there is a problem. No one wants to spend any time looking at recordings if they don't have to. It is also very difficult to identify the root cause of problems by reviewing a recording, but having them is indispensable to your support engineers when they need them, when a user has reported a issue. Solutions include PostHog, FullStory, Sentry. If you don't have these recordings, then the next best alternative (which is very far away) is getting a live screencast from the user. These are less useful, and more expensive to obtain. Worst of all, they can and have been used to breach sensitive systems.
At the very top, is of course the thing you never want to have to do, and the topic of this article: Full user impersonation. If everything else fails then at least we have user impersonation left in our toolkit. But this must only be used after we have significantly invested in all the other strategies.

Assuming we have tackled the bottom two rungs of the table, the missing next component is the User recordings. If you have those, which offer the ability to sanitize the data coming from users, then you've got the solution to 99% of all support cases. Having people jump in and impersonate users is just not necessary. And most importantly, if we look at who often needs to impersonate users, it isn't even the people who should have access to do so.

Revisiting user impersonation

Do you want to see the data or do you want to see what the user sees? In almost every case it is the former, seeing the data can be through an admin app. In the rare case that it is the later, we would need the exact permissions the user has, or some safer strict subset of them. So what's the right way to handle user impersonation in the case that we just can't live without it?

The most important principle here is Secure by Default. So far a blanket implementation is wrong, and there are too many pits of failure with the JWT, auth session, or reference token based approach.

Looking at the support engineer use case, our needs would be satisfied if we were to explicitly hand out to the support staff just the permissions read:logs to handle that specific support case. But it is quite something else to generate whole valid tokens that contain the subject different from the user requesting them and give those out to specific people. So as long as we have a system that allows us to provide our team members with explicit permissions to only the exact resources they need, then we have the capability to ensure we have a secure system that also solves all our use cases.

How Authress supports user impersonation

I want to end this article with a discussion about how Authress solves the top of the pyramid user impersonation story. The caveat here being, that it is sometimes a trade-off some companies really want. They absolutely want to sacrifice security, increase vulnerabilities as well as their attack surface by introducing full user impersonation functionality. However from experience, very few of our customers have anything implemented in this space at all, and those that do have hooked their process into easy to grant permissions through Authress, rather than full user identity impersonation.

The real solution is to actually consider your support team persona when designing features. And this is what Authress optimizes for.

The flow that we consider the most secure is explicitly and Temporarily grant your support user persona exactly one small additional set of permissions relevant for the support case. When we do this we don't change how we determine identity, we only change the way we determine access. Authress supports this by allowing quick cloning of User Based Access Records which represent the permissions a user has. Since cloning is dynamic, a temporary access record can be created that only contains the READ equivalent roles that the user has. And most cases, you can just directly assign your support engineers to a Authress Permission Group with READ ✶ access, and never need to touch permissions again.

Here is an example cloned access record, where the support engineer received just the Viewer Role to all organizations so that documents and users could be Read not Updated:

The firehouse recommendations

In case you want to ignore the advise of this academy article, and instead of using Authress permissions to drive access control as recommended, I do want to include recommendations that will help reduce the impact of security and compliance issues related to user impersonation:

Do not hide user impersonation, it will be tempting to obscure the usage of it from your customers. Instead make sure it is visible and clear for everyone especially your customers. I know you don't want them to know, but they should know, they may even need to know, especially if something goes wrong.
Make sure all actions are recorded in an audit trail both by your admin who impersonated the user and the application user. Especially the admin. There will definitely be questions related to the "last person that touched this" and of course "it was working before your team looked at it". You will need a way to be confident in your response to your customers when it wasn't an admin that touch it last.
If you're operating in any high-security environment, FedRAMP, ITAR, or the like, always require customer user action before the support engineer has access to the account data. Some prominent cloud providers believe having an email with the user agreeing, is sufficient for this. I'm here to say — is not sufficient. Because often people who can create support cases do not and should not have admin access to the customer account to view all the data. Someone without the customer admin role should be able to grant your support engineering staff access to sensitive data in the account. You need an admin to click a button. This is usually done through a Step-Up Authorization Request.
Impersonation can be valuable in some environments however often completely useless in others. Especially in spaces with regulatory requirements, it's much better to diagnose issues from outside the impacted account, either through data replication or a permissions based approach.
Ensure your impersonation logic is completely tested. There should be no better tested piece of functionality in your software system.
Audit trails should always keep a "This was run-by User X" annotation on audit records, not just the user ID, but any additional information from the admin. Our recommendation is both the Admin User ID and the Support Ticket ID, on every log statement.
Start with your customer expectations. What sort of transparency do they explicitly expect? Do not guess. Err on the side of overcommunicating, rather than under.
Please revisit doing this in the first place if you don't have the capacity to have a dedicated team accountable for this functionality. Often this will involve your legal team when it doesn't go right.
When (not if) credentials leak, who leaked those credentials? Was it your customer or was it through your admin application or by one of your support engineers. Always be able to tell where those credentials came from, so that you can respond to the compromise as effectively as possible.
If you want to start anywhere, go back and invest in your admin/support tools so that they can expose the data that you need, rather than focusing on user impersonation. If those tools are insufficient check back at the Support Engineer Pyramid again.

For help understanding this article or how you can implement a solution like this one in your services, feel free to reach out to the Authress development team or follow along in the Authress documentation and join our community:

Join the community

DEV Community