Skip to content

Javiercn/blazor improved circuit is #11376

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 25 commits into from

Conversation

javiercn
Copy link
Member

@javiercn javiercn commented Jun 19, 2019

  • Makes the circuit id a csrf token:
    • We generate a key with a PRNG.
    • We produce a request token by hashing the key.
    • We produce a cookie token by dataprotecting the key.
      • The cookie name includes a prefix of the request token to distinguish the cookie among all possible options.
      • If this is a problem, we can attach an additional prefix to both the request token and cookie for matching purposes.
  • Uses an endpoint policy matcher that attaches an identity with a claim to the request principal.
    • SignalR service copies a limited amount of information to pass in.
    • The claims from the existing principal are among those pieces of information.
    • The policy matcher wraps the hub endpoints, and gets applied after the signalr service matcher.
    • We use the circuit id to compare it against the claim we setup in the principal.
  • Updates the bootstrapping process to limit 1 circuit per connection.
    • Prerendered and non-prerendered components render on the same circuit.
    • When there are prerendered components, we reconnect to the circuit first and then render non prerendered components.
    • When there are no prerendered components we simply create a new circuit and render the non prerendered components.

Pending

  • Cleanup stale cookies.
  • Use an ephemeral data protector.
  • Reject cross-origin requests when generating cookie.
  • Set cache headers when setting the cookie.
  • Unit tests
  • Hide the request for generating a new circuit id inside the signalr connection protocol.
    • We can't hide the call to generate a new circuit id within the hub protocol due to the fact that the client needs to get a circuit id back it can reuse for reconnection and to start the circuit in the first place.

@@ -41,6 +41,7 @@ and are generated based on the last package release.
<LatestPackageReference Include="Microsoft.AspNetCore.Razor.Language" Version="$(MicrosoftAspNetCoreRazorLanguagePackageVersion)" />
<LatestPackageReference Include="Microsoft.AspNetCore.Testing" Version="$(MicrosoftAspNetCoreTestingPackageVersion)" />
<LatestPackageReference Include="Microsoft.Azure.KeyVault" Version="$(MicrosoftAzureKeyVaultPackageVersion)" />
<LatestPackageReference Include="Microsoft.Azure.SignalR" Version="1.1.0-preview1-10426" />
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Included here so that its easier to test with the rest of the repository.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When this is merged, I'm guessing you'll remove this, and ensure the CTI scripts give us adequate coverage of the SignalR Service scenarios?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't hurt CTI, it only puts the reference in our build infrastructure so that we can do <Reference Import="Microsoft.Azure.SignalR" /> in our test projects when we want to test something against Azure SignalR

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is going to be constantly broken while we're still making changes. I'd suggest not doing this right now and wait until we stop making big API changes that break the azure signalr SDK.

@javiercn javiercn force-pushed the javiercn/blazor-improved-circuit-id branch from ef3eabe to 5ce24a6 Compare June 19, 2019 17:34
@Eilon Eilon added the area-blazor Includes: Blazor, Razor Components label Jun 19, 2019
@javiercn javiercn force-pushed the javiercn/blazor-improved-circuit-id branch from 4bc6dba to 31a0217 Compare June 20, 2019 08:04
@javiercn javiercn changed the title Javiercn/blazor improved circuit Javiercn/blazor improved circuit is Jun 20, 2019
@javiercn javiercn marked this pull request as ready for review June 20, 2019 08:04
@javiercn javiercn force-pushed the javiercn/blazor-improved-circuit-id branch from 6e61131 to 7a35e5c Compare June 20, 2019 10:15
@@ -11,6 +11,8 @@ public static partial class ComponentEndpointRouteBuilderExtensions
{
public static Microsoft.AspNetCore.Components.Server.ComponentEndpointConventionBuilder MapBlazorHub(this Microsoft.AspNetCore.Routing.IEndpointRouteBuilder endpoints) { throw null; }
public static Microsoft.AspNetCore.Components.Server.ComponentEndpointConventionBuilder MapBlazorHub(this Microsoft.AspNetCore.Routing.IEndpointRouteBuilder endpoints, System.Action<Microsoft.AspNetCore.Http.Connections.HttpConnectionDispatcherOptions> configureOptions) { throw null; }
public static Microsoft.AspNetCore.Components.Server.ComponentEndpointConventionBuilder MapBlazorHub(this Microsoft.AspNetCore.Routing.IEndpointRouteBuilder endpoints, string path) { throw null; }
public static Microsoft.AspNetCore.Components.Server.ComponentEndpointConventionBuilder MapBlazorHub(this Microsoft.AspNetCore.Routing.IEndpointRouteBuilder endpoints, string path, System.Action<Microsoft.AspNetCore.Http.Connections.HttpConnectionDispatcherOptions> configureOptions) { throw null; }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As an API review point, we should consider wrapping both the path and HttpConnectionDispatcherOptions into some new type BlazorHubOptions so that we're free to keep adding more options in the future without an exponentially increasing number of overload combinations. Adding a note to #4050

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's fair. I would also hope that at some point this ends up being something on ComponentEndpointConventionBuilder instead of overloads.

@@ -2,6 +2,7 @@
// Licensed under the Apache License, Version 2.0. See License.txt in the project root for license information.

using System;
using Microsoft.AspNetCore.Authorization;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line probably isn't required on its own.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope, this is cleanup.


const responseBody = await response.json() as { id: string };

return responseBody.id;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the TS-side code, could we rename all references to "circuit ID" to something like circuitIdRequestToken, or whatever its more accurate name now is? Otherwise it's hard to keep track of what aspect of the circuit ID we're sharing with the client.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough

if (!existingCircuit)
{
// If the circuit we are using was created during prerendering don't run
// the lifecycle events again at this time.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was this a bug before? I'm surprised we need to change that behavior unless it was a bug before.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am a bit uncertain about the meanings of these lifecycle events now. Previously I would have expected:

  • OnCircuitOpenedAsync to occur during prerendering
  • OnConnectionUpAsync to occur after the client opens the socket connection and is ready for interactivity

... but it seems with this change that there wouldn't be a way to know when the socket connection was opened. Is this right? I'm not sure we've got a formal definition of when these lifecycle events should fire, but now would be a good time to be explicit about it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way things work now:

Prerendered components or prerendered+non-prerendered

  • Prerender -> Runs on Circuit Opened
  • Connect -> Runs OnConnectionUp
  • StartCircuit -> Renders additional components

Only non prerendered components

  • StartCircuit runs OnCircuitOpenedAsyn -> OnConnectionUpAsync -> RendersAdditional components.

OnConnectionUp always marks when the socket is opened and we're ready to perform JSInterop.

return hash;
}

public bool ValidateCircuitId(string circuitId, string cookie, out CircuitId id)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I find the namings of the circuitId parameter and the locals in this method hard to follow. It's not obvious to me that circuitId should mean "the token from the querystring but not the whole ID" or rawRequestId should mean "the bit we got from the querystring" or that id should mean "from the cookie". I'd find something like the following easier to interpret at a glance:

var circuitIdFromCookie = FromCookieValue(cookie);
var requestTokenFromQueryStringBytes = Base64UrlTextEncoder.Decode(requestTokenFromQueryString);
var requestTokenFromCookieBytes = Base64UrlTextEncoder.Decode(circuitIdFromCookie.RequestToken);
if (CryptographicOperations.FixedTimeEquals(requestTokenFromQueryStringBytes, requestTokenFromCookieBytes))

Otherwise, interpreting the meanings involves tracing backwards to remember how each data item was derived.

}

public bool ValidateCircuitId(string circuitId)
public bool ValidateCircuitId(string circuitId, ClaimsPrincipal user)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of interest, have we got to the point of verifying that the current user's "name" claim matches the existing one on the circuit when reconnecting? Or is that a future enhancement?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's something we can do in the future, but it's not necessary. The way it works is we add an Identity to the request principal with a claim "bcid" which is the circuit id and we verify that the current principal has a claim that matches the provided circuit id.

}
}

internal struct CircuitId
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this could be readonly for clarity

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I also plan to do some cleanup here.

@@ -105,9 +132,14 @@ public virtual Task DisconnectAsync(CircuitHost circuitHost, string connectionId
return circuitHandlerTask;
}

internal bool ContainsCircuit(string circuitId)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Calling it circuitId here seems really confusing. The dictionary key is now the request token, not the circuit ID, isn't it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but in all effects is the string we use to identify the circuit

using System.Text.Encodings.Web;
using Microsoft.AspNetCore.Authentication;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure the changes in this file are needed

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope, just baggage from a previous implementation where it was done through an authentication handler.

@@ -47,6 +49,9 @@ public static IServerSideBlazorBuilder AddServerSideBlazor(this IServiceCollecti
options.SupportedProtocols.Add(BlazorPackHubProtocol.ProtocolName);
});

// Matcher policy to configure csrf protection on the negotiate endpoint
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not really CSRF, is it? My understanding was that we're protecting against circuit ID exfiltration via XSS.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're protecting against the circuit id being used outside of the browser context, if you manage an XSS and get access to the request token you have everything you need to connect to the server and do stuff as you are effectively running in the users context. (Although you need to do it from within the page, which I'm not sure is viable).

The cookie prevents against the circuit id being used from outside the browser context and across origins (as is same-site by default, although this is something likely we will allow configuring, as its required based on conversations I had with Chris)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand - my comment was just about the use of the term csrf in the line I'm commenting on. It seems unrelated to what we're doing here.

internal const string IdClaimType = "bcid";
internal const string CookiePrefix = "Circuit.";

private const int PrefixLenght = 4;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
private const int PrefixLenght = 4;
private const int PrefixLength = 4;

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, are we relying on non-collision of the IDs up to this prefix length for a given user? If so we might be better adding another couple of characters to the prefix length.

(e.g., if these are base64 chars, the 4-char prefix space size is 64^4=16 million)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assuming you have 50 open circuits on a client at the same time, 0.00000298023223876953125‬ those are your chances of running into a collision I believe. We can add a couple more and won't hurt, but I think its already within target.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. If each user has 2 circuits, then the expected number of users until you get a clash is about 12 million. It is very unlikely for any given user, but people do win lotteries :)

{
internal class CircuitMatcherPolicy : MatcherPolicy, IEndpointSelectorPolicy
{
internal const string IdClaimType = "bcid";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I'm being overly-cautious here, but might it be valuable to give this a more unique name? It's not hard to imagine that people might try using the claim name bcid to mean something completely different in their application.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claims need to be short, but I would accept mabcid (Microsoft AspNetCore Blazor Circuit ID) sounds good?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, sounds much less likely to clash.

@@ -0,0 +1,232 @@
// Copyright (c) .NET Foundation. All rights reserved.
// Licensed under the Apache License, Version 2.0. See License.txt in the project root for license information.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would definitely like to see @rynowak's review on all the endpoint-related stuff, as it's quite far outside my area of expertise.

var previousIdentifier = CaptureScopeIdentifier();
for (int i = 0; i < 50; i++)
{
Browser.Equal(i + 3, () => Browser.Manage().Cookies.AllCookies.Where(c => c.Name.StartsWith("Circuit.")).Count());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should there be a call to Browser.Manage().Cookies.DeleteAllCookies(); before we get here? Otherwise it seems like this could be flaky.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, It was the only test in this class and then I ended up adding more.

var ids = Browser.FindElements(By.CssSelector(".circuit-id")).Select(e => e.Text).ToArray();

Assert.Equal(2, ids.Length);
Assert.Single(ids.Distinct());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this test is doing anything to wait until the UI updates before reading the displayed circuit IDs.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same with the other two tests below.

@@ -2,6 +2,7 @@

<PropertyGroup>
<TargetFramework>netcoreapp3.0</TargetFramework>
<AspNetCoreHostingModel>OutOfProcess</AspNetCoreHostingModel>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of interest, why is this needed?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need to revert this one, but I wasn't able to run InProc IIS Express, (I haven't for a while) so I switched it to the old OutOfProccess


<script src="_framework/blazor.server.js" autostart="false"></script>
<script>
// Used by InteropOnInitializationComponent
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy-paste artifact?

Also it would be great not to be duplicating the boilerplate we see here from L22-L38. Could this be factored out into some static .js file shared among the E2E test cases that use it?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same with file below

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's fair.

app.UseRouting();

app.UseEndpoints(endpoints =>
{
endpoints.MapControllers();
endpoints.MapRazorPages();
endpoints.MapBlazorHub();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is now here, do we still also need L97?

@SteveSandersonMS
Copy link
Member

A couple of more high-level questions:

  1. What am I doing wrong?

I tried running this with the Azure SignalR service, and found it was working fine only on the first connection per browser instance. After connecting once, pressing F5 in the browser to reload would fail with this:

image

To create this situation, I did the following:

  • In ComponentsApp.Server.csproj, added <Reference Include="Microsoft.Azure.SignalR" />
  • In its Index.cshtml, eliminated prerendering (i.e., reduced <app> to <app>Loading...</app>)
  • In its Startup.cs, added in ConfigureServices services.AddSignalR().AddAzureSignalR("<connection string>"); and changed the MapBlazorHub call to endpoints.MapBlazorHub<ComponentsApp.App.App>("app");

Am I doing something wrong, or is this a bug? I'm not sure where to find the underlying error since all it says in the browser is that it couldn't call StartCircuit. I tried putting a breakpoint inside that method, and it wasn't getting hit at all when this error occurs (though does get hit when the error doesn't occur when I use a fresh new browser).

  1. Can XSS code hijack a circuit using SignalR's negotiate endpoint?

I see that the protocol to connect is that we first make a POST request to _blazor/negotiate, passing the circuitId on the querystring, plus obviously any cookies go with the request too. The response gives a URL and access token for connecting to the SignalR service.

So, what's to stop XSS code from also doing a POST request to the same endpoint from your origin? It will also be sent with the cookie that includes the data protection version of the circuit identifier. Then the XSS code can read the response and exfiltrate the URL and access token.

Isn't that enough for an attacker then to connect to the SignalR service with your access token and assume control over your circuit?

// Circuits for which there isn't an already created circuit will contain an additional KeepAlive cookie
// that will be used as a filter to avoid deleting the circuit id cookie. The KeepAlive cookie for a given
// circuit will go away automatically.
private void CleanupStaleCookies(HttpContext context)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't spot any E2E (or unit) tests covering this. Did I miss this? Will some be added?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does work (I verified manually) but I chose not to add one because it takes until the circuit goes away to take effect, so I didn't want to add a test that waits for 4 minutes.

In retrospective I think I might be able to add an endpoint that pings the server and tells it to dispose a circuit out of band, and that would allow for a test this in an E2E fashion

@javiercn
Copy link
Member Author

javiercn commented Jun 24, 2019

I tried running this with the Azure SignalR service, and found it was working fine only on the first connection per browser instance. After connecting once, pressing F5 in the browser to reload would fail with this:

Haven't experienced this in the past. I did a demo last friday where I showcased it using Azure SignalR service without issues, but I'll validate again. Maybe I simply prerendered things.

@javiercn
Copy link
Member Author

  1. Can XSS code hijack a circuit using SignalR's negotiate endpoint?

I see that the protocol to connect is that we first make a POST request to _blazor/negotiate, passing the circuitId on the querystring, plus obviously any cookies go with the request too. The response gives a URL and access token for connecting to the SignalR service.

So, what's to stop XSS code from also doing a POST request to the same endpoint from your origin? It will also be sent with the cookie that includes the data protection version of the circuit identifier. Then the XSS code can read the response and exfiltrate the URL and access token.

Isn't that enough for an attacker then to connect to the SignalR service with your access token and assume control over your circuit?

I don't think there's anything that fully protects you against XSS, but this limits the impact of an XSS in the sense that every action to be taken needs to be performed within the page (as you don't have access to the cookie) which means you cannot send the circuit id to a 3rd party server to connect to the app and party on the circuit, a compromised app is easier to spot (as there's gonna be network traffic) and the attack is much more unreliable as the browser window can be closed at any time.

The other data point is that the circuit id or request token in this case, is not enough information to connect to the circuit by itself, so it puts us in a much better place when it comes to it being logged or leaked somewhere.

How the access token is handled belongs to the signalr service. If its a single use token then multiple connections could be detected and the connection dropped all together.

XSS is probably the most powerful way to PWN a site, so its hard to fully protect against it unless the user uses something like CSP.

@javiercn
Copy link
Member Author

Closing this as given that we are getting rid of stateful prerendering there's nothing to savage from the PR associated with this issue..

@javiercn javiercn closed this Jul 18, 2019
@javiercn javiercn deleted the javiercn/blazor-improved-circuit-id branch July 18, 2019 14:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-blazor Includes: Blazor, Razor Components
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants