diff --git a/docs/architecture/distributed-cloud-native-apps-containers/Aspire ebook asset creation.pptx b/docs/architecture/distributed-cloud-native-apps-containers/Aspire ebook asset creation.pptx new file mode 100644 index 0000000000000..d1268ed5ef075 Binary files /dev/null and b/docs/architecture/distributed-cloud-native-apps-containers/Aspire ebook asset creation.pptx differ diff --git a/docs/architecture/distributed-cloud-native-apps-containers/api-gateways/gateway-patterns.md b/docs/architecture/distributed-cloud-native-apps-containers/api-gateways/gateway-patterns.md new file mode 100644 index 0000000000000..10d329181d65f --- /dev/null +++ b/docs/architecture/distributed-cloud-native-apps-containers/api-gateways/gateway-patterns.md @@ -0,0 +1,168 @@ +--- +title: Gateway patterns +description: Architecture for Distributed Cloud-Native Apps with .NET Aspire & Containers | Gateway patterns +author: +ms.date: 04/25/2024 +--- + +# Gateway patterns + +[!INCLUDE [download-alert](../includes/download-alert.md)] + +In a cloud-native system, front-end clients (mobile, web, and desktop applications) require a communication channel to interact with independent back-end microservices. + +How can you satisfy this need? + +To keep things simple, a front-end client could *directly communicate* with the back-end microservices. + +![Diagram showing direct client to service communication.](./media/direct-client-to-service-communication.png) + +**Figure 13-1**. Direct client to service communication + +With this approach, each microservice has a public endpoint that is accessible by front-end clients. In a production environment, you'd place a load balancer in front of the microservices, routing traffic proportionately. + +While simple to implement, direct client communication would be acceptable only for simple microservice applications. This pattern tightly couples front-end clients to core back-end services, opening the door for many problems, including: + +- Client susceptibility to back-end service refactoring. +- A wider attack surface as core back-end services are directly exposed. +- Duplication of cross-cutting concerns across each microservice. +- Overly complex client code. Clients must keep track of multiple endpoints and handle failures in a resilient way. + +Instead, a widely accepted cloud design pattern is to implement an [API Gateway Service](../../microservices/architect-microservice-container-applications/direct-client-to-microservice-communication-versus-the-api-gateway-pattern.md) between the front-end applications and back-end services. + +![Diagram showing the API Gateway pattern.](./media/api-gateway-pattern.png) + +**Figure 13-2**. API gateway pattern + +In the previous figure, note how the API Gateway service abstracts the back-end core microservices. Implemented as a web API, it acts as a *reverse proxy*, routing incoming traffic to the internal microservices. + +The gateway insulates the client from internal service partitioning and refactoring. If you change a back-end service, you accommodate it in the gateway without breaking the client. It's also your first line of defense for cross-cutting concerns, such as identity, caching, resiliency, metering, and throttling. Many of these cross-cutting concerns can be off-loaded from the back-end core services to the gateway, simplifying the back-end services. + +Care must be taken to keep the API Gateway simple and fast. Typically, business logic is kept out of the gateway. A complex gateway risks becoming a bottleneck and eventually a monolith itself. Larger systems often expose multiple API Gateways segmented by client type (mobile, web, desktop) or back-end functionality. The [Backends for Frontends](/azure/architecture/patterns/backends-for-frontends) pattern provides direction for implementing multiple gateways. + +![Diagram showing the backends for frontends pattern.](./media/backend-for-frontend-pattern.png) + +**Figure 13-3**. Backends for frontends pattern + +Note in the previous figure how incoming traffic is sent to a specific API gateway, based upon client type: web, mobile, or desktop app. This approach makes sense as the capabilities of each device differ significantly across form factor, performance, and display limitations. Typically mobile applications expose less functionality than a browser or desktop applications. Each gateway can be optimized to match the capabilities and functionality of the corresponding device. + +The backends for frontends pattern can be extended to be platform-specific. For example, an iOS and Android gateway could be created to expose platform-specific functionality. Or if you have specific third parties that need to access your services, you could create a gateway specifically for each of them. + +## Simple gateways + +To start, you could build your own API Gateway service. A quick search of GitHub will provide many examples. + +For simple .NET cloud-native applications, you might consider the [Ocelot Gateway](https://github.com/ThreeMammals/Ocelot). Open source and created for .NET microservices, it's lightweight, fast, and scalable. Like any API Gateway, its primary functionality is to forward incoming HTTP requests to downstream services. Additionally, it supports a wide variety of capabilities that are configurable in a .NET middleware pipeline. + +[YARP](https://github.com/microsoft/reverse-proxy) is another open source reverse proxy project, led by a group of Microsoft product teams. Downloadable as a NuGet package, YARP plugs into the ASP.NET framework as middleware and is highly customizable. You'll find YARP [well-documented](https://microsoft.github.io/reverse-proxy/articles/getting-started.html) with various usage examples. You'll see some examples of how to use YARP in the next section. + +For enterprise cloud-native applications, there are several managed Azure services that can help jump-start your efforts. + +## Azure Application Gateway + +For simple gateway requirements, you may consider [Azure Application Gateway](/azure/application-gateway/overview). Available as an Azure [PaaS service](https://azure.microsoft.com/overview/what-is-paas/), it includes basic gateway features such as URL routing, SSL termination, and a Web Application Firewall. The service supports [Layer-7 load balancing](https://www.f5.com/glossary/layer-7-load-balancing) capabilities. With Layer 7, you can route requests based on the actual content of an HTTP message, not just low-level TCP network packets. + +Throughout this book, we evangelize hosting cloud-native systems in [Kubernetes](https://www.infoworld.com/article/3268073/what-is-kubernetes-your-next-application-platform.html). A container orchestrator, Kubernetes automates the deployment, scaling, and operational concerns of containerized workloads. Azure Application Gateway can be configured as an API gateway for an [Azure Kubernetes Service](https://azure.microsoft.com/services/kubernetes-service/) cluster. + +The [Application Gateway Ingress Controller](https://azure.github.io/application-gateway-kubernetes-ingress/) enables Azure Application Gateway to work directly with Azure Kubernetes Service. + +![Diagram showing the Application Gateway Ingress Controller.](./media/application-gateway-ingress-controller.png) + +**Figure 13-4**. Application Gateway Ingress Controller + +Kubernetes includes a built-in feature that supports HTTP (Level 7) load balancing, called [Ingress](https://kubernetes.io/docs/concepts/services-networking/ingress/). Ingress defines a set of rules for how microservice instances inside AKS can be exposed to the outside world. In the previous image, the ingress controller interprets the ingress rules configured for the cluster and automatically configures the Azure Application Gateway. Based on those rules, the Application Gateway routes traffic to microservices running inside AKS. The ingress controller listens for changes to ingress rules and makes the appropriate changes to the Azure Application Gateway. + +## Azure API Management + +For moderate to large-scale cloud-native systems, you may consider [Azure API Management](https://azure.microsoft.com/services/api-management/). It's a cloud-based service that not only solves your API Gateway needs, but provides a full-featured developer and administrative experience. + +![Diagram showing the Azure API Management service.](./media/azure-api-management.png) + +**Figure 13-5**. Azure API Management + +To start, API Management exposes a gateway server that allows controlled access to back-end services based upon configurable rules and policies. These services can be in the Azure cloud, your on-premises data center, or other public clouds. API keys and JWT tokens determine who can do what. All traffic is logged for analytical purposes. + +For developers, API Management offers a developer portal that provides access to services, documentation, and sample code for invoking them. Developers can use Swagger or Open API to inspect service endpoints and analyze their usage. The service works across the major development platforms: .NET, Java, Golang, and more. + +The publisher portal includes a management dashboard where administrators expose APIs and manage their behavior. Service access can be granted, service health monitored, and service telemetry gathered. Administrators apply *policies* to each endpoint to affect behavior. [Policies](/azure/api-management/api-management-howto-policies) are pre-built statements that execute sequentially for each service call. Policies are configured for an inbound call, outbound call, or invoked upon an error. Policies can be applied at different service scopes to enable deterministic ordering when combining policies. The product ships with a large number of prebuilt [policies](/azure/api-management/api-management-policies). + +For your cloud-native services, you can use Azure API Management policies to: + +- Restrict service access. +- Enforce authentication. +- Throttle calls from a single source, if necessary. +- Enable caching. +- Block calls from specific IP addresses. +- Control the flow of the service. +- Convert requests from SOAP to REST or between different data formats, such as from XML to JSON. + +Azure API Management can expose back-end services that are hosted anywhere – in the cloud or your data center. For legacy services that you may expose in your cloud-native systems, it supports both REST and SOAP APIs. Even other Azure services can be exposed through API Management. You could place a managed API on top of an Azure backing service like [Azure Service Bus](https://azure.microsoft.com/services/service-bus/) or [Azure Logic Apps](https://azure.microsoft.com/services/logic-apps/). Azure API Management doesn't include built-in load-balancing support and should be used in conjunction with a load-balancing service. + +Azure API Management is available across [six different tiers](https://azure.microsoft.com/pricing/details/api-management/): + +- Consumption +- Developer +- Basic +- Standard +- Premium +- Isolated + +The Developer tier is meant for non-production workloads and evaluation. The other tiers offer progressively more power, features, and higher service level agreements (SLAs). + +The Premium tier provides [Azure Virtual Network](/azure/virtual-network/virtual-networks-overview) and [multi-region support](/azure/api-management/api-management-howto-deploy-multi-region). All tiers have a fixed price per hour. + +The Azure cloud also offers a [serverless tier](https://azure.microsoft.com/blog/announcing-azure-api-management-for-serverless-architectures/) for Azure API Management. Referred to as the *consumption pricing tier*, the service is a variant of API Management designed around the serverless computing model. Unlike the "pre-allocated" pricing tiers previously shown, the consumption tier provides instant provisioning and pay-per-action pricing. + +It enables API Gateway features for the following use cases: + +- Microservices implemented using serverless technologies such as [Azure Functions](/azure/azure-functions/functions-overview) and [Azure Logic Apps](https://azure.microsoft.com/services/logic-apps/). +- Azure backing service resources such as Service Bus queues and topics, Azure Storage, and others. +- Microservices where traffic has occasional large spikes but remains low most the time. + +The consumption tier uses the same underlying service API Management components, but employs an entirely different architecture based on dynamically allocated resources. It aligns perfectly with the serverless computing model: + +- No infrastructure to manage. +- No idle capacity. +- High-availability. +- Automatic scaling. +- Cost is based on actual usage. + +The new consumption tier is a great choice for cloud-native systems that expose serverless resources as APIs. + +## Real-time communication + +Real-time, or push, communication is another option for front-end applications that communicate with back-end cloud-native systems over HTTP. Applications, such as financial-tickers, online education, gaming, and job-progress updates, require instantaneous, real-time responses from the back-end. With normal HTTP communication, there's no way for the client to know when new data is available. The client must continually *poll* or send requests to the server. With *real-time* communication, the server can push new data to the client at any time. + +Real-time systems are often characterized by high-frequency data flows and large numbers of concurrent client connections. Manually implementing real-time connectivity can quickly become complex, requiring non-trivial infrastructure to ensure scalability and reliable messaging to connected clients. You could find yourself managing an instance of Azure Redis Cache and a set of load balancers configured with sticky sessions for client affinity. + +[Azure SignalR Service](https://azure.microsoft.com/services/signalr-service/) is a fully managed Azure service that simplifies real-time communication for your cloud-native applications. Technical implementation details like capacity provisioning, scaling, and persistent connections are abstracted away. They're handled for you with a 99.9% service-level agreement. You focus on application features, not infrastructure plumbing. + +Once enabled, a cloud-based HTTP service can push content updates directly to connected clients, including browser, mobile and desktop applications. Clients are updated without the need to poll the server. Azure SignalR abstracts the transport technologies that create real-time connectivity, including WebSockets, Server-Side Events, and Long Polling. Developers focus on sending messages to all or specific subsets of connected clients. + +![Diagram showing how HTTP clients connect to a cloud-native application with Azure SignalR.](./media/azure-signalr-service.png) + +**Figure 13-6**. HTTP Clients connecting to a cloud-native application with Azure SignalR + +Another advantage of the Azure SignalR Service comes with implementing serverless cloud-native services. Perhaps your code is executed on demand with Azure Functions triggers. This scenario can be tricky because your code doesn't maintain long connections with clients. Azure SignalR Service can handle this situation since the service already manages connections for you. + +Azure SignalR Service closely integrates with other Azure services, such as Azure SQL Database, Service Bus, or Azure Redis Cache, opening up many possibilities for your cloud-native applications. + +If you're using the .NET Aspire stack to build your cloud-native app, you have a built-in .NET Aspire integration that helps you call the Azure SignalR Service. This integration takes care of creating the connection and makes it easy for microservices to locate it. You can easily add connections from microservices and use them to send and receive information. + +## Drawbacks of the API Gateway pattern + +- The most important drawback is that when you implement an API Gateway, you're coupling that tier with the internal microservices. Coupling like this might introduce serious difficulties for your application. + +- A microservices API Gateway is an additional possible single point of failure. + +- An API Gateway can introduce increased response time due to the additional network call. However, this extra call usually has less impact than having a client interface that's too chatty directly calling the internal microservices. + +- If not scaled out properly, the API Gateway can become a bottleneck. + +- An API Gateway requires additional development cost and future maintenance if it includes custom logic and data aggregation. Developers must update the API Gateway in order to expose each microservice's endpoints. Moreover, implementation changes in the internal microservices might cause code changes at the API Gateway level. However, if the API Gateway is just applying security, logging, and versioning (as when using Azure API Management), this additional development cost might not apply. + +- If the API Gateway is developed by a single team, there can be a development bottleneck. This aspect is another reason why a better approach is to have several fined-grained API Gateways that respond to different client needs. You could also segregate the API Gateway internally into multiple areas or layers that are owned by the different teams working on the internal microservices. + +>[!div class="step-by-step"] +>[Previous](../testing-distributed-apps/how-aspire-helps.md) +>[Next](reverse-proxies-with-yarp.md) diff --git a/docs/architecture/distributed-cloud-native-apps-containers/api-gateways/media/api-gateway-pattern.png b/docs/architecture/distributed-cloud-native-apps-containers/api-gateways/media/api-gateway-pattern.png new file mode 100644 index 0000000000000..077f76c09fe59 Binary files /dev/null and b/docs/architecture/distributed-cloud-native-apps-containers/api-gateways/media/api-gateway-pattern.png differ diff --git a/docs/architecture/distributed-cloud-native-apps-containers/api-gateways/media/application-gateway-ingress-controller.png b/docs/architecture/distributed-cloud-native-apps-containers/api-gateways/media/application-gateway-ingress-controller.png new file mode 100644 index 0000000000000..2084fc01a7f7f Binary files /dev/null and b/docs/architecture/distributed-cloud-native-apps-containers/api-gateways/media/application-gateway-ingress-controller.png differ diff --git a/docs/architecture/distributed-cloud-native-apps-containers/api-gateways/media/azure-api-management.png b/docs/architecture/distributed-cloud-native-apps-containers/api-gateways/media/azure-api-management.png new file mode 100644 index 0000000000000..720ef98b4e72c Binary files /dev/null and b/docs/architecture/distributed-cloud-native-apps-containers/api-gateways/media/azure-api-management.png differ diff --git a/docs/architecture/distributed-cloud-native-apps-containers/api-gateways/media/azure-signalr-service.png b/docs/architecture/distributed-cloud-native-apps-containers/api-gateways/media/azure-signalr-service.png new file mode 100644 index 0000000000000..865e58cdf9830 Binary files /dev/null and b/docs/architecture/distributed-cloud-native-apps-containers/api-gateways/media/azure-signalr-service.png differ diff --git a/docs/architecture/distributed-cloud-native-apps-containers/api-gateways/media/backend-for-frontend-pattern.png b/docs/architecture/distributed-cloud-native-apps-containers/api-gateways/media/backend-for-frontend-pattern.png new file mode 100644 index 0000000000000..a1700aed7c56a Binary files /dev/null and b/docs/architecture/distributed-cloud-native-apps-containers/api-gateways/media/backend-for-frontend-pattern.png differ diff --git a/docs/architecture/distributed-cloud-native-apps-containers/api-gateways/media/direct-client-to-service-communication.png b/docs/architecture/distributed-cloud-native-apps-containers/api-gateways/media/direct-client-to-service-communication.png new file mode 100644 index 0000000000000..669e946a4bde6 Binary files /dev/null and b/docs/architecture/distributed-cloud-native-apps-containers/api-gateways/media/direct-client-to-service-communication.png differ diff --git a/docs/architecture/distributed-cloud-native-apps-containers/api-gateways/reverse-proxies-with-yarp.md b/docs/architecture/distributed-cloud-native-apps-containers/api-gateways/reverse-proxies-with-yarp.md new file mode 100644 index 0000000000000..1b64ead81bf13 --- /dev/null +++ b/docs/architecture/distributed-cloud-native-apps-containers/api-gateways/reverse-proxies-with-yarp.md @@ -0,0 +1,131 @@ +--- +title: Reverse proxies with YARP +description: Architecture for Distributed Cloud-Native Apps with .NET Aspire & Containers | Reverse proxies with YARP +ms.date: 01/13/2021 +--- +# Reverse proxies with YARP + +[!INCLUDE [download-alert](../includes/download-alert.md)] + +YARP is a versatile toolkit for building high-performance reverse proxy servers in .NET. Originating from the need within Microsoft to unify various teams' efforts around reverse proxy development, YARP is designed for flexibility and customization, making it suitable for a wide range of deployment scenarios. YARP is now an open-source project, allowing developers to leverage its capabilities for their own projects. + +## Key Features of YARP + +YARP has several key features that make it a powerful tool for building your reverse proxy servers: + +- **Customizable routing**: YARP can direct incoming requests to different backend services based on URL paths, headers, or other attributes. + +- **Load balancing**: YARP supports various strategies to distribute load evenly across service instances. + +- **Seamless integration with ASP.NET Core middleware**: This allows for custom request/response handling. + +- **Health checks**: YARP ensures traffic is only sent to healthy service instances. + +- **Session affinity**: YARP can maintain user sessions with specific services when needed. + +- **Cross-platform freedom**: YARP works seamlessly across Windows, Linux, and macOS. + +- **Protocol prowess**: YARP embraces gRPC, HTTP/2, and WebSockets for modern communication needs. + +- **Performance**: YARP is built for speed and efficiency, ensuring low latency and high throughput. + +## Getting started with YARP + +1. **Installation**: Add YARP to your .NET project using NuGet: + + ```shell + dotnet add package Yarp.ReverseProxy + ``` + +1. **Basic Configuration**: Configure YARP in your `Startup.cs` file: + + ```csharp + public void ConfigureServices(IServiceCollection services) + { + services.AddReverseProxy() + .LoadFromConfig(Configuration.GetSection("ReverseProxy")); + } + + public void Configure(IApplicationBuilder app, IWebHostEnvironment env) + { + app.UseRouting(); + app.UseEndpoints(endpoints => + { + endpoints.MapReverseProxy(); + }); + } + ``` + +1. **Configuration Files**: Add the YARP configuration to `appsettings.json` for basic configuration: + + ```json + "ReverseProxy": { + "Routes": { + "route1" : { + "ClusterId": "cluster1", + "Match": { + "Path": "/customer/{**catch-all}" + } + } + }, + "Clusters": { + "cluster1": { + "Destinations": { + "destination1": { + "Address": "https://example.com/api/customers/" + } + } + } + } + } + ``` + +1. **Programmatic Configuration**: Customize the configuration dynamically with C# code: + + ```csharp + builder.Services.AddReverseProxy() + .LoadFromMemory(GetRoutes(), GetClusters()); + + ... + + + RouteConfig[] GetRoutes() + { + return + [ + new RouteConfig() + { + RouteId = "route1", + ClusterId = "cluster1", + Match = new RouteMatch + { + // Path or Hosts are required for each route. This catch-all pattern matches all request paths. + Path = "/customer/{**catch-all}" + } + } + ]; + } + + ClusterConfig[] GetClusters() + { + + return + [ + new ClusterConfig() + { + ClusterId = "cluster1", + SessionAffinity = new SessionAffinityConfig { Enabled = true, Policy = "Cookie", AffinityKeyName = ".Yarp.ReverseProxy.Affinity" }, + Destinations = { "destination1", new DestinationConfig() { Address = "https://example.com/api/customers" } + } + } + ]; + } + ``` + +## Additional resources + +- **Getting Started with YARP** \ + +>[!div class="step-by-step"] +>[Previous](gateway-patterns.md) +>[Next](../deploying-distributed-apps/how-deployment-affects-your-architecture.md) diff --git a/docs/architecture/distributed-cloud-native-apps-containers/architecting-distributed-cloud-native-applications/different-distributed-architectures.md b/docs/architecture/distributed-cloud-native-apps-containers/architecting-distributed-cloud-native-applications/different-distributed-architectures.md new file mode 100644 index 0000000000000..d6442a773abd3 --- /dev/null +++ b/docs/architecture/distributed-cloud-native-apps-containers/architecting-distributed-cloud-native-applications/different-distributed-architectures.md @@ -0,0 +1,75 @@ +--- +title: Different ways to architect distributed applications +description: Architecture for Distributed Cloud-Native Apps with .NET Aspire & Containers | Different ways to architect distributed applications +author: +ms.date: 04/06/2022 +--- + +# Different ways to architect distributed applications + +[!INCLUDE [download-alert](../includes/download-alert.md)] + +In the ever-evolving landscape of software development, architects and developers are continually seeking efficient ways to design and build distributed applications. Let's explore some of the current architectural approaches. We'll delve into the characteristics, advantages, and challenges of each approach. As we navigate through these different paradigms, we'll discuss crucial aspects such as containerization, data sovereignty, scalability, and communication patterns. + +Understanding these architectural styles will allow you to make informed decisions when designing robust, scalable, and maintainable distributed systems in today's cloud-native environment. + +## Server-based architecture + +![A diagram showing a high level image of server architecture.](media/server-architecture.png) + +This is the traditional model where client machines connect to a server for processing and data storage. The server is responsible for all the processing, managing business logic, and data persistence. In this architecture, state and data management is centralized, typically using a single database. This approach simplifies data consistency but the database can become a bottleneck as the system scales. As all data is stored centrally there's no separation of data ownership among different components or services, which leads to tight coupling and makes it difficult to evolve different parts of the system independently. + +While not originally designed for containers, monolithic server applications can be containerized. This process provides benefits like consistent environments and easier deployment, but it doesn't address the underlying architectural limitations of a monolithic system. Containerization in this context primarily aids in deployment and environment consistency, rather than in scalability or modularity. + +In a server-based architecture, the logical and physical architectures often closely align, with most components residing on a single server or cluster. This simplicity can be an advantage for smaller applications but may limit flexibility as the application grows. The centralized nature of this architecture provides high security as all data is stored in one place, but it also creates a single point of failure and can become a performance bottleneck under high load. + +## Modular monoliths + +![A diagram showing a modular monolith architecture](media/modular-monolith.png) + +In this architecture, the application is divided into modules. Each module is responsible for a specific feature or functionality. However, unlike microservices, these modules run in the same process and communicate directly with each other. Modular monoliths can be containerized as a single unit, which helps with deployment and scaling of the entire application, but doesn't allow for independent scaling of individual modules. In short they suffer from the same limitations as traditional monolithic server-based architectures. + +While modules are separate, they typically share a common database, simplifying data management but also create a tight coupling between modules. This shared database approach makes it easier to maintain data consistency across the application but can lead to challenges as the application grows and modules become more complex. Also this means modules don't have true data sovereignty, as they don't exclusively own or control their data. This can lead to interdependencies between modules and make it challenging to evolve or replace individual modules without affecting others. + +Although not as critical as in microservices, clearly defined boundaries between modules are important for maintainability in modular monoliths. These boundaries help in organizing code, reducing dependencies between modules, and making the system easier to understand and modify. However, because all modules run in the same process, there's still a risk that changes in one module can affect others, and the technology diversity is limited compared to more distributed architectures. + +## Service-oriented architecture + +![A diagram showing a SOA.](media/service-oriented-architecture.png) + +In Service-Oriented Architecture (SOA), services are the main components. Each service encompasses an entire business process or function. For example, a customer order service would include functionality to manage not only orders but also customer information, shipping, and billing. Services communicate with each other to perform their tasks, typically using a messaging broker or a central enterprise bus. + +Data management in SOA can be challenging. Traditionally, services often share databases, which can lead to tight coupling. Moving towards data sovereignty, where each service owns its data, can improve service independence but introduces challenges in maintaining data consistency across services. This shift improves service independence but introduces new challenges in maintaining data consistency across services. SOA implementations must balance the benefits of data sovereignty with the need for data sharing and consistency across business processes. + +Communication is a crucial aspect of SOA. While not as granular as microservices, SOA relies heavily on inter-service communication, often through a centralized bus. This central communication mechanism can provide benefits in terms of standardization and ease of management, but it can also become a performance bottleneck and single point of failure if not designed carefully. + +## Microservices + +![A diagram showing a microservices based architecture.](media/microservice-architecture.png) + +Microservices architecture breaks down the application into smaller, independent services that perform specific tasks. These services can be developed, deployed, and scaled independently. Microservices are well-suited for containerization, and allow each service to be deployed and scaled independently, which aligns well with cloud-native development practices. + +In microservices, each service typically adheres to the principle of data sovereignty. Each microservice owns and manages its own data store, which can be a database, cache, or any other form of data storage. This sovereignty ensures that a service has complete control over its data model, storage technology, and data access patterns. It reduces coupling between services, as no service can directly access or modify another service's data. Instead, data exchange happens through well-defined APIs or messaging systems. This approach supports the independence and autonomy of each microservice, allowing teams to make decisions about data storage and management that are optimal for their specific service without affecting others. However, it also introduces challenges in maintaining data consistency across the system. Solutions to these challenges include using event-driven architectures, implementing the Saga pattern for distributed transactions, and designing for eventual consistency where appropriate. + +The logical architecture of a microservices system, that us how services are conceptually organized, can be quite different from its physical architecture, that is how they're deployed. This separation allows for optimal resource utilization and adaptation to changing infrastructure requirements. + +Identifying correct domain-model boundaries for each microservice is crucial to ensure each service has a well-defined responsibility and to reduce inter-service dependencies. This process often involves careful analysis of the business domain and can significantly impact the overall system design. + +Communication patterns in microservices architectures are diverse. They can involve direct client-to-microservice communication or use an API Gateway pattern. Direct communication can offer lower latency but increases complexity for clients, while an API Gateway can simplify the client experience but may introduce a potential single point of failure. Additionally, microservices can communicate with each other using both synchronous (HTTP/gRPC) and asynchronous (message queues) methods, each with its own implications on system design and behavior. + +### .NET Aspire and microservices + +Although a microservices architecture is well suited to large apps with complex scaling needs and continuous development, you've seen that they're not always easy to manage. For example, sometimes it's hard to determine exactly which microservices and other components make up your entire app. Also, because each microservice might be written by a different team, common concerns like observability and resiliency may be implemented inconsistently or not at all. + +This is where .NET Aspire comes in - it's a stack that helps you create manageable and consistent microservices, and assemble them into large apps. You can build .NET microservices without .NET Aspire, but if you do you'll have to work harder to solve common problems. For example, .NET Aspire: + +- Makes it clear which microservices and other components are part of your app. You can see them in the App Host project code, and in the dashboard that appears whenever you debug your app. +- Implements observability for you using OpenTelemetry, so you can easily get data on the performance and behavior of all your microservices without requiring your development teams to build it. +- Makes it easy to use common backing services, such as databases and service buses, by providing out-of-the-box integrations for each one. +- Implements resiliency in these integrations to prevent and recover from failures, without requiring your development teams to build it. + +If you've already started building and deploying microservices apps, then .NET Aspire may not help you, because it's opinionated about how they should be built and that might not match your design. However, if you're new to microservices and cloud-native design, or if you're starting an app from scratch, it can help to make the project easier and reduce development time. + +>[!div class="step-by-step"] +>[Previous](why-choose-distributed-architecture.md) +>[Next](../communication-patterns/communication-patterns.md) diff --git a/docs/architecture/distributed-cloud-native-apps-containers/architecting-distributed-cloud-native-applications/media/microservice-architecture.png b/docs/architecture/distributed-cloud-native-apps-containers/architecting-distributed-cloud-native-applications/media/microservice-architecture.png new file mode 100644 index 0000000000000..524789cf635cf Binary files /dev/null and b/docs/architecture/distributed-cloud-native-apps-containers/architecting-distributed-cloud-native-applications/media/microservice-architecture.png differ diff --git a/docs/architecture/distributed-cloud-native-apps-containers/architecting-distributed-cloud-native-applications/media/modular-monolith.png b/docs/architecture/distributed-cloud-native-apps-containers/architecting-distributed-cloud-native-applications/media/modular-monolith.png new file mode 100644 index 0000000000000..7d35e7097d6c0 Binary files /dev/null and b/docs/architecture/distributed-cloud-native-apps-containers/architecting-distributed-cloud-native-applications/media/modular-monolith.png differ diff --git a/docs/architecture/distributed-cloud-native-apps-containers/architecting-distributed-cloud-native-applications/media/server-architecture.png b/docs/architecture/distributed-cloud-native-apps-containers/architecting-distributed-cloud-native-applications/media/server-architecture.png new file mode 100644 index 0000000000000..3036b797092bd Binary files /dev/null and b/docs/architecture/distributed-cloud-native-apps-containers/architecting-distributed-cloud-native-applications/media/server-architecture.png differ diff --git a/docs/architecture/distributed-cloud-native-apps-containers/architecting-distributed-cloud-native-applications/media/service-oriented-architecture.png b/docs/architecture/distributed-cloud-native-apps-containers/architecting-distributed-cloud-native-applications/media/service-oriented-architecture.png new file mode 100644 index 0000000000000..bf8dc155b072b Binary files /dev/null and b/docs/architecture/distributed-cloud-native-apps-containers/architecting-distributed-cloud-native-applications/media/service-oriented-architecture.png differ diff --git a/docs/architecture/distributed-cloud-native-apps-containers/architecting-distributed-cloud-native-applications/why-choose-distributed-architecture.md b/docs/architecture/distributed-cloud-native-apps-containers/architecting-distributed-cloud-native-applications/why-choose-distributed-architecture.md new file mode 100644 index 0000000000000..25efdac5932ac --- /dev/null +++ b/docs/architecture/distributed-cloud-native-apps-containers/architecting-distributed-cloud-native-applications/why-choose-distributed-architecture.md @@ -0,0 +1,36 @@ +--- +title: Architecting distributed cloud-native applications +description: Architecture for Distributed Cloud-Native Apps with .NET Aspire & Containers | Architecting distributed cloud-native applications +author: +ms.date: 04/06/2022 +--- + +# Architecting distributed cloud-native applications + +[!INCLUDE [download-alert](../includes/download-alert.md)] + +Every application has its own unique requirements, but there are some common patterns and best practices that can help you design and build distributed cloud-native applications. In this book, we're focusing on how to architect distributed cloud-native applications by taking a microservices approach. But this approach might not be the best fit for your application. Let's look at why you might choose a distributed architecture. + +- **Scalability**: Distributed architectures allow applications to scale out easily, accommodating more users and handling more requests per second. This is particularly important for applications that experience variable load. + +- **Resilience**: In a distributed system, if one component fails, the others can continue to operate. This resilience can increase the overall uptime and reliability of your application. + +- **Geographical distribution**: For global applications, distributed architectures can reduce latency by locating services closer to users. + +- **Isolation of responsibilities**: Each service in a distributed system can be developed, deployed, scaled, and maintained independently, often by different teams. This separation can lead to increased productivity and speed of development. + +- **Technology diversity**: Different services in a distributed system can use different languages, databases, and other technologies. You can choose the best performing tool for each job and the one that your team has experience with. + +- **Efficient resource utilization**: Distributed architectures can make more efficient use of resources by allowing each service to scale independently based on its needs. For example, reporting services can be scaled up at the end of each month when managers build reports, while your core e-commerce services can be scaled up during peak shopping seasons. + +- **Ease of deployment and updates**: With a distributed architecture, you can update a single service without having to redeploy your entire application. This independence can make deployments faster and less risky allowing for more frequent updates. + +- **Data partitioning**: In a distributed system, you can partition your data across different services, which can lead to improved performance and scalability. You can also geographically partition your data to comply with data residency requirements. + +- **Security**: By isolating different parts of your application into separate services, you can apply specific security measures to each service based on its needs. + +In the rest of this book we'll be focusing specifically on how to design and build distributed cloud-native applications using a microservices based architecture. You'll see how the features built into .NET are designed to help you build and deploy microservices, and how to use containers to package and deploy your services successfully. + +>[!div class="step-by-step"] +>[Previous](../introduction-dot-net-aspire/observability-and-dashboard.md) +>[Next](different-distributed-architectures.md) diff --git a/docs/architecture/distributed-cloud-native-apps-containers/chpt2-introduction-containers-docker/container-terminology.md b/docs/architecture/distributed-cloud-native-apps-containers/chpt2-introduction-containers-docker/container-terminology.md new file mode 100644 index 0000000000000..1a77a41296886 --- /dev/null +++ b/docs/architecture/distributed-cloud-native-apps-containers/chpt2-introduction-containers-docker/container-terminology.md @@ -0,0 +1,78 @@ +--- +title: Container terminology +description: Architecture for Distributed Cloud-Native Apps with .NET Aspire & Containers | Container terminology +author: +ms.date: 04/25/2024 +--- + +# Containers, images, repositories, and registries + +[!INCLUDE [download-alert](../includes/download-alert.md)] + +When using Docker, a developer creates an app or service and packages it and its dependencies into a container image. An image is a static representation of the app or service and its configuration and dependencies. + +To run the app or service, the app's image is instantiated to create a container, which will be running on the Docker host. Containers are initially tested in a development environment or PC. + +Developers should store images in a registry, which acts as a library of images and is needed when deploying to production orchestrators. Docker maintains a public registry via [Docker Hub](https://hub.docker.com/); other vendors provide registries for different collections of images, including [Azure Container Registry](https://azure.microsoft.com/services/container-registry/). Alternatively, enterprises can have a private registry on-premises for their own Docker images. + +Figure 2-4 shows how images and registries in Docker relate to other components. It also shows the multiple registry offerings from vendors. + +![A diagram showing the basic taxonomy in Docker.](media/5-taxonomy-of-docker-terms-and-concepts.png) + +**Figure 2-5**. Taxonomy of Docker terms and concepts + +The registry is like a bookshelf where images are stored and available to be pulled for building containers to run services or web apps. There are private Docker registries on-premises and on the public cloud. Docker Hub is a public registry maintained by Docker, along the Docker Trusted Registry an enterprise-grade solution, Azure offers the Azure Container Registry. AWS, Google, and others also have container registries. + +Putting images in a registry lets you store static and immutable application bits, including all their dependencies at a framework level. Those images can then be versioned and deployed in multiple environments and therefore provide a consistent deployment unit. + +Private image registries, either hosted on-premises or in the cloud, are recommended when: + +- Your images must not be shared publicly due to confidentiality. + +- You want to have minimum network latency between your images and your chosen deployment environment. For example, if your production environment is Azure cloud, you probably want to store your images in [Azure Container Registry](https://azure.microsoft.com/services/container-registry/) so that network latency will be minimal. In a similar way, if your production environment is on-premises, you might want to have an on-premises Docker Trusted Registry available within the same local network. + +# Container terminology + +This section lists terms and definitions you should be familiar with before getting deeper into containers. For further definitions, see [the extensive glossary provided by Docker](https://docs.docker.com/glossary/). + +**Container image**: A package with all the dependencies and information needed to create a container. An image includes all the dependencies, such as frameworks, plus deployment and execution configuration to be used by a container runtime. Usually, an image derives from multiple base images that are layers stacked on top of each other to form the container's filesystem. An image is immutable once it has been created. + +**Dockerfile**: A text file that contains instructions for building a Docker image. It's like a batch script; the first line states the base image to begin with and then gives instructions to install required programs, copy files, and so on, until you get the working environment you need to support your code. + +**Build**: The action of building a container image based on the information and context provided by its Dockerfile, plus additional files in the folder where the image is built. You can build images with the following Docker command: + +```Bash +docker build +``` + +**Container**: An instance of a Docker image. A container represents the execution of a single application, process, or service. It consists of the contents of a Docker image, an execution environment, and a standard set of instructions. When scaling a service, you create multiple instances of a container from the same image. Or a batch job can create multiple containers from the same image, passing different parameters to each instance. + +**Volumes**: Offer a writable filesystem that the container can use. Since images are read-only but most programs need to write to the filesystem, volumes add a writable layer, on top of the container image, so the programs have access to a writable filesystem. The program doesn't know it's accessing a layered filesystem, it's just the filesystem as usual. Volumes live in the host system and are managed by Docker. + +**Tag**: A mark or label you can apply to images so that different images or versions of the same image, depending on the version number or the target environment, can be identified. + +**Multi-stage Build**: This feature, available since Docker 17.05, helps to reduce the size of the final images. For example, a large base image, containing the SDK can be used for compiling and publishing and then a small runtime-only base image can be used to host the application. + +**Repository (repo)**: A collection of related Docker images, labeled with a tag that indicates the image version. Some repos contain multiple variants of a specific image, such as an image containing SDKs (heavier), an image containing only runtimes (lighter), and so on. Those variants can be marked with tags. A single repo can contain platform variants, such as a Linux image and a Windows image. + +**Registry**: A service that provides access to repositories. The default registry for most public images is Docker Hub (owned by Docker as an organization). A registry usually contains repositories from multiple teams. Companies often have private registries to store and manage images they've created. Azure Container Registry is another example. + +**Multi-arch image**: This feature simplifies the selection of the appropriate image according to the platform where Docker is running. For example, when a Dockerfile requests a base image `mcr.microsoft.com/dotnet/sdk:8.0` from the registry, it actually gets `8.0-nanoserver-ltsc2022`, `8.0-nanoserver-1809` or `8.0-bullseye-slim`, depending on the operating system and version where Docker is running. + +**Docker Hub**: A public registry to upload images and work with them. Docker Hub provides Docker image hosting, public or private registries, build triggers and web hooks, and integration with GitHub and Bitbucket. + +**Azure Container Registry**: A public resource for working with Docker images and its components in Azure. This provides a registry that's close to your deployments in Azure and that gives you control over access, making it possible to use your Azure Active Directory groups and permissions. + +**Docker Trusted Registry (DTR)**: A Docker registry service that can be installed on-premises so it lives within the organization's datacenter and network. It's convenient for private images that should be managed within the enterprise. Docker Trusted Registry is included as part of the Docker Datacenter product. + +**Docker Desktop**: Development tools for Windows and macOS for building, running, and testing containers locally. Docker Desktop for Windows provides development environments for both Linux and Windows Containers. The Linux Docker host on Windows is based on a Hyper-V virtual machine. The host for Windows Containers is directly based on Windows. Docker Desktop for Mac is based on the Apple Hypervisor framework and the **xhyve** hypervisor, which provides a Linux Docker host virtual machine on macOS. Docker Desktop for Windows and for Mac replaces Docker Toolbox, which was based on Oracle VirtualBox. + +**Compose**: A command-line tool and YAML file format with metadata for defining and running multi-container applications. You define a single application based on multiple images with one or more .yml files that can override values depending on the environment. After you've created the definitions, you can deploy the whole multi-container application with a single command (`docker-compose up`) that creates a container per image on the Docker host. + +**Cluster:** A collection of Docker hosts exposed as if it were a single virtual Docker host, so that the application can scale to multiple instances of the services spread across multiple hosts within the cluster. Docker clusters can be created with Kubernetes, Azure Service Fabric, Docker Swarm and Mesosphere DC/OS. + +**Orchestrator**: A tool that simplifies the management of clusters and Docker hosts. Orchestrators enable you to manage their images, containers, and hosts through a command-line interface (CLI) or a graphical UI. You can manage container networking, configurations, load balancing, service discovery, high availability, Docker host configuration, and more. An orchestrator is responsible for running, distributing, scaling, and healing workloads across a collection of nodes. Typically, orchestrator products are the same products that provide cluster infrastructure, like Kubernetes and Azure Service Fabric, among other offerings in the market. + +>[!div class="step-by-step"] +>[Previous](what-is-docker.md) +>[Next](official-container-images-tooling.md) diff --git a/docs/architecture/distributed-cloud-native-apps-containers/chpt2-introduction-containers-docker/media/1-multiple-containers-single-host.png b/docs/architecture/distributed-cloud-native-apps-containers/chpt2-introduction-containers-docker/media/1-multiple-containers-single-host.png new file mode 100644 index 0000000000000..e075263988a74 Binary files /dev/null and b/docs/architecture/distributed-cloud-native-apps-containers/chpt2-introduction-containers-docker/media/1-multiple-containers-single-host.png differ diff --git a/docs/architecture/distributed-cloud-native-apps-containers/chpt2-introduction-containers-docker/media/2-multiple-containers-single-host.png b/docs/architecture/distributed-cloud-native-apps-containers/chpt2-introduction-containers-docker/media/2-multiple-containers-single-host.png new file mode 100644 index 0000000000000..fadded84dfe55 Binary files /dev/null and b/docs/architecture/distributed-cloud-native-apps-containers/chpt2-introduction-containers-docker/media/2-multiple-containers-single-host.png differ diff --git a/docs/architecture/distributed-cloud-native-apps-containers/chpt2-introduction-containers-docker/media/3-docker-container-hardware-software.png b/docs/architecture/distributed-cloud-native-apps-containers/chpt2-introduction-containers-docker/media/3-docker-container-hardware-software.png new file mode 100644 index 0000000000000..3513aa70a03f7 Binary files /dev/null and b/docs/architecture/distributed-cloud-native-apps-containers/chpt2-introduction-containers-docker/media/3-docker-container-hardware-software.png differ diff --git a/docs/architecture/distributed-cloud-native-apps-containers/chpt2-introduction-containers-docker/media/3-docker-containers-run-anywhere.png b/docs/architecture/distributed-cloud-native-apps-containers/chpt2-introduction-containers-docker/media/3-docker-containers-run-anywhere.png new file mode 100644 index 0000000000000..5aa41d252114c Binary files /dev/null and b/docs/architecture/distributed-cloud-native-apps-containers/chpt2-introduction-containers-docker/media/3-docker-containers-run-anywhere.png differ diff --git a/docs/architecture/distributed-cloud-native-apps-containers/chpt2-introduction-containers-docker/media/3-virtual-machine-hardware-software.png b/docs/architecture/distributed-cloud-native-apps-containers/chpt2-introduction-containers-docker/media/3-virtual-machine-hardware-software.png new file mode 100644 index 0000000000000..bf95002370de4 Binary files /dev/null and b/docs/architecture/distributed-cloud-native-apps-containers/chpt2-introduction-containers-docker/media/3-virtual-machine-hardware-software.png differ diff --git a/docs/architecture/distributed-cloud-native-apps-containers/chpt2-introduction-containers-docker/media/5-taxonomy-of-docker-terms-and-concepts.png b/docs/architecture/distributed-cloud-native-apps-containers/chpt2-introduction-containers-docker/media/5-taxonomy-of-docker-terms-and-concepts.png new file mode 100644 index 0000000000000..778cecc415309 Binary files /dev/null and b/docs/architecture/distributed-cloud-native-apps-containers/chpt2-introduction-containers-docker/media/5-taxonomy-of-docker-terms-and-concepts.png differ diff --git a/docs/architecture/distributed-cloud-native-apps-containers/chpt2-introduction-containers-docker/official-container-images-tooling.md b/docs/architecture/distributed-cloud-native-apps-containers/chpt2-introduction-containers-docker/official-container-images-tooling.md new file mode 100644 index 0000000000000..70935d1810fe0 --- /dev/null +++ b/docs/architecture/distributed-cloud-native-apps-containers/chpt2-introduction-containers-docker/official-container-images-tooling.md @@ -0,0 +1,50 @@ +--- +title: Official .NET container images & SDK tooling +description: Architecture for Distributed Cloud-Native Apps with .NET Aspire & Containers | Official .NET container images & SDK tooling +author: +ms.date: 04/25/2024 +--- + +# Official .NET container images & SDK tooling + +[!INCLUDE [download-alert](../includes/download-alert.md)] + +The Official .NET Docker images are Docker images created and optimized by Microsoft. They're publicly available on [Microsoft Artifact Registry](https://mcr.microsoft.com/). You can search over the catalog to find all .NET image repositories, for example the [.NET SDK](https://mcr.microsoft.com/product/dotnet/sdk/about) repository. + +Each repository can contain multiple images, depending on .NET versions, and depending on the OS and versions (Linux Debian, Linux Alpine, Windows Nano Server, Windows Server Core, and so on). Image repositories provide extensive tagging to help you select not just a specific framework version, but also to choose an OS. + +## .NET and Docker image optimizations for development versus production + +When building Docker images for developers, Microsoft focused on the following main scenarios: + +- Images used to *develop* and build .NET apps. +- Images used to *run* .NET apps. + +Why multiple images? When developing, building, and running containerized applications, you usually have different priorities. By providing different images for these separate tasks, Microsoft helps optimize the separate processes of developing, building, and deploying apps. + +### During development and build + +During development, what is important is how fast you can iterate changes, and the ability to debug the changes. The size of the image isn't as important as the ability to make changes to your code and see the changes quickly. Some tools and "build-agent containers", use the development .NET image (*mcr.microsoft.com/dotnet/sdk:8.0*) during development and build process. When building inside a Docker container, the important aspects are the elements that are needed to compile your app. This includes the compiler and any other .NET dependencies. + +Why is this type of build image important? You don't deploy this image to production. Instead, it's an image that you use to build the content you place into a production image. This image would be used in your continuous integration (CI) environment or build environment when using Docker multi-stage builds. + +### In production + +What is important in production is how fast you can deploy and start your containers based on a production .NET image. Therefore, the runtime-only image based on *mcr.microsoft.com/dotnet/aspnet:8.0* is small so that it can travel quickly across the network from your Docker registry to your Docker hosts. The contents are ready to run, enabling the fastest time from starting the container to processing results. In the Docker model, there is no need for compilation from C\# code, as there is when you run `dotnet build` or `dotnet publish` when using the build container. + +In this optimized image, you put only the binaries and other content needed to run the application. For example, the content created by `dotnet publish` contains only the compiled .NET binaries, images, .js, and .css files. Over time, you'll see images that contain pre-jitted packages. In these packages, code that is usually stored as Intermediate Language (IL) instructions and compiled to native machine code just-in-time to run, is fully compiled ahead of runtime to optimize performance. + +Although there are multiple versions of the .NET and ASP.NET Core images, they all share one or more layers, including the base layer. Therefore, the amount of disk space needed to store an image is small; it consists only of the delta between your custom image and its base image. The result is that it's quick to pull the image from your registry. + +When you explore the .NET image repositories at Microsoft Artifact Registry, you'll find multiple image versions classified or marked with tags. These tags help to decide which one to use, depending on the version you need, like those in the following table: + +| Image | Comments | +|-------|----------| +| mcr.microsoft.com/dotnet/aspnet:**8.0** | ASP.NET Core, with runtime only and ASP.NET Core optimizations, on Linux and Windows (multi-arch) | +| mcr.microsoft.com/dotnet/sdk:**8.0** | .NET 8, with SDKs included, on Linux and Windows (multi-arch) | + +You can find all the available docker images in [dotnet-docker](https://github.com/dotnet/dotnet-docker) and also refer to the latest preview releases by using nightly build `mcr.microsoft.com/dotnet/nightly/*` + +>[!div class="step-by-step"] +>[Previous](container-terminology.md) +>[Next](../introduction-dot-net-aspire/dot-net-aspire-overview.md) diff --git a/docs/architecture/distributed-cloud-native-apps-containers/chpt2-introduction-containers-docker/what-are-containers.md b/docs/architecture/distributed-cloud-native-apps-containers/chpt2-introduction-containers-docker/what-are-containers.md new file mode 100644 index 0000000000000..d7930ba2a9368 --- /dev/null +++ b/docs/architecture/distributed-cloud-native-apps-containers/chpt2-introduction-containers-docker/what-are-containers.md @@ -0,0 +1,30 @@ +--- +title: What are containers? +description: Architecture for Distributed Cloud-Native Apps with .NET & Containers | What are containers +author: +ms.date: 04/25/2024 +--- + +# What are containers? + +[!INCLUDE [download-alert](../includes/download-alert.md)] + +Containerization is an approach to software development in which an application or service, its dependencies, and its configuration (abstracted as deployment manifest files) are packaged together as a container image. Each container can be run as a unit and deployed to a container host. Docker is often used to host containers, or you can choose more scalable and flexible container orchestrators like Kubernetes. + +Just as shipping containers allow goods to be transported by ship, train, or truck regardless of the cargo inside, software containers act as a standard unit of software deployment that can contain different code and dependencies. Containerizing software this way enables developers and IT professionals to deploy them across environments with little or no modification. + +Containers also isolate applications from each other on a shared OS. Containerized applications run on top of a container host that in turn runs on the OS, usually Linux or Windows. Containers therefore have a significantly smaller footprint than virtual machine (VM) images because they don't include and entirely separate instance of the operating system. + +Each container can run a whole web application or a service, as shown in Figure 2-1. In this example, Docker is the container host, and App1, App2, Service 1, and Service 2 are containerized applications or services. + +![Diagram showing four containers running in a VM or a server.](media/1-multiple-containers-single-host.png) + +**Figure 2-1**. Multiple containers running on a container host + +Another benefit of containerization is scalability. You can scale out quickly by creating new containers for short-term tasks. From an application point of view, creating a container is analogous to instantiating a process like a service or a web app. For reliability, however, when you run multiple instances of the same image across multiple host servers, you typically want each container to run in a different host server or VM in different fault domains. + +In short, containers offer the benefits of isolation, portability, agility, scalability, and control throughout the whole application lifecycle. The most important benefit is the environment's isolation provided between Dev and Ops. + +>[!div class="step-by-step"] +>[Previous](../introduction-to-cloud-native-development/candidate-apps-for-cloud-native.md) +>[Next](what-is-docker.md) diff --git a/docs/architecture/distributed-cloud-native-apps-containers/chpt2-introduction-containers-docker/what-is-docker.md b/docs/architecture/distributed-cloud-native-apps-containers/chpt2-introduction-containers-docker/what-is-docker.md new file mode 100644 index 0000000000000..3cb62f9e6629e --- /dev/null +++ b/docs/architecture/distributed-cloud-native-apps-containers/chpt2-introduction-containers-docker/what-is-docker.md @@ -0,0 +1,106 @@ +--- +title: What is Docker? +description: Architecture for Distributed Cloud-Native Apps with .NET Aspire & Containers | What is Docker +author: +ms.date: 04/25/2024 +--- + +# What is Docker? + +[!INCLUDE [download-alert](../includes/download-alert.md)] + +[Docker](https://www.docker.com) is an [open-source project](https://github.com/docker/docker) for automating the deployment of applications as portable, self-sufficient containers that can run on the cloud or on-premises. Docker is also a [company](https://www.docker.com) that promotes and evolves this technology, working in collaboration with cloud, Linux, and Windows vendors, including Microsoft. + +![Diagram showing the places Docker containers can run.](media/3-docker-containers-run-anywhere.png) + +Figure 2-3. Docker deploys containers at all layers of the hybrid cloud. + +Docker containers can run anywhere, on-premises in the customer datacenter, in an external service provider or in the cloud, such as on Azure. Docker containers can run natively on Linux and Windows. However, Windows images can run only on Windows hosts and Linux images can run on Linux hosts and Windows hosts (using a Hyper-V Linux VM), where host means a server or a VM. + +Developers can use development environments on Windows, Linux, or macOS. On the development computer, the developer runs a Docker host where Docker images are deployed, including the app and its dependencies. Developers who work on Linux or on macOS use a Docker host that is Linux based, and they can create images only for Linux containers. Developers working on macOS can edit code or run the Docker CLI from macOS, but as of the time of this writing, containers don't run directly on macOS. Developers who work on Windows can create images for either Linux or Windows Containers. + +To host containers in development environments and provide additional developer tools, Docker ships Docker Desktop for [Windows](https://hub.docker.com/editions/community/docker-ce-desktop-windows) or for [macOS](https://hub.docker.com/editions/community/docker-ce-desktop-mac). These products install the necessary VM (the Docker host) to host the containers. + +To run [Windows Containers](https://learn.microsoft.com/virtualization/windowscontainers/about/), there are two types of runtimes: + +Windows Server Containers provide application isolation through process and namespace isolation technology. A Windows Server Container shares a kernel with the container host and with all containers running on the host. + +Hyper-V Containers expand on the isolation provided by Windows Server Containers by running each container in a highly optimized virtual machine. In this configuration, the kernel of the container host isn't shared with the Hyper-V Containers, providing better isolation. + +The images for these containers are created the same way and function the same. The difference is in how the container is created from the image; running a Hyper-V container requires an extra parameter. + +## Comparing Docker containers with virtual machines + +Let's examine VMs and containers in more detail to understand their uses. + +### Virtual Machines + +![Diagram showing the hardware/software stack of a traditional VM.](media/3-virtual-machine-hardware-software.png) + +Figure 3-2. Diagram showing the hardware/software stack of a traditional VM. + +Virtual machines include the application, the required libraries or binaries, and a full guest operating system. Full virtualization requires more resources than containerization. + +### Docker Containers + +![Diagram showing the hardware/software stack for Docker containers.](media/3-docker-containers-run-anywhere.png) + +Figure 2-4. Diagram showing the hardware/software stack for Docker containers. + +Containers include the application and all its dependencies. However, they share the OS kernel with other containers, running as isolated processes in user space on the host operating system. (Except in Hyper-V containers, where each container runs inside of a special virtual machine per container.) + +### Comparison between VMs and Docker containers + +For VMs, there are three base layers in the host server, from the bottom-up: + +- Infrastructure +- Host Operating System +- Hypervisor + +On top of all that, each VM has its own OS and all necessary libraries. + +For Docker, the host server only has: + +- Infrastructure +- Host Operating System +- Container engine, that keeps containers isolated but allows them to share the base OS services. + +Because containers require far fewer resources (for example, they don't need a full OS), they're easy to deploy and they start fast. This allows you to have higher density, meaning that it allows you to run more services on the same hardware unit, thereby reducing costs. + +As a side effect of running on the same kernel, you get less isolation than VMs. + +The main goal of an image is that it makes the environment the same across different deployments. This means that you can debug it on your machine and then deploy it to another machine with the same environment guaranteed. + +A container image is a way to package an app or service and deploy it in a reliable and reproducible way. You could say that Docker isn't only a technology but also a philosophy and a process. + +When using Docker, you won't hear developers say, "It works on my machine, why not in production?" They can simply say, "It runs on Docker", because the packaged Docker application can be executed on any supported Docker environment, and it runs the way it was intended to on all deployment targets, such as Dev, QA, staging, and production. + +## A simple analogy + +Perhaps a simple analogy can help getting the grasp of the core concept of Docker. + +Let's go back in time to the 1950s for a moment. There were no word processors, and the photocopiers were used everywhere. + +Imagine you're responsible for quickly issuing batches of letters as required, to mail them to customers, using real paper and envelopes, to be delivered physically to each customer's address. + +At some point, you realize the letters are just a composition of a large set of paragraphs, which are picked and arranged as needed, according to the purpose of the letter, so you devise a system to issue letters quickly, expecting to get a hefty raise. + +The system is simple: + +1. You begin with a deck of transparent sheets containing one paragraph each. +1. To issue a set of letters, you pick the sheets with the paragraphs you need, then you stack and align them so they look and read fine. +1. Finally, you place the set in the photocopier and press start to produce as many letters as required. + +So, simplifying, that's the core idea of Docker. + +In Docker, each layer is the resulting set of changes that happen to the filesystem after executing a command, such as, installing a program. + +So, when you "look" at the filesystem after the layer has been copied, you see all the files, included in the layer when the program was installed. + +You can think of an image as an auxiliary read-only hard disk ready to be installed in a "computer" where the operating system is already installed. + +Similarly, you can think of a container as the "computer" with the image hard disk installed. The container, just like a computer, can be powered on or off. + +>[!div class="step-by-step"] +>[Previous](what-are-containers.md) +>[Next](container-terminology.md) diff --git a/docs/architecture/distributed-cloud-native-apps-containers/chpt8-data-patterns/azure-caching.md b/docs/architecture/distributed-cloud-native-apps-containers/chpt8-data-patterns/azure-caching.md new file mode 100644 index 0000000000000..f6d6b0af3d424 --- /dev/null +++ b/docs/architecture/distributed-cloud-native-apps-containers/chpt8-data-patterns/azure-caching.md @@ -0,0 +1,105 @@ +--- +title: Caching in a cloud-native application +description: Architecture for Distributed Cloud-Native Apps with .NET Aspire & Containers | Caching in a cloud-native application +author: +ms.date: 04/06/2022 +--- + +# Caching in a cloud-native application + +[!INCLUDE [download-alert](../includes/download-alert.md)] + +The benefits of caching are well understood. The technique works by temporarily copying frequently accessed data from a backend data store to *fast storage* that's located closer to the application. Caching is often implemented where: + +- Data remains relatively static. +- Data access is slow, especially compared to the speed of the cache. +- Data is subject to high levels of contention. + +## Why use a cache? + +As discussed in the [Microsoft caching guidance](/azure/architecture/best-practices/caching), caching can increase performance, scalability, and availability for individual microservices and the system as a whole. It reduces the latency and contention of handling large volumes of concurrent requests to a data store. As data volume and the number of users increase, the benefits of caching become greater. + +Caching is most effective when a client repeatedly reads data that is immutable or that changes infrequently. Examples include reference information such as product and pricing information, or shared static resources that are costly to construct. + +While microservices should be stateless, a distributed cache can support concurrent access to session state data when absolutely required. + +Also consider caching to avoid repetitive computations. If an operation transforms data or performs a complicated calculation, cache the result for subsequent requests. + +## Caching architecture + +Cloud native applications typically implement a distributed caching architecture. The cache is hosted as a cloud-based backing service, separate from the microservices. Figure 5-15 shows the architecture. + +![A diagram showing how a cache is implemented in a cloud-native app.](media/distributed-data.png) + +**Figure 5-15**. Caching in a cloud-native app + +The previous figure presents a common caching pattern known as the [cache-aside pattern](/azure/architecture/patterns/cache-aside). For an incoming request, you first query the cache (step \#1) for a response. If found, the data is returned immediately. If the data doesn't exist in the cache (known as a [cache miss](https://www.techopedia.com/definition/6308/cache-miss)), it's retrieved from a local database in a downstream service (step \#2). It's then written to the cache for future requests (step \#3), and returned to the caller. Care must be taken to periodically evict cached data so that the system remains timely and consistent. + +As a shared cache grows, it might prove beneficial to partition its data across multiple nodes. Doing so can help minimize contention and improve scalability. Many caching services support the ability to add and remove nodes dynamically and rebalance data across partitions. This approach typically involves clustering. Clustering exposes a collection of federated nodes as a seamless, single cache. Internally, however, the data is dispersed across the nodes following a predefined distribution strategy that balances the load evenly. + +## Azure Cache for Redis + +[Azure Cache for Redis](https://azure.microsoft.com/services/cache/) is a secure data caching and messaging broker service, fully managed by Microsoft. It's a Platform as a Service (PaaS) offering that provides high throughput and low-latency access to data. The service is accessible to any application within or outside of Azure. + +The Azure Cache for Redis service manages access to open-source Redis servers hosted across Azure data centers. The service acts as a facade providing management, access control, and security. The service natively supports a rich set of data structures, including strings, hashes, lists, and sets. If your application already uses Redis, it will work as-is with Azure Cache for Redis. + +Azure Cache for Redis is more than a simple cache server. It can support a number of scenarios to enhance a microservices architecture: + +- An in-memory data store +- A distributed non-relational database +- A message broker +- A configuration or discovery server + +For advanced scenarios, a copy of the cached data can be [persisted to disk](/azure/azure-cache-for-redis/cache-how-to-premium-persistence). If a catastrophic event disables both the primary and replica caches, the cache is reconstructed from the most recent snapshot. + +Azure Redis Cache is available across a number of predefined configurations and pricing tiers. The [Premium tier](/azure/azure-cache-for-redis/cache-overview#service-tiers) features many enterprise-level features such as clustering, data persistence, geo-replication, and virtual-network isolation. + +## Using Redis caches with .NET Aspire + +.NET Aspire includes several built-in integrations that can help you cache data in Redis, whether that service is running in Azure, in a container, or elsewhere. The integrations come in three types: + +- **Caching.** This integration stores frequently accessed data in a single instance of Redis. +- **Distributed Caching.** Use this integration if your Redis cache may consist of multiple servers. +- **Output Caching.** Use this integration if you want to cache complete HTTP responses, such as a full web page or a response to an API call in HTML format. + +In the app host, you add the Redis hosting package. In this case, we'll add a distributed cache: + +```dotnetcli +dotnet add package Aspire.Hosting.Redis +``` + +Then, to ease service discovery, create the cache instance in the app host project's _Program.cs_ file and pass it to any microservice that needs to use it: + +```csharp +var builder = DistributedApplication.CreateBuilder(args); + +var cache = builder.AddRedis("cache"); + +builder.AddProject() + .WithReference(cache) +``` + +In the microservice where you want to use the cache, start by installing the integration: + +```dotnetcli +dotnet add package Aspire.StackExchange.Redis.DistributedCaching +``` + +Then, add the cache service to the dependency injection container, by adding this code to the microservice's _Program.cs_ file: + +```csharp +builder.AddRedisDistributedCache("cache"); +``` + +Now, whenever you want to add data to the cache or retrieve it, you can get the `IDistributedCache` object and code it as normal: + +```csharp +public class ExampleService(IDistributedCache cache) +{ + // Use the cache object to store and retrieve data. +} +``` + +>[!div class="step-by-step"] +>[Previous](relational-vs-nosql-data.md) +>[Next](data-driven-crud-microservice.md) diff --git a/docs/architecture/distributed-cloud-native-apps-containers/chpt8-data-patterns/data-driven-crud-microservice.md b/docs/architecture/distributed-cloud-native-apps-containers/chpt8-data-patterns/data-driven-crud-microservice.md new file mode 100644 index 0000000000000..d79a0e6cf732b --- /dev/null +++ b/docs/architecture/distributed-cloud-native-apps-containers/chpt8-data-patterns/data-driven-crud-microservice.md @@ -0,0 +1,309 @@ +--- +title: Creating a simple data-driven CRUD microservice +description: Architecture for Distributed Cloud-Native Apps with .NET Aspire & Containers | Creating a simple data-driven CRUD microservice +ms.date: 03/04/2024 +--- + +# Creating a simple data-driven CRUD microservice + +[!INCLUDE [download-alert](../includes/download-alert.md)] + +This section outlines how to create a simple microservice that performs create, read, update, and delete (CRUD) operations on a data source. + +## Designing a simple CRUD microservice + +From a design point of view, this type of containerized microservice is very simple. Perhaps the problem to solve is simple, or perhaps the implementation is only a proof of concept. + +![Diagram showing a simple CRUD microservice internal design pattern.](media/internal-design-simple-crud-microservices.png) + +**Figure 6-4**. Internal design for simple CRUD microservices + +An example of this kind of simple data-drive service is the catalog microservice from the eShop Reference Architecture sample application. This type of service implements all its functionality in a single ASP.NET Core Web API project that includes classes for its data model, its business logic, and its data access code. It also stores its related data in a database running in SQL Server (as another container for dev/test purposes), but could also be any regular SQL Server host: + +![Diagram showing a data-driven/CRUD microservice container.](media/simple-data-driven-crud-microservice.png) + +**Figure 6-5**. Simple data-driven/CRUD microservice design + +This diagram shows the logical Catalog microservice, that includes its Catalog database, which can be or not in the same Docker host. Having the database in the same Docker host might be good for development, but not for production. When you are developing this kind of service, you only need [ASP.NET Core](/aspnet/core/) and a data-access API or ORM like [Entity Framework Core](/ef/core/index). + +For a production environment in Azure, it is recommended that you use Azure SQL DB or any other database technology that can provide high availability and high scalability. For example, for a NoSQL approach, you might choose CosmosDB. + +Finally, by editing the Dockerfile and docker-compose.yml metadata files, you can configure how the image of this container will be created—what base image it will use, plus design settings such as internal and external names and TCP ports. + +## Implementing a simple CRUD microservice with ASP.NET Core + +To implement a simple CRUD microservice using .NET and Visual Studio, you start by creating a simple ASP.NET Core Web API project (running on .NET so it can run on a Linux Docker host): + +![Screenshot of Visual Studios showing the set up of the project.](media/create-asp-net-core-web-api-project.png) + +**Figure 6-6**. Creating an ASP.NET Core Web API project in Visual Studio 2019 + +To create an ASP.NET Core Web API Project, first select an ASP.NET Core Web Application and then select the API type. After creating the project, you can implement your MVC controllers as you would in any other Web API project, using the Entity Framework API or other API. In a new Web API project, you can see that the only dependency you have in that microservice is on ASP.NET Core itself. + +![Screenshot of VS showing the NuGet dependencies of Catalog.Api](media/simple-crud-web-api-microservice-dependencies.png) + +**Figure 6-7**. Dependencies in a simple CRUD Web API microservice + +The API project includes references to Microsoft.AspNetCore.App NuGet package, that includes references to all essential packages. It could include some other packages as well. + +### Implementing CRUD Web API services with Entity Framework Core + +Entity Framework (EF) Core is a lightweight, extensible, and cross-platform version of the popular Entity Framework data access technology. EF Core is an object-relational mapper (ORM) that enables .NET developers to work with a database using .NET objects. + +#### The data model + +With EF Core, data access is performed by using a model. A model is made up of (domain model) entity classes and a derived context (DbContext) that represents a session with the database, allowing you to query and save data. You can generate a model from an existing database, manually code a model to match your database, or use EF migrations technique to create a database from your model, using the code-first approach (that makes it easy to evolve the database as your model changes over time). For the catalog microservice, the last approach has been used. You can see an example of the CatalogItem entity class in the following code example: + +```csharp +public class CatalogItem +{ + public int Id { get; set; } + public string Name { get; set; } + public string Description { get; set; } + public decimal Price { get; set; } + public string PictureFileName { get; set; } + public string PictureUri { get; set; } + public int CatalogTypeId { get; set; } + public CatalogType CatalogType { get; set; } + public int CatalogBrandId { get; set; } + public CatalogBrand CatalogBrand { get; set; } + public int AvailableStock { get; set; } + public int RestockThreshold { get; set; } + public int MaxStockThreshold { get; set; } + + public bool OnReorder { get; set; } + public CatalogItem() { } + + // Additional code ... +} +``` + +You also need a DbContext that represents a session with the database. For the catalog microservice, the CatalogContext class derives from the DbContext base class, as shown in the following example: + +```csharp +public class CatalogContext : DbContext +{ + public CatalogContext(DbContextOptions options) : base(options) + { } + public DbSet CatalogItems { get; set; } + public DbSet CatalogBrands { get; set; } + public DbSet CatalogTypes { get; set; } + + // Additional code ... +} +``` + +You can have additional `DbContext` implementations. For example, in the sample Catalog.API microservice, there's a second `DbContext` named `CatalogContextSeed` where it automatically populates the sample data the first time it tries to access the database. This method is useful for demo data and for automated testing scenarios, as well. + +Within the `DbContext`, you use the `OnModelCreating` method to customize object/database entity mappings and other [EF extensibility points](https://devblogs.microsoft.com/dotnet/implementing-seeding-custom-conventions-and-interceptors-in-ef-core-1-0/). + +##### Querying data from Web API controllers + +Instances of your entity classes are typically retrieved from the database using [Language-Integrated Query (LINQ)](/dotnet/csharp/linq/) + +##### Saving data + +Data is created, deleted, and modified in the database using instances of your entity classes. You could add code like the following hard-coded example (mock data, in this case) to your Web API controllers. + +```csharp +var catalogItem = new CatalogItem() {CatalogTypeId=2, CatalogBrandId=2, + Name="Roslyn T-Shirt", Price = 12}; +_context.Catalog.Add(catalogItem); +_context.SaveChanges(); +``` + +##### Dependency Injection in ASP.NET Core and Web API controllers + +In ASP.NET Core, you can use Dependency Injection (DI) out of the box. You do not need to set up a third-party Inversion of Control (IoC) container. + +In the `CatalogController` class mentioned earlier, `CatalogContext` (which inherits from `DbContext`) type is injected along with the other required objects in the `CatalogController()` constructor. + +An important configuration to set up in the Web API project is the DbContext class registration into the service's IoC container. You typically do so in the _Program.cs_ file by calling the `builder.Services.AddDbContext()` method, as shown in the following **simplified** example: + +```csharp +// Additional code... + +builder.Services.AddDbContext(options => +{ + options.UseSqlServer(builder.Configuration["ConnectionString"], + sqlServerOptionsAction: sqlOptions => + { + sqlOptions.MigrationsAssembly( + typeof(Program).GetTypeInfo().Assembly.GetName().Name); + + //Configuring Connection Resiliency: + sqlOptions. + EnableRetryOnFailure(maxRetryCount: 5, + maxRetryDelay: TimeSpan.FromSeconds(30), + errorNumbersToAdd: null); + }); + + // Changing default behavior when client evaluation occurs to throw. + // Default in EFCore would be to log warning when client evaluation is done. + options.ConfigureWarnings(warnings => warnings.Throw( + RelationalEventId.QueryClientEvaluationWarning)); +}); +``` + +## The DB connection string and environment variables used by Docker containers + +You can use the ASP.NET Core settings and add a ConnectionString property to your settings.json file as shown in the following example: + +```json +{ + "ConnectionString": "Server=tcp:127.0.0.1,5433;Initial Catalog=Microsoft.eShop.Services.CatalogDb;User Id=sa;Password=[PLACEHOLDER]", + "ExternalCatalogBaseUrl": "http://host.docker.internal:5101", + "Logging": { + "IncludeScopes": false, + "LogLevel": { + "Default": "Debug", + "System": "Information", + "Microsoft": "Information" + } + } +} +``` + +The settings.json file can have default values for the ConnectionString property or for any other property. However, those properties will be overridden by the values of environment variables that you specify in the docker-compose.override.yml file, when using Docker. + +From your docker-compose.yml or docker-compose.override.yml files, you can initialize those environment variables so that Docker will set them up as OS environment variables for you, as shown in the following docker-compose.override.yml file (the connection string and other lines wrap in this example, but it would not wrap in your own file). + +```yml +# docker-compose.override.yml + +# +catalog-api: + environment: + - ConnectionString=Server=sqldata;Database=Microsoft.eShop.Services.CatalogDb;User Id=sa;Password=[PLACEHOLDER] + # Additional environment variables for this service + ports: + - "5101:80" +``` + +Finally, you can get that value from your code by using `builder.Configuration\["ConnectionString"\]`, as shown in an earlier code example. + +However, for production environments, you might want to explore additional ways on how to store secrets like the connection strings. An excellent way to manage application secrets is using [Azure Key Vault](https://azure.microsoft.com/services/key-vault/). + +### Implementing versioning in ASP.NET Web APIs + +As business requirements change, new collections of resources may be added, the relationships between resources might change, and the structure of the data in resources might be amended. Updating a Web API to handle new requirements is a relatively straightforward process, but you must consider the effects that such changes will have on client applications consuming the Web API. + +Versioning enables a Web API to indicate the features and resources that it exposes. A client application can then submit requests to a specific version of a feature or resource. There are several approaches to implement versioning: + +- URI versioning + +- Query string versioning + +- Header versioning + +With URI versioning, as in the eShop sample application, each time you modify the Web API or change the schema of resources, you add a version number to the URI for each resource. Existing URIs should continue to operate as before, returning resources that conform to the schema that matches the requested version. + +As shown in the following code example, the version can be set by using the Route attribute in the Web API controller, which makes the version explicit in the URI (v1 in this case). + +```csharp +[Route("api/v1/[controller]")] +public class CatalogController : ControllerBase +{ + // Implementation ... +``` + +For a more sophisticated versioning and the best method when using REST, see [HATEOAS (Hypertext as the Engine of Application State)](/azure/architecture/best-practices/api-design#use-hateoas-to-enable-navigation-to-related-resources). + +## Generating Swagger description metadata from your ASP.NET Core Web API + +[Swagger](https://swagger.io/) is a commonly used open source framework backed by a large ecosystem of tools that helps you design, build, document, and consume your RESTful APIs. You should include Swagger description metadata with any kind of microservice. + +The heart of Swagger is the Swagger specification, which is API description metadata in a JSON or YAML file. The specification creates the RESTful contract for your API, detailing all its resources and operations in both a human- and machine-readable format for easy development, discovery, and integration. + +For more information, including a web editor and examples of Swagger specifications from companies like Spotify, Uber, Slack, and Microsoft, see the Swagger site (). + +### Why use Swagger? + +The main reasons to generate Swagger metadata for your APIs are the following. + +**Ability for other products to automatically consume and integrate your APIs**. + +Dozens of products and [commercial tools](https://swagger.io/commercial-tools/) and many [libraries and frameworks](https://swagger.io/open-source-integrations/) support Swagger. Microsoft has high-level products and tools that can automatically consume Swagger-based APIs, such as the following: + +- [AutoRest](https://github.com/Azure/AutoRest). You can automatically generate .NET client classes for calling Swagger. This tool can be used from the CLI and it also integrates with Visual Studio for easy use through the GUI. + +- [Microsoft PowerApps](https://powerapps.microsoft.com/). You can automatically consume your API from [PowerApps mobile apps](https://powerapps.microsoft.com/blog/register-and-use-custom-apis-in-powerapps/) built with [PowerApps Studio](https://powerapps.microsoft.com/build-powerapps/), with no programming skills required. + +**Ability to automatically generate API documentation**. + +There are several options to automate Swagger metadata generation for ASP.NET Core REST API applications, in the form of functional API help pages, based on *swagger-ui*. + +Probably the best known is [Swashbuckle](https://github.com/domaindrivendev/Swashbuckle.AspNetCore) which we'll cover in some detail in this guide. + +### How to automate API Swagger metadata generation with the Swashbuckle NuGet package + +Generating Swagger metadata manually (in a JSON or YAML file) can be tedious work. However, you can automate API discovery of ASP.NET Web API services by using the [Swashbuckle NuGet package](https://aka.ms/swashbuckledotnetcore) to dynamically generate Swagger API metadata. + +Swashbuckle automatically generates Swagger metadata for your ASP.NET Web API projects. + +Swashbuckle combines API Explorer and Swagger or [swagger-ui](https://github.com/swagger-api/swagger-ui) to provide a rich discovery and documentation experience for your API consumers. + +This means you can complement your API with a nice discovery UI to help developers to use your API: + +![Screenshot of Swagger API Explorer displaying eShopOContainers API.](media/swagger-metadata-eshoponcontainers-catalog-microservice.png) + +**Figure 6-8**. Swashbuckle API Explorer based on Swagger metadata—eShopOnContainers catalog microservice + +The Swashbuckle generated Swagger UI API documentation includes all published actions. The API explorer is not the most important thing here. You can use tools like [swagger-codegen](https://github.com/swagger-api/swagger-codegen) which allow code generation of API client libraries, server stubs, and documentation automatically. + +After you have installed the NuGet packages under the high-level metapackage [Swashbuckle.AspNetCore](https://www.nuget.org/packages/Swashbuckle.AspNetCore) in your Web API project, you configure Swagger in the _Program.cs_ class, as in the following **simplified** code: + +```csharp + +// Add framework services. + +builder.Services.AddSwaggerGen(options => +{ + options.DescribeAllEnumsAsStrings(); + options.SwaggerDoc("v1", new OpenApiInfo + { + Title = "eShopOnContainers - Catalog HTTP API", + Version = "v1", + Description = "The Catalog Microservice HTTP API. This is a Data-Driven/CRUD microservice sample" + }); +}); + +// Other startup code... + +app.UseSwagger() + .UseSwaggerUI(c => + { + c.SwaggerEndpoint("/swagger/v1/swagger.json", "My API V1"); + }); + ``` + +Once this is done, you can start your application and browse the following Swagger JSON and UI endpoints using URLs like these: + +```console + http:///swagger/v1/swagger.json + + http:///swagger/ +``` + +You previously saw the generated UI created by Swashbuckle for a URL like `http:///swagger`. In Figure 6-9, you can also see how you can test any API method. + +![Screenshot of Swagger UI showing available testing tools.](media/swashbuckle-ui-testing.png) + +**Figure 6-9**. Swashbuckle UI testing the Catalog/Items API method + +The Swagger UI API detail shows a sample of the response and can be used to execute the real API, which is great for developer discovery. + +### Additional resources + +- **ASP.NET Web API Help Pages using Swagger** \ + [https://learn.microsoft.com/aspnet/core/tutorials/web-api-help-pages-using-swagger](/aspnet/core/tutorials/web-api-help-pages-using-swagger) + +- **Get started with Swashbuckle and ASP.NET Core** \ + [https://learn.microsoft.com/aspnet/core/tutorials/getting-started-with-swashbuckle](/aspnet/core/tutorials/getting-started-with-swashbuckle) + +- **Get started with NSwag and ASP.NET Core** \ + [https://learn.microsoft.com/aspnet/core/tutorials/getting-started-with-nswag](/aspnet/core/tutorials/getting-started-with-nswag) + +> [!div class="step-by-step"] +> [Previous](azure-caching.md) +> [Next](../cloud-native-resiliency/cloud-native-resiliency.md) diff --git a/docs/architecture/distributed-cloud-native-apps-containers/chpt8-data-patterns/distributed-data.md b/docs/architecture/distributed-cloud-native-apps-containers/chpt8-data-patterns/distributed-data.md new file mode 100644 index 0000000000000..153b4492aaea6 --- /dev/null +++ b/docs/architecture/distributed-cloud-native-apps-containers/chpt8-data-patterns/distributed-data.md @@ -0,0 +1,121 @@ +--- +title: Data patterns for distributed applications +description: Architecture for Distributed Cloud-Native Apps with .NET Aspire & Containers | Data patterns for distributed applications +author: +ms.date: 04/06/2022 +--- + +# Data patterns for distributed applications + +[!INCLUDE [download-alert](../includes/download-alert.md)] + +A cloud-native approach changes the way you design, deploy, and manage applications. It also changes the way you manage and store data. + +Figure 5-1 contrasts the differences. + +![A diagram showing data storage in cloud-native applications.](media/distributed-data.png) + +**Figure 5-1**. Data management in cloud-native applications + +The left side of figure 5-1 shows a *monolithic application*, business service components collocate together in a shared services tier, sharing data from a single relational database. + +Designing for cloud-native, we take a different approach. On the right-side of Figure 5-1, note how business functionality segregates into small, independent [microservices](/azure/architecture/guide/architecture-styles/microservices). Each microservice encapsulates a specific business capability and its own data. This is a *database per microservice* design. + +## Why use a database per microservice? + +This database per microservice provides many benefits, especially for systems that must evolve rapidly and support massive scale. With this model: + +- Domain data is encapsulated within the service. +- Data schema can evolve without directly impacting other services. +- Each data store can independently scale. +- A data store failure in one service won't directly impact other services. + +Segregating data also enables each microservice to implement the data store type that is best optimized for its workload, storage needs, and read/write patterns. Choices include relational, document, key-value, and even graph-based data stores. + +Figure 5-2 presents the principle of polyglot persistence in a cloud-native system. + +![A diagram showing polyglot data persistence.](media/polyglot-data-persistence.png) + +**Figure 5-2**. Polyglot data persistence + +Note that in this figure, each microservice supports a different type of data store: + +- The product catalog microservice consumes a relational database to accommodate the rich relational structure of its underlying data. +- The shopping cart microservice consumes a distributed cache that supports its simple, key-value data store. +- The ordering microservice consumes both a NoSQL document database for write operations along with a highly denormalized key/value store to accommodate high-volumes of read operations. + +## Cross-service queries + +While microservices are independent and focus on specific functional capabilities, like inventory, shipping, or ordering, they frequently require integration with other microservices. Often the integration involves one microservice *querying* another for data. Figure 5-3 shows the scenario. + +![A diagram showing querying across microservices.](media/cross-service-query.png) + +**Figure 5-3**. Querying across microservices + +The figure shows a shopping basket microservice that adds an item to a user's shopping basket. While the data store for this microservice contains basket and line item data, it doesn't maintain product or pricing data. Instead, those data items are owned by the catalog and pricing microservices. This arrangement presents a problem. How can the shopping basket microservice add a product to the user's shopping basket when it has neither product nor pricing data in its database? + +To deal with this we can use the [Materialized View pattern](/azure/architecture/patterns/materialized-view), shown in Figure 5-4. + +![A diagram showing materialized view pattern.](media/materialized-view-pattern.png) + +**Figure 5-4**. Materialized View pattern + +With this pattern, you place a local data table, known as a *read model*, in the shopping basket service. This table contains a denormalized copy of the data needed from the product and pricing microservices. Copying the data directly into the shopping basket microservice eliminates the need for expensive cross-service calls. With the data local to the service, you improve the service's response time and reliability. Additionally, having its own copy of the data makes the shopping basket service more resilient. If the catalog service becomes unavailable, it doesn't directly impact the shopping basket service. + +## Distributed transactions + +In cloud-native applications, you must manage distributed transactions programmatically. You move from a world of *immediate consistency* to that of *eventual consistency*. + +Figure 5-5 shows the problem. + +![A diagram showing transaction in saga pattern.](media/saga-transaction-operation.png) + +**Figure 5-5**. Implementing a transaction across microservices + +In the preceding figure, five independent microservices participate in a distributed transaction that creates an order. Each microservice maintains its own data store and implements a local transaction for its store. To create the order, the local transaction for *each* individual microservice must succeed, or *all* must abort and roll back the operation. While built-in transactional support is available inside each of the microservices, there's no support for a distributed transaction that would span across all five services to keep data consistent. + +A popular pattern for adding distributed transactional support is the [Saga pattern](/azure/architecture/reference-architectures/saga/saga): + +![ A diagram showng roll back in saga pattern.](media/saga-rollback-operation.png) + +**Figure 5-6**. Rolling back a transaction + +In this figure, the *Update Inventory* operation has failed in the Inventory microservice. The Saga pattern invokes a set of compensating transactions (in red) to adjust the inventory counts, cancel the payment and the order, and return the data for each microservice back to a consistent state. + +## High volume data + +Large cloud-native applications often support high-volume data requirements. In these scenarios, traditional data storage techniques can cause bottlenecks. For complex systems that deploy on a large scale, Event Sourcing may improve application performance. + +### Event Sourcing + +Another approach to optimizing high volume data scenarios involves [Event Sourcing](/azure/architecture/patterns/event-sourcing). + +A system typically stores the current state of a data entity. In high volume systems, however, overhead from transactional locking and frequent update operations can impact database performance, responsiveness, and limit scalability. + +Event Sourcing takes a different approach to capturing data. Each operation that affects data is persisted to an event store. Instead of updating the state of a data record, we append each change to a sequential list of past events - similar to an accountant's ledger. The event store becomes the system of record for the data. It's used to propagate various materialized views within the bounded context of a microservice. Figure 5.8 shows the pattern. + +![A diagram showing event sourcing.](media/event-sourcing.png) + +**Figure 5-8**. Event Sourcing + +In the previous figure, note how each entry (in blue) for a user's shopping cart is appended to an underlying event store. In the adjoining materialized view, the system projects the current state by replaying all the events associated with each shopping cart. This view, or read model, is then exposed back to the UI. + +For this pattern, consider a data store that directly supports event sourcing. Azure Cosmos DB, MongoDB, Cassandra, CouchDB, and RavenDB are good candidates. Additionally, while event sourcing can provide increased performance and scalability, it comes at the expense of complexity and a learning curve. + +### Data sharding + +You use [data sharding](/azure/architecture/patterns/sharding) to partition your data horizontally. Each shard has the same schema and holds its own distinct subset of data. A shard is a data store in its own right and it can contain the data for many entities of different types. The benefits of sharding include: + +- You can scale the system out by adding further shards running on additional storage nodes. +- You can reduce contention and improve performance by balancing the workload across shards. +- Shards can be located physically closer to the users that need access to the data. + +The main challenge with sharding is deciding how to partition your data across the shards. You need to ensure that data is evenly distributed, and that the system can handle the failure of a shard without losing data or compromising the system's availability. Three strategies are commonly used to help with this challenges: + +- **Lookup strategy**: a lookup service maps each entity to the shard that contains it. +- **Range strategy**: entities are distributed across shards based on a range of values. +- **Hash strategy**: a hash function is used to map entities to shards. + +>[!div class="step-by-step"] +>[Previous](../event-based-communication-patterns/subscribe-events.md) +>[Next](relational-vs-nosql-data.md) diff --git a/docs/architecture/distributed-cloud-native-apps-containers/chpt8-data-patterns/media/azure-managed-databases.png b/docs/architecture/distributed-cloud-native-apps-containers/chpt8-data-patterns/media/azure-managed-databases.png new file mode 100644 index 0000000000000..5bae3e9aeed95 Binary files /dev/null and b/docs/architecture/distributed-cloud-native-apps-containers/chpt8-data-patterns/media/azure-managed-databases.png differ diff --git a/docs/architecture/distributed-cloud-native-apps-containers/chpt8-data-patterns/media/caching-in-a-cloud-native-app.png b/docs/architecture/distributed-cloud-native-apps-containers/chpt8-data-patterns/media/caching-in-a-cloud-native-app.png new file mode 100644 index 0000000000000..03f35443629de Binary files /dev/null and b/docs/architecture/distributed-cloud-native-apps-containers/chpt8-data-patterns/media/caching-in-a-cloud-native-app.png differ diff --git a/docs/architecture/distributed-cloud-native-apps-containers/chpt8-data-patterns/media/cap-theorem.png b/docs/architecture/distributed-cloud-native-apps-containers/chpt8-data-patterns/media/cap-theorem.png new file mode 100644 index 0000000000000..f7ce9fe2f5183 Binary files /dev/null and b/docs/architecture/distributed-cloud-native-apps-containers/chpt8-data-patterns/media/cap-theorem.png differ diff --git a/docs/architecture/distributed-cloud-native-apps-containers/chpt8-data-patterns/media/cosmos-consistency-level-graph.png b/docs/architecture/distributed-cloud-native-apps-containers/chpt8-data-patterns/media/cosmos-consistency-level-graph.png new file mode 100644 index 0000000000000..be5abf59a3f99 Binary files /dev/null and b/docs/architecture/distributed-cloud-native-apps-containers/chpt8-data-patterns/media/cosmos-consistency-level-graph.png differ diff --git a/docs/architecture/distributed-cloud-native-apps-containers/chpt8-data-patterns/media/cosmos-db-overview.png b/docs/architecture/distributed-cloud-native-apps-containers/chpt8-data-patterns/media/cosmos-db-overview.png new file mode 100644 index 0000000000000..bc74f642ade80 Binary files /dev/null and b/docs/architecture/distributed-cloud-native-apps-containers/chpt8-data-patterns/media/cosmos-db-overview.png differ diff --git a/docs/architecture/distributed-cloud-native-apps-containers/chpt8-data-patterns/media/cosmos-db-partitioning.png b/docs/architecture/distributed-cloud-native-apps-containers/chpt8-data-patterns/media/cosmos-db-partitioning.png new file mode 100644 index 0000000000000..25f2f09d79869 Binary files /dev/null and b/docs/architecture/distributed-cloud-native-apps-containers/chpt8-data-patterns/media/cosmos-db-partitioning.png differ diff --git a/docs/architecture/distributed-cloud-native-apps-containers/chpt8-data-patterns/media/cqrs-implementation.png b/docs/architecture/distributed-cloud-native-apps-containers/chpt8-data-patterns/media/cqrs-implementation.png new file mode 100644 index 0000000000000..558d4a5173470 Binary files /dev/null and b/docs/architecture/distributed-cloud-native-apps-containers/chpt8-data-patterns/media/cqrs-implementation.png differ diff --git a/docs/architecture/distributed-cloud-native-apps-containers/chpt8-data-patterns/media/create-asp-net-core-web-api-project.png b/docs/architecture/distributed-cloud-native-apps-containers/chpt8-data-patterns/media/create-asp-net-core-web-api-project.png new file mode 100644 index 0000000000000..6e24a5bb48e5d Binary files /dev/null and b/docs/architecture/distributed-cloud-native-apps-containers/chpt8-data-patterns/media/create-asp-net-core-web-api-project.png differ diff --git a/docs/architecture/distributed-cloud-native-apps-containers/chpt8-data-patterns/media/cross-service-query.png b/docs/architecture/distributed-cloud-native-apps-containers/chpt8-data-patterns/media/cross-service-query.png new file mode 100644 index 0000000000000..194b189a3ca09 Binary files /dev/null and b/docs/architecture/distributed-cloud-native-apps-containers/chpt8-data-patterns/media/cross-service-query.png differ diff --git a/docs/architecture/distributed-cloud-native-apps-containers/chpt8-data-patterns/media/distributed-data.png b/docs/architecture/distributed-cloud-native-apps-containers/chpt8-data-patterns/media/distributed-data.png new file mode 100644 index 0000000000000..26a0d6289d12c Binary files /dev/null and b/docs/architecture/distributed-cloud-native-apps-containers/chpt8-data-patterns/media/distributed-data.png differ diff --git a/docs/architecture/distributed-cloud-native-apps-containers/chpt8-data-patterns/media/event-sourcing.png b/docs/architecture/distributed-cloud-native-apps-containers/chpt8-data-patterns/media/event-sourcing.png new file mode 100644 index 0000000000000..c4fbb8e62505f Binary files /dev/null and b/docs/architecture/distributed-cloud-native-apps-containers/chpt8-data-patterns/media/event-sourcing.png differ diff --git a/docs/architecture/distributed-cloud-native-apps-containers/chpt8-data-patterns/media/internal-design-simple-crud-microservices.png b/docs/architecture/distributed-cloud-native-apps-containers/chpt8-data-patterns/media/internal-design-simple-crud-microservices.png new file mode 100644 index 0000000000000..63a45002f589d Binary files /dev/null and b/docs/architecture/distributed-cloud-native-apps-containers/chpt8-data-patterns/media/internal-design-simple-crud-microservices.png differ diff --git a/docs/architecture/distributed-cloud-native-apps-containers/chpt8-data-patterns/media/materialized-view-pattern.png b/docs/architecture/distributed-cloud-native-apps-containers/chpt8-data-patterns/media/materialized-view-pattern.png new file mode 100644 index 0000000000000..3aba7b6dd2d6e Binary files /dev/null and b/docs/architecture/distributed-cloud-native-apps-containers/chpt8-data-patterns/media/materialized-view-pattern.png differ diff --git a/docs/architecture/distributed-cloud-native-apps-containers/chpt8-data-patterns/media/polyglot-data-persistence.png b/docs/architecture/distributed-cloud-native-apps-containers/chpt8-data-patterns/media/polyglot-data-persistence.png new file mode 100644 index 0000000000000..cdb2fb43b7f6f Binary files /dev/null and b/docs/architecture/distributed-cloud-native-apps-containers/chpt8-data-patterns/media/polyglot-data-persistence.png differ diff --git a/docs/architecture/distributed-cloud-native-apps-containers/chpt8-data-patterns/media/saga-rollback-operation.png b/docs/architecture/distributed-cloud-native-apps-containers/chpt8-data-patterns/media/saga-rollback-operation.png new file mode 100644 index 0000000000000..16030b1c85c3a Binary files /dev/null and b/docs/architecture/distributed-cloud-native-apps-containers/chpt8-data-patterns/media/saga-rollback-operation.png differ diff --git a/docs/architecture/distributed-cloud-native-apps-containers/chpt8-data-patterns/media/saga-transaction-operation.png b/docs/architecture/distributed-cloud-native-apps-containers/chpt8-data-patterns/media/saga-transaction-operation.png new file mode 100644 index 0000000000000..fa4a5b7ac1f3d Binary files /dev/null and b/docs/architecture/distributed-cloud-native-apps-containers/chpt8-data-patterns/media/saga-transaction-operation.png differ diff --git a/docs/architecture/distributed-cloud-native-apps-containers/chpt8-data-patterns/media/simple-crud-web-api-microservice-dependencies.png b/docs/architecture/distributed-cloud-native-apps-containers/chpt8-data-patterns/media/simple-crud-web-api-microservice-dependencies.png new file mode 100644 index 0000000000000..8a36f190ad805 Binary files /dev/null and b/docs/architecture/distributed-cloud-native-apps-containers/chpt8-data-patterns/media/simple-crud-web-api-microservice-dependencies.png differ diff --git a/docs/architecture/distributed-cloud-native-apps-containers/chpt8-data-patterns/media/simple-data-driven-crud-microservice.png b/docs/architecture/distributed-cloud-native-apps-containers/chpt8-data-patterns/media/simple-data-driven-crud-microservice.png new file mode 100644 index 0000000000000..bd6beef0de0cd Binary files /dev/null and b/docs/architecture/distributed-cloud-native-apps-containers/chpt8-data-patterns/media/simple-data-driven-crud-microservice.png differ diff --git a/docs/architecture/distributed-cloud-native-apps-containers/chpt8-data-patterns/media/swagger-metadata-eshoponcontainers-catalog-microservice.png b/docs/architecture/distributed-cloud-native-apps-containers/chpt8-data-patterns/media/swagger-metadata-eshoponcontainers-catalog-microservice.png new file mode 100644 index 0000000000000..c0a33b454122b Binary files /dev/null and b/docs/architecture/distributed-cloud-native-apps-containers/chpt8-data-patterns/media/swagger-metadata-eshoponcontainers-catalog-microservice.png differ diff --git a/docs/architecture/distributed-cloud-native-apps-containers/chpt8-data-patterns/media/swashbuckle-ui-testing.png b/docs/architecture/distributed-cloud-native-apps-containers/chpt8-data-patterns/media/swashbuckle-ui-testing.png new file mode 100644 index 0000000000000..b41c226aac1d5 Binary files /dev/null and b/docs/architecture/distributed-cloud-native-apps-containers/chpt8-data-patterns/media/swashbuckle-ui-testing.png differ diff --git a/docs/architecture/distributed-cloud-native-apps-containers/chpt8-data-patterns/media/types-of-nosql-datastores.png b/docs/architecture/distributed-cloud-native-apps-containers/chpt8-data-patterns/media/types-of-nosql-datastores.png new file mode 100644 index 0000000000000..8773b1df21838 Binary files /dev/null and b/docs/architecture/distributed-cloud-native-apps-containers/chpt8-data-patterns/media/types-of-nosql-datastores.png differ diff --git a/docs/architecture/distributed-cloud-native-apps-containers/chpt8-data-patterns/relational-vs-nosql-data.md b/docs/architecture/distributed-cloud-native-apps-containers/chpt8-data-patterns/relational-vs-nosql-data.md new file mode 100644 index 0000000000000..43674db9aa75b --- /dev/null +++ b/docs/architecture/distributed-cloud-native-apps-containers/chpt8-data-patterns/relational-vs-nosql-data.md @@ -0,0 +1,239 @@ +--- +title: Relational versus NoSQL data +description: Architecture for Distributed Cloud-Native Apps with .NET Aspire & Containers | Relational versus NoSQL data +author: +ms.date: 04/06/2022 +--- + +# Relational versus NoSQL data + +[!INCLUDE [download-alert](../includes/download-alert.md)] + +Relational and NoSQL are two types of database systems commonly implemented in cloud-native apps. They're built differently, store data differently, and accessed differently. In this section, we'll look at both and compare them. + +*Relational databases* have been a prevalent technology for decades. Relational databases provide tables that have a fixed schema, use SQL (Structured Query Language) to manage and query data, and support ACID guarantees. + +*No-SQL databases* refer to high-performance, non-relational data stores. They excel in their ease-of-use, scalability, resilience, and availability characteristics. [NoSQL](https://www.geeksforgeeks.org/introduction-to-nosql/) databases typically don't provide ACID guarantees beyond the scope of a single database partition. High volume services that require sub second response time favor NoSQL datastores. + +NoSQL databases include several different models for accessing and managing data, each suited to specific use cases: + +![A diagram showing NoSQL data models.](media/types-of-nosql-datastores.png) + +**Figure 5-9**: Data models for NoSQL databases + +| Model | Characteristics | +| :-------- | :-------- | +| Document Store | Data and metadata are stored hierarchically in JSON-based documents inside the database. | +| Key Value Store | The simplest of the NoSQL databases, data is represented as a collection of key-value pairs. | +| Wide-Column Store | Related data is stored as a set of nested key-value pairs within a single column. | +| Graph Store | Data is stored in a graph structure as node, edge, and data properties. | + +## The CAP theorem + +As a way to understand the differences between these types of databases, consider the CAP theorem, a set of principles applied to distributed systems that store state: + +![A diagram showing CAP theorem.](media/cap-theorem.png) + +**Figure 5-10**. The CAP theorem + +The theorem states that a databases in distributed data systems can only guarantee *two* of the following three properties: + +- **Consistency.** Every node in the cluster responds with the most recent data, even if the system must block the request until all replicas update. If you query a "consistent system" for an item that is currently updating, you'll wait for that response until all replicas successfully update. However, you'll receive the most current data. + +- **Availability.** Every node returns an immediate response, even if that response isn't the most recent data. If you query an "available system" for an item that is updating, you'll get the best possible answer the service can provide at that moment. + +- **Partition Tolerance.** Guarantees the system continues to operate even if a replicated data node fails or loses connectivity with other replicated data nodes. + +Relational databases typically provide consistency and availability, but not partition tolerance. + +Many relational database systems support built-in replication features where copies of the primary database can be made to other secondary server instances. Data can also be horizontally partitioned across multiple nodes, such as with [sharding](/azure/sql-database/sql-database-elastic-scale-introduction). But sharding can be costly and time consuming to manage. Replication consistency and recovery point objectives can be tuned by configuring whether replication occurs synchronously or asynchronously. + +NoSQL databases typically support high availability at a reduced cost and partition tolerance. If data replicas were to lose connectivity in a "highly available" NoSQL database cluster, you could still complete a write operation to the database. + +> A new type of database, called NewSQL, has emerged which extends the relational database engine to support both horizontal scalability and the scalable performance of NoSQL systems. + +## Considerations for relational versus NoSQL systems + +Based upon specific data requirements, a cloud-native microservice can implement a relational, NoSQL datastore or both. + +| Consider a NoSQL datastore when: | Consider a relational database when: | +| :-------- | :-------- | +| You have high volume workloads that require predictable latency at large scale (for example, latency measured in milliseconds while performing millions of transactions per second) | Your workload volume generally fits within thousands of transactions per second | +| Your data is dynamic and frequently changes | Your data is highly structured and requires referential integrity | +| Relationships can be de-normalized data models | Relationships are expressed through table joins on normalized data models | +| Data retrieval is simple and expressed without table joins | You work with complex queries and reports| +| Data is typically replicated across geographies and requires finer control over consistency, availability, and performance | Data is typically centralized, or can be replicated regions asynchronously | +| Your application will be deployed to commodity hardware, such as with public clouds | Your application will be deployed to large, high-end hardware | + +In the next sections, we'll explore the options available in the Azure cloud for storing and managing your cloud-native data. + +## Database as a Service + +Cloud-native applications favor data services exposed as a Database as a Service (DBaaS). Fully managed by a cloud vendor, these services provide built-in security, scalability, and monitoring. Instead of owning the service, you simply consume it as a backing service. The provider operates the resource at scale and bears the responsibility for performance and maintenance. + +They can be configured across cloud availability zones and regions to achieve high availability. They all support just-in-time capacity and a pay-as-you-go model. Azure features different kinds of managed data service options, each with specific benefits. + +## Azure relational databases + +For cloud-native microservices that require relational data, Azure offers four managed relational databases as a service (DBaaS) offerings, shown in Figure 5-11. + +![A diagram showing managed relational databases in Azure.](media/azure-managed-databases.png) + +**Figure 5-11**. Managed relational databases available in Azure + +The features shown in the figure are especially important to organizations who provision large numbers of databases, but have limited resources to administer them. You can provision an Azure database in minutes by selecting the amount of processing cores, memory, and underlying storage. You can scale the database on-the-fly and dynamically adjust resources with little to no downtime. + +## Azure SQL Database + +Development teams with expertise in Microsoft SQL Server should consider +[Azure SQL Database](/azure/sql-database/). It's a fully managed relational DBaaS based on the Microsoft SQL Server database engine. The service shares many features found in the on-premises version of SQL Server and runs the latest stable version of the SQL Server database engine. + +For use with a cloud-native microservice, Azure SQL Database is available with three deployment options: + +- A Single Database represents a fully managed SQL Database running on an [Azure SQL Database server](/azure/sql-database/sql-database-servers) in the Azure cloud. + +- A [Managed Instance](/azure/sql-database/sql-database-managed-instance) is a fully managed instance of the Microsoft SQL Server database engine that provides near-100% compatibility with an on-premises SQL Server. + +- [Azure SQL Database serverless](/azure/sql-database/sql-database-serverless) is a compute tier for a single database that automatically scales based on workload demand. It bills only for the amount of compute used per second. The service is well suited for workloads with intermittent, unpredictable usage patterns, interspersed with periods of inactivity. The serverless compute tier also automatically pauses databases during inactive periods so that only storage charges are billed, and resumes when activity returns. + +## Open-source databases in Azure + +Open-source relational databases have become a popular choice for cloud-native applications. Open-source databases can be deployed across multiple cloud providers, helping minimize the concern of "vendor lock-in." + +You can easily self-host any open-source database on an Azure VM. But this means that you take care of management, monitoring, and maintenance of the database and VM. As an alternative, Microsoft provides *fully managed* DBaaS services: + +- [Azure Database for MySQL](https://azure.microsoft.com/services/mysql/) is a managed relational database service based on the open-source MySQL Server engine. It uses the MySQL Community edition. The Azure MySQL server is the administrative point for the service. It's the same MySQL server engine used for on-premises deployments. + +- [Azure Database for MariaDB](https://azure.microsoft.com/services/mariadb/) is a fully managed relational database as a service in the Azure cloud. The service is based on the MariaDB community edition server engine. It can handle mission-critical workloads with predictable performance and dynamic scalability. + +- [Azure Database for PostgreSQL](https://azure.microsoft.com/services/postgresql/) is a fully managed relational database service, based on the open-source PostgreSQL database engine. Azure Database for PostgreSQL is available with two deployment options: + + - The [Single Server](/azure/postgresql/concepts-servers) deployment option is a central administrative point for multiple databases to which you can deploy many databases. + + - The [Hyperscale (Citus) option](https://azure.microsoft.com/blog/get-high-performance-scaling-for-your-azure-database-workloads-with-hyperscale/). This option allows the engine to fit more data in memory, parallelize queries across hundreds of nodes, and index data faster. + +## NoSQL data in Azure + +Cosmos DB is a fully managed, globally distributed NoSQL database service in the Azure cloud. It has been adopted by many large companies across the world, including Coca-Cola, Skype, ExxonMobil, and Liberty Mutual. + +If your services require fast response from anywhere in the world, high availability, or elastic scalability, Cosmos DB is a great choice. Figure 5-12 shows Cosmos DB. + +![A diagram showing overview of Cosmos DB.](media/cosmos-db-overview.png) + +**Figure 5-12**: Overview of Azure Cosmos DB + +The previous figure presents many of the built-in cloud-native capabilities available in Cosmos DB. In this section, we'll take a closer look at them. + +### Global support + +Cloud-native applications often have a global audience and require global scale. + +You can distribute Cosmos databases across regions or around the world, placing data close to your users, improving response time, and reducing latency without pausing or redeploying. + +Cosmos DB supports [active/active](https://kemptechnologies.com/white-papers/unfog-confusion-active-passive-activeactive-load-balancing/) clustering at the global level, enabling you to configure any of your database regions to support *both writes and reads*. + +The [multi-region write](/azure/cosmos-db/conflict-resolution-policies) protocol is an important feature in Cosmos DB that enables functionality such as guaranteed reads and writes served in less than 10 milliseconds at the 99th percentile. + +With the Cosmos DB [multi-homing APIs](/azure/cosmos-db/distribute-data-globally), your microservice is automatically aware of the nearest Azure region and sends requests to it. Should a region become unavailable, the Multi-Homing feature will automatically route requests to the next nearest available region. + +### Multi-model support + +When replatforming monolithic applications to a cloud-native architecture, development teams sometimes have to migrate open-source, NoSQL data stores. Cosmos DB can help you preserve your investment in these NoSQL datastores with its *multi-model* data platform. The following table shows the supported NoSQL [compatibility APIs](https://www.wikiwand.com/en/Cosmos_DB). + +| Provider | Description | +| :-------- | :-------- | +| NoSQL API | API for NoSQL. Stores data in document format. | +| Mongo DB API | Supports Mongo DB APIs and JSON documents. | +| Gremlin API | Supports Gremlin API with graph-based nodes and edge data representations. | +| Cassandra API | Supports Cassandra API for wide-column data representations. | +| Table API | Supports Azure Table Storage with premium enhancements. | +| PostgreSQL API | Managed service for running PostgreSQL at any scale. | + +Development teams can migrate existing Mongo, Gremlin, or Cassandra databases into Cosmos DB with minimal changes to data or code. For new apps, development teams can choose among open-source options or the built-in SQL API model. + +> Internally, Cosmos stores the data in a simple struct format made up of primitive data types. For each request, the database engine translates the primitive data into the model representation you've selected. + +In the previous table, note the [Table API](/azure/cosmos-db/table-introduction) option. This API is an evolution of Azure Table Storage. Both share the same underlying table model, but the Cosmos DB Table API adds premium enhancements not available in the Azure Storage API. See [Develop with Azure Cosmos DB for Table and Azure Table Storage](/azure/cosmos-db/table/support) for details. Microservices that consume Azure Table Storage can easily migrate to the Cosmos DB Table API. No code changes are required. + +### Tunable consistency + +Cloud-native services with distributed data rely on replication and must make a fundamental trade-off between read consistency, availability, and latency. + +Most distributed databases allow developers to choose between two consistency models: strong consistency and eventual consistency. **Strong consistency** is the gold standard of data programmability. It guarantees that a query will always return the most current data - even if the system must incur latency waiting for an update to replicate across all database copies. While a database configured for **eventual consistency** will return data immediately, even if that data isn't the most current copy. The latter option enables higher availability, greater scale, and increased performance. + +Azure Cosmos DB offers five well-defined [consistency models](/azure/cosmos-db/consistency-levels) shown in Figure 5-13. + +![A diagram showing Cosmos DB consistency graph.](media/cosmos-consistency-level-graph.png) + +**Figure 5-13**: Cosmos DB Consistency Levels + +| Consistency Level | Description | +| :-------- | :-------- | +| Eventual | No ordering guarantee for reads. Replicas will eventually converge. | +| Constant Prefix | Reads are still eventual, but data is returned in the ordering in which it is written. | +| Session | Guarantees you can read any data written during the current session. It is the default consistency level. | +| Bounded Staleness | Reads trail writes by interval that you specify. | +| Strong | Reads are guaranteed to return most recent committed version of an item. A client never sees an uncommitted or partial read. | + +### Partitioning + +Azure Cosmos DB embraces automatic [partitioning](/azure/cosmos-db/partitioning-overview) to scale a database to meet the performance needs of your cloud-native services. You manage data in Cosmos DB data by creating databases, containers, and items. + +Containers live in a Cosmos DB database and represent a schema-agnostic grouping of items. Items are the data that you add to the container. They're represented as documents, rows, nodes, or edges. All items added to a container are automatically indexed. + +Don't get Cosmos DB containers confused with the virtualization containers we've discussed elsewhere in this book. They are data storage entities in a database, not a code execution environment. + +To partition the container, items are divided into distinct subsets called logical partitions. Logical partitions are populated based on the value of a partition key that is associated with each item in a container. Figure 5-14 shows two containers each with a logical partition based on a partition key value: + +![A diagram showing Cosmos DB partitioning mechanics.](media/cosmos-db-partitioning.png) + +**Figure 5-14**: Cosmos DB partitioning mechanics + +## Using databases in a .NET Aspire app + +One of the most important ways .NET Aspire helps cloud-native developers is by managing backing services and making it easy for microservices to discover them and communicate with them. Database services like Cosmos DB, SQL Server, or MongoDB are typical examples of backing services that support your microservices by persisting data. + +In .NET Aspire, there are built-in integrations, each of which supports a different backing service. The following database systems have .NET Aspire integrations, available out-of-the-box: + +- Azure Cosmos DB +- Azure Table Storage +- MongoDB +- MySQL +- Oracle +- PostgreSQL +- SQL Server + +Other database integrations are likely to become available from Microsoft or third parties. If you use one of these integrations, you still have to code operations like read, write, and delete, but you don't have to write code that manages the database clients and enables microservices to discover them. + +## NewSQL databases + +**NewSQL** is an emerging database technology that combines the distributed scalability of NoSQL with the ACID guarantees of a relational database. NewSQL databases are important for business systems that must process high-volumes of data, across distributed environments, with full transactional support and ACID compliance. While a NoSQL database can provide massive scalability, it does not guarantee data consistency. Intermittent problems from inconsistent data can place a burden on the development team. Developers must construct safeguards into their microservice code to manage problems caused by inconsistent data. + +The Cloud Native Computing Foundation (CNCF) features several NewSQL database projects including: + +| Project | Characteristics | +| :-------- | :-------- | +| Cockroach DB |An ACID-compliant, relational database that scales globally. Add a new node to a cluster and CockroachDB takes care of balancing the data across instances and geographies. It creates, manages, and distributes replicas to ensure reliability. It's open source and freely available. | +|Vitess | Vitess is a database solution for deploying, scaling, and managing large clusters of MySQL instances. It can run in a public or private cloud architecture. Vitess combines and extends many important MySQL features and features both vertical and horizontal sharding support. Originated by YouTube, Vitess has been serving all YouTube database traffic since 2011. | + +Cockroach DB full database products, which include .NET support. Vitess is a database clustering system that horizontally scales large clusters of MySQL instances. A key design goal for NewSQL databases is to work natively in Kubernetes, taking advantage of the platform's resiliency and scalability. + +NewSQL databases are designed to thrive in ephemeral cloud environments where underlying VMs can be restarted or rescheduled at a moment's notice. The databases are designed to survive node failures without data loss or downtime. Cockroach DB, for example, is able to survive a machine loss by maintaining three consistent replicas of any data across the nodes in a cluster. + +For a detailed look at the mechanics behind NewSQL databases, see [DASH: Four Properties of Kubernetes-Native Databases](https://thenewstack.io/dash-four-properties-of-kubernetes-native-databases/). + +## Data migration to the cloud + +One of the more time-consuming tasks you may face is migrating data from one data platform to another. The [Azure Data Migration Service](https://azure.microsoft.com/services/database-migration/) can make it easier. It can migrate data from several external database sources into Azure Data platforms with minimal downtime. Target platforms include the following services: + +- Azure SQL Database +- Azure Database for MySQL +- Azure Database for MariaDB +- Azure Database for PostgreSQL +- Azure Cosmos DB + +The service provides recommendations to guide you through the changes required to execute a migration, both small or large. + +>[!div class="step-by-step"] +>[Previous](distributed-data.md) +>[Next](azure-caching.md) diff --git a/docs/architecture/distributed-cloud-native-apps-containers/cloud-native-identity/authentication-authorization.md b/docs/architecture/distributed-cloud-native-apps-containers/cloud-native-identity/authentication-authorization.md new file mode 100644 index 0000000000000..6c9f8b7fb05ec --- /dev/null +++ b/docs/architecture/distributed-cloud-native-apps-containers/cloud-native-identity/authentication-authorization.md @@ -0,0 +1,30 @@ +--- +title: Authentication and authorization in cloud-native apps +description: Architecting Cloud Native .NET Apps for Azure | Authentication and authorization in cloud native apps +ms.date: 04/06/2022 +--- + +# Authentication and authorization in cloud-native apps + +[!INCLUDE [download-alert](../includes/download-alert.md)] + +*Authentication* is the process of determining the identity of a security principal. *Authorization* is the act of granting an authenticated principal permission to perform an action or access a resource. Sometimes authentication is shortened to `AuthN` and authorization is shortened to `AuthZ`. Cloud-native applications need to rely on open HTTP-based protocols to authenticate security principals since both clients and applications could be running anywhere in the world on any platform or device. The only common factor is HTTP. + +Many organizations still rely on local authentication services like Active Directory Federation Services (ADFS). While this approach has traditionally served organizations well for on premises authentication needs, cloud-native applications benefit from systems designed specifically for the cloud. Some benefits of moving from ADFS to Entra ID are outlined in [this analysis](https://oxfordcomputergroup.com/resources/migrate-adfs-entra-id-benefits-best-practices/) including: + +- Easier single sign-on to thousands of apps, including legacy, SaaS, and third-party apps. +- Improve your organization’s security posture. +- Improve risk management, compliance, and governance capabilities by eliminating on-premises infrastructure. +- Reduce infrastructure costs. +- Implement a single control plane for identity and access management. +- Streamline the user experience for accessing organization apps. + +## References + +- [Authentication basics](/azure/active-directory/develop/authentication-scenarios) +- [Access tokens and claims](/azure/active-directory/develop/access-tokens) +- [Migrating from ADFS to Microsoft Entra ID: Benefits and Best Practices](https://oxfordcomputergroup.com/resources/migrate-adfs-entra-id-benefits-best-practices/) + +>[!div class="step-by-step"] +>[Previous](azure-security.md) +>[Next](azure-entra.md) diff --git a/docs/architecture/distributed-cloud-native-apps-containers/cloud-native-identity/azure-entra.md b/docs/architecture/distributed-cloud-native-apps-containers/cloud-native-identity/azure-entra.md new file mode 100644 index 0000000000000..aeb972713c2d2 --- /dev/null +++ b/docs/architecture/distributed-cloud-native-apps-containers/cloud-native-identity/azure-entra.md @@ -0,0 +1,21 @@ +--- +title: Azure Entra ID +description: Architecting Cloud Native .NET Apps for Azure | Azure Entra ID +ms.date: 05/28/2022 +--- + +# Azure Entra ID + +[!INCLUDE [download-alert](../includes/download-alert.md)] + +Microsoft Azure Entra ID is a cloud-based identity and access management service offering identity as a service (IDaaS). Customers use it to configure and maintain who users are, what information to store about them, who can access that information, who can manage it, and what apps can access it. Entra ID can authenticate users for applications configured to use it, providing a single sign-on (SSO) experience. + +Microsoft Entra ID also helps customers access internal resources such as apps on their corporate intranet, and any cloud apps developed for their organization. + +## References + +- [What is Microsoft Entra ID?](https://learn.microsoft.com/en-us/entra/fundamentals/whatis) + +>[!div class="step-by-step"] +>[Previous](authentication-authorization.md) +>[Next](identity-server.md) diff --git a/docs/architecture/distributed-cloud-native-apps-containers/cloud-native-identity/azure-security.md b/docs/architecture/distributed-cloud-native-apps-containers/cloud-native-identity/azure-security.md new file mode 100644 index 0000000000000..fd9e45d91ba92 --- /dev/null +++ b/docs/architecture/distributed-cloud-native-apps-containers/cloud-native-identity/azure-security.md @@ -0,0 +1,277 @@ +--- +title: Azure security for cloud-native apps +description: Architecting Cloud Native .NET Apps for Azure | Azure security for cloud-native apps +ms.date: 04/06/2022 +--- + +# Azure security for cloud-native apps + +[!INCLUDE [download-alert](../includes/download-alert.md)] + +Cloud-native applications can be both easier and more difficult to secure than traditional applications. On the downside, you need to secure more smaller components and dedicate more energy to build out the security infrastructure. The heterogeneous nature of programming languages and styles in most service deployments also means you need to pay more attention to security bulletins from many different providers. + +On the flip side, microservices, each with their own data store, limit the scope of an attack. If an attacker compromises one system, it's probably more difficult for the attacker to make the jump to another system than it is in a monolithic application. Process boundaries are strong boundaries. Also, if a database backup gets exposed, then the damage is more limited, as that database contains only a subset of data and is unlikely to contain personal data. + +## Threat modeling + +No matter if the advantages outweigh the disadvantages of cloud-native applications, the same holistic security mindset must be followed. Security and secure thinking must be part of every step of the development and operations story. When planning an application, ask questions like: + +- What would be the impact of this data being lost? +- How can we limit the damage from bad data being injected into this service? +- Who should have access to this data? +- Are there auditing policies in place around the development and release process? + +All these questions are part of a process called [Introduction to threat modeling](https://learn.microsoft.com/training/modules/tm-introduction-to-threat-modeling/). This process tries to answer the question of what threats there are to the system, how likely the threats are, and the potential damage from them. + +Once the list of threats has been established, you need to decide whether they're worth mitigating. Sometimes a threat is so unlikely and expensive to plan for that it isn't worth spending energy on it. For instance, some state level actor could inject changes into the design of a process that is used by millions of devices. Now, instead of running a certain piece of code in [Ring 3](https://en.wikipedia.org/wiki/Protection_ring), that code is run in Ring 0. This process allows an exploit that can bypass the hypervisor and run the attack code on the bare metal machines, allowing attacks on all the virtual machines that are running on that hardware. + +The altered processors are difficult to detect without a microscope and advanced knowledge of the on silicon design of that processor. This scenario is unlikely to happen and expensive to mitigate, so probably no threat model would recommend building exploit protection for it. + +More likely threats, such as broken access controls permitting `Id` incrementing attacks (replacing `Id=2` with `Id=3` in the URL) or SQL injection, are more attractive to build protections against. The mitigations for these threats are quite reasonable to build and prevent embarrassing security holes that smear the company's reputation. + +## Principle of least privilege + +One of the founding ideas in computer security is the Principle of Least Privilege (POLP). It's actually a foundational idea in most forms of security be it digital or physical. In short, the principle is that any user or process should have the smallest number of rights possible to execute its task. + +As an example, think of the tellers at a bank: accessing the safe is an uncommon activity. So, the average teller can't open the safe themselves. To gain access, they need to escalate their request through a bank manager, who performs additional security checks. + +In a computer system, an example is the rights of a user connecting to a database. In many cases, there's a single user account used to both build the database structure and run the application. Except in extreme cases, the account running the application doesn't need the ability to update schema information. There should be several accounts that provide different levels of privilege. The application should only use the permission level that grants read and writes access to the data in the tables. This kind of protection would eliminate attacks that aimed to drop database tables or introduce malicious triggers. + +Almost every part of building a cloud-native application can benefit from remembering the principle of least privilege. You can find it at play when setting up firewalls, network security groups, roles, and scopes in Role-Based Access Control (RBAC). + +## Penetration testing + +As applications become more complicated the number of attack vectors increases at an alarming rate. Threat modeling is flawed if it is executed by the same people who built the system. In the same way that many developers have trouble envisioning user interactions to build unusable user interfaces, most developers have difficulty seeing every attack vector. It's also possible that the developers building the system aren't well versed in attack methodologies and miss something crucial. + +Penetration testing or "pen testing" involves bringing in external actors to attempt to attack the system. These attackers may be an external consulting company or other developers with good security knowledge from another part of the business. They're given carte blanche to attempt to subvert the system. Frequently, they'll find extensive security holes that need to be patched. Sometimes the attack vector will be something totally unexpected such as a phishing attack against the CEO. + +Azure itself is constantly undergoing attacks from a team of hackers inside Microsoft - [Inside the world of the elite hacker and those trying to stop him](https://www.microsoft.com/industry/blog/financial-services/2016/05/17/red-vs-blue/). Over the years, they've been the first to find dozens of potentially catastrophic attack vectors, closing them before they can be exploited externally. The more tempting a target, the more likely that eternal actors will attempt to exploit it and there are a few targets in the world more tempting than Azure. + +## Monitoring + +Should an attacker attempt to penetrate an application, there should be some warning of it. Frequently, attacks can be spotted by examining the logs from services. Attacks leave telltale signs that can be spotted before they succeed. For instance, an attacker attempting to guess a password will make many requests to a login system. Monitoring around the login system can detect weird patterns that are out of line with the typical access pattern. This monitoring can be turned into an alert to an operations person to activate some sort of countermeasure. A highly mature monitoring system might even take action based on these deviations proactively, adding rules to block requests or throttle responses. + +## Securing the build + +One place where security is often overlooked is around the build process. Not only should the build run security checks, such as scanning for insecure code or checked-in credentials, but the build itself should be secure. If the build server is compromised, then it provides an open door for introducing arbitrary code into the product. + +Imagine that an attacker is looking to steal the passwords of people signing into a web application. They could introduce a build step that modifies the checked-out code to mirror any login request to another server. The next time code goes through the build, it's silently updated. The source code vulnerability scanning won't catch this vulnerability as it runs before the build. Equally, nobody will catch it in a code review because the build steps live on the build server. The exploited code will go to production where it can harvest passwords. Probably there's no audit log of the build process changes, or at least nobody monitoring the audit. + +This scenario is a perfect example of a seemingly low-value target that can be used to break into the system. Once an attacker breaches the perimeter of the system, they can start working on finding ways to elevate their permissions to the point that they can cause real harm anywhere they like. + +## Building secure code + +.NET Framework is designed to be secure. It avoids some of the pitfalls of unmanaged code, such as walking off the ends of arrays. Work is actively done to fix security holes as they're discovered. There's even a [bug bounty program](https://www.microsoft.com/msrc/bounty) that pays researchers to find issues in the framework and report them instead of exploiting them. + +There are many ways to make .NET code more secure. Following guidelines such as the [Secure coding guidelines for .NET](https://learn.microsoft.com/dotnet/standard/security/secure-coding-guidelines) article is a reasonable step to take to ensure that the code is secure from the ground up. The [OWASP top 10](https://owasp.org/www-project-top-ten/) is another invaluable guide to build secure code. + +The build process is a good place to put scanning tools to detect problems in source code before they make it into production. Every project has dependencies on some other packages. A tool that can scan for outdated packages will catch problems in a nightly build. Even when building Docker images, it's useful to check and make sure that the base image doesn't have known vulnerabilities. Another thing to check is that nobody has accidentally checked in credentials. + +## Built-in security + +Azure aims to balance usability and security for most users. Different users are going to have different security requirements, so they need to fine-tune their approach to cloud security. Microsoft publishes a great deal of security information in the [Trust Center](https://www.microsoft.com/trust-center). This resource should be the first stop for those professionals interested in understanding how the built-in attack mitigation technologies work. + +Within the Azure portal, the [Azure Advisor](https://azure.microsoft.com/products/advisor/) is a system that is constantly scanning an environment and making recommendations. Some of these recommendations are designed to save users money, but others are designed to identify potentially insecure configurations, such as having a storage container open to the world and not protected by a virtual network. + +## Azure network infrastructure + +In an on-premises deployment environment, a great deal of energy is dedicated to setting up networking. Setting up routers, switches, and the like is complicated work. Networks allow certain resources to talk to other resources and prevent access in some cases. A frequent network rule is to restrict access to the production environment from the development environment on the off chance that a half-developed piece of code runs awry and deletes a swath of data. + +Out of the box, most PaaS Azure resources have only the most basic and permissive networking setup. For instance, anybody on the Internet can access an app service. New SQL Server instances typically come restricted, so that external parties can't access them, but the IP address ranges used by Azure itself are permitted through. So, while the SQL server is protected from external threats, an attacker only needs to set up an Azure bridgehead from where they can launch attacks against all SQL instances on Azure. + +Fortunately, most Azure resources can be placed into an Azure virtual network that allows fine-grained access control. Similar to the way that on-premises networks establish private networks that are protected from the wider world, virtual networks are islands of private IP addresses that are located within the Azure network. + +![Diagram showing a virtual network in Azure](./media/virtual-network.png) + +**Figure 11-1**. A virtual network in Azure. + +In the same way that on-premises networks have a firewall governing access to the network, you can establish a similar firewall at the boundary of the virtual network. By default, all the resources on a virtual network can still talk to the Internet. It's only incoming connections that require some form of explicit firewall exception. + +With the network established, internal resources like storage accounts can be set up to only allow for access by resources that are also on the virtual network. This firewall provides an extra level of security, should the keys for that storage account be leaked, attackers wouldn't be able to connect to it to exploit the leaked keys. This scenario is another example of the principle of least privilege. + +The nodes in an Azure Kubernetes cluster can participate in a virtual network just like other resources that are more native to Azure. This functionality is called [Azure Container Networking Interface](https://github.com/Azure/azure-container-networking/blob/master/docs/cni.md). In effect, it allocates a subnet within the virtual network on which virtual machines and container images are allocated. + +Continuing down the path of illustrating the principle of least privilege, not every resource within a virtual network needs to talk to every other resource. For instance, in an application that provides a web API over a storage account and a SQL database, it's unlikely that the database and the storage account need to talk to one another. Any data sharing between them would go through the web application. So, a [network security group (NSG)](https://learn.microsoft.com/azure/virtual-network/network-security-groups-overview) could be used to deny traffic between the two services. + +A policy of denying communication between resources can be annoying to implement, especially coming from a background of using Azure without traffic restrictions. On some other clouds, the concept of network security groups is much more prevalent. For instance, the default policy on AWS is that resources can't communicate among themselves until enabled by rules in an NSG. While slower to develop this, a more restrictive environment provides a more secure default. Making use of proper DevOps practices, especially using [Bicep or Terraform](https://learn.microsoft.com/azure/cloud-adoption-framework/ready/considerations/infrastructure-as-code) to manage permissions can make controlling the rules easier. + +Virtual networks can also be useful when setting up communication between on-premises and cloud resources. A virtual private network can be used to attach the two networks together seamlessly. This approach allows running a virtual network without any sort of gateway for scenarios where all the users are on-site. There are a number of technologies that can be used to establish this network. The simplest is to use a [site-to-site VPN](https://learn.microsoft.com/azure/vpn-gateway/tutorial-site-to-site-portal) that can be established between many routers and Azure. Traffic is encrypted and tunneled over the Internet at the same cost per byte as any other traffic. For scenarios where more bandwidth or more security is desirable, Azure offers a service called [Express Route](https://learn.microsoft.com/azure/expressroute/expressroute-introduction) that uses a private circuit between an on-premises network and Azure. It's more costly and difficult to establish but also more secure. + +## Role-Based Access Control (RBAC) for restricting access to Azure resources + +RBAC is a system that provides an identity to applications running in Azure. Applications can access resources using this identity instead of or in addition to using keys or passwords. + +### Security principals + +The first component in RBAC is a security principal. A security principal can be a user, group, service principal, or managed identity. + +![Diagram showing different types of security principals](./media/rbac-security-principal.png) + +**Figure 11-2**. Different types of security principals. + +- **User**: Any user who has an account in Azure Active Directory is a user. +- **Group**: A collection of users from Azure Active Directory. As a member of a group, a user takes on the roles of that group in addition to their own. +- **Service principal**: A security identity under which services or applications run. +- **Managed identity**: An Azure Active Directory identity managed by Azure. Managed identities are typically used when developing cloud applications that manage the credentials for authenticating to Azure services. + +The security principal can be applied to any resource. This aspect means that it's possible to assign a security principal to a container running within Azure Kubernetes, allowing it to access secrets stored in Key Vault. An Azure Function could take on a permission allowing it to talk to an Active Directory instance to validate a JWT for a calling user. Once services are enabled with a service principal, their permissions can be managed granularly using roles and scopes. + +### Roles + +A security principal can take on many roles or, using a more sartorial analogy, wear many hats. Each role defines a series of permissions such as "Read messages from Azure Service Bus endpoint". The effective permission set of a security principal is the combination of all the permissions assigned to all the roles that a security principal has. Azure has a large number of built-in roles and users can define their own roles. + +![Diagram showing RBAC role definitions](./media/rbac-role-definition.png) + +**Figure 11-3**. RBAC role definitions. + +Built into Azure are also a number of high-level roles such as **Owner**, **Contributor**, **Reader**, and **User Account Administrator**. With the Owner role, a security principal can access all resources and assign permissions to others. A Contributor has the same level of access to all resources but they can't assign permissions. A Reader can only view existing Azure resources and a User Account Administrator can manage access to Azure resources. + +More granular built-in roles such as [DNS Zone Contributor](https://learn.microsoft.com/azure/dns/dns-protect-zones-recordsets#azure-role-based-access-control) have rights limited to a single service. Security principals can take on any number of roles. + +### Scopes + +Roles can be applied to a restricted set of resources within Azure. For instance, applying scope to the previous example of reading from a Service Bus queue, you can narrow the permission to a single queue: "Read messages from Azure Service Bus endpoint `blah.servicebus.windows.net/queue1`" + +The scope can be as narrow as a single resource or it can be applied to an entire resource group, subscription, or even management group. + +When testing if a security principal has certain permission, the combination of role and scope are taken into account. This combination provides a powerful authorization mechanism. + +### Deny + +Previously, only "allow" rules were permitted for RBAC. This behavior made some scopes complicated to build. For instance, allowing a security principal access to all storage accounts except one required granting explicit permission to a potentially endless list of storage accounts. Every time a new storage account was created, it would have to be added to this list of accounts. This added management overhead certainly wasn't desirable. + +Deny rules take precedence over allow rules. Now, representing the same "allow all but one" scope could be represented as two rules "allow all" and "deny this specific one". Deny rules not only ease management but allow for resources that are extra secure by denying access to everybody. + +### Checking access + +As you can imagine, having a large number of roles and scopes can make figuring out the effective permission of a service principal quite difficult. Piling deny rules on top of that, only serves to increase the complexity. Fortunately, there's a permissions calculator that can show the effective permissions for any service principal. It's typically found under the IAM tab in the portal, as shown in Figure 11-4. + +![Screenshot showing the permission calculator for an app service.](./media/check-rbac.png) + +**Figure 11-4**. Permission calculator for an app service. + +## Securing secrets + +Passwords and certificates are a common attack vector for attackers. Password-cracking hardware can do a brute-force attack and try to guess billions of passwords per second. So it's important that the passwords that are used to access resources are strong, with a large variety of characters. These passwords are exactly the kind of passwords that are near impossible to remember. Fortunately, the passwords in Azure don't actually need to be known by any human. + +Many security [experts suggest](https://www.troyhunt.com/password-managers-dont-have-to-be-perfect-they-just-have-to-be-better-than-not-having-one/) that using a password manager to keep your own passwords is the best approach. While it centralizes your passwords in one location, it also allows using highly complex passwords and ensuring they're unique for each account. The same system exists within Azure: a central store for secrets. + +### Azure Key Vault + +Azure Key Vault provides a centralized location to store passwords for things such as databases, API keys, and certificates. Once a secret is entered into the vault, it's never shown again and the commands to extract and view it are purposefully complicated. The information in the safe is protected using either software encryption or FIPS 140-2 Level 2 validated Hardware Security Modules. + +Access to the key vault is provided through RBACs, meaning that not just any user can access the information in the vault. Say a web application wishes to access the database connection string stored in Azure Key Vault. To gain access, applications need to run using a service principal. Under this assumed role, they can read the secrets from the safe. There are a number of different security settings that can further limit the access that an application has to the vault, so that it can't update secrets but only read them. + +Access to the key vault can be monitored to ensure that only the expected applications are accessing the vault. The logs can be integrated back into Azure Monitor, unlocking the ability to set up alerts when unexpected conditions are encountered. + +#### Azure Key Vault and .NET Aspire + +You don't have to use .NET Aspire to connect your cloud-native app to Azure Key Vault, but if you do, much of the work is done for you because .NET Aspire includes a built-in Key Vault integration. You can create and configure a Key Vault backing service centrally, in the .NET Aspire app host project. Then, you pass that service to every microservice that uses keys. + +In such a microservice, you use dependency injection to retrieve the backing service and then interact with it using a `SecretClient` object as normal. You can, for example, retrieve a password from the vault and use it to authenticate with another microservice. + +### Kubernetes + +Within Kubernetes, there's a similar service for maintaining small pieces of secret information. Kubernetes Secrets can be set using the `kubectl` executable. + +Creating a secret is as simple as finding the base64 version of the values to be stored: + +```console +echo -n 'admin' | base64 +YWRtaW4= +echo -n '1f2d1e2e67df' | base64 +MWYyZDFlMmU2N2Rm +``` + +Then, add it to a secrets file, named `secret.yml` for example, that looks similar to the following: + +```yml +apiVersion: v1 +kind: Secret +metadata: + name: mysecret +type: Opaque +data: + username: YWRtaW4= + password: MWYyZDFlMmU2N2Rm +``` + +Finally, this file can be loaded into Kubernetes by running the following command: + +```console +kubectl apply -f ./secret.yaml +``` + +These secrets can be mounted into volumes or exposed to container processes through environment variables. The [Twelve-factor app](https://12factor.net/) approach to building applications suggests using the lowest common denominator to transmit settings to an application. Environment variables are the lowest common denominator, because they're supported no matter the operating system or application. + +An alternative to using the built-in Kubernetes Secrets system is to access the secrets in Azure Key Vault from within Kubernetes. The simplest way to do this is to assign an RBAC role to the container looking to load secrets. The application can then use the Azure Key Vault APIs to access the secrets. However, this approach requires modifications to the code and doesn't follow the pattern of using environment variables. Instead, it's possible to inject values into a container. This approach is actually more secure than using the Kubernetes secrets directly, as they can be accessed by users on the cluster. + +## Encryption in transit and at rest + +Keeping data safe is important whether it's on disk or transiting between various different services. The most effective way to keep data from leaking is to encrypt it into a format that can't be easily read by others. Azure supports a wide range of encryption options. + +### In transit + +There are several ways to encrypt traffic on the network in Azure. The access to Azure services is typically done over connections that use Transport Layer Security (TLS). For instance, all the connections to the Azure APIs require TLS connections. Equally, connections to endpoints in Azure storage can be restricted to work only over TLS encrypted connections. + +TLS is a complicated protocol and simply knowing that the connection is using TLS isn't sufficient to ensure security. For instance, TLS 1.0 is chronically insecure, and TLS 1.1 isn't much better. Even within the versions of TLS, there are various settings that can make the connections easier to decrypt. The best course of action is to check and see if the server connection is using up-to-date and well configured protocols. + +This check can be done by an external service such as SSL labs' **SSL Server Test**. A test run against a typical Azure endpoint, in this case a service bus endpoint, yields a near perfect score of **A**. + +Even services like Azure SQL databases use TLS encryption to keep data hidden. The interesting part about encrypting the data in transit using TLS is that it isn't possible, even for Microsoft, to listen in on the connection between computers running TLS. This should provide comfort for companies concerned that their data may be at risk from Microsoft or even a state actor with more resources than the standard attacker. + +![Screenshot showing an SSL labs report with a score of A for a Service Bus endpoint.](./media/ssl-report.png) + +**Figure 11-5**. SSL labs report showing a score of A for a Service Bus endpoint. + +While this level of encryption isn't going to be sufficient for all time, it should inspire confidence that Azure TLS connections are quite secure. Azure will continue to evolve its security standards as encryption improves. It's nice to know that there's somebody watching the security standards and updating Azure as they improve. + +### At rest + +In any application, there are a number of places where data rests on the disk. The application code itself is loaded from some storage mechanism. Most applications also use some kind of a database such as SQL Server, Cosmos DB, or even the amazingly price-efficient Table Storage. These databases all use heavily encrypted storage to ensure that nobody other than the applications with proper permissions can read your data. Even the system operators can't read data that has been encrypted. So customers can remain confident their secret information remains secret. + +### Storage + +The underpinning of much of Azure is the Azure Storage engine. Virtual machine disks are mounted on top of Azure Storage. Azure Kubernetes Service runs on virtual machines that, themselves, are hosted on Azure Storage. Even serverless technologies, such as Azure Functions Apps and Azure Container Instances, run off a disk that is part of Azure Storage. + +If Azure Storage is well encrypted, then it provides a foundation for everything else to also be encrypted. Azure Storage [is encrypted](https://learn.microsoft.com/azure/storage/common/storage-service-encryption) with [FIPS 140-2](https://en.wikipedia.org/wiki/FIPS_140) compliant [256-bit AES](https://en.wikipedia.org/wiki/Advanced_Encryption_Standard). This is a well-regarded encryption technology, having been the subject of extensive academic scrutiny over the last 20 or so years. At present, there's no known practical attack that would allow someone without knowledge of the key to read data encrypted by AES. + +By default, the keys used for encrypting Azure Storage are managed by Microsoft. There are extensive protections in place to prevent malicious access to these keys. However, users with particular encryption requirements can also [provide their own storage keys](https://learn.microsoft.com/azure/storage/common/customer-managed-keys-overview) that are managed in Azure Key Vault. These keys can be revoked at any time, which would effectively render the contents of the Storage account inaccessible. + +Virtual machines use encrypted storage, but it's possible to provide another layer of encryption by using technologies like BitLocker on Windows or DM-Crypt on Linux. These technologies mean that even if the disk image was leaked, it would remain nearly impossible to read it. + +### Azure SQL + +Databases hosted on Azure SQL use a technology called [Transparent Data Encryption (TDE)](https://learn.microsoft.com/sql/relational-databases/security/encryption/transparent-data-encryption) to ensure data remains encrypted. It's enabled by default on all newly created SQL databases, but must be enabled manually for legacy databases. TDE executes real-time encryption and decryption of not just the database, but also the backups and transaction logs. + +The encryption parameters are stored in the `master` database and, on startup, are read into memory for the remaining operations. This means that the `master` database must remain unencrypted. The actual key is managed by Microsoft. However, users with exacting security requirements may provide their own key in Key Vault in much the same way as is done for Azure Storage. The Key Vault provides for such services as key rotation and revocation. + +The "Transparent" part of TDS comes from the fact that there aren't client changes needed to use an encrypted database. While this approach provides for good security, leaking the database password is enough for users to be able to decrypt the data. There's another approach that encrypts individual columns or tables in a database. [Always Encrypted](/azure/sql-database/sql-database-always-encrypted-azure-key-vault) ensures that at no point the encrypted data appears in plain text inside the database. + +Setting up this tier of encryption requires running through a wizard in SQL Server Management Studio to select the sort of encryption and where in Key Vault to store the associated keys. + +![Screenshot showing how to select columns in a table to be encrypted using Always Encrypted.](./media/always-encrypted.png) + +**Figure 11-6**. Selecting columns in a table to be encrypted using Always Encrypted. + +Client applications that read information from these encrypted columns need to make special allowances to read encrypted data. Connection strings need to be updated with `Column Encryption Setting=Enabled` and client credentials must be retrieved from the Key Vault. The SQL Server client must then be primed with the column encryption keys. Once that is done, the remaining actions use the standard interfaces to SQL Client. That is, tools like Dapper and Entity Framework, which are built on top of SQL Client, will continue to work without changes. Always Encrypted may not yet be available for every SQL Server driver on every language. + +The combination of TDE and Always Encrypted, both of which can be used with client-specific keys, ensures that even the most exacting encryption requirements are supported. + +### Cosmos DB + +Cosmos DB is the newest database provided by Microsoft in Azure. It has been built from the ground up with security and cryptography in mind. AES-256bit encryption is standard for all Cosmos DB databases and can't be disabled. Coupled with the TLS 1.2 requirement for communication, the entire storage solution is encrypted. + +![Diagram showing the flow of data encryption within Cosmos DB.](./media/cosmos-encryption.png) + +**Figure 11-7**. The flow of data encryption within Cosmos DB. + +While Cosmos DB doesn't provide for supplying customer encryption keys, there has been significant work done by the team to ensure it remains PCI-DSS compliant without that. Cosmos DB also doesn't support any sort of single column encryption similar to Azure SQL's Always Encrypted yet. + +## Keeping secure + +Azure has all the tools necessary to release a highly secure product. However, a chain is only as strong as its weakest link. If the applications deployed on top of Azure aren't developed with a proper security mindset and good security audits, then they become the weak link in the chain. There are many great static analysis tools, encryption libraries, and security practices that can be used to ensure that the software installed on Azure is as secure as Azure itself. Examples include [static analysis tools](https://www.mend.io/sca/), [encryption libraries](https://www.libressl.org/), and [security practices](https://www.microsoft.com/industry/blog/financial-services/2016/05/17/red-vs-blue/). + +>[!div class="step-by-step"] +>[Previous](code-provenance.md) +>[Next](authentication-authorization.md) diff --git a/docs/architecture/distributed-cloud-native-apps-containers/cloud-native-identity/cloud-native-security.md b/docs/architecture/distributed-cloud-native-apps-containers/cloud-native-identity/cloud-native-security.md new file mode 100644 index 0000000000000..b8fc04c971686 --- /dev/null +++ b/docs/architecture/distributed-cloud-native-apps-containers/cloud-native-identity/cloud-native-security.md @@ -0,0 +1,25 @@ +--- +title: Cloud-native security +description: Architecting Cloud Native .NET Apps for Azure | Cloud-native security +ms.date: 04/06/2022 +--- + +# Cloud-native security + +[!INCLUDE [download-alert](../includes/download-alert.md)] + +Not a day goes by without the news of a company being hacked or somehow losing their customers' data. Even countries and large jurisdictions aren't immune to the problems created by treating security as an afterthought. For years, companies have treated the security of customer data and, in fact, of their entire networks as something of a "nice to have". Windows servers were left unpatched, ancient versions of PHP kept running, and databases left wide open to the world. + +However, real-world consequences are now almost inevitable for not maintaining a security mindset when building and deploying applications. Many companies learned the hard way what can happen when servers and desktops aren't patched during the 2017 outbreak of [NotPetya](https://www.wired.com/story/notpetya-cyberattack-ukraine-russia-code-crashed-the-world/). The cost of these attacks has easily reached into the billions, with some estimates putting the losses from this single attack at 10 billion US dollars. + +Even governments aren't immune to hacking incidents. The city of Baltimore was held ransom by [criminals](https://www.vox.com/recode/2019/5/21/18634505/baltimore-ransom-robbinhood-mayor-jack-young-hackers) making it impossible for citizens to pay their bills or use city services. + +There has also been an increase in legislation that mandates certain data protections for personal data. In Europe, GDPR has been in effect since 2018 and, in the same year, California created their own version called the California Consumer Privacy Act (CCPA), which came into effect January 1, 2020. The fines under GDPR can be so punishing as to put companies out of business. Google has already been fined 50 million Euros for violations, but that's just a drop in the ocean compared with the potential fines. + +In short, security is serious business. Cloud native apps have their own security challenges. For example, if a user is authenticated by one microservice, how do you authorize them to access other microservices. + +In this chapter you will learn about identity, authorization, and authentication, how security is provided on-premises and in the cloud, as well as about security products, both from Microsoft and third-party open-source providers. + +>[!div class="step-by-step"] +>[Previous](../monitoring-health/azure-monitor.md) +>[Next](identity.md) diff --git a/docs/architecture/distributed-cloud-native-apps-containers/cloud-native-identity/code-provenance.md b/docs/architecture/distributed-cloud-native-apps-containers/cloud-native-identity/code-provenance.md new file mode 100644 index 0000000000000..31682859d70f5 --- /dev/null +++ b/docs/architecture/distributed-cloud-native-apps-containers/cloud-native-identity/code-provenance.md @@ -0,0 +1,31 @@ +--- +title: Code provenance +description: Architecting Cloud Native .NET Apps for Azure | Code provenance +ms.date: 06/03/2024 +--- + +# Code provenance: the lineage of code + +[!INCLUDE [download-alert](../includes/download-alert.md)] + +At its core, code provenance refers to the origin or source of code, encompassing the understanding of where code comes from, who authored it, how it has been built, and the ways in which it has been modified over time. This intricate tapestry of information is not just a historical record; it is a vital component for maintaining the integrity and security of software systems. + +## Why does code provenance matter? + +In a digital era where software permeates every aspect of our lives, ensuring the security and reliability of code is paramount. Code provenance serves as a strategic approach to achieve this goal by providing a verifiable attestation of the origin of all code running in production environments. It acts as a root of trust, allowing teams to define and enforce policies throughout each stage of the software development process. + +## The journey of code: from creation to deployment + +The journey of code from its inception to deployment is marked by various stages, each with its own set of checks and balances. Code reviews, continuous integration (CI) testing, security unit tests, and broader analysis at the repository level are standard practices that help in identifying errors, bugs, and vulnerabilities. However, these traditional checks are not foolproof. Code provenance adds an additional layer of defense against scenarios where either malicious or non-malicious code could bypass these checks. + +## Scenarios addressed by code provenance + +Code provenance is particularly effective in mitigating risks posed by well-intentioned insiders who may lack experience, malicious insiders with legitimate access attempting to introduce harmful code, and malicious outsiders who have gained unauthorized control. By tracking the provenance of code, organizations can ensure that every line of code has undergone proper review and testing before making its way into production. + +## The provenance report: a glossary of code lineage + +A provenance report categorizes code into several buckets, such as brand new code, code less than two weeks old, and code more than two years old. These classifications help teams understand the amount of churn, the necessity of rewriting legacy code, and the efforts made to modernize and pay down technical debt. + +>[!div class="step-by-step"] +>[Previous](code-security.md) +>[Next](azure-security.md) diff --git a/docs/architecture/distributed-cloud-native-apps-containers/cloud-native-identity/code-security.md b/docs/architecture/distributed-cloud-native-apps-containers/cloud-native-identity/code-security.md new file mode 100644 index 0000000000000..ba4cda8a4f717 --- /dev/null +++ b/docs/architecture/distributed-cloud-native-apps-containers/cloud-native-identity/code-security.md @@ -0,0 +1,32 @@ +--- +title: Code security and secure supply chain +description: Architecting Cloud Native .NET Apps for Azure | Code security and secure supply chain +ms.date: 06/03/2024 +--- + +# Ensuring code security in the age of digital supply chain vulnerabilities + +[!INCLUDE [download-alert](../includes/download-alert.md)] + +In the rapidly evolving digital landscape, code security has become a paramount concern for organizations worldwide. As businesses increasingly rely on software to drive operations, the security of their code and the integrity of their supply chains have emerged as critical factors in safeguarding against cyber threats. + +## The importance of code security + +Code security refers to the practices and measures taken to protect software code from unauthorized access and vulnerabilities that could lead to exploitation. This encompasses a range of strategies, from secure coding practices to vulnerability scanning and patch management. The goal is to ensure that the software remains robust against attacks and that any potential security flaws are identified and remedied swiftly. + +## Challenges in the supply chain + +The software supply chain is a complex network of developers, vendors, and third-party components that come together to create the final product. Each element in this chain presents an opportunity for security breaches, making it essential to establish secure supply chain practices. This includes vetting third-party components for vulnerabilities, continuously monitoring for new threats, and ensuring that all parties adhere to strict security standards. + +## Best practices for secure code and secure supply chains + +1. **Implement secure coding standards**: Developers should follow secure coding guidelines to minimize vulnerabilities from the outset. +1. **Conduct regular code reviews**: Peer reviews and automated tools can help detect issues early in the development process. +1. **Utilize static and dynamic analysis tools**: These tools can identify potential security flaws within the codebase. +1. **Stay updated on vulnerabilities**: Keeping abreast of the latest security threats allows for timely updates and patches. +1. **Secure third-party components**: Ensure that all external code and libraries are from reputable sources and are up-to-date. +1. **Educate and train staff**: A well-informed team is crucial in maintaining a secure development environment. + +>[!div class="step-by-step"] +>[Previous](security-concepts.md) +>[Next](code-provenance.md) diff --git a/docs/architecture/distributed-cloud-native-apps-containers/cloud-native-identity/identity-server.md b/docs/architecture/distributed-cloud-native-apps-containers/cloud-native-identity/identity-server.md new file mode 100644 index 0000000000000..b3f1e486977c3 --- /dev/null +++ b/docs/architecture/distributed-cloud-native-apps-containers/cloud-native-identity/identity-server.md @@ -0,0 +1,110 @@ +--- +title: IdentityServer for cloud native apps +description: Architecting Cloud Native .NET Apps for Azure | IdentityServer for cloud native apps +ms.date: 04/06/2022 +--- + +# IdentityServer for cloud-native apps + +[!INCLUDE [download-alert](../includes/download-alert.md)] + +**IdentityServer** is an authentication server that implements OpenID Connect (OIDC) and OAuth 2.0 standards for ASP.NET Core. It's designed to provide a common way to authenticate requests to all of your applications, whether they're web, native, mobile, or API endpoints. IdentityServer can be used to implement Single Sign-On (SSO) for multiple applications and application types. It can be used to authenticate actual users via sign-in forms and similar user interfaces as well as service-based authentication that typically involves token issuance, verification, and renewal without any user interface. IdentityServer is designed to be a customizable solution. Each instance is typically customized to suit an individual organization and/or a set of applications' needs. + +## Common web app scenarios + +Typically, applications need to support some or all of the following scenarios: + +- Human users accessing web applications with a browser. +- Human users accessing back-end Web APIs from browser-based apps. +- Human users on mobile/native clients accessing back-end Web APIs. +- Other applications accessing back-end Web APIs (without an active user or user interface). +- Any application may need to interact with other Web APIs, using its own identity or delegating to the user's identity. + +![Diagram showing common web application types and scenarios.](./media/application-types.png) + +**Figure 11-8**. Application types and scenarios. + +In each of these scenarios, the exposed functionality needs to be secured against unauthorized use. At a minimum, this typically requires authenticating the user or principal making a request for a resource. This authentication may use one of several common protocols such as SAML2p, WS-Fed, or OpenID Connect. Communicating with APIs typically uses the OAuth2 protocol and its support for security tokens. Separating these critical cross-cutting security concerns and their implementation details from the applications themselves ensures consistency and improves security and maintainability. Outsourcing these concerns to a dedicated product like IdentityServer helps the requirement for every application to solve these problems itself. + +IdentityServer provides middleware that runs within an ASP.NET Core application and adds support for OpenID Connect and OAuth2 (see [supported specifications](https://docs.duendesoftware.com/identityserver/v7/overview/specs/)). Organizations would create their own ASP.NET Core app using IdentityServer middleware to act as the STS for all of their token-based security protocols. The IdentityServer middleware exposes endpoints to support standard functionality, including: + +- **Authorize**: Authenticate the end user +- **Token**: Request a token programmatically +- **Discovery**: Obtain metadata about the server +- **User Info**: Get user information with a valid access token +- **Device Authorization**: Used to start device flow authorization +- **Introspection**: Validate a token +- **Revocation**: Invalidate a token +- **End Session**: trigger single sign-out across all apps + +## Getting started + +IdentityServer is available under dual license: + +- **RPL**: lets you use the IdentityServer4 free if used in open-source work +- **Paid**: lets you use the IdentityServer4 in a commercial scenario + +For more information about pricing, see the official product's [pricing page](https://duendesoftware.com/products/identityserver). + +You can add it to your applications using its NuGet packages. The main package is [IdentityServer](https://www.nuget.org/packages/Duende.IdentityServer), which has been downloaded over 9 million times. + +## Configuration + +IdentityServer supports different kinds of protocols and social authentication providers that can be configured as part of each custom installation. This is typically done in the ASP.NET Core application's `Program` class (or in the `Startup` class in the `ConfigureServices` method). The configuration involves specifying the supported protocols and the paths to the servers and endpoints that will be used. Figure 11-2 shows an example configuration taken from the IdentityServer Quickstart UI project: + +```csharp +public class Startup +{ + public void ConfigureServices(IServiceCollection services) + { + services.AddMvc(); + + // some details omitted + services.AddIdentityServer(); + + services.AddAuthentication() + .AddGoogle("Google", options => + { + options.SignInScheme = IdentityServerConstants.ExternalCookieAuthenticationScheme; + + options.ClientId = ""; + options.ClientSecret = ""; + }) + .AddOpenIdConnect("demoidsrv", "IdentityServer", options => + { + options.SignInScheme = IdentityServerConstants.ExternalCookieAuthenticationScheme; + options.SignOutScheme = IdentityServerConstants.SignoutScheme; + + options.Authority = "https://demo.identityserver.io/"; + options.ClientId = "implicit"; + options.ResponseType = "id_token"; + options.SaveTokens = true; + options.CallbackPath = new PathString("/signin-idsrv"); + options.SignedOutCallbackPath = new PathString("/signout-callback-idsrv"); + options.RemoteSignOutPath = new PathString("/signout-idsrv"); + + options.TokenValidationParameters = new TokenValidationParameters + { + NameClaimType = "name", + RoleClaimType = "role" + }; + }); + } +} +``` + +**Figure 11-9**. Configuring IdentityServer. + +## JavaScript clients + +Many cloud-native applications use server-side APIs and rich client single page applications (SPAs) on the front end. IdentityServer ships a [JavaScript client](https://docs.duendesoftware.com/identityserver/v7/quickstarts/js_clients/) (`oidc-client.js`) via NPM that can be added to SPAs to enable them to use IdentityServer for sign in, sign out, and token-based authentication of web APIs. + +## References + +- [IdentityServer documentation](https://docs.duendesoftware.com/identityserver/v7/) +- [Application types](https://learn.microsoft.com/en-gb/entra/identity-platform/v2-app-types) +- [JavaScript OIDC client](https://docs.duendesoftware.com/identityserver/v7/quickstarts/js_clients/) + +>[!div class="step-by-step"] +>[Previous](azure-entra.md) +>[Next](keycloak.md) diff --git a/docs/architecture/distributed-cloud-native-apps-containers/cloud-native-identity/identity.md b/docs/architecture/distributed-cloud-native-apps-containers/cloud-native-identity/identity.md new file mode 100644 index 0000000000000..1e8568b195e70 --- /dev/null +++ b/docs/architecture/distributed-cloud-native-apps-containers/cloud-native-identity/identity.md @@ -0,0 +1,27 @@ +--- +title: Cloud-native identity +description: Architecting Cloud Native .NET Apps for Azure | Cloud-native identity +ms.date: 06/03/2024 +--- + +# Cloud-native identity + +[!INCLUDE [download-alert](../includes/download-alert.md)] + +Most software applications need to have some knowledge of the user or process that is calling them. The user or process interacting with an application is known as a security principal, and the process of authenticating and authorizing these principals is known as identity management, or simply *identity*. Simple applications may include all of their identity management within the application, but this approach doesn't scale well with many applications and many kinds of security principals. Windows supports the use of Active Directory (AD) to provide centralized authentication and authorization. + +While this solution is effective within corporate networks, it isn't designed for use by users or applications that are outside of the AD domain. With the growth of Internet-based applications and the rise of cloud-native apps, security models have evolved. + +In today's cloud-native identity model, architecture is assumed to be distributed. Apps can be deployed anywhere and may communicate with other apps anywhere. Clients may communicate with these apps from anywhere, and in fact, clients may consist of any combination of platforms and devices. Cloud-native identity solutions use open standards to achieve secure application access from clients. These clients range from human users on PCs or phones, to other apps hosted anywhere online, to set-top boxes and IOT devices running any software platform anywhere in the world. Microsoft Azure Entra ID is Microsoft's cloud-based identity and access management service + +Modern cloud-native identity solutions typically use access tokens that are issued by a secure token service/server (STS) to a security principal once their identity is determined. The access token, typically a JSON Web Token (JWT), includes *claims* about the security principal. These claims will minimally include the user's identity but may also include other claims that can be used by applications to determine the level of access to grant the principal. + +Typically, the STS is only responsible for authenticating the principal. Determining their level of access to resources is left to other parts of the application. + +## References + +- [What is the Microsoft identity platform?](https://learn.microsoft.com/en-us/entra/identity-platform/v2-overview) + +>[!div class="step-by-step"] +>[Previous](cloud-native-security.md) +>[Next](security-concepts.md) diff --git a/docs/architecture/distributed-cloud-native-apps-containers/cloud-native-identity/keycloak.md b/docs/architecture/distributed-cloud-native-apps-containers/cloud-native-identity/keycloak.md new file mode 100644 index 0000000000000..44229baa4655f --- /dev/null +++ b/docs/architecture/distributed-cloud-native-apps-containers/cloud-native-identity/keycloak.md @@ -0,0 +1,55 @@ +--- +title: Keycloak +description: Architecting Cloud Native .NET Apps for Azure | Keycloak +ms.date: 04/06/2022 +--- + +# Keycloak: An open source identity and access management solution + +Keycloak, an open-source identity and access management (IAM) solution, stands out as a robust platform designed to simplify the complexities of modern authentication and authorization processes. + +## Simplifying authentication with Single Sign-On (SSO) + +Keycloak’s SSO capability is a significant feature that streamlines the user experience. It allows users to authenticate once and gain access to multiple applications without the need to log in again for each service. This not only enhances user convenience but also reduces the potential for password fatigue and the associated security risks. + +## Social login and identity brokering + +The platform supports social login, which enables users to sign in with their existing social media accounts, such as Facebook or Google, through a straightforward configuration in the admin console. Additionally, Keycloak can act as an identity broker, interfacing with other OpenID Connect or SAML 2.0 identity providers, further centralizing and simplifying user authentication. + +## User federation and management + +Keycloak can integrate with existing LDAP or Active Directory servers, allowing organizations to leverage their current user directories. For those with users in non-standard stores, Keycloak offers the flexibility to implement custom user federation providers. + +## Comprehensive administration and account management + +Administrators benefit from Keycloak’s comprehensive administration console, which provides centralized control over the server’s features, including user federation, identity brokering, and the creation of fine-grained authorization policies. Users, on the other hand, can manage their accounts through the account management console, handling tasks such as profile updates, password changes, and two-factor authentication setup. + +## Adherence to standard protocols + +Keycloak adheres to standard protocols like OpenID Connect, OAuth 2.0, and SAML, ensuring compatibility and ease of integration with a wide range of applications and services. + +## Fine-grained authorization services + +Beyond role-based authorization, Keycloak offers fine-grained authorization services, allowing administrators to define precise access policies for their services directly from the administration console. + +## Keycloak and .NET Aspire + +The .NET Aspire stack includes a built-in integration to help you interact with Keycloak. As with other .NET Aspire integrations, you create a Keycloak backing service in the .NET Aspire app host project, and then pass it to the microservices that authenticate users. + +In each of those microservices, you can add authentication types and configure their options to identify users. For example, this code adds JSON Web Token (JWT) bearer authentication: + +```csharp +builder.Services.AddAuthentication() + .AddKeycloakJwtBearer("keycloak", realm: "eShop", options => + { + options.Audience = "customers.api"; + }); +``` + +## References + +For more detailed information and the latest updates on Keycloak, you can visit their [official website](https://www.keycloak.org/). + +>[!div class="step-by-step"] +>[Previous](identity-server.md) +>[Next](../testing-distributed-apps/challenges-of-distributed-app-testing.md) diff --git a/docs/architecture/distributed-cloud-native-apps-containers/cloud-native-identity/media/always-encrypted.png b/docs/architecture/distributed-cloud-native-apps-containers/cloud-native-identity/media/always-encrypted.png new file mode 100644 index 0000000000000..c3f91d7df6261 Binary files /dev/null and b/docs/architecture/distributed-cloud-native-apps-containers/cloud-native-identity/media/always-encrypted.png differ diff --git a/docs/architecture/distributed-cloud-native-apps-containers/cloud-native-identity/media/application-types.png b/docs/architecture/distributed-cloud-native-apps-containers/cloud-native-identity/media/application-types.png new file mode 100644 index 0000000000000..2c2a370e8ad29 Binary files /dev/null and b/docs/architecture/distributed-cloud-native-apps-containers/cloud-native-identity/media/application-types.png differ diff --git a/docs/architecture/distributed-cloud-native-apps-containers/cloud-native-identity/media/check-rbac.png b/docs/architecture/distributed-cloud-native-apps-containers/cloud-native-identity/media/check-rbac.png new file mode 100644 index 0000000000000..a2f587fe1f0fc Binary files /dev/null and b/docs/architecture/distributed-cloud-native-apps-containers/cloud-native-identity/media/check-rbac.png differ diff --git a/docs/architecture/distributed-cloud-native-apps-containers/cloud-native-identity/media/cosmos-encryption.png b/docs/architecture/distributed-cloud-native-apps-containers/cloud-native-identity/media/cosmos-encryption.png new file mode 100644 index 0000000000000..d0790c7f3e785 Binary files /dev/null and b/docs/architecture/distributed-cloud-native-apps-containers/cloud-native-identity/media/cosmos-encryption.png differ diff --git a/docs/architecture/distributed-cloud-native-apps-containers/cloud-native-identity/media/rbac-role-definition.png b/docs/architecture/distributed-cloud-native-apps-containers/cloud-native-identity/media/rbac-role-definition.png new file mode 100644 index 0000000000000..b923bb14f35c3 Binary files /dev/null and b/docs/architecture/distributed-cloud-native-apps-containers/cloud-native-identity/media/rbac-role-definition.png differ diff --git a/docs/architecture/distributed-cloud-native-apps-containers/cloud-native-identity/media/rbac-security-principal.png b/docs/architecture/distributed-cloud-native-apps-containers/cloud-native-identity/media/rbac-security-principal.png new file mode 100644 index 0000000000000..4c23c04dee0a4 Binary files /dev/null and b/docs/architecture/distributed-cloud-native-apps-containers/cloud-native-identity/media/rbac-security-principal.png differ diff --git a/docs/architecture/distributed-cloud-native-apps-containers/cloud-native-identity/media/ssl-report.png b/docs/architecture/distributed-cloud-native-apps-containers/cloud-native-identity/media/ssl-report.png new file mode 100644 index 0000000000000..dec9720cbe839 Binary files /dev/null and b/docs/architecture/distributed-cloud-native-apps-containers/cloud-native-identity/media/ssl-report.png differ diff --git a/docs/architecture/distributed-cloud-native-apps-containers/cloud-native-identity/media/virtual-network.png b/docs/architecture/distributed-cloud-native-apps-containers/cloud-native-identity/media/virtual-network.png new file mode 100644 index 0000000000000..08ad5ddc61527 Binary files /dev/null and b/docs/architecture/distributed-cloud-native-apps-containers/cloud-native-identity/media/virtual-network.png differ diff --git a/docs/architecture/distributed-cloud-native-apps-containers/cloud-native-identity/security-concepts.md b/docs/architecture/distributed-cloud-native-apps-containers/cloud-native-identity/security-concepts.md new file mode 100644 index 0000000000000..91d2c278d71b8 --- /dev/null +++ b/docs/architecture/distributed-cloud-native-apps-containers/cloud-native-identity/security-concepts.md @@ -0,0 +1,31 @@ +--- +title: Cloud security concepts +description: Architecting Cloud Native .NET Apps for Azure | Cloud security concepts +ms.date: 06/03/2024 +--- + +# Cloud security concepts + +[!INCLUDE [download-alert](../includes/download-alert.md)] + +In the digital age, cloud security is paramount for protecting sensitive data as businesses and individuals increasingly rely on cloud services. This topic delves into the key concepts of cloud security, including SSL, REST, TLS, and secret management, which form the bedrock of secure cloud operations. + +## Identity and Access Management (IAM) + +IAM systems are essential for controlling users' access to cloud resources. They authenticate and authorize individuals to access specific resources, ensuring that only the right people have the right access at the right times. + +## SSL (Secure Sockets Layer) and TLS (Transport Layer Security) + +SSL and TLS are cryptographic protocols designed to provide secure communication over computer networks and the Internet. SSL, the predecessor to TLS, is commonly used to secure transactions on the web. However, TLS has largely replaced SSL due to improved security features. Both protocols use a combination of asymmetric and symmetric encryption to ensure that data transmitted between the client and server is protected. This prevents eavesdropping, tampering, and message forgery. + +## REST (Representational State Transfer) + +REST is an architectural style for designing networked applications. It relies on the stateless, client-server, cacheable communications protocol HTTP. RESTful services increase cloud security by using standard HTTP methods, which are understood by network security devices, thus it doesn't require any special configuration when transiting firewalls. + +## Secret management + +Secret management refers to the tools and methods for managing digital authentication credentials (secrets), including passwords, keys, APIs, and tokens. In cloud environments, secret management is critical as it ensures that secrets are stored, transmitted, and used securely, without exposing them to unauthorized entities. Effective secret management systems help in rotating, controlling, and auditing secrets throughout their lifecycle. + +>[!div class="step-by-step"] +>[Previous](identity.md) +>[Next](code-security.md) diff --git a/docs/architecture/distributed-cloud-native-apps-containers/cloud-native-resiliency/application-resiliency-patterns.md b/docs/architecture/distributed-cloud-native-apps-containers/cloud-native-resiliency/application-resiliency-patterns.md new file mode 100644 index 0000000000000..1ef23d359ce5a --- /dev/null +++ b/docs/architecture/distributed-cloud-native-apps-containers/cloud-native-resiliency/application-resiliency-patterns.md @@ -0,0 +1,88 @@ +--- +title: Application resiliency patterns +description: Architecture for Distributed Cloud-Native Apps with .NET Aspire & Containers | Application resiliency patterns +author: +ms.date: 04/17/2024 +--- + +# Application resiliency patterns + +[!INCLUDE [download-alert](../includes/download-alert.md)] + +The first line of defense for your distributed application is resiliency. + +While you could invest considerable time writing your own resiliency framework, there's no need to with the latest releases of .NET. The .NET platform now includes a set of libraries that provide resiliency features out of the box. These libraries are built on top of the [Polly](https://github.com/App-vNext/Polly) library. Polly defines a number of resiliency strategies that you can use to make your applications more resilient to transient faults. These strategies include: + +| Strategy | Experience | +| :-------- | :-------- | +| Retry | Configures retry operations on designated operations. | +| Circuit breaker | Blocks requested operations for a predefined period when faults exceed a configured threshold. | +| Timeout | Places a limit on the duration for which a caller can wait for a response. | +| Bulkhead | Constrains actions to fixed-size resource pool to prevent failing calls from swamping a resource. | +| Cache | Stores responses automatically. | +| Fallback | Defines structured behaviors that attempt to handle failures. | + +Note that these resiliency strategies apply to request messages, whether the come from an external client or from a back-end service. The goal is to compensate the request for a service that might be momentarily unavailable. Such short-lived interruptions typically manifest themselves with the HTTP status codes shown in the following table. + +| HTTP Status Code| Cause | +| :-------- | :-------- | +| 404 | Not Found | +| 408 | Request timeout | +| 429 | Too many requests (your request has likely been throttled) | +| 502 | Bad gateway | +| 503 | Service unavailable | +| 504 | Gateway timeout | + +Not all 400 and 500 status codes should be retried. For example, a 403 status code indicates that the requested operation is forbidden. The caller isn't authorized and won't be permitted to successfully complete the operation no matter how many times they retry. + +Take care to retry only those operations caused by temporary failures that might respond on subsequent attempts. + +Next, let's expand on the retry and circuit breaker patterns. + +### Retry pattern + +In a distributed cloud-native environment, calls to services and cloud resources can fail because of transient (short-lived) failures, which correct themselves after a brief period. Implementing a retry strategy helps a cloud-native service mitigate these scenarios. + +The [Retry pattern](/azure/architecture/patterns/retry) enables a service to retry a failed request operation a configurable number of times with an exponentially increasing wait time. + +:::image type="content" source="media/retry-pattern.png" alt-text="A diagram showing a retry pattern in action." border="false"::: + +**Figure 9-2**. Retry pattern in action + +In the previous figure, a retry pattern has been implemented for a request operation. It's configured to allow up to four retries before failing with a backoff interval (wait time) of two seconds, which exponentially doubles for each subsequent attempt. + +- The first invocation fails and returns an HTTP status code of 500. The application waits for two seconds and retries the call. +- The second invocation also fails and returns an HTTP status code of 500. The application now doubles the backoff interval to four seconds and retries the call. +- Finally, the third call succeeds. +- In this scenario, the retry operation would have attempted up to four retries while doubling the backoff duration before failing the call. +- Had the 4th retry attempt failed, a fallback policy would be invoked to gracefully handle the problem. + +It's important to increase the backoff period before retrying the call to allow the service time to self-correct. It's a best practice to implement an exponentially increasing backoff. It allows adequate correction time for the fault and prevents the temporarily failed microservice from being flooded with calls when it restarts. + +## Circuit breaker pattern + +While the retry pattern can help salvage a request entangled in a partial failure, there are situations where failures can be caused by unanticipated events that will require longer periods of time to resolve. These faults can range in severity from a partial loss of connectivity to the complete failure of a service. In these situations, it's pointless for an application to continually retry an operation that is unlikely to succeed. + +If you executing continual retry operations on a non-responsive service, can cause a denial of service scenario: You flood your service with calls that exhaust resources such as memory, threads, and database connections. Such a flood of requests can cause further failures in unrelated parts of the system that use the same resources. + +In these situations, it would be preferable for the operation to fail immediately and only attempt to invoke the service if it's likely to succeed. + +The [Circuit Breaker pattern](/azure/architecture/patterns/circuit-breaker) can prevent an application from repeatedly trying an operation that's likely to fail. After a pre-defined number of failed calls, it blocks all traffic to the service. Periodically, it will allow a trial call to determine whether the fault has resolved. + +:::image type="content" source="media/circuit-breaker-pattern.png" alt-text="A diagram showing the circuit breaker pattern in action." border="false"::: + +**Figure 9-3**. Circuit breaker pattern in action + +In the previous figure, a circuit breaker pattern has been added to the original retry pattern. Note that after 100 failed requests, the circuit breaker opens and blocks further calls to the service. The CheckCircuit value, set at 30 seconds, specifies how often the library allows one request to proceed to the service. If that call succeeds, the circuit closes and the service is once again available to traffic. + +Keep in mind that the point of the circuit breaker pattern is *different* to that of the retry pattern. The retry pattern enables an application to retry an operation in the expectation that it will succeed. The circuit breaker pattern prevents an application from retrying an operation that is likely to fail. Typically, an application will *combine* these two patterns by using the retry pattern to invoke an operation through a circuit breaker. + +## Testing for resiliency + +You can't alway test for resiliency in the same way that you test application functionality, by running unit tests, integration tests, and so on. Instead, you must test how the end-to-end workload performs under failure conditions, which only occur intermittently. For example, you can inject failures by crashing processes, expiring certificates, making dependent services unavailable, and so on. Frameworks like [chaos-monkey](https://github.com/Netflix/chaosmonkey) can be used for such chaotic testing. + +Application resiliency is a must for handling problematic requested operations. But, it's only half of the story. Next, we cover resiliency features available in the Azure cloud. + +>[!div class="step-by-step"] +>[Previous](cloud-native-resiliency.md) +>[Next](cloud-infrastructure-resiliency-azure.md) diff --git a/docs/architecture/distributed-cloud-native-apps-containers/cloud-native-resiliency/cloud-infrastructure-resiliency-azure.md b/docs/architecture/distributed-cloud-native-apps-containers/cloud-native-resiliency/cloud-infrastructure-resiliency-azure.md new file mode 100644 index 0000000000000..371bfd3225292 --- /dev/null +++ b/docs/architecture/distributed-cloud-native-apps-containers/cloud-native-resiliency/cloud-infrastructure-resiliency-azure.md @@ -0,0 +1,97 @@ +--- +title: Cloud infrastructure resiliency with Azure +description: Architecture for Distributed Cloud-Native Apps with .NET Aspire & Containers | Cloud infrastructure resiliency with Azure +author: +ms.date: 04/06/2022 +--- + +# Azure platform resiliency + +[!INCLUDE [download-alert](../includes/download-alert.md)] + +Building a reliable application in the cloud is different from traditional on-premises application development. While historically you purchased higher-end hardware to scale up, in a cloud environment you scale out. Instead of trying to prevent failures, the goal is to minimize their effects and keep the system stable. + +That said, reliable cloud applications display distinct characteristics: + +- They're resilient, recover gracefully from problems, and continue to function. +- They're highly available (HA) and run as designed in a healthy state with no significant downtime. + +Understanding how these characteristics work together, and how they affect cost, is essential to building a reliable cloud-native application. We'll next look at ways that you can build resiliency and availability into your cloud-native applications with features from the Azure cloud. + +## Design with resiliency + +We've said that resiliency enables your application to react to failure and still remain functional. The whitepaper, [Resilience in Azure](https://azure.microsoft.com/mediahandler/files/resourcefiles/resilience-in-azure-whitepaper/Resilience%20in%20Azure.pdf), provides guidance for achieving resilience in the Azure platform. Here are some key recommendations: + +- **Hardware failure.** Build redundancy into the application by deploying components across different fault domains. For example, ensure that Azure VMs are placed in different racks by using Availability Sets. + +- **Datacenter failure.** Build redundancy into the application with fault isolation zones across datacenters. For example, ensure that Azure VMs are placed in different fault-isolated datacenters by using Azure Availability Zones. + +- **Regional failure.** Replicate the data and components into at least one other region so that applications can be quickly recovered. For example, use Azure Site Recovery to replicate Azure VMs to another Azure region. + +- **Heavy load.** Load balance across instances to handle spikes in usage. For example, put two or more Azure VMs behind a load balancer to distribute traffic to all VMs. + +- **Accidental data deletion or corruption.** Back up data so it can be restored if there's any deletion or corruption. For example, use Azure Backup to periodically back up your Azure VMs. + +## Design with redundancy + +Failures vary in the scope of their impact. A hardware failure, such as a failed disk, can affect a single node in a cluster. A failed network switch could affect an entire server rack. Less common failures, such as loss of power, could disrupt a whole datacenter. Rarely, an entire region becomes unavailable. + +[Redundancy](/azure/architecture/guide/design-principles/redundancy) is one way to provide application resilience. The exact level of redundancy needed depends upon your business requirements and will affect both the cost and complexity of your system. For example, a multi-region deployment has more redundancy but is more expensive and more complex to manage than a single-region deployment. You'll need operational procedures to manage failover and failback. The additional cost and complexity might be justified for some business scenarios, but not others. + +To architect redundancy, you need to identify the critical paths in your application, and then determine if there's redundancy at each point in the path. If a subsystem should fail, will the application fail over to something else? You also need a clear understanding of those features built into the Azure cloud platform that you can use to meet your redundancy requirements. Here are recommendations for architecting redundancy: + +- **Deploy multiple instances of services.** If your application depends on a single instance of a service, it creates a single point of failure. Provisioning multiple instances improves both resiliency and scalability. When hosting in Azure Kubernetes Service, you can declaratively configure redundant instances (replica sets) in the Kubernetes manifest file. The replica count value can be managed programmatically, in the portal, or through autoscaling features. + +- **Use a load balancer.** Load-balancing distributes your application's requests to healthy service instances and automatically removes unhealthy instances from rotation. When deploying to Kubernetes, load balancing can be specified in the Kubernetes manifest file in the Services section. + +- **Plan for multiregion deployment.** If you deploy your application to a single region, and that region becomes unavailable, your application will also become unavailable. This may be unacceptable under the terms of your application's service level agreements. Otherwise, consider deploying your application and its services across multiple regions. For example, an Azure Kubernetes Service (AKS) cluster is deployed to a single region. To protect your system from a regional failure, you might deploy your application to multiple AKS clusters across different regions and use the [Paired Regions](/azure/virtual-machines/regions#region-pairs) feature to coordinate platform updates and prioritize recovery efforts. + +- **Enable [geo-replication](/azure/sql-database/sql-database-active-geo-replication).** Geo-replication for services such as Azure SQL Database and Cosmos DB will create secondary replicas of your data across multiple regions. While both services automatically replicate data within the same region, geo-replication protects you against a regional outage by enabling you to fail over to a secondary region. Another best practice for geo-replication centers around storing container images. To deploy a service in AKS, you need to store and pull the image from a repository. Azure Container Registry integrates with AKS and can securely store container images. To improve performance and availability, consider geo-replicating your images to a registry in each region where you have an AKS cluster. Each AKS cluster then pulls container images from the local container registry in its region as shown in Figure 6-4: + + :::image type="content" source="media/replicated-resources.png" border="false" alt-text="A diagram showing replicated resources across multiple regions."::: + + **Figure 9-4**. Replicated resources across multiple regions + +- **Implement a DNS traffic load balancer.** [Azure Traffic Manager](/azure/traffic-manager/traffic-manager-overview) provides high-availability for critical applications by load-balancing at the DNS level. It can route traffic to different regions based on geography, cluster response time, and even application endpoint health. For example, Azure Traffic Manager can direct customers to the closest AKS cluster and application instance. If you have multiple AKS clusters in different regions, use Traffic Manager to control how traffic flows to the applications that run in each cluster. + +## Design for scalability + +The cloud thrives on scaling. The ability to increase or decrease system resources to address increasing or decreasing system load is a key tenet of the Azure cloud. But, to effectively scale an application, you need an understanding of the scaling features of each Azure service that you include in your application. Here are recommendations for effectively implementing scaling in your system. + +- **Design for scaling.** An application must be designed for scaling. To start, services should be stateless so that requests can be routed to any instance. Having stateless services also means that adding or removing an instance doesn't adversely impact current users. + +- **Partition workloads.** Decomposing domains into independent, self-contained microservices enables each service to scale independently of others. Typically, services will have different scalability needs and requirements. Partitioning enables you to scale only what needs to be scaled without the unnecessary cost of scaling an entire application. + +- **Favor scale-out.** Cloud-based applications favor scaling out resources as opposed to scaling up. Scaling out (also known as horizontal scaling) involves adding more service resources to an existing system to meet and share a desired level of performance. Scaling up (also known as vertical scaling) involves replacing existing resources with more powerful hardware (more disk, memory, and processing cores). Scaling out can be invoked automatically with the autoscaling features available in some Azure cloud resources. Scaling out across multiple resources also adds redundancy to the overall system. Finally scaling up a single resource is typically more expensive than scaling out across many smaller resources. Figure 6-6 shows the two approaches: + + :::image type="content" source="media/scale-up-scale-out.png" alt-text="A diagram showing the differences between scale up (vertical scaling) versus scale out (horizontal scaling)." border="false"::: + + **Figure 9-6**. Scale up versus scale out + +- **Scale proportionally.** When scaling a service, think in terms of *resource sets*. If you were to scale out a specific service dramatically, what impact would that have on back-end data stores, caches, and dependent services? Some resources such as Cosmos DB can scale out proportionally, while many others can't. You want to ensure that you don't scale out a resource to a point where it will exhaust other associated resources. + +- **Avoid affinity.** A best practice is to ensure a node doesn't require local affinity, often referred to as a *sticky session*. A request should be able to route to any instance. If you need to persist state, it should be saved to a distributed cache, such as [Azure Redis cache](https://azure.microsoft.com/services/cache/). + +- **Take advantage of platform autoscaling features.** Use built-in autoscaling features whenever possible, rather than custom or third-party mechanisms. Where possible, use scheduled scaling rules to ensure that resources are available without a startup delay, but add reactive autoscaling to the rules as appropriate, to cope with unexpected changes in demand. For more information, see [Autoscaling guidance](/azure/architecture/best-practices/auto-scaling). + +- **Scale out aggressively.** Scale out aggressively so that you can quickly meet immediate spikes in traffic without losing business. Then scale in (that is, remove unneeded instances) conservatively to keep the system stable. A simple way to implement this is to set the cool down period, which is the time to wait between scaling operations, to five minutes for adding resources and up to 15 minutes for removing instances. + +## Built-in retry in services + +We encouraged the best practice of implementing programmatic retry operations in an earlier section. Keep in mind that many Azure services and their corresponding client SDKs also include retry mechanisms. The following list summarizes retry features in the many of the Azure services that are discussed in this book: + +- **Azure Cosmos DB.** The class from the client API automatically retries failed attempts. The number of retries and maximum wait time are configurable. Exceptions thrown by the client API are either requests that exceed the retry policy or non-transient errors. + +- **Azure Redis Cache.** The Redis StackExchange client uses a connection manager class that retries failed attempts. The number of retries, specific retry policy, and wait time are all configurable. + +- **Azure Service Bus.** The Service Bus client exposes a [RetryPolicy class](xref:Microsoft.ServiceBus.RetryPolicy) that can be configured with a back-off interval, retry count, and , which specifies the maximum time an operation can take. The default policy is nine maximum retry attempts with a 30-second backoff period between attempts. + +- **Azure SQL Database.** Retry support is provided when you use the [Entity Framework Core](/ef/core/miscellaneous/connection-resiliency) library. + +- **Azure Storage.** The storage client library support retry operations. The strategies vary across Azure Storage tables, blobs, and queues. Alternate retries switch between primary and secondary storage service locations when the geo-redundancy feature is enabled. + +- **Azure Event Hubs.** The Event Hub client library features a RetryPolicy property, which includes a configurable exponential backoff feature. + +>[!div class="step-by-step"] +>[Previous](application-resiliency-patterns.md) +>[Next](resilient-communication.md) diff --git a/docs/architecture/distributed-cloud-native-apps-containers/cloud-native-resiliency/cloud-native-resiliency.md b/docs/architecture/distributed-cloud-native-apps-containers/cloud-native-resiliency/cloud-native-resiliency.md new file mode 100644 index 0000000000000..a7975f41948ae --- /dev/null +++ b/docs/architecture/distributed-cloud-native-apps-containers/cloud-native-resiliency/cloud-native-resiliency.md @@ -0,0 +1,42 @@ +--- +title: Cloud-native resiliency +description: Architecture for Distributed Cloud-Native Apps with .NET Aspire & Containers | Cloud-native resiliency +author: +ms.date: 17/04/2024 +--- + +[!INCLUDE [download-alert](../includes/download-alert.md)] + +Your cloud-native applications must embrace the partial failures that will inevitably occur. You apps should continue working when limited failures occur and recover quickly from more serious events. In cloud-native applications, where there are multiple microservices and backing services running in different containers and potentially in different locations, failures can be more common, even if you're using platforms with robust Service Level Aggreements (SLAs). + +Resiliency is the ability of your system to react to failure and still remain functional. It's not about avoiding failure, but accepting failure and constructing your cloud-native services to recover from them. You want to return to a fully functioning state as quickly as possible. + +As you've learned, unlike traditional monolithic applications, where everything runs together in a single process, cloud-native systems embrace a distributed architecture: + +:::image type="content" source="media/distributed-cloud-native-environment.png" alt-text="Distributed cloud-native environment" border="false"::: + +**Figure 9-1**. Distributed cloud-native environment + +Operating in this environment, a service must be sensitive to many different challenges: + +- Unexpected network latency - the time for a service request to travel to the receiver and back may become unpredictable. + +- [Transient faults](/azure/architecture/best-practices/transient-faults) - short-lived network connectivity errors may arise. + +- Slow synchronous operations - If you call an remote function synchronously, your local service is blocked by slow responses. + +- Crashed hosts - A microservice host may crash and need to be restarted or moved. + +- Overloaded microservices - Microservices that have too much demand may become unresponsive for a time. + +- Orchestrator operations - Your microservice orchestrator may initiate a rolling upgrade or move a service from one node to another. + +- Hardware failures. + +Cloud platforms can detect and mitigate many of these infrastructure issues. They may restart, scale out, and even redistribute your service to a different node. However, to take full advantage of this built-in protection, you must design your services to react to it and thrive in this dynamic environment. + +In the following sections, we'll explore defensive techniques that your service and managed cloud resources can use to minimize downtime and disruption caused by these challenges. + +>[!div class="step-by-step"] +>[Previous](../chpt8-data-patterns/distributed-data.md) +>[Next](application-resiliency-patterns.md) diff --git a/docs/architecture/distributed-cloud-native-apps-containers/cloud-native-resiliency/media/aks-traffic-manager.png b/docs/architecture/distributed-cloud-native-apps-containers/cloud-native-resiliency/media/aks-traffic-manager.png new file mode 100644 index 0000000000000..4dbb6168a9a5d Binary files /dev/null and b/docs/architecture/distributed-cloud-native-apps-containers/cloud-native-resiliency/media/aks-traffic-manager.png differ diff --git a/docs/architecture/distributed-cloud-native-apps-containers/cloud-native-resiliency/media/circuit-breaker-pattern.png b/docs/architecture/distributed-cloud-native-apps-containers/cloud-native-resiliency/media/circuit-breaker-pattern.png new file mode 100644 index 0000000000000..009d30dbdf2a1 Binary files /dev/null and b/docs/architecture/distributed-cloud-native-apps-containers/cloud-native-resiliency/media/circuit-breaker-pattern.png differ diff --git a/docs/architecture/distributed-cloud-native-apps-containers/cloud-native-resiliency/media/cover-thumbnail.png b/docs/architecture/distributed-cloud-native-apps-containers/cloud-native-resiliency/media/cover-thumbnail.png new file mode 100644 index 0000000000000..e9c523d88cff0 Binary files /dev/null and b/docs/architecture/distributed-cloud-native-apps-containers/cloud-native-resiliency/media/cover-thumbnail.png differ diff --git a/docs/architecture/distributed-cloud-native-apps-containers/cloud-native-resiliency/media/distributed-cloud-native-environment.png b/docs/architecture/distributed-cloud-native-apps-containers/cloud-native-resiliency/media/distributed-cloud-native-environment.png new file mode 100644 index 0000000000000..45f1f0b503194 Binary files /dev/null and b/docs/architecture/distributed-cloud-native-apps-containers/cloud-native-resiliency/media/distributed-cloud-native-environment.png differ diff --git a/docs/architecture/distributed-cloud-native-apps-containers/cloud-native-resiliency/media/istio-control-and-data-plane.png b/docs/architecture/distributed-cloud-native-apps-containers/cloud-native-resiliency/media/istio-control-and-data-plane.png new file mode 100644 index 0000000000000..1f65453da2727 Binary files /dev/null and b/docs/architecture/distributed-cloud-native-apps-containers/cloud-native-resiliency/media/istio-control-and-data-plane.png differ diff --git a/docs/architecture/distributed-cloud-native-apps-containers/cloud-native-resiliency/media/replicated-resources.png b/docs/architecture/distributed-cloud-native-apps-containers/cloud-native-resiliency/media/replicated-resources.png new file mode 100644 index 0000000000000..a01232e4aa332 Binary files /dev/null and b/docs/architecture/distributed-cloud-native-apps-containers/cloud-native-resiliency/media/replicated-resources.png differ diff --git a/docs/architecture/distributed-cloud-native-apps-containers/cloud-native-resiliency/media/retry-pattern.png b/docs/architecture/distributed-cloud-native-apps-containers/cloud-native-resiliency/media/retry-pattern.png new file mode 100644 index 0000000000000..b073049725252 Binary files /dev/null and b/docs/architecture/distributed-cloud-native-apps-containers/cloud-native-resiliency/media/retry-pattern.png differ diff --git a/docs/architecture/distributed-cloud-native-apps-containers/cloud-native-resiliency/media/scale-up-scale-out.png b/docs/architecture/distributed-cloud-native-apps-containers/cloud-native-resiliency/media/scale-up-scale-out.png new file mode 100644 index 0000000000000..b62e67c821625 Binary files /dev/null and b/docs/architecture/distributed-cloud-native-apps-containers/cloud-native-resiliency/media/scale-up-scale-out.png differ diff --git a/docs/architecture/distributed-cloud-native-apps-containers/cloud-native-resiliency/media/service-mesh-with-side-car.png b/docs/architecture/distributed-cloud-native-apps-containers/cloud-native-resiliency/media/service-mesh-with-side-car.png new file mode 100644 index 0000000000000..f96cb729d5d19 Binary files /dev/null and b/docs/architecture/distributed-cloud-native-apps-containers/cloud-native-resiliency/media/service-mesh-with-side-car.png differ diff --git a/docs/architecture/distributed-cloud-native-apps-containers/cloud-native-resiliency/resiliency-with-aspire.md b/docs/architecture/distributed-cloud-native-apps-containers/cloud-native-resiliency/resiliency-with-aspire.md new file mode 100644 index 0000000000000..1b707b1b576a2 --- /dev/null +++ b/docs/architecture/distributed-cloud-native-apps-containers/cloud-native-resiliency/resiliency-with-aspire.md @@ -0,0 +1,80 @@ +--- +title: Resiliency with .NET +description: Architecture for Distributed Cloud-Native Apps with .NET Aspire & Containers | Resiliency with .NET +author: +ms.date: 04/06/2022 +--- + +# Add resiliency with .NET + +You've seen how important it is to design your cloud-native applications to be resilient and the different approaches you can take to achieve this. We'll explore how you can implement application resiliency using the features built in to .NET. + +Application resiliency is provided in .NET by two packages: + +- `Microsoft.Extensions.Resilience` +- `Microsoft.Extensions.Http.Resilience` + +These packages implement the resiliency strategies provided by Polly to make your applications more resilient to transient faults and other issues that can occur in distributed environments. + +For detailed information on how to use these packages, see [Introduction to resilient app development](https://learn.microsoft.com/en-us/dotnet/core/resilience). + +## Adding fault tolerance to microservices + +In a distributed cloud-native environment, HTTP requests are a common way to communicate between services. You can use the `Microsoft.Extensions.Http.Resilience` package to add fault tolerance between your services. + +The high level steps are: + +- Add the `Microsoft.Extensions.Http.Resilience` package to the project that needs to handle failures. +- Add the `AddStandardResilienceHandler` method to the `HttpClient` factory. +- Configure the resiliency policies for the `HttpClient` factory. + +The default `AddStandardResilienceHandler` sets sensible values for the timeout, retries, and circuit breaker policies. But you can choose to customize these policies to suit your application's specific needs. + +This manual process of adding the package, modifying the HttpClient, and configuring the policies can be time-consuming and error-prone. There is a better way to add resiliency to your application. + +## Add resiliency with .NET Aspire + +As you've seen throughout this book .NET Aspire stack scales with your application and services. One of the fundamental ideas is to have an opinionated take on how to build better cloud-native applications. This includes adding sensible defaults for resiliency. + +If you've already added the previous .NET packages for resiliency and configured them, you can leave all that code in place. For all your microservices without any added resiliency, you can enrol them in .NET Aspire and gain the benefits of standard HTTP resiliency without any additional code. + +The projects you enlist in .NET Aspire have a dependency added to the service defaults project. This project contains the default configuration for resiliency, observability, telemetry, and other features. When you add a new microservice to .NET Aspire, it automatically gets the resiliency configuration. + +The code inside the service defaults project that adds the default resiliency configuration is: + +```csharp +public static IHostApplicationBuilder AddServiceDefaults( + this IHostApplicationBuilder builder) +{ + builder.ConfigureOpenTelemetry(); + + builder.AddDefaultHealthChecks(); + + builder.Services.AddServiceDiscovery(); + + builder.Services.ConfigureHttpClientDefaults(http => + { + // Turn on resilience by default + http.AddStandardResilienceHandler(); + + // Turn on service discovery by default + http.AddServiceDiscovery(); + }); + + return builder; +} +``` + +You don't need to change any of your code if the default resiliency settings meet your needs. + +## Additional resources + +- **Implement resiliency in a cloud-native .NET microservice** \ + + +- **Tutorial: Add .NET Aspire to an existing .NET app** \ + + +>[!div class="step-by-step"] +>[Previous](resilient-communication.md) +>[Next](../monitoring-health/observability-patterns.md) diff --git a/docs/architecture/distributed-cloud-native-apps-containers/cloud-native-resiliency/resilient-communication.md b/docs/architecture/distributed-cloud-native-apps-containers/cloud-native-resiliency/resilient-communication.md new file mode 100644 index 0000000000000..526c6db08ea4c --- /dev/null +++ b/docs/architecture/distributed-cloud-native-apps-containers/cloud-native-resiliency/resilient-communication.md @@ -0,0 +1,79 @@ +--- +title: Resilient communication +description: Architecture for Distributed Cloud-Native Apps with .NET Aspire & Containers | Resilient communication +author: +ms.date: 04/06/2022 +--- + +# Resilient communication + +[!INCLUDE [download-alert](../includes/download-alert.md)] + +Throughout this book, we've embraced a microservice-based architectural approach. While such an architecture provides important benefits, it presents many challenges: + +- **Out-of-process network communication.** Each microservice communicates over a network protocol that introduces network congestion, latency, and transient faults. + +- **Service discovery.** How do microservices discover and communicate with each other when running across a cluster of machines with their own IP addresses and ports? + +- **Resiliency.** How do you manage short-lived failures and keep the system stable? + +- **Load balancing.** How does inbound traffic get distributed across multiple instances of a microservice? + +- **Security.** How are security concerns such as transport-level encryption and certificate management enforced? + +- **Distributed monitoring.** - How do you correlate and capture traceability and monitoring for a single request across multiple consuming microservices? + +You can address these concerns with different libraries and frameworks, but the implementation can be expensive, complex, and time-consuming. You also end up with infrastructure concerns coupled to business logic. + +## Service mesh + +A better approach is an evolving technology entitled *Service Mesh*. A [service mesh](https://www.nginx.com/blog/what-is-a-service-mesh/) is a configurable infrastructure layer with built-in capabilities to handle service communication and the other challenges mentioned above. It decouples these concerns by moving them into a service proxy. The proxy is deployed into a separate process (called a [sidecar](/azure/architecture/patterns/sidecar)) to provide isolation from business code. However, the sidecar is linked to the service - it's created with it and shares its lifecycle. + +:::image type="content" source="media/service-mesh-with-side-car.png" alt-text="A diagram showing a service mesh using sidecars." border="false"::: + +**Figure 9-7**. Service mesh with a sidecar + +In the previous figure, note how the proxy intercepts and manages communication among the microservices and the cluster. + +A service mesh is logically split into two disparate components: A [data plane](https://blog.envoyproxy.io/service-mesh-data-plane-vs-control-plane-2774e720f7fc) and [control plane](https://blog.envoyproxy.io/service-mesh-data-plane-vs-control-plane-2774e720f7fc). + +:::image type="content" source="media/istio-control-and-data-plane.png" alt-text="A diagram showing a service mesh control and data plane" border="false"::: + +**Figure 9-8.** Service mesh control and data plane + +Once configured, a service mesh is highly functional. It can retrieve a corresponding pool of instances from a service discovery endpoint. The mesh can then send a request to a specific instance, recording the latency and response type of the result. A mesh can choose the instance most likely to return a fast response based on many factors, including its observed latency for recent requests. + +If an instance is unresponsive or fails, the mesh will retry the request on another instance. If it returns errors, a mesh will evict the instance from the load-balancing pool and restate it after it heals. If a request times out, a mesh can fail and then retry the request. A mesh captures and emits metrics and distributed tracing to a centralized metrics system. + +## Istio and Envoy + +While a few service mesh options currently exist, [Istio](https://istio.io/docs/concepts/what-is-istio/) is the most popular at the time of writing. Istio is a joint venture from IBM, Google, and Lyft. It's an open-source offering that can be integrated into a new or existing distributed application. The technology provides a consistent and complete solution to secure, connect, and monitor microservices. Its features include: + +- Secure service-to-service communication in a cluster with strong identity-based authentication and authorization. +- Automatic load balancing for HTTP, [gRPC](https://grpc.io/), WebSockets, and TCP traffic. +- Fine-grained control of traffic behavior with rich routing rules, retries, failovers, and fault injection. +- A pluggable policy layer and configuration API supporting access controls, rate limits, and quotas. +- Automatic metrics, logs, and traces for all traffic within a cluster, including cluster ingress and egress. + +A key component for an Istio implementation is a proxy service called the [Envoy proxy](https://www.envoyproxy.io/docs/envoy/latest/intro/what_is_envoy). It runs alongside each service and provides a platform-agnostic foundation for the following features: + +- Dynamic service discovery. +- Load balancing. +- TLS termination. +- HTTP and gRPC proxies. +- Circuit breaker resiliency. +- Health checks. +- Rolling updates with [canary](https://martinfowler.com/bliki/CanaryRelease.html) deployments. + +As previously discussed, Envoy is deployed as a sidecar to each microservice in the cluster. + +## Integration with Azure Kubernetes Services + +The Azure cloud embraces Istio and provides direct support for it within Azure Kubernetes Services. The following links can help you get started: + +- [Installing Istio in AKS](/azure/aks/istio-install) +- [Using AKS and Istio](/azure/aks/istio-scenario-routing) + +>[!div class="step-by-step"] +>[Previous](cloud-infrastructure-resiliency-azure.md) +>[Next](resiliency-with-aspire.md) diff --git a/docs/architecture/distributed-cloud-native-apps-containers/communication-patterns/communication-patterns.md b/docs/architecture/distributed-cloud-native-apps-containers/communication-patterns/communication-patterns.md new file mode 100644 index 0000000000000..90d63ad67f243 --- /dev/null +++ b/docs/architecture/distributed-cloud-native-apps-containers/communication-patterns/communication-patterns.md @@ -0,0 +1,55 @@ +--- +title: Communication patterns +description: Architecture for Distributed Cloud-Native Apps with .NET Aspire & Containers | Communication patterns +ms.date: 04/06/2024 +--- + +# Communication patterns + +[!INCLUDE [download-alert](../includes/download-alert.md)] + +When constructing a cloud-native system, communication becomes a significant design decision because the microservices and backing services are loosely couple. How does a front-end client application communicate with a back-end microservice? How do back-end microservices communicate with each other? What are the principles, patterns, and best practices to consider when implementing communication in cloud-native applications? + +## Communication considerations + +In a monolithic application, communication is straightforward. The code modules execute together in the same process on the same server. This approach can have performance advantages as everything runs together in shared memory, but results in tightly coupled code that becomes difficult to maintain, evolve, and scale. + +Cloud-native systems implement a microservice-based architecture with many independent microservices. Each microservice runs inside a container deployed in a *cluster*. + +A cluster groups a pool of containers together to form a highly available environment. They're managed with an orchestration tool, which is responsible for deploying and managing the containerized microservices. Figure 5-1 shows a [Kubernetes](https://kubernetes.io) cluster. This arrangement could be deployed in any Kubernetes cluster, including the Azure cloud with the fully managed [Azure Kubernetes Services](/azure/aks/intro-kubernetes). + +![A Kubernetes cluster in Azure](media/kubernetes-cluster.png) + +**Figure 5-1**. A Kubernetes cluster + +Across the cluster, microservices communicate with each other through APIs and messaging technologies. + +While they provide many benefits, microservices are no free lunch. Local in-process method calls between components are now replaced with network calls. Each microservice must communicate over a network protocol, which adds complexity to your system: + +- Network congestion, latency, and transient faults are constant concerns. +- Resiliency, in which failed requests are automatically retried, is essential. +- Some calls must be [idempotent](https://restapitutorial.com/introduction/idempotence) to keep a consistent state. +- Each microservice must authenticate and authorize calls. +- Each message must be serialized and then deserialized, which can be expensive. +- Message encryption and decryption becomes important. + +## Communication types + +Clients and services can use many different types of communication, each one targeting a different scenario and goals. Initially, those types of communications can be classified along two axes. + +The first axis defines if the protocol is synchronous, asynchronous, and streaming: + +- **Synchronous protocol**. HTTP is a synchronous protocol. The client sends a request and waits for a response from the service. That's independent of the client code execution that could be synchronous, where thread is blocked, or asynchronous, where the thread isn't blocked and the response will reach a callback eventually. The important point here is that the network protocol (HTTP/HTTPS) is synchronous and the client code can only continue its task when it receives the HTTP server response. +- **Asynchronous protocol**. Other protocols like Advanced Message Queuing Protocol (AMQP) use asynchronous messages. AMQP is a protocol supported by many operating systems and cloud environments. The client code or message sender usually doesn't wait for a response. It just sends the message to a queue and trusts that it will be delivered. +- **Streaming communication**. Streaming communication is a specialized form of asynchronous communication that permits a continuous data flow between services. Websockets aren't a streaming protocol but they can be used to enable streaming communication between services. Streaming is useful for scenarios where services need to exchange large amounts of data, when event streaming, or when there's a need to maintain a persistent connection for real-time updates. + +The second axis defines if the communication has a single receiver or multiple receivers: + +- **Single receiver**. Each request must be processed by exactly one receiver or service. An example of this communication is the [Command pattern](https://en.wikipedia.org/wiki/Command_pattern). +- **Multiple receivers**. Each request can be processed by zero, one, or multiple receivers. This type of communication must be asynchronous. An example is the [publish/subscribe](https://en.wikipedia.org/wiki/Publish%E2%80%93subscribe_pattern) mechanism used in patterns like [Event-driven architecture](https://microservices.io/patterns/data/event-driven-architecture.html). This is based on an event-bus interface or message broker when propagating data updates between multiple microservices through events. It's usually implemented through a service bus or similar artifact like [Azure Service Bus](https://azure.microsoft.com/services/service-bus/) by using [topics and subscriptions](https://learn.microsoft.com/en-us/azure/service-bus-messaging/service-bus-queues-topics-subscriptions#topics-and-subscriptions). + +The book [.NET Microservices: Architecture for Containerized .NET Applications](https://dotnet.microsoft.com/download/thank-you/microservices-architecture-ebook), available for free from Microsoft, provides an in-depth coverage of communication patterns for microservice applications. + +>[!div class="step-by-step"] +>[Previous](../architecting-distributed-cloud-native-applications/different-distributed-architectures.md) +>[Next](when-to-use-each-approach.md) diff --git a/docs/architecture/distributed-cloud-native-apps-containers/communication-patterns/media/kubernetes-cluster.png b/docs/architecture/distributed-cloud-native-apps-containers/communication-patterns/media/kubernetes-cluster.png new file mode 100644 index 0000000000000..07c453ce6be94 Binary files /dev/null and b/docs/architecture/distributed-cloud-native-apps-containers/communication-patterns/media/kubernetes-cluster.png differ diff --git a/docs/architecture/distributed-cloud-native-apps-containers/communication-patterns/when-to-use-each-approach.md b/docs/architecture/distributed-cloud-native-apps-containers/communication-patterns/when-to-use-each-approach.md new file mode 100644 index 0000000000000..92ea092be47a0 --- /dev/null +++ b/docs/architecture/distributed-cloud-native-apps-containers/communication-patterns/when-to-use-each-approach.md @@ -0,0 +1,161 @@ +--- +title: How to choose the right communication pattern +description: Architecture for Distributed Cloud-Native Apps with .NET Aspire & Containers | How to choose the right communication pattern +ms.date: 04/06/2024 +--- + +# How to choose the right communication pattern + +[!INCLUDE [download-alert](../includes/download-alert.md)] + +Let's look in more detail at each of the different ways applications can handle the communication between their services. In a real cloud-native app you will often use a combination of these patterns to meet the needs of your application. The choice of communication pattern can significantly impact the performance, scalability, and reliability of your system. + +## Synchronous communication + +Synchronous communication involves a direct, real-time interaction between services, where the sender waits for a response from the receiver before proceeding. + +The general approach for synchronous communication is as follows: + +- A client sends a request to a service. +- The client waits for the service to process the request and send back a response. +- The client proceeds only after receiving the response. + +Example use cases include: + +1. **Real-time user interactions**: + + - **Use case**: A user logs into a web application. + - **How**: The authentication service validates credentials and responds immediately to indicate success or failure. + +1. **Immediate data retrieval**: + + - **Use case**: A user requests their account details. + - **How**: The client service queries the account service and waits for the data to display to the user. + +There are many ways to implement synchronous communication. Two common approaches are: + +- **HTTP/HTTPS**: The most common protocol for synchronous communication in web applications. +- **gRPC**: This protocol uses HTTP/2 and supports synchronous communication with additional benefits like multiplexing and lower latency. + +**Pros**: + +- Simple to implement and understand. +- Provides immediate feedback. + +**Cons**: + +- Creates tight coupling between services. +- Can lead to bottlenecks and reduced resilience if services are not highly available. + +## Asynchronous communication + +Asynchronous communication decouples services by allowing the sender to proceed without waiting for a response. Messages are typically queued and processed later. Some messaging systems enable single messages to be distributed to multiple receivers. This pattern is appropriate for many communications between microservices in cloud-native apps because a microservice doesn't stop and wait until a response arrives from another component, which may be distant or busy. The sending microservice can continue with other tasks while it waits. + +The general approach for asynchronous communication is as follows: + +- A client publishes messages to a queue or message broker. +- Services that need to process the messages consume them from the queue. + +Example use cases include: + +1. **Order processing**: + + - **Use case**: A user places an order on an e-commerce site. + - **How**: The order service sends the order details to a queue. The order processing service consumes the message and processes the order. + +1. **Background jobs**: + + - **Use case**: An administrator requests a report or performs data analysis. + - **How**: The microservice sends a message to a queue. A worker service processes the message, creates the report and delivers it asynchronously. + +There are many ways to implement asynchronous communication. Some popular technologies are: + +- **RabbitMQ**: A popular message broker that supports various messaging patterns. +- **Apache Kafka**: A scalable messaging system that can be used for asynchronous messaging. +- **Azure Service Bus**: A versatile message queue and distribution system hosted in the cloud. +- **Azure Storage Queues**: A simple message queue manager hosted in the cloud. + +**Pros**: + +- Microservices remain loosely coupled, which increases their fault tolerance. +- It's easier to scale out because microservices are loosely coupled. You can invest in extra resources target solely at the bottleneck microservice. + +**Cons**: + +- Asynchronous communication is more complex to implement due to the need for message queues. You must also consider how to ensure eventual data consistency. +- It may introduce latency in message processing during busy times. When this happens, consider scaling out. + +## Streaming communication + +Streaming communication involves the continuous transmission of data between services. It's suitable for real-time data processing and event streaming. + +The general approach for asynchronous communication is as follows: + +- A continuous flow of data is sent from one service to another. +- The receiving service processes the data as it arrives in real-time. + +Example use cases include: + +1. **Real-time analytics**: + - **Use case**: Monitoring application performance. + - **How**: Application services send telemetry data to a real-time analytics platform, which processes and displays the data. + +1. **IoT data processing**: + - **Use case**: Sensor data from IoT devices. + - **How**: Devices continuously stream event data to a processing service, which analyzes the data and triggers actions. + +There are many ways to implement asynchronous communication. Two approaches are: + +- **gRPC**: This protocol supports streaming communication through server-side, client-side, and bidirectional streaming. +- **WebSockets**: This protocol provides full-duplex communication channels over a single TCP connection. +- **Apache Kafka**: Kafka includes a streaming API in addition to asynchronous communications. + +**Pros**: + +- Streaming is ideal for real-time data processing and low-latency requirements. +- It's efficient for handling continuous data flows and large volumes of data. + +**Cons**: + +- Srteaming is resource-intensive and requires careful management of continuous data streams. +- It's complex to implement and maintain. + +## Choosing the right pattern + +When deciding between synchronous, asynchronous, and streaming communication patterns between your microservices, consider the following factors: + +1. **Latency requirements** + + - **Low latency**: Use synchronous communication when you need immediate feedback. For example, use HTTP or gRPC unary calls. + - **Moderate to high latency**: Use asynchronous communication to decouple processing times. For example, use RabbitMQ, Kafka, or Azure Service Bus as a message broker. + - **Continuous, real-time**: Use streaming for real-time data flows. For example use Kafka, gRPC streaming, or WebSockets. + +1. **Coupling and scalability** + + - **Tight coupling**: Synchronous communication is simpler but couples services tightly. + - **Loose coupling**: Asynchronous communication offers better fault tolerance and scalability. + - **Real-time data**: Streaming is suitable for scenarios requiring continuous data processing and real-time updates. + +1. **Complexity and maintenance** + + - **Simplicity**: Synchronous communication is straightforward to implement. + - **Decoupling**: Asynchronous communication requires message brokers and additional handling for eventual consistency. + - **Real-time needs**: Streaming requires managing continuous data flows and maintaining high performance. + +Understanding the strengths and limitations of each communication pattern is crucial for designing robust, scalable, and efficient microservice architectures. By carefully evaluating the requirements and constraints of your system, you can choose the appropriate communication pattern to meet your architectural goals. + +## Messaging in .NET Aspire + +If you choose to use the .NET Aspire stack to build your cloud-native app, synchronous communications must be implemented with HTTP, HTTPS, or gRPC calls. .NET Aspire is not involved in this communication. + +For asynchronous communications, .NET Aspire has integrations that help you work with queues in Azure Storage, RabbitMQ, Azure Service Bus, Apache Kafka, and NATS. You create these backing services in the app host project, and pass them to each microservice that uses them. In the microservices, you can use dependency injection to obtain objects that store and retrieve messages from queues in your preferred service. + +For streaming communications, use the .NET Aspire Apache Kafka integration. + +Using .NET Aspire integrations also helps to improve resiliency. Some integrations can automatically retry requests that have failed, and you can configure timeouts for these retries. + +In the next chapter we'll explore in more detail service-to-service communication patterns. + +>[!div class="step-by-step"] +>[Previous](communication-patterns.md) +>[Next](../service-to-service-communication-patterns/introduction.md) diff --git a/docs/architecture/distributed-cloud-native-apps-containers/deploying-distributed-apps/deploy-with-dot-net-aspire.md b/docs/architecture/distributed-cloud-native-apps-containers/deploying-distributed-apps/deploy-with-dot-net-aspire.md new file mode 100644 index 0000000000000..f79bf1bf6cf21 --- /dev/null +++ b/docs/architecture/distributed-cloud-native-apps-containers/deploying-distributed-apps/deploy-with-dot-net-aspire.md @@ -0,0 +1,42 @@ +--- +title: Deployment with or without .Net Aspire +description: Architecture for Distributed Cloud-Native Apps with .NET Aspire & Containers | Deployment with or without the .Net Aspire +author: +ms.date: 06/12/2024 +--- + +# Deployment with or without .Net Aspire + +[!INCLUDE [download-alert](../includes/download-alert.md)] + +Deploying distributed applications is a complex process that involves various steps and considerations to ensure that the application runs smoothly across different environments. The introduction of .NET Aspire has streamlined this process significantly, offering a more efficient and reliable way to deploy cloud-native apps. However, it's also possible to deploy these applications without .NET Aspire, and you should continue to use conventional methods for existing applications because the configuration and testing of deployment has already been done. + +## Without .NET Aspire + +Deploying distributed applications without .NET Aspire involves a more hands-on approach. Developers can still use ASP.NET Core to create and manage distributed applications, but they miss out on the guidance and automation provided by .NET Aspire. The process typically involves manually setting up the infrastructure, configuring the environments, and ensuring that all components of the distributed system communicate effectively. + +Without the streamlined process of .NET Aspire, developers need to be more vigilant about each step of the deployment. This includes creating deployment manifests manually, setting up CI/CD pipelines, and handling container orchestration platforms like Kubernetes without the assistance of .NET Aspire's tools. + +For existing applications you should continue to use conventional methods because the deployment setup has already been done. + +## With .NET Aspire + +.NET Aspire simplifies the deployment of distributed apps by providing a cloud-agnostic framework that supports various platforms. It offers a deployment manifest that describes the structure of applications and the necessary properties for deployment, such as environment variables. This manifest enables deployment tools from Microsoft and other cloud providers to understand and manage the application effectively. + +For instance, deploying to Azure Container Apps is made easier with .NET Aspire. The Azure Developer CLI (`azd.exe`) has been extended to support .NET Aspire applications, allowing developers to provision and deploy resources on Azure with ease. Additionally, .NET Aspire apps are designed to emit telemetry using OpenTelemetry, which can be directed to Azure Monitor or Application Insights for monitoring and analysis. + +When deploying to Kubernetes, .NET Aspire applications require mapping the JSON manifest to a Kubernetes YML manifest file. This can be done using the **Aspir8** tool, which generates the necessary Kubernetes manifests based on the .NET Aspire app host manifest. + +In conclusion, while .NET Aspire offers a more automated and error-free deployment experience, it is still possible to deploy distributed applications without it. The choice between using .NET Aspire or not depends on the specific needs of the project, the expertise of the development team, and the desired level of control over the deployment process. + +## Additional resources + +- [.NET Aspire deployments](https://learn.microsoft.com/en-us/dotnet/aspire/deployment/overview) +- [Deploy a .NET Aspire apps to Azure Container Apps](https://learn.microsoft.com/en-us/dotnet/aspire/deployment/azure/aca-deployment-azd-in-depth) +- [Deploy distributed .NET apps to the cloud with .NET Aspire and Azure](https://learn.microsoft.com/en-us/shows/azure-developers/deploy-distributed-dotnet-apps-to-the-cloud-with-dotnet-aspire-and-azure-container-apps) +- [How to deploy .NET Aspire apps to Azure Container Apps](https://devblogs.microsoft.com/dotnet/how-to-deploy-dotnet-aspire-apps-to-azure-container-apps/) +- [Deploy apps to Azure Container Apps easily with .NET Aspire](https://techcommunity.microsoft.com/t5/apps-on-azure-blog/deploy-apps-to-azure-container-apps-easily-with-net-aspire/ba-p/4032711) + +>[!div class="step-by-step"] +>[Previous](development-vs-production.md) +>[Next](deployment-patterns.md) diff --git a/docs/architecture/distributed-cloud-native-apps-containers/deploying-distributed-apps/deployment-patterns.md b/docs/architecture/distributed-cloud-native-apps-containers/deploying-distributed-apps/deployment-patterns.md new file mode 100644 index 0000000000000..d57cbceced360 --- /dev/null +++ b/docs/architecture/distributed-cloud-native-apps-containers/deploying-distributed-apps/deployment-patterns.md @@ -0,0 +1,142 @@ +--- +title: Deployment patterns +description: Architecture for Distributed Cloud-Native Apps with .NET Aspire & Containers | Deployment patterns +author: +ms.date: 06/12/2024 +--- + +# Deployment patterns + +[!INCLUDE [download-alert](../includes/download-alert.md)] + +When deploying distributed applications, choosing the right deployment pattern significantly influences system performance, scalability, and maintainability. It should be noted that the deployment strategies in this topic are not alternatives, but can complement each other. + +Here are some common deployment strategies and their effects: + +## Blue-green deployment + +Blue-green deployment is a software release strategy that aims to minimize downtime and reduce the risk associated with deploying new versions of an application. Here's how it works: + +1. **Environment setup**: + - In a blue-green deployment, two identical environments are set up: the **blue** environment (running the current application version) and the **green** environment (running the new application version). + - The blue environment serves as the stable version that users interact with, while the green environment remains idle initially. + +1. **Testing and verification**: + - The green revision (new version) undergoes thorough testing, including functional tests, performance checks, and compatibility verification. + - Once verified, the green revision is ready for production. + +1. **Traffic switch**: + - A controlled traffic switch directs live traffic from the blue to the green environment. + - Users now interact with the new version in the green environment. + +1. **Rollback capability**: + - If issues arise in the green revision, a rollback is possible. + - Traffic can be reverted back to the stable blue revision, minimizing user impact. + +1. **Role change**: + - After a successful deployment, the roles switch: the green revision becomes the stable production environment. + - The blue revision is then used for the next deployment cycle. + +### Benefits of blue-green deployment + +- **Zero downtime**: Users experience no downtime during the transition. +- **Risk mitigation**: Problems in the green revision can be easily rolled back. +- **Continuous delivery**: Enables frequent updates without disrupting users. + +For more information, see [Blue-Green Deployment in Azure Container Apps](https://learn.microsoft.com/en-us/azure/container-apps/blue-green-deployment). + +## Deployment velocity + +Deployment velocity, also known as release velocity or deployment frequency, refers to the speed at which software changes are deployed into production. It directly impacts an organization's ability to deliver new features, enhancements, and bug fixes to end-users. + +### Why deployment velocity matters + +Here are key points to consider: + +1. **Business agility**: + - Rapid deployment allows organizations to respond swiftly to market demands and changing customer needs. + - Frequent releases enable faster feedback loops, helping teams iterate and improve their products. + +1. **Risk reduction**: + - Smaller, more frequent deployments reduce the risk associated with large, infrequent releases. + - Isolating changes minimizes the impact of any potential issues. + +1. **Continuous delivery**: + - High deployment velocity aligns with the principles of continuous delivery. + - Automated pipelines ensure consistent, reliable releases. + +### Strategies for increasing deployment velocity + +Use these approaches to accelerate your deployment cycle: + +1. **Automation**: + - Automate build, test, and deployment processes. + - Use tools like Jenkins, GitLab CI/CD, or GitHub Actions. + +1. **Microservices architecture**: + - Break down monolithic applications into smaller, independently deployable services. + - Each microservice can have its own release cycle. + +1. **Feature flags**: + - Implement feature flags (toggles) to selectively enable or disable features. + - Allows gradual rollout and easy rollback if issues arise. + +1. **Blue-green deployment**: + - Maintain two identical environments (blue and green). + - Deploy changes to the inactive environment, then switch traffic. + +1. **Canary releases**: + - Roll out changes to a subset of users. + - Monitor performance and gather feedback before full deployment. + +### Challenges and considerations + +Before you take steps to accelerate your deployment, consider these factors: + +1. **Testing**: + - Frequent deployments require robust automated testing. + - Balance speed with quality assurance. + +1. **Infrastructure as Code (IaC)**: + - Use IaC tools, such as Terraform and Ansible, to manage infrastructure changes. + - Infrastructure changes should be versioned and tested, just as code is. + +1. **Cultural shift**: + - Encourage collaboration between development, operations, and business teams. + - Foster a DevOps mindset. + +## Continuous integration and continuous deployment (CI/CD) + +In the realm of software development, Continuous Integration (CI) and Continuous Deployment (CD) are practices that automate the process from code commit to production release. These practices are crucial for teams aiming for high velocity and quality in software delivery. + +Continuous deployment offers speed, quality, and agility, but it requires careful planning and robust testing to mitigate risks. + +### Continuous integration: A commit a day keeps the bugs away + +CI is the practice of merging all developers' working copies to a shared mainline several times a day. The key principles include: + +- **Maintain a single source repository**: Developers should commit to a single shared repository, which can be managed by tools like Git. +- **Automate the build**: The build process should be automated to include compiling, testing, and packaging the code. +- **Make your build self-testing**: Automated tests ensure that changes don't break the application. +- **Every commit should build on an integration environment**: This ensures that the application builds on a clean environment. +- **Keep the build fast**: A fast build encourages frequent commits. +- **Test in a clone of the production environment**: This reduces the chances of environment-specific bugs. +- **Make it easy to get the latest deliverables**: Anyone should be able to get the latest executable easily. +- **Everyone can see what's happening**: Automated dashboards can show build status and test results. + +### Continuous deployment: Release early, release often + +CD extends CI by automatically deploying all code changes to a testing or production environment after the build stage. Best practices include: + +- **Automated deployments**: Automate deployments to ensure reliable and consistent process. +- **Environment parity**: Keep development, staging, and production as similar as possible. +- **Feature flags**: Use feature toggles to enable or disable features without deploying new code. +- **Branch by abstraction**: Make large-scale changes incrementally without affecting users. +- **Decouple deployment from release**: Deploying code doesn't mean releasing it to users immediately. +- **Monitor real-time performance**: Monitoring applications in real-time helps identify issues early. + +By adhering to CI/CD practices, teams can reduce integration problems, deploy more frequently, and ensure high-quality releases. It's a journey towards more efficient and effective software development processes. + +>[!div class="step-by-step"] +>[Previous](deploy-with-dot-net-aspire.md) +>[Next](distribution-patterns.md) diff --git a/docs/architecture/distributed-cloud-native-apps-containers/deploying-distributed-apps/development-vs-production.md b/docs/architecture/distributed-cloud-native-apps-containers/deploying-distributed-apps/development-vs-production.md new file mode 100644 index 0000000000000..4d0f0b6bfb9ce --- /dev/null +++ b/docs/architecture/distributed-cloud-native-apps-containers/deploying-distributed-apps/development-vs-production.md @@ -0,0 +1,32 @@ +--- +title: Development versus production and what .NET Aspire can do for you +description: Architecture for Distributed Cloud-Native Apps with .NET Aspire & Containers | Development versus production and what .NET Aspire can do for you +author: +ms.date: 06/12/2024 +--- + +# Development versus production and what .NET Aspire can do for you + +[!INCLUDE [download-alert](../includes/download-alert.md)] + +When it comes to deploying distributed applications, the environments for development and production serve distinct purposes and require different approaches. The dichotomy between development and production environments is essential for the successful deployment of distributed apps. While the development environment is designed for innovation and problem-solving, the production environment is the realm of user experience, where reliability and efficiency reign supreme. Understanding and respecting these differences is crucial for any development team aiming to deliver high-quality software to its users. Here's a closer look at the key differences between deploying apps in development and production environments. + +## Development environment: A sandbox for creativity and testing + +The development environment is the playground where developers build and test new features. It's less stable but more flexible, allowing for rapid iteration and experimentation. Here, the focus is on: + +- **Continuous Integration (CI)**: Developers frequently merge code changes into a shared repository, ensuring that new code is continuously tested. +- **Debugging tools**: Development environments are equipped with extensive logging and debugging tools to trace and fix issues. +- **Mock services**: Instead of connecting to live databases or services, developers often use mock versions to simulate interactions without affecting real data. + +## Production environment: The stage for reliability and performance + +In contrast, the production environment is where the app is available to end-users. It prioritizes stability, performance, and security. Key aspects include: + +- **Scalability**: Production systems must handle varying loads efficiently, often through load balancing and auto-scaling techniques. +- **Monitoring and alerting**: Real-time monitoring tools track the health of the app, with alerting mechanisms to notify teams of issues. +- **Data integrity**: Connections to live databases and services require strict protocols to ensure data is not compromised. + +>[!div class="step-by-step"] +>[Previous](how-deployment-affects-your-architecture.md) +>[Next](deploy-with-dot-net-aspire.md) diff --git a/docs/architecture/distributed-cloud-native-apps-containers/deploying-distributed-apps/distribution-patterns.md b/docs/architecture/distributed-cloud-native-apps-containers/deploying-distributed-apps/distribution-patterns.md new file mode 100644 index 0000000000000..f1521310a9c0a --- /dev/null +++ b/docs/architecture/distributed-cloud-native-apps-containers/deploying-distributed-apps/distribution-patterns.md @@ -0,0 +1,46 @@ +--- +title: Distribution patterns +description: Architecture for Distributed Cloud-Native Apps with .NET Aspire & Containers | Distribution patterns +author: +ms.date: 06/12/2024 +--- + +# Distribution patterns + +[!INCLUDE [download-alert](../includes/download-alert.md)] + +Both provisioners and publishers play crucial roles in designing robust and efficient distributed systems. Let’s explore the concepts of provisioners and publishers in distribution patterns: + +## Provisioners + +Provisioners are components responsible for allocating and managing resources, such as servers, virtual machines, or containers in a distributed system. + +Provisioners handle tasks like provisioning instances, configuring network settings, and ensuring proper resource allocation. + +### Use cases + +- **Dynamic Scaling**: Provisioners automatically adjust the number of instances based on demand. +- **Infrastructure as Code (IaC)**: Tools like Terraform facilitate declarative resource provisioning. + +The benefits afforded by provisioners include scalability, consistency, and efficient resource utilization. + +## Publishers + +Publishers generate and disseminate data or events within a distributed system. + +They produce messages, notifications, or updates that need to be communicated to other components. + +### Examples + +- In the Publisher-Subscriber (Pub/Sub) pattern, publishers send messages to a broker, which then distributes them to subscribers. +- In event-driven architectures, publishers emit events, such as user actions or system events, for downstream processing. + +The benefits of publishers include decoupling, flexibility, and efficient communication. + +## Resources + +[Provision infrastructure with Azure deployment slots using Terraform](https://learn.microsoft.com/azure/developer/terraform/provision-infrastructure-using-azure-deployment-slots) + +>[!div class="step-by-step"] +>[Previous](deployment-patterns.md) +>[Next](dot-net-aspire.md) diff --git a/docs/architecture/distributed-cloud-native-apps-containers/deploying-distributed-apps/dot-net-aspire.md b/docs/architecture/distributed-cloud-native-apps-containers/deploying-distributed-apps/dot-net-aspire.md new file mode 100644 index 0000000000000..0ac776f1a5039 --- /dev/null +++ b/docs/architecture/distributed-cloud-native-apps-containers/deploying-distributed-apps/dot-net-aspire.md @@ -0,0 +1,93 @@ +--- +title: Deploying distributed apps with .NET Aspire +description: Architecture for Distributed Cloud-Native Apps with .NET Aspire & Containers | Deploying distributed apps with .NET Aspire +author: +ms.date: 06/12/2024 +--- + +# Deploying distributed apps with .NET Aspire + +[!INCLUDE [download-alert](../includes/download-alert.md)] + +In this topic, we will discuss .NET Aspire extensibility and the steps to deploy solutions to some popular cloud hosting platforms using .NET Aspire. + +## .NET Aspire extensibility + +.NET Aspire is created to be extensible so that you can configure it specifically for your needs and so that the developer community can share additional functionality. A good example of this is [Aspir8](https://prom3theu5.github.io/aspirational-manifests/getting-started.html). Soon after the initial release of .NET Aspire a developer and chief technology officer from the UK called David Sekula created Aspir8, a tool that publishes .NET Aspire solutions to Kubernetes clusters. + +Developers can participate by sharing feedback, discussing ideas, and contributing code on the [.NET Aspire GitHub repo](https://github.com/dotnet/aspire) + +### .NET Aspire manifest file + +A key aspect of .NET Aspire is the manifest file. A manifest file is a JSON document that describes all of the resources required for a .NET Aspire project. A manifest file is generated when you build a .NET Aspire project, or by running the following command: + +```dotnetcli +dotnet run --publisher manifest --output-path manifest.json +``` + +To build tools for .NET Aspire you can create your own manifest files enabling you to deploy .NET Aspire projects to additional hosting platforms. + +For more information on the structure of a manifest file, see [.NET Aspire manifest format for deployment tool builders](https://learn.microsoft.com/dotnet/aspire/deployment/manifest-format) + +You can also use the extensibility of .NET Aspire to create your own reusable building blocks for your solution. For more information on creating your own .NET Aspire resources, see [Create custom resource types for .NET Aspire](https://learn.microsoft.com/dotnet/aspire/extensibility/custom-resources) + +## How to deploy a .NET Aspire solution to Azure using Azure Developer CLI (azd) + +To deploy a NET Aspire solution to Azure, complete the following steps: + +1. **Create a .NET Aspire solutions**: + - Start by creating a .NET Aspire solution using the [.NET Aspire Starter Application template](https://learn.microsoft.com/dotnet/aspire/get-started/build-your-first-aspire-app). + - This template provides a solid foundation for your distributed application. + +1. **Install the Azure Developer CLI (`azd`)**: + - Install `azd` based on your operating system. It's available through `winget`, `brew`, `apt`, or directly via `curl`. + - Refer to the [Install Azure Developer CLI](https://learn.microsoft.com/dotnet/aspire/deployment/azure/aca-deployment) guide. + +1. **Initialize the Template**: + - Open a terminal and navigate to the *app host* project directory of your .NET Aspire solution. + - Execute the `azd init` command to initialize your project. + - The CLI inspects your local directory structure and determines the app type. + - For more information on the `azd init` command, see the [azd init documentation](https://learn.microsoft.com/dotnet/aspire/deployment/azure/aca-deployment). + + :::image type="content" source="media/azdinit.png" alt-text="A screeshot showing an azd init command execution." border="false"::: + + **Figure 14-1**. Executing the `azd init` command + +1. **Provision and deploy your solution**: + - Execute the `azd up` command from the *app host* project to begin the provision and deployment process. + - Select the subscription that you'd like to deploy to. + - Select the Azure region to deploy to. + +For more information, see [Deploy a .NET Aspire project to Azure Container Apps](https://learn.microsoft.com/dotnet/aspire/deployment/azure/aca-deployment) + +For an in-depth guide using GitHub Actions, check out [Deploy a .NET Aspire app using the Azure Developer CLI and GitHub Actions](https://learn.microsoft.com/dotnet/aspire/deployment/azure/aca-deployment-github-actions). + +## How to deploy a .NET Aspire app to Amazon Web Services (AWS) + +Because of the extensible nature of .NET Aspire, you can deploy to alternative web platforms, either using your own code, or by implementing community created packages. + +To deploy a .NET Aspire solution to AWS, complete the following steps: + +1. Install the **Aspire.Hosting.AWS** package in your *app host* project. +1. Configure the AWS SDK for .NET, specifying the AWS profile and region. +1. Provision any required AWS application resources by using a CloudFormation template. +1. Import any existing AWS resources using the **AddAWSCloudFormationStack** method. +1. Provision and deploy your solution. + +For more information, see [Aspire.Hosting.AWS readme](https://www.nuget.org/packages/Aspire.Hosting.AWS/8.0.1-preview.8.24267.1#readme-body-tab) + +## How to deploy a .NET Aspire app to Kubernetes + +Kubernetes is a widely used cloud-agnostic container orchestration service. As such, Kubernetes is widely used for distributed microservices solutions. Aspir8 is a community created .NET Aspire tool to deploy solutions to Kubernetes. To deploy to a Kubernetes cluster using Aspir8, complete the following steps: + +1. Generate a .NET Aspire manifest file. +1. Edit the _manifest.json_ file, as necessary. +1. Initialize Aspir8 by executing `aspirate init`. +1. Build the project by executing `aspirate build`. +1. Generate the Kubernetes files by executing `aspirate generate`. +1. Deploy the solution by executing `aspirate apply`. + +For more information, see [.NET 8, Aspire, & Aspir8: Deploy Microservices Into Dev Environments Effortlessly with CLI](https://medium.com/@josephsims1/aspire-aspi8-deploy-microservices-effortlessly-with-cli-no-docker-or-yaml-needed-f30b58443107) + +>[!div class="step-by-step"] +>[Previous](distribution-patterns.md) diff --git a/docs/architecture/distributed-cloud-native-apps-containers/deploying-distributed-apps/how-deployment-affects-your-architecture.md b/docs/architecture/distributed-cloud-native-apps-containers/deploying-distributed-apps/how-deployment-affects-your-architecture.md new file mode 100644 index 0000000000000..65caea7cc8af8 --- /dev/null +++ b/docs/architecture/distributed-cloud-native-apps-containers/deploying-distributed-apps/how-deployment-affects-your-architecture.md @@ -0,0 +1,28 @@ +--- +title: How deployment affects your architecture and vice versa +description: Architecture for Distributed Cloud-Native Apps with .NET Aspire & Containers | How deployment affects your architecture and vice versa +author: +ms.date: 06/12/2024 +--- + +# How deployment affects your architecture and vice versa + +[!INCLUDE [download-alert](../includes/download-alert.md)] + +In the realm of software engineering, the deployment of distributed applications is not merely a final step but a critical component that influences and is influenced by the application's architecture. This symbiotic relationship shapes the efficiency, scalability, and resilience of the services provided. + +## Deployment's impact on architecture + +The deployment strategy can significantly affect the architectural design of a distributed application. For instance, a microservices architecture might be chosen to facilitate independent deployment of service components. This allows for continuous integration and delivery practices, enabling teams to deploy updates more frequently and with less risk. + +Moreover, deployment considerations can dictate the choice of stateless over stateful components, influencing how data persistence and session management are handled. Stateless components can be easily scaled horizontally, enhancing the application's ability to handle increased load by simply adding more instances. + +## Architecture's influence on deployment + +Conversely, the architecture of a distributed application can determine the complexity and approach of its deployment. A monolithic architecture, while simpler to deploy initially, can become cumbersome to update and scale as the application grows. This often leads to the adoption of containerization and orchestration tools like Docker and Kubernetes, which manage the deployment of complex, interdependent services. + +The architecture also prescribes the deployment environment. A cloud-native architecture is designed with cloud services in mind, leveraging their scalability and managed services to reduce operational overhead. This necessitates a deployment strategy that is cloud-centric, often automated through infrastructure as code (IaC) tools such as Terraform or AWS CloudFormation. + +>[!div class="step-by-step"] +>[Previous](../api-gateways/gateway-patterns.md) +>[Next](development-vs-production.md) diff --git a/docs/architecture/distributed-cloud-native-apps-containers/deploying-distributed-apps/media/azdinit.png b/docs/architecture/distributed-cloud-native-apps-containers/deploying-distributed-apps/media/azdinit.png new file mode 100644 index 0000000000000..d7dc3bcfbe895 Binary files /dev/null and b/docs/architecture/distributed-cloud-native-apps-containers/deploying-distributed-apps/media/azdinit.png differ diff --git a/docs/architecture/distributed-cloud-native-apps-containers/event-based-communication-patterns/background-tasks-with-ihostedservice.md b/docs/architecture/distributed-cloud-native-apps-containers/event-based-communication-patterns/background-tasks-with-ihostedservice.md new file mode 100644 index 0000000000000..12f69046ffbdd --- /dev/null +++ b/docs/architecture/distributed-cloud-native-apps-containers/event-based-communication-patterns/background-tasks-with-ihostedservice.md @@ -0,0 +1,117 @@ +--- +title: Implement background tasks in microservices with IHostedService and the BackgroundService class +description: .NET Microservices Architecture for Containerized .NET Applications | Understand the new ways to use IHostedService and BackgroundService to implement background tasks in microservices. +ms.date: 01/13/2021 +--- +# Implement background tasks in microservices with IHostedService and the BackgroundService class + +[!INCLUDE [download-alert](../includes/download-alert.md)] + +In the world of microservices, background tasks are essential for operations that need to run independently of user interactions, such as data processing, batch jobs, or periodic maintenance. + +Background tasks and scheduled jobs are something you might need to use in any application, whether or not it follows the microservices architecture pattern. The difference when using a microservices architecture is that you can implement the background task in a separate container so you can scale it down or up as necessary. + +From a generic point of view, in .NET we called these type of tasks **hosted services**, because they are services that you host within your application or microservice. Note that in this case, the hosted service simply means a class with the background task logic. + +Since .NET Core 2.0, the framework provides a new interface named , which helps you to implement hosted services easily. The basic idea is that you can register multiple background tasks (hosted services) that run in the background while your web host or host is running, as shown in the image 7-26. + +![Diagram comparing ASP.NET Core IWebHost and .NET Core IHost.](./media/ihosted-service-webhost-vs-host.png) + +**Figure 7-26**. Using IHostedService in a WebHost or a Host + +ASP.NET Core 1.x and 2.x support `IWebHost` for background processes in web apps. .NET Core 2.1 and later versions support `IHost` for background processes with plain console apps. Note the difference made between `WebHost` and `Host`. + +A `WebHost` (the base class that implements `IWebHost` in ASP.NET Core 2.0) is the infrastructure artifact you use to provide HTTP server features to your process, such as when you're implementing an MVC web app or Web API service. It provides all the new infrastructure in ASP.NET Core, enabling you to use dependency injection, insert middlewares in the request pipeline, and so on. The `WebHost` uses these very same `IHostedServices` for background tasks. + +A `Host` (the base class that implements `IHost`) was introduced in .NET Core 2.1. Basically, a `Host` allows you to have a similar infrastructure to the `WebHost` (dependency injection, hosted services, and so on), but in this case, you have a simple and lighter process as the host, with nothing related to MVC, Web API or HTTP server features. + +Therefore, you can either create a specialized host process with `IHost` to handle the hosted services and nothing else, or you can extend an existing ASP.NET Core `WebHost`, such as an existing ASP.NET Core Web API or MVC app. + +## Understanding IHostedService and BackgroundService + +The `IHostedService` interface is a contract for services that run in the background. It defines two methods: `StartAsync(CancellationToken)` and `StopAsync(CancellationToken)`. Implementing this interface allows you to start and stop the service, and the `CancellationToken` parameter helps handle the task's lifetime. + +The `BackgroundService` class is an abstract class that implements `IHostedService` and provides a template for creating long-running background tasks. It introduces the `ExecuteAsync(CancellationToken)` method, which is where the logic of the background task is placed. + +## Implementing a background task + +To implement a background task, you create a class that inherits from `BackgroundService` and override the `ExecuteAsync` method. Here's a simple example: + +```csharp +using Microsoft.Extensions.Hosting; +using System; +using System.Threading; +using System.Threading.Tasks; + +public class MyBackgroundTask : BackgroundService +{ + protected override async Task ExecuteAsync(CancellationToken stoppingToken) + { + while (!stoppingToken.IsCancellationRequested) + { + DoWork(); + await Task.Delay(TimeSpan.FromHours(1), stoppingToken); + } + } + + private void DoWork() + { + // Task logic here + } +} +``` + +In this example, `DoWork` is a placeholder for the actual task logic. The `ExecuteAsync` method runs in a loop, checking the `stoppingToken` to see if a stop request has been made, and waits for an hour before running again. + +## Registering the service + +After implementing the background task, you need to register it with the dependency injection (DI) container in the `Startup.cs` or `Program.cs` file, depending on your .NET version: + +```csharp +public void ConfigureServices(IServiceCollection services) +{ + services.AddHostedService(); +} +``` + +## Summary class diagram + +The following image shows a visual summary of the classes and interfaces involved when implementing `IHostedServices`. + +![Class diagram showing the multiple classes and interfaces related to IHostedService.](./media/class-diagram-custom-ihostedservice.png) + +**Figure 7-27**. Class diagram showing the multiple classes and interfaces related to IHostedService + +`IWebHost` and `IHost` can host many services, which inherit from `BackgroundService`, which implements `IHostedService`. + +## Deployment considerations and takeaways + +It is important to note that the way you deploy your ASP.NET Core `WebHost` or .NET `Host` might impact the final solution. For instance, if you deploy your `WebHost` on IIS or a regular Azure App Service, your host can be shut down because of application pool recycles. But if you are deploying your host as a container into an orchestrator like Kubernetes, you can control the assured number of live instances of your host. In addition, you could consider other approaches in the cloud especially made for these scenarios, like Azure Functions. Finally, if you need the service to be running all the time and are deploying on a Windows Server, you could use a Windows Service. + +But even for a `WebHost` deployed into an app pool, there are scenarios that would be still applicable, like repopulating or flushing an application's in-memory cache. + +The `IHostedService` interface provides a convenient way to start background tasks in an ASP.NET Core web application (in .NET Core 2.0 and later versions) or in any process/host (starting in .NET Core 2.1 with `IHost`). Its main benefit is the opportunity you get with the graceful cancellation to clean-up the code of your background tasks when the host itself is shutting down. + +### Best Practices + +When implementing background tasks in microservices, consider the following best practices: + +- **Graceful Shutdown**: Ensure your tasks handle cancellation tokens properly to stop gracefully. +- **Error Handling**: Implement robust error handling within the task to prevent unhandled exceptions from crashing the service. +- **Logging**: Include comprehensive logging to track the task's operation and facilitate debugging. +- **Resource Management**: Be mindful of resource consumption and implement efficient use of CPU and memory. + +## Additional resources + +- **Building a scheduled task in ASP.NET Core/Standard 2.0** \ + + +- **Implementing IHostedService in ASP.NET Core 2.0** \ + + +- **GenericHost Sample using ASP.NET Core 2.1** \ + + +> [!div class="step-by-step"] +> [Previous](rabbitmq-event-bus-development-test-environment.md) +> [Next](subscribe-events.md) diff --git a/docs/architecture/distributed-cloud-native-apps-containers/event-based-communication-patterns/integration-event-based-microservice-communications.md b/docs/architecture/distributed-cloud-native-apps-containers/event-based-communication-patterns/integration-event-based-microservice-communications.md new file mode 100644 index 0000000000000..91f4e6282a488 --- /dev/null +++ b/docs/architecture/distributed-cloud-native-apps-containers/event-based-communication-patterns/integration-event-based-microservice-communications.md @@ -0,0 +1,221 @@ +--- +title: Implementing event-based communication between microservices +description: .NET Microservices Architecture for Containerized .NET Applications | Understand integration events to implement event-based communication between microservices. +ms.date: 01/13/2021 +--- + +# Implementing event-based communication between microservices + +[!INCLUDE [download-alert](../includes/download-alert.md)] + +When you use [event-based communication](/azure/architecture/guide/architecture-styles/event-driven), a [microservice](/azure/architecture/microservices/) publishes an event when something notable happens, such as when a product is added to a customer's basket. Other microservices subscribe to those events. When a microservice receives an event, it can update its own business entities, which might lead to more events being published. This is the essence of the eventual consistency concept. This [publish/subscribe](/azure/architecture/patterns/publisher-subscriber) system is usually performed by using an event bus. You can design the event bus as an interface with the API needed to subscribe and unsubscribe to events and to publish events. It can also have one or more implementations based on any inter-process or messaging communication, such as a messaging queue or a service bus that supports asynchronous communication and a publish/subscribe model. + +You can use events to implement business transactions that span multiple services, which give you eventual consistency between those services. An eventually consistent transaction consists of a series of distributed actions. At each action, the microservice updates a business entity and publishes an event that triggers the next action. Be aware that transactions don't span the underlying persistence and event bus, so [idempotence needs to be handled](/azure/architecture/reference-architectures/containers/aks-mission-critical/mission-critical-data-platform#idempotent-message-processing). Figure 7-18 below, shows a `PriceUpdated` event published through an event bus, so the price update is propagated to the Basket and other microservices. + +![Diagram of asynchronous event-driven communication with an event bus.](./media/event-driven-communication.png) + +**Figure 7-18**. Event-driven communication based on an event bus + +This section describes how you can implement this type of communication with .NET by using a generic event bus interface, as shown in Figure 7-18. There are multiple potential implementations, each using a different technology or infrastructure such as RabbitMQ, Azure Service Bus, or any other third-party open-source or commercial service bus. + +## Using message brokers and service buses for production systems + +As noted in the architecture section, you can choose from multiple messaging technologies for your event bus. But these technologies are at different levels. For instance, RabbitMQ, a messaging broker transport, is at a lower level than commercial products like Azure Service Bus, NServiceBus, MassTransit, or Brighter. Most of these products can work on top of either RabbitMQ or Azure Service Bus. Your choice of product depends on how many features and how much out-of-the-box scalability you need for your application. + +For a proof-of-concept or development environment, a simple implementation on [RabbitMQ](https://www.rabbitmq.com/) running as a container might be enough. But for mission-critical and production systems that need high scalability, you might want to evaluate and use [Azure Service Bus](/azure/service-bus-messaging/). + +Of course, you could always build your own service bus features on top of lower-level technologies like RabbitMQ and Docker, but the work needed to "reinvent the wheel" might be too costly for a custom enterprise application. + +Once you have decided that you want to have asynchronous and event-driven communication, you should choose the service bus product that best fits your needs for production. + +## Implementing the Azure Service Bus integration with .NET Aspire + +.NET Aspire makes it much more straightforward to implement an event bus, because it includes built-in integrations for systems like: + +- RabbitMQ +- Azure Service Bus +- Apache Kafka +- Azure Storage Queues +- NATS + +In the .NET Aspire solution, in the app host project, you can register these services and pass them to microservices that need to send or receive event messages. Service discovery is therefore straightforward. You must write code to work with client objects and send or pick up messages from queues as normal for each system. + +As an example, Let's examine how you'd implement Azure Service Bus in a .NET Aspire solution. + +### Prerequisites + +Before you begin, ensure you have an Azure Service Bus namespace. You can learn more about creating a Service Bus namespace [here](https://learn.microsoft.com/dotnet/aspire/messaging/azure-service-bus-component). + +### Creating the Azure Service Bus in the app host project + +The first step is to install the Azure Service Bus hosting package in the app host project: + +```shell +dotnet add package Aspire.Hosting.Azure.ServiceBus +``` + +Then, in the app host _Program.cs_ file, you can register the Service Bus integration and consume the service as follows: + +```csharp +var builder = DistributedApplication.CreateBuilder(args); +var serviceBus = builder.ExecutionContext.IsPublishMode + ? builder.AddAzureServiceBus("messaging") + : builder.AddConnectionString("messaging"); + +builder.AddProject() + .WithReference(serviceBus); +``` + +### Using the Azure Service Bus in microservices + +Next, install the necessary NuGet package in each microservice project that will send or receive event messages. Use the following command in your .NET CLI: + +```shell +dotnet add package Aspire.Azure.Messaging.ServiceBus +``` + +In the microservice's `Program.cs` file, register a `ServiceBusClient` like this: + +```csharp +builder.AddAzureServiceBusClient("messaging"); +``` + +That adds a `ServiceBusClient` object to the dependency injection container. To retrieve the client when you need it, require it as a constructor parameter in your services. For example: + +```csharp +public class ExampleService +{ + private readonly ServiceBusClient _client; + + public ExampleService(ServiceBusClient client) + { + _client = client; + } + + // Your methods here +} +``` + +### Configuration + +The .NET Aspire Service Bus integration offers several configuration options to tailor the `ServiceBusClient` to your project's needs. You can use configuration providers to load settings from `appsettings.json` or other configuration files using the `Aspire:Azure:Messaging:ServiceBus` key. + +Here's an example of how you might configure the ServiceBusClient in your `appsettings.json`: + +```json +{ + "Aspire": { + "Azure": { + "Messaging": { + "ServiceBus": { + "DisableHealthChecks": true + } + } + } + } +} +``` + +For more information, see [.NET Aspire Azure Service Bus integration](https://learn.microsoft.com/dotnet/aspire/messaging/azure-service-bus-integration?tabs=dotnet-cli) + +## Integration events + +Integration events are used for bringing domain state into synchronization across multiple microservices or external systems. Each microservice can publish integration events. Other microservices subscribe to the integration events that they need to handle. When an event is published, the appropriate event handler in each receiving microservice handles the event. + +An integration event is basically a data-holding class, as in the following example: + +```csharp +public class ProductPriceChangedIntegrationEvent : IntegrationEvent +{ + public int ProductId { get; private set; } + public decimal NewPrice { get; private set; } + public decimal OldPrice { get; private set; } + + public ProductPriceChangedIntegrationEvent(int productId, decimal newPrice, + decimal oldPrice) + { + ProductId = productId; + NewPrice = newPrice; + OldPrice = oldPrice; + } +} +``` + +The integration events can be defined at the application level of each microservice, so they are decoupled from other microservices, in a way comparable to how `ViewModels` are defined in the server and client. It's not recommended to share a common integration events library across multiple microservices; doing that would couple those microservices with a single event definition data library. You don't want to do that for the same reasons that you don't want to share a common domain model across multiple microservices: microservices must be completely autonomous and independent. For more information, see this blog post on [the amount of data to put in events](https://particular.net/blog/putting-your-events-on-a-diet). Be careful not to take this too far, as this other blog post describes [the problem data deficient messages can produce](https://ardalis.com/data-deficient-messages/). The design of your events should aim to be "just right" for the needs of their consumers. + +There are only a few kinds of libraries you should share across microservices. One is libraries that are final application blocks, like the [Event Bus client API](https://github.com/dotnet/eShop/tree/main/src/EventBus), as in the eShop reference architecture. Another is libraries that constitute tools that could also be shared as NuGet components, like JSON serializers. + +## The event bus + +An event bus allows publish/subscribe-style communication between microservices without requiring the components to explicitly be aware of each other, as shown in Figure 7-19. + +![A diagram showing the basic publish/subscribe pattern.](./media/publish-subscribe-basics.png) + +**Figure 7-19**. Publish/subscribe basics with an event bus + +The above diagram shows that microservice A publishes to the event bus, which distributes to subscribing microservices B and C, without the publisher needing to know the subscribers. The event bus is related to the observer pattern and the publish-subscribe pattern. + +### Observer pattern + +In the [observer pattern](https://en.wikipedia.org/wiki/Observer_pattern), your primary object, known as the observable, notifies other interested objects, known as observers, with relevant information (events). + +### Publish/subscribe pattern + +The purpose of the [publish/subscribe (pub/sub) pattern](https://learn.microsoft.com/previous-versions/msp-n-p/ff649664(v=pandp.10)) is the same as the observer pattern: you want to notify other services when certain events take place. But there is an important difference between the observer and pub/sub patterns. In the observer pattern, the broadcast is performed directly from the observable to the observers, so they "know" each other. But when using a pub/sub pattern, there is a third component, called broker, or message broker or event bus, which is known by both the publisher and subscriber. Therefore, when using the pub/sub pattern the publisher and the subscribers are precisely decoupled thanks to the event bus or message broker. + +### The middleman or event bus + +How do you achieve anonymity between publisher and subscriber? An easy way is let a middleman take care of all the communication. An event bus is one such middleman. + +An event bus is typically composed of two parts: + +- The abstraction or interface. +- One or more implementations. + +In Figure 7-19 you can see how, from an application point of view, the event bus is nothing more than a pub/sub channel. The way you implement this asynchronous communication can vary. It can have multiple implementations so that you can swap between them, depending on the environment requirements (for example, production or development environments). + +In Figure 7-20, you can see an abstraction of an event bus with multiple implementations based on infrastructure messaging technologies like RabbitMQ, Azure Service Bus, or another event/message broker. + +![Diagram showing the addition of an event bus abstraction layer.](./media/multiple-implementations-event-bus.png) + +**Figure 7- 20.** Multiple implementations of an event bus + +It's good to have the event bus defined through an interface so it can be implemented with several technologies, like RabbitMQ, Azure Service Bus or others. However, and as mentioned previously, using your own abstractions is good only if you need basic event bus features. If you need richer service bus features, you should probably use the API and abstractions provided by your preferred commercial service bus instead. + +### Defining an event bus interface + +Let's start with some implementation code for the event bus interface and possible implementations for exploration purposes. The interface should be generic and straightforward, as in the following interface. + +```csharp +public interface IEventBus +{ + void Publish(IntegrationEvent @event); + + void Subscribe() + where T : IntegrationEvent + where TH : IIntegrationEventHandler; + + void SubscribeDynamic(string eventName) + where TH : IDynamicIntegrationEventHandler; + + void UnsubscribeDynamic(string eventName) + where TH : IDynamicIntegrationEventHandler; + + void Unsubscribe() + where TH : IIntegrationEventHandler + where T : IntegrationEvent; +} +``` + +The `Publish` method is straightforward. The event bus broadcasts the integration event passed to it to any microservice, or even an external application, subscribed to that event. This method is used by the microservice that is publishing the event. + +The `Subscribe` methods (you can have several implementations depending on the arguments) are used by the microservices that want to receive events. This method has two arguments. The first is the integration event to subscribe to (`IntegrationEvent`). The second argument is the integration event handler (or callback method), named `IIntegrationEventHandler`, to be executed when the receiver microservice gets that integration event message. + +## Additional resources + +- **Azure Service Bus** \ + [https://learn.microsoft.com/azure/service-bus-messaging/](/azure/service-bus-messaging/) + +> [!div class="step-by-step"] +> [Previous](../service-to-service-communication-patterns/service-mesh-communication-infrastructure.md) +> [Next](rabbitmq-event-bus-development-test-environment.md) diff --git a/docs/architecture/distributed-cloud-native-apps-containers/event-based-communication-patterns/media/atomicity-publish-event-bus.png b/docs/architecture/distributed-cloud-native-apps-containers/event-based-communication-patterns/media/atomicity-publish-event-bus.png new file mode 100644 index 0000000000000..0579f94b2e3c0 Binary files /dev/null and b/docs/architecture/distributed-cloud-native-apps-containers/event-based-communication-patterns/media/atomicity-publish-event-bus.png differ diff --git a/docs/architecture/distributed-cloud-native-apps-containers/event-based-communication-patterns/media/atomicity-publish-worker-microservice.png b/docs/architecture/distributed-cloud-native-apps-containers/event-based-communication-patterns/media/atomicity-publish-worker-microservice.png new file mode 100644 index 0000000000000..05a94838d672d Binary files /dev/null and b/docs/architecture/distributed-cloud-native-apps-containers/event-based-communication-patterns/media/atomicity-publish-worker-microservice.png differ diff --git a/docs/architecture/distributed-cloud-native-apps-containers/event-based-communication-patterns/media/class-diagram-custom-ihostedservice.png b/docs/architecture/distributed-cloud-native-apps-containers/event-based-communication-patterns/media/class-diagram-custom-ihostedservice.png new file mode 100644 index 0000000000000..2af79dd1bb7cc Binary files /dev/null and b/docs/architecture/distributed-cloud-native-apps-containers/event-based-communication-patterns/media/class-diagram-custom-ihostedservice.png differ diff --git a/docs/architecture/distributed-cloud-native-apps-containers/event-based-communication-patterns/media/display-item-price-change.png b/docs/architecture/distributed-cloud-native-apps-containers/event-based-communication-patterns/media/display-item-price-change.png new file mode 100644 index 0000000000000..3eaac7f5f4a91 Binary files /dev/null and b/docs/architecture/distributed-cloud-native-apps-containers/event-based-communication-patterns/media/display-item-price-change.png differ diff --git a/docs/architecture/distributed-cloud-native-apps-containers/event-based-communication-patterns/media/event-driven-communication.png b/docs/architecture/distributed-cloud-native-apps-containers/event-based-communication-patterns/media/event-driven-communication.png new file mode 100644 index 0000000000000..75fb3bfd1874d Binary files /dev/null and b/docs/architecture/distributed-cloud-native-apps-containers/event-based-communication-patterns/media/event-driven-communication.png differ diff --git a/docs/architecture/distributed-cloud-native-apps-containers/event-based-communication-patterns/media/ihosted-service-webhost-vs-host.png b/docs/architecture/distributed-cloud-native-apps-containers/event-based-communication-patterns/media/ihosted-service-webhost-vs-host.png new file mode 100644 index 0000000000000..27d3d14931d5f Binary files /dev/null and b/docs/architecture/distributed-cloud-native-apps-containers/event-based-communication-patterns/media/ihosted-service-webhost-vs-host.png differ diff --git a/docs/architecture/distributed-cloud-native-apps-containers/event-based-communication-patterns/media/multiple-implementations-event-bus.png b/docs/architecture/distributed-cloud-native-apps-containers/event-based-communication-patterns/media/multiple-implementations-event-bus.png new file mode 100644 index 0000000000000..894aced743b70 Binary files /dev/null and b/docs/architecture/distributed-cloud-native-apps-containers/event-based-communication-patterns/media/multiple-implementations-event-bus.png differ diff --git a/docs/architecture/distributed-cloud-native-apps-containers/event-based-communication-patterns/media/publish-subscribe-basics.png b/docs/architecture/distributed-cloud-native-apps-containers/event-based-communication-patterns/media/publish-subscribe-basics.png new file mode 100644 index 0000000000000..071a3391190fc Binary files /dev/null and b/docs/architecture/distributed-cloud-native-apps-containers/event-based-communication-patterns/media/publish-subscribe-basics.png differ diff --git a/docs/architecture/distributed-cloud-native-apps-containers/event-based-communication-patterns/media/rabbitmq-implementation.png b/docs/architecture/distributed-cloud-native-apps-containers/event-based-communication-patterns/media/rabbitmq-implementation.png new file mode 100644 index 0000000000000..5fef906d0d74b Binary files /dev/null and b/docs/architecture/distributed-cloud-native-apps-containers/event-based-communication-patterns/media/rabbitmq-implementation.png differ diff --git a/docs/architecture/distributed-cloud-native-apps-containers/event-based-communication-patterns/rabbitmq-event-bus-development-test-environment.md b/docs/architecture/distributed-cloud-native-apps-containers/event-based-communication-patterns/rabbitmq-event-bus-development-test-environment.md new file mode 100644 index 0000000000000..ef7dc1a425a12 --- /dev/null +++ b/docs/architecture/distributed-cloud-native-apps-containers/event-based-communication-patterns/rabbitmq-event-bus-development-test-environment.md @@ -0,0 +1,173 @@ +--- +title: Implementing an event bus with RabbitMQ for the development or test environment +description: .NET Microservices Architecture for Containerized .NET Applications | Use RabbitMQ to implement event bus messaging for integration events in development or test environments. +ms.date: 01/13/2021 +--- +# Implementing an event bus with RabbitMQ for the development or test environment + +[!INCLUDE [download-alert](../includes/download-alert.md)] + +We should start by saying that if you create your custom event bus based on [RabbitMQ](https://www.rabbitmq.com/) running in a container, it should be used only for your development and test environments. Don't use it for your production environment, unless you're building it as a part of a production-ready service bus as described in the [Additional resources section below](rabbitmq-event-bus-development-test-environment.md#additional-resources). A simple custom event bus might be missing many production-ready critical features that a commercial service bus has. + +The event bus implementation with RabbitMQ lets microservices subscribe to events, publish events, and receive events, as shown in Figure 7-21. + +![Diagram showing RabbitMQ between message sender and message receiver.](./media/rabbitmq-implementation.png) + +**Figure 7-21.** RabbitMQ implementation of an event bus + +RabbitMQ functions as an intermediary between a message publisher and subscribers, to handle distribution. In the code, the `EventBusRabbitMQ` class implements the generic `IEventBus` interface. This implementation is based on dependency injection so that you can swap from this development and test version to a production version. + +```csharp +public class EventBusRabbitMQ : IEventBus, IDisposable +{ + // Implementation using RabbitMQ API + //... +} +``` + +The RabbitMQ implementation of a sample dev/test event bus is boilerplate code. It has to handle the connection to the RabbitMQ server and publish a message event to the queues. It also has to implement a collection of integration event handlers for each event type. These event types can have a different instantiation and different subscriptions for each receiver microservice, as shown in Figure 7-21. + +## Implementing a simple publish method with RabbitMQ + +The following code is a **simplified** version of an event bus implementation for RabbitMQ, to showcase the whole scenario. You don't really handle the connection this way. To see the full implementation, see the actual code in the [eShop Reference Architecture](https://github.com/dotnet/eShop/blob/main/src/EventBusRabbitMQ/RabbitMQEventBus.cs) repository. + +```csharp +public class EventBusRabbitMQ : IEventBus, IDisposable +{ + // Member objects and other methods ... + // ... + + public void Publish(IntegrationEvent @event) + { + var eventName = @event.GetType().Name; + var factory = new ConnectionFactory() { HostName = _connectionString }; + using (var connection = factory.CreateConnection()) + using (var channel = connection.CreateModel()) + { + channel.ExchangeDeclare(exchange: _brokerName, + type: "direct"); + string message = JsonConvert.SerializeObject(@event); + var body = Encoding.UTF8.GetBytes(message); + channel.BasicPublish(exchange: _brokerName, + routingKey: eventName, + basicProperties: null, + body: body); + } + } +} +``` + +As mentioned earlier, there are many possible configurations in RabbitMQ, so this code should be used only for dev/test environments. For example, you could improve a RabbitMQ implementation by using a [Polly](https://github.com/App-vNext/Polly) retry policy, which retries the task some times in case the RabbitMQ container is not ready. This scenario can occur when docker-compose is starting the containers. For example, the RabbitMQ container might start more slowly than the other containers. + +## Implementing the subscription code with the RabbitMQ API + +As with the publish code, the following code is a simplification of part of the event bus implementation for RabbitMQ. Again, you usually don't need to change it unless you are improving it. + +```csharp +public class EventBusRabbitMQ : IEventBus, IDisposable +{ + // Member objects and other methods ... + // ... + + public void Subscribe() + where T : IntegrationEvent + where TH : IIntegrationEventHandler + { + var eventName = _subsManager.GetEventKey(); + + var containsKey = _subsManager.HasSubscriptionsForEvent(eventName); + if (!containsKey) + { + if (!_persistentConnection.IsConnected) + { + _persistentConnection.TryConnect(); + } + + using (var channel = _persistentConnection.CreateModel()) + { + channel.QueueBind(queue: _queueName, + exchange: BROKER_NAME, + routingKey: eventName); + } + } + + _subsManager.AddSubscription(); + } +} +``` + +Each event type has a related channel to get events from RabbitMQ. You can then have as many event handlers per channel and event type as needed. + +The `Subscribe` method accepts an `IIntegrationEventHandler` object, which is like a callback method in the current microservice, plus its related `IntegrationEvent` object. The code then adds that event handler to the list of event handlers that each integration event type can have per client microservice. If the client code has not already been subscribed to the event, the code creates a channel for the event type so it can receive events in a push style from RabbitMQ when that event is published from any other service. + +## Implementing RabitMQ with .NET Aspire + +When using .NET Aspire you can implement and configure RabbitMQ with much less custom code. Initially, you need to install the [Aspire.RabbitMQ.Client](https://www.nuget.org/packages/Aspire.RabbitMQ.Client) NuGet package. In the app host, install the [Aspire.Hosting.RabbitMQ](https://www.nuget.org/packages/Aspire.Hosting.RabbitMQ) NuGet package. + +In the _Program.cs_ file of your microservice project, call the `AddRabbitMQClient` extension method to register an `IConnection` in the dependency injection container. The method takes a connection name parameter. + +```csharp +Copy +builder.AddRabbitMQClient("messaging"); +``` + +You can then retrieve the `IConnection` instance using dependency injection: + +```csharp +copy +public class ExampleService(IConnection connection) +{ + // Use connection... +} +``` + +In your app host project, register a RabbitMQ server and consume the connection using the following methods, such as AddRabbitMQ: + +```csharp +Copy +var builder = DistributedApplication.CreateBuilder(args); + +var messaging = builder.AddRabbitMQ("messaging"); + +builder.AddProject() + .WithReference(messaging); +``` + +The `WithReference` method configures a connection in the `ExampleProject` project named "messaging". + +You can choose to provide the username and password explicitly for authenticating with RabbitMQ: + +```csharp +Copy +var username = builder.AddParameter("username", secret: true); +var password = builder.AddParameter("password", secret: true); + +var messaging = builder.AddRabbitMQ("messaging", username, password); + +// Service consumption +builder.AddProject() + .WithReference(messaging); +``` + +For more information, see [.NET Aspire RabbitMQ integration](https://learn.microsoft.com/dotnet/aspire/messaging/rabbitmq-client-integration) + +## Additional resources + +- **Peregrine Connect** - Simplify your integration with efficient design, deployment, and management of apps, APIs, and workflows \ + + +- **NServiceBus** - Fully-supported commercial service bus with advanced management and monitoring tooling for .NET \ + + +- **EasyNetQ** - Open Source .NET API client for RabbitMQ \ + + +- **MassTransit** - Free, open-source distributed application framework for .NET \ + + +- **Rebus** - Open source .NET Service Bus \ + + +> [!div class="step-by-step"] +> [Previous](integration-event-based-microservice-communications.md) +> [Next](background-tasks-with-ihostedservice.md) diff --git a/docs/architecture/distributed-cloud-native-apps-containers/event-based-communication-patterns/subscribe-events.md b/docs/architecture/distributed-cloud-native-apps-containers/event-based-communication-patterns/subscribe-events.md new file mode 100644 index 0000000000000..7544814fa1b21 --- /dev/null +++ b/docs/architecture/distributed-cloud-native-apps-containers/event-based-communication-patterns/subscribe-events.md @@ -0,0 +1,128 @@ +--- +title: Subscribing to events +description: .NET Microservices Architecture for Containerized .NET Applications | Understand the details of publishing and subscription to integration events. +ms.date: 06/23/2021 +--- + +# Subscribing to events + +[!INCLUDE [download-alert](../includes/download-alert.md)] + +The first step for using an event bus is to subscribe the microservices to the events they should receive. + +The following simple code shows what each receiver microservice needs to implement when starting the service so it subscribes to the events it needs. In this case, the `basket-api` microservice needs to subscribe to `ProductPriceChangedIntegrationEvent` and the `OrderStartedIntegrationEvent` messages. + +For instance, when subscribing to the `ProductPriceChangedIntegrationEvent` event, that makes the basket microservice aware of any changes to the product price and lets it warn the user about the change if that product is in the user's basket. + +```csharp +var eventBus = app.ApplicationServices.GetRequiredService(); + +eventBus.Subscribe(); + +eventBus.Subscribe(); +``` + +After this code runs, the subscriber microservice will be listening through RabbitMQ channels. When any message of type `ProductPriceChangedIntegrationEvent` arrives, the code invokes the event handler that is passed to it and processes the event. + +## Publishing events through the event bus + +Finally, the message sending microservice publishes the integration events with code similar to the following example. This simplified example doesn't take atomicity into account. You would implement similar code whenever an event must be propagated across multiple microservices, usually right after committing data or transactions from the sending microservice. + +First, the event bus implementation object, based on RabbitMQ or a service bus, would be injected at the controller constructor, as in the following code: + +```csharp +[Route("api/v1/[controller]")] +public class CatalogController : ControllerBase +{ + private readonly CatalogContext _context; + private readonly IOptionsSnapshot _settings; + private readonly IEventBus _eventBus; + + public CatalogController(CatalogContext context, + IOptionsSnapshot settings, + IEventBus eventBus) + { + _context = context; + _settings = settings; + _eventBus = eventBus; + } + // ... +} +``` + +Then you use it from your controller's methods, as in this `UpdateProduct` method: + +```csharp +[Route("items")] +[HttpPost] +public async Task UpdateProduct([FromBody]CatalogItem product) +{ + var item = await _context.CatalogItems.SingleOrDefaultAsync( + i => i.Id == product.Id); + // ... + if (item.Price != product.Price) + { + var oldPrice = item.Price; + item.Price = product.Price; + _context.CatalogItems.Update(item); + var @event = new ProductPriceChangedIntegrationEvent(item.Id, + item.Price, + oldPrice); + // Commit changes in original transaction + await _context.SaveChangesAsync(); + // Publish integration event to the event bus + // (RabbitMQ or a service bus underneath) + _eventBus.Publish(@event); + // ... + } + // ... +} +``` + +In this case, since the origin microservice is a simple CRUD microservice, that code is placed right into a Web API controller. + +In more advanced microservices, like when using CQRS approaches, it can be implemented in the `CommandHandler` class, within the `Handle()` method. + +## Idempotency in update message events + +An important aspect of update message events is that a failure at any point in the communication should cause the message to be retried. Otherwise a background task might try to publish an event that has already been published, creating a race condition. Make sure that the updates are either idempotent or that they provide enough information to ensure that you can detect a duplicate, discard it, and send back only one response. + +As noted earlier, idempotency means that an operation can be performed multiple times without changing the result. In a messaging environment, as when communicating events, an event is idempotent if it can be delivered multiple times without changing the result for the receiver microservice. This may be necessary because of the nature of the event itself, or because of the way the system handles the event. Message idempotency is important in any application that uses messaging, not just in applications that implement the event bus pattern. + +An example of an idempotent operation is a SQL statement that inserts data into a table only if that data is not already in the table. It does not matter how many times you run that insert SQL statement; the result will be the same. The table contains that data. Idempotency like this can also be necessary when dealing with messages if the messages could potentially be sent and therefore processed more than once. If retry logic causes a sender to send exactly the same message more than once, make sure that it is idempotent. + +It is possible to design idempotent messages. For example, you can create an event that says "set the product price to $25" instead of "add $5 to the product price." You could safely process the first message any number of times and the result will be the same. That is not true for the second message. But even in the first case, you might not want to process the first event, because the system could also have sent a newer price-change event and you would be overwriting the new price. + +Another example might be an order-completed event that's propagated to multiple subscribers. The app has to make sure that order information is updated in other systems only once, even if there are duplicated message events for the same order-completed event. + +It's convenient to have some kind of identity for each event so that you can create logic that ensures that each event is processed only once. + +Some message processing is inherently idempotent. For example, if a system generates image thumbnails, it might not matter how many times the message about the generated thumbnail is processed; the outcome is that the thumbnails are generated and they are the same every time. On the other hand, operations such as calling a payment gateway to charge a credit card may not be idempotent at all. In these cases, you need to ensure that processing a message multiple times has the effect that you expect. + +## Deduplicating integration event messages + +You can make sure that message events are sent and processed only once for each subscriber at different levels. One way is to use a deduplication feature offered by the messaging infrastructure you are using. Another is to implement custom logic in your destination microservice. Having validations at both the transport level and the application level is your best bet. + +### Deduplicating message events at the EventHandler level + +One way to make sure that an event is processed only once by any receiver is by implementing certain logic when processing the message events in event handlers. For example, by wrapping one command within another. + +### Deduplicating messages when using RabbitMQ + +When intermittent network failures happen, messages can be duplicated, and the message receiver must be ready to handle these duplicated messages. If possible, receivers should handle messages in an idempotent way, which is better than explicitly handling them with deduplication. + +According to the [RabbitMQ documentation](https://www.rabbitmq.com/reliability.html#consumer), "If a message is delivered to a consumer and then requeued (because it was not acknowledged before the consumer connection dropped, for example) then RabbitMQ will set the redelivered flag on it when it is delivered again (whether to the same consumer or a different one)." + +If the "redelivered" flag is set, the receiver must take that into account, because the message might already have been processed. But that is not guaranteed; the message might never have reached the receiver after it left the message broker, perhaps because of network issues. On the other hand, if the "redelivered" flag is not set, it's guaranteed that the message has not been sent more than once. Therefore, the receiver needs to deduplicate messages or process messages in an idempotent way only if the "redelivered" flag is set in the message. + +### Additional resources + +- [Eventual Consistency](https://en.wikipedia.org/wiki/Eventual_consistency) +- [The CAP Theorem](https://en.wikipedia.org/wiki/CAP_theorem) +- [Reliability Guide (RabbitMQ documentation)](https://www.rabbitmq.com/docs/reliability#consumer-side) + +> [!div class="step-by-step"] +> [Previous](background-tasks-with-ihostedservice.md) +> [Next](../chpt8-data-patterns/distributed-data.md) diff --git a/docs/architecture/distributed-cloud-native-apps-containers/includes/download-alert.md b/docs/architecture/distributed-cloud-native-apps-containers/includes/download-alert.md new file mode 100644 index 0000000000000..53dc741f9139e --- /dev/null +++ b/docs/architecture/distributed-cloud-native-apps-containers/includes/download-alert.md @@ -0,0 +1,19 @@ +--- +author: +ms.author: +ms.date: 04/23/2024 +ms.topic: include +--- + +> [!TIP] +> :::row::: +> :::column span="3"::: +> This content is an excerpt from the eBook, Architecture for Distributed Cloud-Native Apps with .NET & Containers, available on [.NET Docs](/dotnet/architecture/TODO) or as a free downloadable PDF that can be read offline. +> +> > [!div class="nextstepaction"] +> > [Download PDF TODO](https://dotnet.microsoft.com/download/e-book/microservices-architecture/pdf) +> :::column-end::: +> :::column::: +> :::image type="content" source="../media/cover-thumbnail.png" alt-text=".NET Microservices Architecture for Containerized .NET Applications eBook cover thumbnail."::: +> :::column-end::: +> :::row-end::: diff --git a/docs/architecture/distributed-cloud-native-apps-containers/introduction-dot-net-aspire/dot-net-aspire-overview.md b/docs/architecture/distributed-cloud-native-apps-containers/introduction-dot-net-aspire/dot-net-aspire-overview.md new file mode 100644 index 0000000000000..9eb43bf910be6 --- /dev/null +++ b/docs/architecture/distributed-cloud-native-apps-containers/introduction-dot-net-aspire/dot-net-aspire-overview.md @@ -0,0 +1,46 @@ +--- +title: .NET Aspire overview +description: Architecture for Distributed Cloud-Native Apps with .NET Aspire & Containers | .NET Aspire overview +author: +ms.date: 05/30/2024 +--- + +# .NET Aspire overview + +[!INCLUDE [download-alert](../includes/download-alert.md)] + +![A diagram showing an overview of .NET Aspire. A cloud ready stack for building observable, production ready, distributed applications.](media/what-is-aspire.png) + +**Figure 3-1**. A summary of .NET Aspire. + +.NET Aspire is a cloud-ready stack designed to accelerate development of cloud-native applications using .NET. Here's an overview: + +- **Cloud-ready stack**: It's an opinionated framework that provides a set of tools and libraries for building distributed applications that are ready for deployment in the cloud. It is designed to take full advantage of cloud features such as scalability, resilience, and manageability. + +- **Scalable**: .NET Aspire can be utilized by all sizes of project. .NET Aspire can be deployed in standalone mode to provide the .NET Aspire dashboard without any other .NET Aspire functionality or the need for Internet connectivity. From that small size, .NET Aspire can scale up to a multi-cloud distributed microservices architecture. + +- **Observable and production-ready**: .NET Aspire focuses on creating applications that are easy to monitor and ready for production environments, ensuring high reliability and performance. + +- **Microservices architecture**: It encourages the use of microservices, which are small, loosely coupled services that work together to form a complete application. This approach allows for easier scaling and maintenance. + +- **NuGet package delivery**: The framework is delivered as a set of NuGet packages, each addressing different aspects of cloud-native application development. + +- **Distributed application support**: .NET Aspire is ideal for applications that spread their computational workload across multiple nodes, ensuring efficient communication over network boundaries. + +- **Enhanced developer experience**: With .NET Aspire, developers get an improved experience with a consistent set of patterns and tools that simplify the process of building and running distributed apps. + +- **Orchestration and integrations**: It provides features for orchestrating multi-project applications and their dependencies, along with standardized integrations for common services like databases and caching. + +- **Integrated tooling**: .NET Aspire comes with integrated tooling for popular development environments like Visual Studio and Visual Studio Code, as well as the .NET CLI, making it easier to create and manage applications. + +![A diagram showing a summary of .NET Aspire. Cloud native observability, resiliency, scalability, and manageability.](media/aspire-cloud-native.png) + +**Figure 3-2**. Cloud-native features provided by .NET Aspire. + +In summary, .NET Aspire is all about providing a robust, opinionated framework that helps developers build cloud-native, distributed applications more efficiently and effectively. + +For more information, see [.NET Aspire overview](https://learn.microsoft.com/dotnet/aspire/get-started/aspire-overview) + +>[!div class="step-by-step"] +>[Previous](../chpt2-introduction-containers-docker/official-container-images-tooling.md) +>[Next](orchestration.md) diff --git a/docs/architecture/distributed-cloud-native-apps-containers/introduction-dot-net-aspire/integrations.md b/docs/architecture/distributed-cloud-native-apps-containers/introduction-dot-net-aspire/integrations.md new file mode 100644 index 0000000000000..3cfbe557de0f6 --- /dev/null +++ b/docs/architecture/distributed-cloud-native-apps-containers/introduction-dot-net-aspire/integrations.md @@ -0,0 +1,48 @@ +--- +title: .NET Aspire integrations +description: Architecture for Distributed Cloud-Native Apps with .NET Aspire & Containers | .NET Aspire integrations +author: +ms.date: 04/25/2024 +--- + +# .NET Aspire integrations: Empowering cloud-native applications + +[!INCLUDE [download-alert](../includes/download-alert.md)] + +![A diagram showing some example .NET Aspire integrations - and the NuGet package website.](media/aspire-integrations.png) + +**Figure 3-5**. Examples of .NET Aspire integrations. + +.NET Aspire integrations are curated **NuGet packages** that handle specific cloud-native concerns. These integrations connect seamlessly with your app, ensuring consistent integration with services such as Redis, PostgreSQL, and more. + +Your choice of integration depends on the problem you want to solve. For example, when you want to use a database to persist information, you can choose from: + +- **Azure Cosmos DB Entity Framework Core**: Access Azure Cosmos DB databases with Entity Framework Core using `Aspire.Microsoft.EntityFrameworkCore.Cosmos`. There's a separate integration if you prefer not to use Entity Framework. +- **MongoDB Driver**: Store and retrieve data from the popular NoSQL MongoDB databases using the `Aspire.MongoDB.Driver` package. +- **MySqlConnector**: Access MySql databases with the `Aspire.MySqlConnector` library. +- **Oracle Entity Framework Core**: Access Oracle databases with Entity Framework Core using `Aspire.Oracle.EntityFrameworkCore`. +- **Azure Table Storage**: Azure Tabe Storage provides simple NoSQL database tables. Access the Azure Table service to store and retrieve data with the `Aspire.Azure.Data.Tables` package. + +If you want to send messages between you decoupled microservices, you can choose from: + +- **Apache Kafka**: The `Aspire.Confluent.Kafka` package allows you to produce and consume messages from an Apache Kafka broker. +- **Azure Storage Queues**: Azure Storage Queues are simple and efficient message queues for decoupled microservices. Use the `Aspire.Azure.Storage.Queues` library to send and receive messages. +- **Azure Service Bus**: Send messages to other microservices by calling Azure Service Bus through the `Aspire.Azure.Messaging.ServiceBus` package. Azure Service Bus provides a more versatile messaging solution than Azure Storage Queues. +- **NATS**: The `Aspire.NATS.Net` package provides access to NATS messaging servers. + +Other popular .NET Aspire integrations include: + +- **Azure AI OpenAI**: Use the `Aspire.Azure.AI.OpenAI` library to call generative AI functionality. +- **Azure Search Documents**: The `Aspire.Azure.Search.Documents` package provides access to Azure AI Search. +- **Azure Blob Storage**: Store files in Azure Blob Storage using the `Aspire.Azure.Storage.Blobs` library. +- **Azure Event Hubs**: Event Hubs process large volumes of events quickly and reliably. The `Aspire.Azure.Messaging.EventHubs` package enables you to access to Azure Event Hubs easily from your .NET Aspire app. +- **Azure Key Vault**: Access Azure Key Vault to manage secrets using the `Aspire.Azure.Security.KeyVault` library. + +> [!NOTE] +> This isn't a full list and new integrations will be created by both Microsoft and third parties. + +These integrations simplify connections to popular services and platforms, handle cloud-native concerns, and ensure standardized configuration patterns. Remember to use the latest versions of .NET Aspire integrations to benefit from the latest features and security updates. + +>[!div class="step-by-step"] +>[Previous](service-discovery.md) +>[Next](observability-and-dashboard.md) diff --git a/docs/architecture/distributed-cloud-native-apps-containers/introduction-dot-net-aspire/media/aspire-cloud-native.png b/docs/architecture/distributed-cloud-native-apps-containers/introduction-dot-net-aspire/media/aspire-cloud-native.png new file mode 100644 index 0000000000000..f46daca519d08 Binary files /dev/null and b/docs/architecture/distributed-cloud-native-apps-containers/introduction-dot-net-aspire/media/aspire-cloud-native.png differ diff --git a/docs/architecture/distributed-cloud-native-apps-containers/introduction-dot-net-aspire/media/aspire-integrations.png b/docs/architecture/distributed-cloud-native-apps-containers/introduction-dot-net-aspire/media/aspire-integrations.png new file mode 100644 index 0000000000000..ab84397ae376d Binary files /dev/null and b/docs/architecture/distributed-cloud-native-apps-containers/introduction-dot-net-aspire/media/aspire-integrations.png differ diff --git a/docs/architecture/distributed-cloud-native-apps-containers/introduction-dot-net-aspire/media/dashboard.png b/docs/architecture/distributed-cloud-native-apps-containers/introduction-dot-net-aspire/media/dashboard.png new file mode 100644 index 0000000000000..5291b16dd1293 Binary files /dev/null and b/docs/architecture/distributed-cloud-native-apps-containers/introduction-dot-net-aspire/media/dashboard.png differ diff --git a/docs/architecture/distributed-cloud-native-apps-containers/introduction-dot-net-aspire/media/orchestration.png b/docs/architecture/distributed-cloud-native-apps-containers/introduction-dot-net-aspire/media/orchestration.png new file mode 100644 index 0000000000000..5204e30e72f8c Binary files /dev/null and b/docs/architecture/distributed-cloud-native-apps-containers/introduction-dot-net-aspire/media/orchestration.png differ diff --git a/docs/architecture/distributed-cloud-native-apps-containers/introduction-dot-net-aspire/media/service-discovery.png b/docs/architecture/distributed-cloud-native-apps-containers/introduction-dot-net-aspire/media/service-discovery.png new file mode 100644 index 0000000000000..c0dfd3da02048 Binary files /dev/null and b/docs/architecture/distributed-cloud-native-apps-containers/introduction-dot-net-aspire/media/service-discovery.png differ diff --git a/docs/architecture/distributed-cloud-native-apps-containers/introduction-dot-net-aspire/media/what-is-aspire.png b/docs/architecture/distributed-cloud-native-apps-containers/introduction-dot-net-aspire/media/what-is-aspire.png new file mode 100644 index 0000000000000..dae073f494fee Binary files /dev/null and b/docs/architecture/distributed-cloud-native-apps-containers/introduction-dot-net-aspire/media/what-is-aspire.png differ diff --git a/docs/architecture/distributed-cloud-native-apps-containers/introduction-dot-net-aspire/observability-and-dashboard.md b/docs/architecture/distributed-cloud-native-apps-containers/introduction-dot-net-aspire/observability-and-dashboard.md new file mode 100644 index 0000000000000..a960aa683a542 --- /dev/null +++ b/docs/architecture/distributed-cloud-native-apps-containers/introduction-dot-net-aspire/observability-and-dashboard.md @@ -0,0 +1,59 @@ +--- +title: Observability and dashboard +description: Architecture for Distributed Cloud-Native Apps with .NET Aspire & Containers | Observability and dashboard +author: +ms.date: 04/25/2024 +--- + +# Observability and dashboard + +[!INCLUDE [download-alert](../includes/download-alert.md)] + +![A diagram and screenshot of the .NET Aspire dashboard.](media/dashboard.png) + +**Figure 3-6**. The .NET Aspire dashboard and its features. + +The **.NET Aspire Dashboard** is a stand alone tool developers can use with their .NET Aspire projects. Here's what you need to know: + +1. **Purpose and features**: + + - The dashboard provides a sophisticated interface for comprehensive app monitoring and inspection. + - It allows you to track various aspects of your app in real-time, including logs, traces, and environment configurations. + - Purpose-built to enhance the local development experience, it offers an insightful overview of your app's state and structure. + +1. **Integration with .NET Aspire projects**: + + - The dashboard is seamlessly integrated into the .NET Aspire app host. + - During development, it automatically launches when you start your project. + - It displays the app's resources and telemetry, helping you understand your app's behavior and performance. + +1. **Standalone mode**: + + - The .NET Aspire dashboard can also be used standalone, independent of the rest of .NET Aspire. + - It ships as a Docker image, providing a great UI for viewing telemetry. + - You can use it with any application by running the following Docker command: + + ```docker + docker run --rm -it -p 18888:18888 -p 4317:18889 -d --name aspire-dashboard mcr.microsoft.com/dotnet/aspire-dashboard:8.0.0 + ``` + +1. **Configuration**: + + - The dashboard's configuration includes frontend and OTLP addresses, resource service endpoint, authentication, and telemetry limits. + - Fine-tune it to suit your needs. + +1. **Architecture**: + + - The frontend is built using Microsoft's Fluent UI Blazor component library. + - Apps communicate with the dashboard using the OpenTelemetry Protocol (OTLP). + - A resource server provides information about app resources. This information includes resource lists, console logs, and command execution. + - The dashboard communicates via gRPC to the resource server. + +1. **Security**: + + - The dashboard offers powerful insights into your apps. + - It displays sensitive data about resources, including configuration, console logs, and in-depth telemetry. + +>[!div class="step-by-step"] +>[Previous](integrations.md) +>[Next](../architecting-distributed-cloud-native-applications/why-choose-distributed-architecture.md) diff --git a/docs/architecture/distributed-cloud-native-apps-containers/introduction-dot-net-aspire/orchestration.md b/docs/architecture/distributed-cloud-native-apps-containers/introduction-dot-net-aspire/orchestration.md new file mode 100644 index 0000000000000..347bc35434e1b --- /dev/null +++ b/docs/architecture/distributed-cloud-native-apps-containers/introduction-dot-net-aspire/orchestration.md @@ -0,0 +1,59 @@ +--- +title: .NET Aspire orchestration +description: Architecture for Distributed Cloud-Native Apps with .NET Aspire & Containers | .NET Aspire orchestration +author: +ms.date: 05/30/2024 +--- + +# .NET Aspire orchestration + +[!INCLUDE [download-alert](../includes/download-alert.md)] + +### What is orchestration? + +![A diagram with the four ideas behind orchestration. App model, discovery, references, and resources.](media/orchestration.png) + +**Figure 3-3**. The four key aspects of .NET Aspire orchestration. + +In a cloud-native environment, orchestrating microservices and other components within a distributed application can be complex. **Orchestration** refers to managing the configuration and interconnections of these components. While .NET Aspire's orchestration is not intended to replace robust production systems like Kubernetes, it significantly enhances local development. + +## Key aspects of .NET Aspire orchestration + +1. **App Model Definition**: + - Every .NET Aspire app has an app model that outlines the resources in your app and their relationships. + - Resources include the projects, executables, containers, external services, and cloud resources your app depends on. + +1. **App Host Project**: + - Every .NET Aspire app has a designated **App Host Project**. + - The app host project orchestrates all projects within the .NET Aspire application. + - It runs and manages the entire app model. + - By convention, the app host project is named with the `*.AppHost` suffix. + +1. **Resource Composition**: + - In the app host project, you specify the resources that make up your application. + - Resources can be .NET projects, containers, executables, databases, caches, or cloud services. + - For example, in an app with two projects and a Redis cache, this code defines those resources and connects them: + + ```csharp + var builder = DistributedApplication.CreateBuilder(args); + + var cache = builder.AddRedis("cache"); + + var apiservice = builder.AddProject("apiservice"); + + builder.AddProject("webfrontend") + .WithReference(cache) + .WithReference(apiservice) + .WithExternalHttpEndpoints(); + + builder.Build().Run(); + ``` + +1. **Service Discovery and Connection String Management**: + - The app host project injects connection strings and service discovery information. + - Abstractions simplify setup, by eliminating low-level implementation details. + - For more information, see the topic **Service discovery**. + +>[!div class="step-by-step"] +>[Previous](dot-net-aspire-overview.md) +>[Next](service-discovery.md) diff --git a/docs/architecture/distributed-cloud-native-apps-containers/introduction-dot-net-aspire/service-discovery.md b/docs/architecture/distributed-cloud-native-apps-containers/introduction-dot-net-aspire/service-discovery.md new file mode 100644 index 0000000000000..b5953758af51b --- /dev/null +++ b/docs/architecture/distributed-cloud-native-apps-containers/introduction-dot-net-aspire/service-discovery.md @@ -0,0 +1,80 @@ +--- +title: Service discovery +description: Architecture for Distributed Cloud-Native Apps with .NET Aspire & Containers | Service discovery +author: +ms.date: 04/25/2024 +--- + +# Service discovery + +[!INCLUDE [download-alert](../includes/download-alert.md)] + +![A diagram showing some aspects of .NET Aspire service discovery.](media/service-discovery.png) + +**Figure 3-4**. Service discovery in .NET Aspire. + +In the world of distributed applications, services often need to communicate with each other over the network. Whether you're developing locally or in the cloud, ensuring that services can find and connect to each other is crucial. **.NET Aspire** provides a powerful service discovery mechanism that simplifies this process. + +## Why service discovery matters + +Imagine a microservices-based application with multiple components: authentication, catalog, basket, and frontend services. These services need to interact with each other, but their endpoints, such as different ports or URLs, can change dynamically. Service discovery helps address this challenge by providing a way for services to locate and communicate with one another. + +## Implicit service discovery by reference + +In .NET Aspire, service discovery configuration is added only for services that are **referenced** by a given project. Let's consider an example: + +```csharp +var builder = DistributedApplication.CreateBuilder(args); + +var catalog = builder.AddProject("catalog"); + +var basket = builder.AddProject("basket"); + +var frontend = builder.AddProject("frontend") + .WithReference(basket) + .WithReference(catalog) + WithExternalHttpEndpoints(); +``` + +You'll find this code in the app host project, usually in the _Program.cs_ file: + +- The `frontend` project receives references to both the `catalog` and `basket` projects. +- The `.WithReference()` calls instruct the .NET Aspire application to pass service discovery information for the referenced projects (in this case `catalog` and `basket`) into the `frontend` project. + +## Named endpoints + +To make it easier to locate services, services can expose multiple **named endpoints**. These endpoints can be resolved by specifying the endpoint name in the host portion of the HTTP request URI. The format is `scheme://_endpointName.serviceName`. For instance: + +```csharp +builder.Services.AddHttpClient(client => + client.BaseAddress = new Uri("https://basket")); +builder.Services.AddHttpClient(client => + client.BaseAddress = new Uri("https://_dashboard.basket")); +``` + +In this example: + +- We configure two `HttpClient` classes — one for the core basket service and another for the basket service's dashboard. +- The `_dashboard` endpoint is resolved from the name `https://_dashboard.basket`. + +## Configuration-based endpoint resolver + +With the configuration-based endpoint resolver, named endpoints can be specified in configuration files. For example, in `appsettings.json`: + +```json +{ + "Services": { + "basket": "https://10.2.3.4:8080", // Default endpoint, reachable at https://basket + "dashboard": "https://10.2.3.4:9999" // "dashboard" endpoint, reachable at https://_dashboard.basket + } +} +``` + +In this JSON: + +- The default endpoint resolves the name `https://basket` to `10.2.3.4:8080`. +- The "dashboard" endpoint resolves the name `https://_dashboard.basket` to `10.2.3.4:9999`. + +>[!div class="step-by-step"] +>[Previous](orchestration.md) +>[Next](integrations.md) diff --git a/docs/architecture/distributed-cloud-native-apps-containers/introduction-to-cloud-native-development/candidate-apps-for-cloud-native.md b/docs/architecture/distributed-cloud-native-apps-containers/introduction-to-cloud-native-development/candidate-apps-for-cloud-native.md new file mode 100644 index 0000000000000..fd840d79a27f5 --- /dev/null +++ b/docs/architecture/distributed-cloud-native-apps-containers/introduction-to-cloud-native-development/candidate-apps-for-cloud-native.md @@ -0,0 +1,77 @@ +--- +title: Candidate apps for cloud-native +description: Learn which types of applications benefit from a cloud-native approach +author: robvet +ms.date: 12/14/2023 +--- + +# Candidate apps for cloud-native + +[!INCLUDE [download-alert](../includes/download-alert.md)] + +Think about the apps your organization needs to build. Then, look at the existing apps in your portfolio. How many of them warrant a cloud-native architecture? + +Applying cost/benefit analysis, there's a good chance some wouldn't support the effort. The cost of becoming cloud-native would far exceed the business value of the application. + +What type of application might be a good candidate for cloud-native? + +- Strategic enterprise systems that need to evolve business features constantly +- An application that requires a high release velocity, with high confidence +- A system where individual features must release *without* a full redeployment of the entire system +- An application developed by multiple teams with expertise in different technology stacks +- An application with components that must scale independently + +Smaller, less impactful line-of-business applications might fare well with a simple monolithic architecture hosted in a cloud PaaS environment. + +Then there are legacy systems. While we'd all like to build new applications, we're often responsible for modernizing legacy workloads that are critical to the business. + +## Modernizing legacy apps + +The free Microsoft e-book [Modernize existing .NET applications with Azure cloud and Windows Containers](https://dotnet.microsoft.com/download/thank-you/modernizing-existing-net-apps-ebook) provides guidance about migrating on-premises workloads into the cloud. Figure 1-10 shows that there isn't a single, one-size-fits-all strategy for modernizing legacy applications. + +![Strategies for migrating legacy workloads](./media/strategies-for-migrating-legacy-workloads.png) + +**Figure 1-10**. Strategies for migrating legacy workloads + +Monolithic apps that are non-critical might benefit from a quick **lift-and-shift** migration. Here, the on-premises workload is moved to a cloud-based virtual machine (VM), without changes. This approach uses the [IaaS (Infrastructure as a Service) model](https://azure.microsoft.com/resources/cloud-computing-dictionary/what-is-iaas/). Azure includes several tools such as [Azure Migrate](https://azure.microsoft.com/services/azure-migrate/), [Azure Site Recovery](https://azure.microsoft.com/services/site-recovery/), and [Azure Database Migration Service](https://azure.microsoft.com/campaigns/database-migration/) to help streamline the move. While this strategy can yield some cost savings, such applications typically weren't designed to unlock and leverage the benefits of cloud computing. + +Legacy apps that are critical to the business often benefit from an enhanced **Cloud Optimized** migration. This approach includes deployment optimizations that enable key cloud services, without changing the core architecture of the application. For example, you might [containerize](https://learn.microsoft.com/virtualization/windowscontainers/about/) the application and deploy it to a container orchestrator, like [Azure Kubernetes Services](https://azure.microsoft.com/services/kubernetes-service/), discussed later in this book. Once in the cloud, the application can consume cloud backing services such as databases, message queues, monitoring, and distributed caching. + +Finally, monolithic apps that provide strategic enterprise functions might best benefit from a cloud-native approach. This approach provides agility and velocity but comes at a cost of replatforming, rearchitecting, and rewriting code. Over time, a legacy application could be decomposed into microservices, containerized, and ultimately _replatformed_ into a cloud-native architecture. + +If you and your team believe a cloud-native approach is appropriate, you have to rationalize the decision with your organization. What exactly is the business problem that a cloud-native approach will solve? How would it align with business needs? + +- Rapid releases of features with increased confidence? +- Fine-grained scalability and more efficient usage of resources? +- Improved system resiliency? +- Improved system performance? +- More visibility into operations? +- Blend development platforms and data stores to arrive at the best tool for the job? +- Future-proof application investment? + +The right migration strategy depends on organizational priorities and the systems you're targeting. For many, it may be more cost effective to cloud-optimize a monolithic application or add coarse-grained services to an N-Tier app. In these cases, you can still make full use of cloud PaaS capabilities like the ones offered by Azure App Service. + +## Summary + +In this chapter, we introduced cloud-native computing. We provided a definition along with the key capabilities that drive a cloud-native application. We looked at the types of applications that might justify this investment and effort. We've also had a introduction to the new .NET Aspire stack, which can help ease cloud-native development for .NET 8 and later solutions. + +With the introduction behind, we now dive into a much more detailed look at cloud native. + +### References + +- [Cloud Native Computing Foundation](https://www.cncf.io/) +- [.NET Microservices: Architecture for Containerized .NET applications](https://dotnet.microsoft.com/download/thank-you/microservices-architecture-ebook) +- [Microsoft Azure Well-Architected Framework](https://learn.microsoft.com/azure/well-architected/) +- [Modernize existing .NET applications with Azure cloud and Windows Containers](https://dotnet.microsoft.com/download/thank-you/modernizing-existing-net-apps-ebook) +- [Cloud Native Patterns by Cornelia Davis](https://www.manning.com/books/cloud-native-patterns) +- [Cloud native applications: Ship faster, reduce risk, and grow your business](https://tanzu.vmware.com/cloud-native) +- [Dapr documents](https://dapr.io/) +- [Beyond the Twelve-Factor Application](https://content.pivotal.io/blog/beyond-the-twelve-factor-app) +- [What is Infrastructure as Code](https://learn.microsoft.com/devops/deliver/what-is-infrastructure-as-code) +- [Uber Engineering's Micro Deploy: Deploying Daily with Confidence](https://www.uber.com/blog/micro-deploy-code/) +- [How Netflix Deploys Code](https://www.infoq.com/news/2013/06/netflix/) +- [Overload Control for Scaling WeChat Microservices](https://www.cs.columbia.edu/~ruigu/papers/socc18-final100.pdf) + +>[!div class="step-by-step"] +>[Previous](what-is-cloud-native.md) +>[Next](../chpt2-introduction-containers-docker/what-are-containers.md) diff --git a/docs/architecture/distributed-cloud-native-apps-containers/introduction-to-cloud-native-development/introduction-to-cloud-native-applications.md b/docs/architecture/distributed-cloud-native-apps-containers/introduction-to-cloud-native-development/introduction-to-cloud-native-applications.md new file mode 100644 index 0000000000000..d60d482bcbd45 --- /dev/null +++ b/docs/architecture/distributed-cloud-native-apps-containers/introduction-to-cloud-native-development/introduction-to-cloud-native-applications.md @@ -0,0 +1,79 @@ +--- +title: Introduction to cloud-native applications +description: Learn about cloud-native computing +author: +ms.date: 05/15/2024 +--- + +# Introduction to cloud-native applications + +[!INCLUDE [download-alert](../includes/download-alert.md)] + +Another day at the office, diligently working on "the next big thing." + +Your cellphone rings. It's your recruiter, the one who regularly contacts you with enticing new opportunities. + +However, this time it is different: a start-up with equity and substantial funding. The mention of cloud computing, microservices, and cutting-edge technology compels you to consider the offer seriously. + +A few weeks later, you find yourself employed, participating in a design session, and architecting a major eCommerce application. You're going to compete with the leading eCommerce platforms. + +How will you approach this challenge? + +If you follow the guidance from past 15 years, you'll most likely build the system shown in Figure 1.1. + +![Traditional monolithic design](./media/monolithic-design.png) + +**Figure 1-1**. Traditional monolithic design + +You construct a large core application containing all of your domain logic. It includes modules such as Identity, Catalog, Ordering, and more. They directly communicate with each other within a single server process. The modules share a large relational database. The core exposes functionality through an HTML usere interface and a mobile app. + +Congratulations! You've created a monolithic application. + +This isn't a bad app. Monoliths offer some distinct advantages. For example, they're easy to: + +- build +- test +- deploy +- troubleshoot +- scale vertically + +Many successful apps that exist today were created as monoliths. Your app is a hit and continues to evolve, iteration after iteration, adding more functionality. + +At some point, however, as the app continues to grow and develop you begin to feel uncomfortable. You find yourself losing control of the application. As time goes on, the feeling becomes more intense, and you eventually enter a state known as the `Fear Cycle`: + +- The app has become so overwhelmingly complicated that no single person understands it. +- You fear making changes because each change has unintended and costly side effects. +- New features or fixes become tricky, time-consuming, and expensive to implement. +- Each release becomes as small as possible and requires a full deployment of the entire application. +- One unstable component can crash the entire system. +- New technologies and frameworks aren't an option. +- It's difficult to implement agile delivery methodologies. +- Architectural erosion sets in as the code base deteriorates with never-ending "quick fixes." +- Finally, the _consultants_ come in and tell you to rewrite it. + +Does this scenario sound familiar? + +Many organizations have addressed this monolithic fear cycle by adopting a cloud-native approach to building systems. Figure 1-2 shows the same system built applying cloud-native techniques and practices. + +![Cloud-Native Design](./media/cloud-native-design.png) + +**Figure 1-2**. Cloud-native design + +Note how the application is decomposed across a set of small isolated microservices. Each service is self-contained and encapsulates its own code, data, and dependencies. Each is deployed in a software container and managed by a container orchestrator. Instead of a large relational database, each service owns its own datastore, the type of which varies based upon the data needs. Note how some services depend on a relational database but others on NoSQL databases. One service stores its state in a distributed cache. Note how all traffic routes through an API Gateway service that's responsible for routing traffic to the core back-end services and enforcing many cross-cutting concerns. Most importantly, the application takes full advantage of the scalability, availability, and resiliency features found in modern cloud platforms. + +### Cloud-native computing + +We just used the term, _cloud native_. You might be wondering if that's a meaningful term, or just another industry buzzword concocted by software vendors to sell more product. + +Actually, _cloud native_ is a clear and specific term that describes a flexible way to write large scale web applications that won't evolve into a fear cycle. + +Within a short time, cloud native has become a driving trend in the software industry. It's a new way to construct large, complex systems. The approach takes full advantage of modern software development practices, technologies, and cloud infrastructure. Cloud native changes the way you design, implement, deploy, and operationalize systems. + +Unlike much of the continuous hype that drives our industry, cloud native is a serious new approach to architecture. Consider the [Cloud Native Computing Foundation](https://www.cncf.io/) (CNCF), a consortium of over 400 major corporations. Its charter is to make cloud-native computing ubiquitous across technology and cloud stacks. As one of the most influential open-source groups, it hosts many of the fastest-growing open-source projects in GitHub. These projects include [Kubernetes](https://kubernetes.io/), [Prometheus](https://prometheus.io/), [Helm](https://helm.sh/), [Envoy](https://www.envoyproxy.io/), and [gRPC](https://grpc.io/). + +The CNCF fosters an ecosystem of open-source code and vendor-neutrality. Following that lead, this book presents cloud-native principles, patterns, and best practices that are technology agnostic. At the same time, we discuss the services and infrastructure available in the Microsoft Azure cloud for constructing cloud-native systems and we show how Microsoft's new .NET Azure stack eases your build. + +So, what exactly is cloud native? Let's explore this new world. + +>[!div class="step-by-step"] +>[Next](what-is-cloud-native.md) diff --git a/docs/architecture/distributed-cloud-native-apps-containers/introduction-to-cloud-native-development/media/build-release-run-pipeline.png b/docs/architecture/distributed-cloud-native-apps-containers/introduction-to-cloud-native-development/media/build-release-run-pipeline.png new file mode 100644 index 0000000000000..08d240aa37419 Binary files /dev/null and b/docs/architecture/distributed-cloud-native-apps-containers/introduction-to-cloud-native-development/media/build-release-run-pipeline.png differ diff --git a/docs/architecture/distributed-cloud-native-apps-containers/introduction-to-cloud-native-development/media/cloud-native-design.png b/docs/architecture/distributed-cloud-native-apps-containers/introduction-to-cloud-native-development/media/cloud-native-design.png new file mode 100644 index 0000000000000..5fd2dc26e107d Binary files /dev/null and b/docs/architecture/distributed-cloud-native-apps-containers/introduction-to-cloud-native-development/media/cloud-native-design.png differ diff --git a/docs/architecture/distributed-cloud-native-apps-containers/introduction-to-cloud-native-development/media/cloud-native-foundational-pillars.png b/docs/architecture/distributed-cloud-native-apps-containers/introduction-to-cloud-native-development/media/cloud-native-foundational-pillars.png new file mode 100644 index 0000000000000..914886743048a Binary files /dev/null and b/docs/architecture/distributed-cloud-native-apps-containers/introduction-to-cloud-native-development/media/cloud-native-foundational-pillars.png differ diff --git a/docs/architecture/distributed-cloud-native-apps-containers/introduction-to-cloud-native-development/media/common-backing-services.png b/docs/architecture/distributed-cloud-native-apps-containers/introduction-to-cloud-native-development/media/common-backing-services.png new file mode 100644 index 0000000000000..94d679ac06715 Binary files /dev/null and b/docs/architecture/distributed-cloud-native-apps-containers/introduction-to-cloud-native-development/media/common-backing-services.png differ diff --git a/docs/architecture/distributed-cloud-native-apps-containers/introduction-to-cloud-native-development/media/dapr-high-level.png b/docs/architecture/distributed-cloud-native-apps-containers/introduction-to-cloud-native-development/media/dapr-high-level.png new file mode 100644 index 0000000000000..54abd341bccc6 Binary files /dev/null and b/docs/architecture/distributed-cloud-native-apps-containers/introduction-to-cloud-native-development/media/dapr-high-level.png differ diff --git a/docs/architecture/distributed-cloud-native-apps-containers/introduction-to-cloud-native-development/media/hosting-mulitple-containers.png b/docs/architecture/distributed-cloud-native-apps-containers/introduction-to-cloud-native-development/media/hosting-mulitple-containers.png new file mode 100644 index 0000000000000..1035d50d9380a Binary files /dev/null and b/docs/architecture/distributed-cloud-native-apps-containers/introduction-to-cloud-native-development/media/hosting-mulitple-containers.png differ diff --git a/docs/architecture/distributed-cloud-native-apps-containers/introduction-to-cloud-native-development/media/monolithic-design.png b/docs/architecture/distributed-cloud-native-apps-containers/introduction-to-cloud-native-development/media/monolithic-design.png new file mode 100644 index 0000000000000..9e056db9b7ead Binary files /dev/null and b/docs/architecture/distributed-cloud-native-apps-containers/introduction-to-cloud-native-development/media/monolithic-design.png differ diff --git a/docs/architecture/distributed-cloud-native-apps-containers/introduction-to-cloud-native-development/media/monolithic-vs-microservices.png b/docs/architecture/distributed-cloud-native-apps-containers/introduction-to-cloud-native-development/media/monolithic-vs-microservices.png new file mode 100644 index 0000000000000..18270534b633b Binary files /dev/null and b/docs/architecture/distributed-cloud-native-apps-containers/introduction-to-cloud-native-development/media/monolithic-vs-microservices.png differ diff --git a/docs/architecture/distributed-cloud-native-apps-containers/introduction-to-cloud-native-development/media/strategies-for-migrating-legacy-workloads.png b/docs/architecture/distributed-cloud-native-apps-containers/introduction-to-cloud-native-development/media/strategies-for-migrating-legacy-workloads.png new file mode 100644 index 0000000000000..d037d4ac970f8 Binary files /dev/null and b/docs/architecture/distributed-cloud-native-apps-containers/introduction-to-cloud-native-development/media/strategies-for-migrating-legacy-workloads.png differ diff --git a/docs/architecture/distributed-cloud-native-apps-containers/introduction-to-cloud-native-development/media/what-container-orchestrators-do.png b/docs/architecture/distributed-cloud-native-apps-containers/introduction-to-cloud-native-development/media/what-container-orchestrators-do.png new file mode 100644 index 0000000000000..958e841420f47 Binary files /dev/null and b/docs/architecture/distributed-cloud-native-apps-containers/introduction-to-cloud-native-development/media/what-container-orchestrators-do.png differ diff --git a/docs/architecture/distributed-cloud-native-apps-containers/introduction-to-cloud-native-development/what-is-cloud-native.md b/docs/architecture/distributed-cloud-native-apps-containers/introduction-to-cloud-native-development/what-is-cloud-native.md new file mode 100644 index 0000000000000..1650ee93b45b6 --- /dev/null +++ b/docs/architecture/distributed-cloud-native-apps-containers/introduction-to-cloud-native-development/what-is-cloud-native.md @@ -0,0 +1,351 @@ +--- +title: What is cloud-native? +description: Learn about the foundational pillars that provide the bedrock for cloud-native systems +author: robvet +ms.date: 12/14/2023 +--- + +# What is cloud-native? + +[!INCLUDE [download-alert](../includes/download-alert.md)] + +Stop what you're doing and ask your colleagues to define the term _cloud-native_. There's a good chance you'll get several different answers. + +Let's start with a simple definition: + +> *Cloud-native architecture and technologies are an approach to designing, constructing, and operating workloads that are built in the cloud and take full advantage of the cloud computing model.* + +The [Cloud Native Computing Foundation](https://www.cncf.io/) provides the [official definition](https://github.com/cncf/toc/blob/main/DEFINITION.md): + +> *Cloud-native practices empower organizations to develop, build, and deploy workloads in computing environments (public, private, hybrid cloud) to meet their organizational needs at scale in a programmatic and repeatable manner. It is characterized by loosely coupled systems that interoperate in a manner that is secure, resilient, manageable, sustainable, and observable.* + +Cloud-native is about *speed* and *agility*. Business systems are evolving from enabling business capabilities to tools for strategic transformation that accelerate business velocity and growth. It's imperative to get new ideas to market immediately. + +At the same time, business systems have also become increasingly complex with users demanding more. They expect rapid responsiveness, innovative features, and zero downtime. Performance problems, recurring errors, and the inability to move fast are no longer acceptable. Your users will visit your competitor. Cloud-native systems are designed to embrace rapid change, large scale, and resilience. + +Here are some companies who have implemented cloud-native techniques. Think about the speed, agility, and scalability they've achieved. + +| Company | Experience | +| :-------- | :-------- | +| [Netflix](https://www.infoq.com/news/2013/06/netflix/) | Has 600+ services in production. Deploys 100 times per day. | +| [Uber](https://www.uber.com/blog/micro-deploy-code/) | Has 1,000+ services in production. Deploys several thousand times each week. | +| [WeChat](https://www.cs.columbia.edu/~ruigu/papers/socc18-final100.pdf) | Has 3,000+ services in production. Deploys 1,000 times a day. | + +As you can see, Netflix, Uber, and, WeChat expose cloud-native systems that consist of many independent services. This architectural style enables them to rapidly respond to market conditions. They instantaneously update small areas of a live, complex application, without a full redeployment. They individually scale services as needed. + +## The pillars of cloud-native + +The speed and agility of cloud-native architectures derive from many factors. Foremost is *cloud infrastructure*. But there's more: Five other foundational pillars shown in Figure 1-3 also provide the bedrock for cloud-native systems. + +![Cloud-native foundational pillars](./media/cloud-native-foundational-pillars.png) + +**Figure 1-3**. Cloud-native foundational pillars + +Let's take some time to better understand the significance of each pillar. + +## The cloud + +Cloud-native systems take full advantage of the cloud service model. + +Designed to thrive in a dynamic, virtualized cloud environment, these systems make extensive use of [Platform as a Service (PaaS)](https://azure.microsoft.com/overview/what-is-paas/) compute infrastructure and managed services. They treat the underlying infrastructure as *disposable* - provisioned in minutes and resized, scaled, or destroyed on demand through automation. + +Consider the difference between how we treat pets and cattle. In a traditional data center, servers are treated as pets: a physical machine, given a meaningful name, and cared for. You scale by adding more resources to the same machine (scaling up). If the server becomes sick, you nurse it back to health. Should the server become unavailable, everyone notices. + +The cattle service model is different. You provision each instance as a virtual machine or container. They're identical and assigned a system identifier such as Service-01, Service-02, and so on. You scale by creating more instances (scaling out). Nobody notices when an instance becomes unavailable. + +The cattle model embraces immutable infrastructure. Servers aren't repaired or modified. If one fails or requires updating, it's destroyed and a new one is provisioned. This is all done through automation. + +Cloud-native systems embrace the cattle service model. They continue to run as the infrastructure scales in or out. They don't care about the machines upon which they run. + +The Azure cloud platform supports this type of highly elastic infrastructure with automatic scaling, self-healing, and monitoring capabilities. + +## Modern design + +How would you design a cloud-native app? What would your architecture look like? To what principles, patterns, and best practices would you adhere? What infrastructure and operational concerns would be important? + +### The Twelve-Factor Application + +A widely accepted methodology for constructing cloud-based applications is the [Twelve-Factor Application](https://12factor.net/). It describes a set of principles and practices that developers follow to construct applications optimized for modern cloud environments. It pays special attention to portability across environments and declarative automation. + +While applicable to any web-based application, many practitioners consider Twelve-Factor a solid foundation for building cloud-native apps. Systems built upon these principles can deploy and scale rapidly and add features to react quickly to market changes. + +The following table highlights the Twelve-Factor methodology: + +| Factor | Explanation | +| :-------- | :-------- | +| 1 - Code Base | A single code base for each microservice, stored in its own repository. Tracked with version control, it can deploy to multiple environments, such as quality assurance, staging, and production. | +| 2 - Dependencies | Each microservice isolates and packages its own dependencies, embracing changes without impacting the entire system. | +| 3 - Configurations | Configuration information is moved out of the microservice and externalized through a configuration management tool outside of the code. The same deployment can propagate across environments with the correct configuration applied. | +| 4 - Backing Services | Ancillary resources (data stores, caches, message brokers) should be exposed through an addressable URL. Doing so decouples the resource from the application, so it's interchangeable. | +| 5 - Build, Release, Run | Each release must enforce a strict separation across the build, release, and run stages. Each should be tagged with a unique ID and support the ability to roll back. Modern CI/CD systems help fulfill this principle. | +| 6 - Processes | Each microservice should execute in its own process, isolated from other running services. Externalize required state to a backing service such as a distributed cache or data store. | +| 7 - Port Binding | Each microservice should be self-contained with its interfaces and functionality exposed on its own port. Doing so provides isolation from other microservices. | +| 8 - Concurrency | When capacity needs to increase, scale out services horizontally across multiple identical processes (copies) as opposed to scaling-up a single large instance on the most powerful machine available. Develop the application to be concurrent so that scaling out in cloud environments is seamless. | +| 9 - Disposability | Service instances should be disposable. Favor fast startup to increase scalability opportunities and graceful shutdowns to leave the system in a correct state. Docker containers along with an orchestrator inherently satisfy this requirement. | +| 10 - Dev/Prod Parity | Keep environments across the application lifecycle as similar as possible. Here, the adoption of containers can greatly contribute by promoting the same execution environment. | +| 11 - Logging | Treat logs generated by microservices as event streams. Process them with an event aggregator. Propagate log data to data-mining and log management tools like Azure Monitor or Splunk and eventually to long-term archives. | +| 12 - Admin Processes | Run administrative tasks, such as data cleanup or computing analytics, as one-off processes. Use independent tools to invoke these tasks from the production environment, but separately from the application. | + +In the book, [Beyond the Twelve-Factor App](https://content.pivotal.io/blog/beyond-the-twelve-factor-app), author Kevin Hoffman details each of the original 12 factors. Additionally, he discusses three extra factors that reflect today's modern cloud application design: + +| New Factor | Explanation | +| :-------- | :-------- | +| 13 - API First | Make everything a service. Assume your code will be consumed by a front-end client, gateway, or another service. | +| 14 - Telemetry | On a workstation, you have deep visibility into your application and its behavior. In the cloud, you don't. Make sure your design includes the collection of monitoring, domain-specific, and health data. | +| 15 - Authentication and Authorization | Implement identity from the start. Consider [RBAC (role-based access control)](https://learn.microsoft.com/azure/role-based-access-control/overview) features available in public clouds. | + +We'll refer to many of the 12+ factors in this chapter and throughout the book. + +### Azure Well-Architected Framework + +Designing and deploying cloud-based workloads can be challenging, especially when implementing cloud-native architecture. Microsoft provides industry standard best practices to help you and your team deliver robust cloud solutions. + +The [Microsoft Well-Architected Framework](https://learn.microsoft.com/azure/well-architected/) provides a set of guiding tenets that can be used to improve the quality of a cloud-native workload. The framework consists of five pillars of architecture excellence: + +| Tenets | Description | +| :-------- | :-------- | +| [Cost management](https://learn.microsoft.com/azure/well-architected/cost-optimization/) | Focus on generating incremental value early. Apply *Build-Measure-Learn* principles to accelerate time to market while avoiding capital-intensive solutions. Using a pay-as-you-go strategy, invest as you scale out, rather than delivering a large investment up front. | +| [Operational excellence](https://learn.microsoft.com/azure/well-architected/operational-excellence/) | Automate the environment and operations to increase speed and reduce human error. Roll problem updates back or forward quickly. Implement monitoring and diagnostics from the start. | +| [Performance efficiency](https://learn.microsoft.com/azure/well-architected/performance-efficiency/) | Efficiently meet demands placed on your workloads. Favor horizontal scaling (scaling out) and design it into your systems. Continually conduct performance and load testing to identify potential bottlenecks. | +| [Reliability](/azure/architecture/framework/#reliability) | Build workloads that are both resilient and available. Resiliency enables workloads to recover from failures and continue functioning. Availability ensures users access to your workload at all times. Design applications to expect failures and recover from them. | +| [Security](https://learn.microsoft.com/azure/well-architected/reliability/) | Implement security across the entire lifecycle of an application, from design and implementation to deployment and operations. Pay close attention to identity management, infrastructure access, application security, and data sovereignty and encryption. | + +To get started, Microsoft provides a set of [online assessments](https://learn.microsoft.com/assessments/azure-architecture-review/) to help you assess your current cloud workloads against the five well-architected pillars. + +## Microservices + +Cloud-native systems embrace microservices, a popular architectural style for constructing modern applications. + +Built as a distributed set of small, independent services that interact through a shared fabric, microservices share the following characteristics: + +- Each implements a specific business capability within a larger domain context. +- Each is developed autonomously and can be deployed independently. +- Each is self-contained and encapsulates its own data storage technology, dependencies, and programming platform. +- Each runs in its own process and communicates with others using standard communication protocols such as HTTP/HTTPS, gRPC, WebSockets, or [AMQP](https://en.wikipedia.org/wiki/Advanced_Message_Queuing_Protocol). +- They come together to form the complete application. + +Contrast these characteristics with a monolithic application approach. A monolith is composed of a layered architecture, which executes in a single process. It typically consumes a relational database. By comparison, the microservice approach segregates functionality into independent services, each with its own logic, state, and data. Each microservice hosts its own datastore. + +Note that microservices promote the **Processes** principle from the [Twelve-Factor Application](https://12factor.net/), discussed earlier in the chapter. + +> *Factor \#6 specifies "Each microservice should execute in its own process, isolated from other running services."* + +### Why microservices? + +Microservices provide agility. + +Earlier in the chapter, we compared an eCommerce application built as a monolith to that with microservices. In the example, we saw some clear benefits: + +- Each microservice has an autonomous lifecycle and can evolve independently and deploy frequently. You don't have to wait for a quarterly release to deploy a new feature or update. You can update a small area of a live application with less risk of disrupting the entire system. You can deploy the update without a full redeployment of the entire application. +- Each microservice can scale independently. Instead of scaling the entire application as a single unit, you scale out only those services that require more processing power to meet desired performance levels and service-level agreements. Fine-grained scaling provides for greater control of your system and helps reduce overall costs as you scale portions of your system, not everything. + +An excellent reference guide for understanding microservices is [.NET Microservices: Architecture for Containerized .NET Applications](https://dotnet.microsoft.com/download/thank-you/microservices-architecture-ebook). The book deep dives into microservices design and architecture. It's a companion for the [eShop reference application,](https://github.com/dotnet/eShop), a full-stack microservice reference architecture that's available as a free download from Microsoft. + +### Developing microservices + +You can build microservices on any modern development platform. + +The Microsoft .NET platform is an excellent choice. Free and open source, it has many built-in features that simplify microservice development. .NET is cross-platform. Applications can be built and run on Windows, macOS, and most flavors of Linux. + +.NET is highly performant and has scored well in comparison to Node.js and other competing platforms. Interestingly, [TechEmpower](https://www.techempower.com/) conducted an extensive set of [performance benchmarks](https://www.techempower.com/benchmarks/#section=data-r22&hw=ph&test=plaintext) across many web application platforms and frameworks. .NET scored in the top 3 - well above Node.js and other competing platforms. + +[.NET](https://github.com/dotnet/core) is maintained by Microsoft and the .NET community on [GitHub](https://github.com/dotnet). + +.NET version 8 and later also includes a technology stack that's specifically designed to help you build cloud-native apps easily and rapidly: .NET Aspire. It's not the only way to build cloud-native apps in .NET but it does address the common challenges we're about to examine. + +### Cloud-native challenges + +While distributed cloud-native microservices can provide immense agility and speed, they present many challenges: + +#### *Communication* + +How will front-end client applications communicate with backed-end core microservices? Will you allow direct communication? Or, might you abstract the back-end microservices with a gateway facade that provides flexibility, control, and security? + +How will back-end core microservices communicate with each other? Will you allow direct HTTP calls that can increase coupling and impact performance and agility? Or might you consider decoupled messaging with queue and topic technologies? + +Communication is covered in the [Cloud-native communication patterns](../communication-patterns/communication-patterns.md) chapter. + +#### *Resiliency* + +A microservices architecture moves your system from in-process to out-of-process network communication. In a distributed architecture, what happens when Service B isn't responding to a network call from Service A? Or, what happens when Service C becomes temporarily unavailable and other services calling it become blocked? + +Resiliency is covered in the [Cloud-native resiliency](../cloud-native-resiliency/cloud-native-resiliency.md) chapter. + +#### *Distributed Data* + +By design, each microservice encapsulates its own data, exposing operations via its public interface. If so, how do you query data or implement a transaction across multiple services? + +Distributed data is covered in the [Cloud-native data patterns](../chpt8-data-patterns/distributed-data.md) chapter. + +#### *Secrets* + +How will your microservices securely store and manage secrets and sensitive configuration data? + +Secrets are covered in detail [Cloud-native security](../cloud-native-identity/cloud-native-security.md). + +#### *Observability* + +In cloud-native environments, where services are ephemeral and interactions complex, traditional monitoring tools often fall short. This necessitates advanced observability solutions that can handle high-velocity data, provide actionable insights, and support the diverse technology stack typical of microservices architectures. + +#### *Manageability* + +Cloud-native apps demand robust automation for deployment, scaling, and recovery processes. The challenge lies in creating management tools that can adapt to rapid changes without human intervention, ensuring consistent performance and reliability. + +## Containers + +It's natural to hear the term *container* mentioned in any cloud-native conversation. In the book, [Cloud Native Patterns](https://www.manning.com/books/cloud-native-patterns), author Cornelia Davis observes that, "Containers are a great enabler of cloud-native software." The Cloud Native Computing Foundation places microservice containerization as the first step in their [Cloud-Native Trail Map](https://raw.githubusercontent.com/cncf/trailmap/master/CNCF_TrailMap_latest.png) - guidance for enterprises beginning their cloud-native journey. + +Containerizing a microservice is simple and straightforward. The code, its dependencies, and runtime are packaged into a binary called a [container image](https://docs.docker.com/glossary/?term=image). Images are stored in a container registry, which acts as a repository or library for images. A registry can be located on your development computer, in your data center, or in a public cloud. Docker itself maintains a public registry at [Docker Hub](https://hub.docker.com/). The Azure cloud features a private [container registry](https://azure.microsoft.com/services/container-registry/) to store container images close to the cloud applications that will run them. + +When an application starts or scales, you transform the container image into a running container instance. The instance runs on any computer that has a [container runtime](https://kubernetes.io/docs/setup/production-environment/container-runtimes/) engine installed. You can have as many instances of the containerized service as needed. + +Figure 1-6 shows three different microservices, each in its own container, all running on a single host. + +![Multiple containers running on a container host](./media/hosting-mulitple-containers.png) + +**Figure 1-6**. Multiple containers running on a container host + +Note how each container maintains its own set of dependencies and runtime, which can be different from one another. Here, we see different versions of the **Product** microservice running on the same host. Each container shares a slice of the underlying host operating system, memory, and processor, but is isolated from the others. + +Note how well the container model embraces the **Dependencies** principle from the [Twelve-Factor Application](https://12factor.net/). + +> *Factor \#2 specifies that "Each microservice isolates and packages its own dependencies, embracing changes without impacting the entire system."* + +Containers support both Linux and Windows workloads. The Azure cloud openly embraces both. Interestingly, it's Linux, not Windows Server, that has become the more popular operating system in Azure. + +While several container vendors exist, [Docker](https://www.docker.com/) has captured the lion's share of the market. The company has been driving the software container movement. It has become the de facto standard for packaging, deploying, and running cloud-native applications. + +### Why containers? + +Containers provide portability and guarantee consistency across environments. By encapsulating everything into a single package, you *isolate* the microservice and its dependencies from the underlying infrastructure. + +You can deploy the container in any environment that hosts the Docker runtime engine. Containerized workloads also eliminate the expense of pre-configuring each environment with frameworks, software libraries, and runtime engines. + +By sharing the underlying operating system and host resources, a container has a much smaller footprint than a full virtual machine. The smaller size increases the *density*, or number of microservices, that a given host can run at one time. + +### Container orchestration + +While tools such as Docker create images and run containers, you also need tools to manage them. Container management is done with a special software program called a **container orchestrator**. When operating at scale with many independent running containers, orchestration is essential. + +Figure 1-7 shows management tasks that container orchestrators automate. + +![What container orchestrators do](./media/what-container-orchestrators-do.png) + +**Figure 1-7**. What container orchestrators do + +The following table describes common orchestration tasks. + +| Tasks | Explanation | +| :-------- | :-------- | +| Scheduling | Automatically provision container instances.| +| Affinity/anti-affinity | Provision containers nearby or far apart from each other, helping availability and performance. | +| Health monitoring | Automatically detect and correct failures.| +| Failover | Automatically reprovision a failed instance to a healthy machine.| +| Scaling | Automatically add or remove a container instance to meet demand.| +| Networking | Manage a networking overlay for container communication.| +| Service Discovery | Enable containers to locate each other.| +| Rolling Upgrades | Coordinate incremental upgrades with zero downtime deployment. Automatically roll back problematic changes.| + +Note how container orchestrators embrace the **Disposability** and **Concurrency** principles from the [Twelve-Factor Application](https://12factor.net/). + +> *Factor \#9 specifies that "Service instances should be disposable, favoring fast startups to increase scalability opportunities and graceful shutdowns to leave the system in a correct state."* Docker containers along with an orchestrator inherently satisfy this requirement." + +> *Factor \#8 specifies that "Services scale out across a large number of small identical processes (copies) as opposed to scaling-up a single large instance on the most powerful machine available."* + +While several container orchestrators exist, [Kubernetes](https://kubernetes.io/docs/concepts/overview/what-is-kubernetes/) has become the de facto standard for the cloud-native world. It's a portable, extensible, open-source platform for managing containerized workloads. + +You could host your own instance of Kubernetes, but then you'd be responsible for provisioning and managing its resources, which can be complex. The Azure cloud features Kubernetes as a managed service. Both [Azure Kubernetes Service (AKS)](https://azure.microsoft.com/services/kubernetes-service/) and [Azure Red Hat OpenShift (ARO)](https://azure.microsoft.com/services/openshift/) enable you to leverage the features and power of Kubernetes fully as a managed service, without having to install and maintain it. + +## Backing services + +Cloud-native systems depend upon many different ancillary resources, such as data stores, message brokers, monitoring, and identity services. These services are known as [backing services](https://12factor.net/backing-services). + +Figure 1-8 shows many common backing services that cloud-native systems consume. + +![Common backing services](./media/common-backing-services.png) + +**Figure 1-8**. Common backing services + +You could host your own backing services, but then you'd be responsible for licensing, provisioning, and managing those resources. + +Cloud providers offer a rich assortment of *managed backing services*. Instead of owning the service, you simply consume it. The cloud provider operates the resource at scale and bears the responsibility for performance, security, and maintenance. Monitoring, redundancy, and availability are built into the service. Providers guarantee service level performance and fully support their managed services - open a ticket and they fix your issue. + +Cloud-native systems favor managed backing services from cloud vendors. The savings in time and labor can be significant. The operational risk of hosting your own and experiencing trouble can get expensive fast. + +A best practice is to treat a backing service as an **attached resource**, dynamically bound to a microservice with configuration information (a URL and credentials) stored in an external configuration. This guidance is spelled out in the [Twelve-Factor Application](https://12factor.net/), discussed earlier in the chapter. + +>*Factor \#4* specifies that backing services "should be exposed via an addressable URL. Doing so decouples the resource from the application, enabling it to be interchangeable." + +>*Factor \#3* specifies that "Configuration information is moved out of the microservice and externalized through a configuration management tool outside of the code." + +With this pattern, a backing service can be attached and detached without code changes. You might promote a microservice from QA to a staging environment. You update the microservice configuration to point to the backing services in staging and inject the settings into your container through an environment variable. + +Cloud vendors provide APIs for you to communicate with their proprietary backing services. These libraries encapsulate the proprietary plumbing and complexity. However, communicating directly with these APIs will tightly couple your code to that specific backing service. It's a widely accepted practice to insulate the implementation details of the vendor API. Introduce an intermediation layer, or intermediate API, exposing generic operations to your service code and wrap the vendor code inside it. This loose coupling enables you to swap out one backing service for another or move your code to a different cloud environment without having to make changes to the mainline service code. Dapr, discussed earlier, follows this model with its set of [prebuilt building blocks](https://docs.dapr.io/developing-applications/building-blocks/). + +On a final thought, backing services also promote the **Statelessness** principle from the [Twelve-Factor Application](https://12factor.net/), discussed earlier in the chapter. + +>*Factor \#6* specifies that, "Each microservice should execute in its own process, isolated from other running services. Externalize required state to a backing service such as a distributed cache or data store." + +Backing services are discussed in [Cloud-native data patterns](../chpt8-data-patterns/distributed-data.md) and [Cloud-native communication patterns](../communication-patterns/communication-patterns.md). + +## Automation + +As you've seen, cloud-native systems embrace microservices, containers, and modern system design to achieve speed and agility. But that's only part of the story. How do you provision the cloud environments upon which these systems run? How do you rapidly deploy app features and updates? How do you round out the full picture? + +This is where the widely accepted practice of [Infrastructure as Code](https://learn.microsoft.com/devops/deliver/what-is-infrastructure-as-code), or IaC, comes in. + +With IaC, you automate platform provisioning and application deployment. You essentially apply software engineering practices such as testing and versioning to your DevOps practices. Your infrastructure and deployments are automated, consistent, and repeatable. + +### Automating infrastructure + +Tools like [Azure Resource Manager](https://learn.microsoft.com/azure/azure-resource-manager/management/overview), [Azure Bicep](https://learn.microsoft.com/azure/azure-resource-manager/bicep/overview), [Terraform](https://www.terraform.io/) from HashiCorp, and the [Azure CLI](https://learn.microsoft.com/cli/azure/), enable you to script the cloud infrastructure you require declaratively. Resource names, locations, capacities, and secrets are parameterized and dynamic. The script is versioned and checked into source control as an artifact of your project. You invoke the script to provision a consistent and repeatable infrastructure across system environments, such as QA, staging, and production. + +Under the hood, IaC is idempotent, meaning that you can run the same script over and over without side effects. If the team needs to make a change, they edit and rerun the script. Only the updated resources are affected. + +In the article, [What is Infrastructure as Code](https://learn.microsoft.com/devops/deliver/what-is-infrastructure-as-code), Author Sam Guckenheimer describes how, "Teams who implement IaC can deliver stable environments rapidly and at scale. They avoid manual configuration of environments and enforce consistency by representing the desired state of their environments via code. Infrastructure deployments with IaC are repeatable and prevent runtime issues caused by configuration drift or missing dependencies. DevOps teams can work together with a unified set of practices and tools to deliver applications and their supporting infrastructure rapidly, reliably, and at scale." + +### Automating deployments + +The [Twelve-Factor Application](https://12factor.net/), discussed earlier, calls for separate steps when transforming completed code into a running application. + +> *Factor \#5* specifies that "Each release must enforce a strict separation across the build, release and run stages. Each should be tagged with a unique ID and support the ability to roll back." + +Modern CI/CD systems help fulfill this principle. They provide separate build and delivery steps that help ensure consistent and quality code that's readily available to users. + +Figure 1-9 shows the separation across the deployment process. + +![Deployments Steps in CI/CD Pipeline](./media/build-release-run-pipeline.png) + +**Figure 1-9**. Deployment steps in a CI/CD Pipeline + +In the previous figure, pay special attention to separation of tasks: + +1. The developer constructs a feature in their development environment, iterating through what is called the "inner loop" of code, run, and debug. +2. When complete, that code is *pushed* into a code repository, such as GitHub, Azure DevOps, or BitBucket. +3. The push triggers a build stage that transforms the code into a binary artifact. The work is implemented with a [Continuous Integration (CI)](https://martinfowler.com/articles/continuousIntegration.html) pipeline. It automatically builds, tests, and packages the application. +4. The release stage picks up the binary artifact, applies external application and environment configuration information, and produces an immutable release. The release is deployed to a specified environment. The work is implemented with a [Continuous Delivery (CD)](https://martinfowler.com/bliki/ContinuousDelivery.html) pipeline. Each release should be identifiable. You can say, "This deployment is running Release 2.1.1 of the application." +5. Finally, the released feature is run in the target execution environment. Releases are immutable meaning that any change must create a new release. + +Applying these practices, organizations have radically evolved how they ship software. Many have moved from quarterly releases to on-demand updates. The goal is to catch problems early in the development cycle when they're less expensive to fix. The longer the duration between integrations, the more expensive problems become to resolve. With consistency in the integration process, teams can commit code changes more frequently, leading to better collaboration and software quality. + +## How does .NET Aspire help? + +You should understand that you can develop cloud-native systems using many different languages and technology stacks. You don't have to use .NET and you don't have to use .NET Aspire. However, as we're starting to understand, a cloud-native app that uses microservices, containers, and backing services is complex to write and you have to overcome several technical challenges. + +.NET Aspire helps by solving many of the difficulties common to all cloud-native apps. + +You can either start by creating a new solution based on one of the .NET Aspire templates or add .NET Aspire orchestration to an existing solution. In such a solution: + +- Each microservice is a .NET project and will be run in a container. During development, containers are run in Docker Desktop or Podman on your local computer. You can deploy containers to Docker hosts, Kubernetes clusters, or other container orchestrators. +- Observability is built-in with support for the OpenTelemetry SDK. You can observe the behavior of your app in the .NET Aspire dashboard or send the data to other tools. +- You can connect to a range of backing services by using .NET Aspire integrations. There are integrations for common message brokers, database systems, secrets stores, and caches. Each integration includes resiliency enabled by default. + +> *You don't have to use Azure to host a .NET Aspire solution. However, Azure Container Apps (ACA) or Azure Kubernetes Service (AKS) are scalable and robust choices and .NET Aspire includes several built-in integrations that work with Azure products, such as Azure Service Bus and Azure Cosmos DB.* + +.NET Aspire is an opinionated stack - that means that it imposes a particular style of solution on you. If you disagree, there are other possible approaches but you'll have to code them yourself. A software architect who's already built cloud-native systems may find that disagreeable but for others who can adapt to the .NET Aspire way, the convenience of .NET Aspire is worth it. + +We'll learn about .NET Aspire in more detail throughout this book. + +>[!div class="step-by-step"] +>[Previous](introduction-to-cloud-native-applications.md) +>[Next](candidate-apps-for-cloud-native.md) diff --git a/docs/architecture/distributed-cloud-native-apps-containers/monitoring-health/aspire-dashboard.md b/docs/architecture/distributed-cloud-native-apps-containers/monitoring-health/aspire-dashboard.md new file mode 100644 index 0000000000000..4e74a3b93eb2b --- /dev/null +++ b/docs/architecture/distributed-cloud-native-apps-containers/monitoring-health/aspire-dashboard.md @@ -0,0 +1,48 @@ +--- +title: .NET Aspire dashboard +description: Architecture for Distributed Cloud-Native Apps with .NET Aspire & Containers | .NET Aspire dashboard +ms.date: 04/06/2022 +--- + +# .NET Aspire dashboard + +[!INCLUDE [download-alert](../includes/download-alert.md)] + +You can use the .NET Aspire dashboard as your development and test watchdog to monitor the various aspects of your .NET cloud-native app. This includes detailed information about logs, traces, and environment configuration in real-time. + +The dashboard is available after you add .NET Aspire to your solution. The dashboard launches automatically alongside your other services when you start your solution. On initial launch, it shows the .NET Aspire app's resources: + +![A screenshot of the .NET Aspire dashboard.](media/aspire-dashboard-projects.png) + +**Figure 10-9**. The .NET Aspire dashboard. + +There are five main sections in the dashboard: + +- **Resources**: Shows the projects in your solution, their state, endpoints, logs, and more +- **Console**: Shows you any console output from each of the projects or containers +- **Structured**: Uses OpenTelemetry to show semantic logs listed by log event +- **Traces**: Show distributed traces between your projects different components +- **Metrics**: Allows you to drill down into any services metrics + +## Standalone mode + +You can also launch the dashboard in standalone mode, to use it with any .NET app that you've added OpenTelemetry instrumentation to. That's because it's made available as a Docker image. + +The standalone dashboard is meant to be used during development or as a short-term diagnostic tool. The reason is that the dashboard persists telemetry in-memory, so data isn't persisted between restarts. + +Data shown in the dashboard can potentially be sensitive. For instance, configuration can include secrets in environment variables, telemetry can show sensitive runtime information, and more. + +To block untrusted services or apps from sending unauthorized or malicious telemetry to your dashboard, your OpenTelemetry endpoint is automatically secured when you use the dashboard as part of the .NET Aspire stack. However, when using standalone mode, you'll need to configure this yourself by running the following command: + +```bash +docker run --rm -it -p 18888:18888 -p 4317:18889 -d --name aspire-dashboard \ + -e DASHBOARD__OTLP__AUTHMODE='ApiKey' \ + -e DASHBOARD__OTLP__PRIMARYAPIKEY='{Your_APIKEY}' \ + mcr.microsoft.com/dotnet/aspire-dashboard:8.0 +``` + +You'll need to replace `{Your_APIKEY}` with your real API key. The dashboard will use this key to validate any telemetry it receives. For this reason, your app must have also been configured to send telemetry using this key. + +>[!div class="step-by-step"] +>[Previous](health-checks-probes.md) +>[Next](observability-platforms.md) diff --git a/docs/architecture/distributed-cloud-native-apps-containers/monitoring-health/azure-monitor.md b/docs/architecture/distributed-cloud-native-apps-containers/monitoring-health/azure-monitor.md new file mode 100644 index 0000000000000..791d3675b9f19 --- /dev/null +++ b/docs/architecture/distributed-cloud-native-apps-containers/monitoring-health/azure-monitor.md @@ -0,0 +1,85 @@ +--- +title: Azure Monitor +description: Architecture for Distributed Cloud-Native Apps with .NET Aspire & Containers | Azure Monitor +ms.date: 04/06/2022 +--- + +# Azure Monitor + +[!INCLUDE [download-alert](../includes/download-alert.md)] + +Azure Monitor is an umbrella name for a collection of cloud tools designed to provide visibility into the state of your system. It helps you understand how your cloud-native services are performing and actively identifies issues affecting them. Figure 10-10 presents a high level of view of Azure Monitor. + +![A diagram of a high-level view of Azure Monitor.](media/azure-monitor.png) + +**Figure 10-10**. High-level view of Azure Monitor. + +## Gathering logs and metrics + +The first step in any monitoring solution is to gather as much data as possible. The more data gathered, the deeper the insights. Instrumenting systems has traditionally been difficult. Simple Network Management Protocol (SNMP) was the gold standard protocol for collecting machine level information, but it required a great deal of knowledge and configuration. Fortunately, much of this hard work has been eliminated as the most common metrics are gathered automatically by Azure Monitor. + +Application level metrics and events aren't possible to instrument automatically because they're specific to the application being deployed. In order to gather these metrics, there are [SDKs and APIs available](/azure/azure-monitor/app/api-custom-events-metrics) to report such information directly, such as when a customer signs up or completes an order. Exceptions can also be captured and reported back into Azure Monitor through Application Insights. The SDKs support every language found in Cloud Native Applications including Go, Python, JavaScript, and the .NET languages. + +The ultimate goal of gathering information about the state of your application is to ensure that your end users have a good experience. What better way to tell if users are experiencing issues than doing [outside-in web tests](/azure/azure-monitor/app/monitor-web-app-availability)? These tests can be as simple as pinging your website from locations around the world or as involved as having agents log into the site and simulate user actions. + +## Reporting data + +Once the data is gathered, it can be manipulated, summarized, and plotted into charts, which allow users to see instantly when there are problems. These charts can be gathered into dashboards or into workbooks; a multi-page report designed to tell a story about some aspect of the system. + +No modern application would be complete without some artificial intelligence or machine learning. To this end, data [can be passed](https://www.youtube.com/watch?v=Cuza-I1g9tw) to the various machine learning tools in Azure to allow you to extract trends and information that would otherwise be hidden. + +Application Insights provides a powerful SQL-like query language called *Kusto* that can query records, summarize them, and even plot charts. For example, the following query will locate all records for the month of November 2024, group them by state, and plot the top 10 as a pie chart. + +```kusto +StormEvents +| where StartTime >= datetime(2024-11-01) and StartTime < datetime(2024-12-01) +| summarize count() by State +| top 10 by count_ +| render piechart +``` + +These are the results of the previous Application Insights Query. + +![A screenshot of Application Insights query results.](media/application_insights_example.png) + +**Figure 10-11**. Application Insights query results drawn as a pie chart. + +There is a [playground for experimenting with Kusto](https://dataexplorer.azure.com/clusters/help/databases/Samples) queries. Reading [sample queries](/azure/kusto/query/samples) can also be instructive. + +## Dashboards + +There are several different dashboard technologies that may be used to surface the information from Azure Monitor. Perhaps the simplest is to run queries in Application Insights and [plot the data into a chart](/azure/azure-monitor/learn/tutorial-app-dashboards). + +![An example screenshot of Application Insights charts embedded in the main Azure Dashboard.](media/azure_dashboard.png) + +**Figure 10-12**. An example Application Insights chart embedded in the main Azure Dashboard. + +These charts can then be embedded in the Azure portal proper through use of the dashboard feature. For users with more exacting requirements, such as being able to drill down into several tiers of data, Azure Monitor data is available to [Power BI](https://powerbi.microsoft.com/). Power BI is an industry-leading, enterprise class, business intelligence tool that can aggregate data from many different data sources. + +![A screenshot of the Power BI dashboard.](media/powerbidashboard.png) + +**Figure 10-13**. An example Power BI dashboard. + +## Alerts + +Sometimes, having data dashboards is insufficient. If nobody is awake to watch the dashboards, then it can still be many hours before a problem is addressed, or even detected. To this end, Azure Monitor also provides a top notch [alerting solution](/azure/azure-monitor/platform/alerts-overview). Alerts can be triggered by a wide range of conditions including: + +- Metric values +- Log search queries +- Activity log events +- The health of the underlying Azure platform +- Tests for web site availability + +When triggered, the alerts can perform a wide variety of tasks. On the simple side, the alerts may just send an e-mail notification to a mailing list or a text message to an individual. More involved alerts might trigger a workflow in a tool such as PagerDuty, which is aware of who is on call for a particular application. Alerts can trigger actions in [Microsoft Flow](https://flow.microsoft.com/), unlocking near limitless possibilities for workflows. + +As common causes of alerts are identified, the alerts can be enhanced with details about the common causes of the alerts and the steps to take to resolve them. Highly mature cloud-native application deployments may opt to kick off self-healing tasks, which perform actions such as removing failing nodes from a scale set or triggering an autoscaling activity. Eventually it may no longer be necessary to wake up on-call personnel at 2AM to resolve a live-site issue as the system will be able to adjust itself to compensate or at least limp along until somebody arrives at work the next morning. + +Azure Monitor automatically leverages machine learning to understand the normal operating parameters of deployed applications. This approach enables it to detect services that are operating outside of their normal parameters. For instance, the typical weekday traffic on the site might be 10,000 requests per minute. And then, on a given week, suddenly the number of requests hits a highly unusual 20,000 requests per minute. [Smart Detection](/azure/azure-monitor/app/proactive-diagnostics) will notice this deviation from the norm and trigger an alert. At the same time, the trend analysis is smart enough to avoid firing false positives when the traffic load is expected. + +## References + +- [Azure Monitor](/azure/azure-monitor/overview) + +>[!div class="step-by-step"] +>[Previous](observability-platforms.md) +>[Next](../cloud-native-identity/cloud-native-security.md) diff --git a/docs/architecture/distributed-cloud-native-apps-containers/monitoring-health/health-checks-probes.md b/docs/architecture/distributed-cloud-native-apps-containers/monitoring-health/health-checks-probes.md new file mode 100644 index 0000000000000..6ffaa11c93a4e --- /dev/null +++ b/docs/architecture/distributed-cloud-native-apps-containers/monitoring-health/health-checks-probes.md @@ -0,0 +1,163 @@ +--- +title: Health checks and probes +description: Architecture for Distributed Cloud-Native Apps with .NET Aspire & Containers | Health checks and probes +ms.date: 04/06/2022 +--- + +# Health checks and probes + +[!INCLUDE [download-alert](../includes/download-alert.md)] + +Health monitoring can allow near-real-time information about the state of your containers and microservices. Health monitoring is critical to multiple aspects of operating microservices and is especially important when orchestrators perform partial application upgrades in phases. + +Microservices-based applications often use heartbeats or health checks to enable their performance monitors, schedulers, and orchestrators to keep track of the multitude of services. If services can't send some sort of "I'm alive" signal, either on demand or on a schedule, your application might face risks when you deploy updates, or it might just detect failures too late and not be able to stop cascading failures that can end up in major outages. + +## Implement health checks in ASP.NET Core services + +When developing an ASP.NET Core microservice or web application, you can use the built-in health checks feature in ASP .NET Core ([Microsoft.Extensions.Diagnostics.HealthChecks](https://www.nuget.org/packages/Microsoft.Extensions.Diagnostics.HealthChecks)). Like many ASP.NET Core features, health checks come with a set of services and a middleware. + +Health check services and middleware are easy to use and provide capabilities that let you validate if any external resource needed for your application (like a SQL Server database or a remote API) is working properly. When you use this feature, you can also define what it means for the resource to be healthy. + +To use this feature effectively, you first need to configure services in your microservices. Second, you need a front-end application that queries for the health reports. That front-end application could be a custom reporting application, or it could be an orchestrator itself that can react accordingly to the health states. + +### Use the health checks feature in your back-end ASP.NET microservices + +In this section, you'll learn how to implement the HealthChecks feature in a sample ASP.NET Core Web API application when using the [Microsoft.Extensions.Diagnostics.HealthChecks](https://www.nuget.org/packages/Microsoft.Extensions.Diagnostics.HealthChecks) package. + +To begin, you need to define what constitutes a healthy status for each microservice. In the sample application, we define the microservice is healthy if its API is accessible via HTTP and its related SQL Server database is also available. + +In .NET 8, with the built-in APIs, you can configure the services, add a health check for the microservice and its dependent SQL Server database in this way: + +```csharp +// Program.cs from .NET 8 Web API sample + +//... +// Registers required services for health checks +builder.Services.AddHealthChecks() + // Add a health check for a SQL Server database + .AddCheck( + "OrderingDB-check", + new SqlConnectionHealthCheck(builder.Configuration["ConnectionString"]), + HealthStatus.Unhealthy, + new string[] { "orderingdb" }); +``` + +In the previous code, the `services.AddHealthChecks()` method configures a basic HTTP check that returns a status code **200** with "Healthy". Further, the `AddCheck()` extension method configures a custom `SqlConnectionHealthCheck` that checks the related SQL database's health. + +The `AddCheck()` method adds a new health check with a specified name and the implementation of type `IHealthCheck`. You can add multiple health checks using the `AddCheck()` method, so a microservice won't provide a "healthy" status until all its checks are healthy. + +`SqlConnectionHealthCheck` is a custom class that implements `IHealthCheck`. It takes a connection string as a constructor parameter and executes a simple query to check if the connection to the SQL database is successful. It returns `HealthCheckResult.Healthy()` if the query was executed successfully and a `FailureStatus` with the actual exception when it fails: + +```csharp +// Sample SQL Connection Health Check +public class SqlConnectionHealthCheck : IHealthCheck +{ + private const string DefaultTestQuery = "Select 1"; + + public string ConnectionString { get; } + + public string TestQuery { get; } + + public SqlConnectionHealthCheck(string connectionString) + : this(connectionString, testQuery: DefaultTestQuery) + { + } + + public SqlConnectionHealthCheck(string connectionString, string testQuery) + { + ConnectionString = connectionString ?? throw new ArgumentNullException(nameof(connectionString)); + TestQuery = testQuery; + } + + public async Task CheckHealthAsync(HealthCheckContext context, CancellationToken cancellationToken = default(CancellationToken)) + { + using (var connection = new SqlConnection(ConnectionString)) + { + try + { + await connection.OpenAsync(cancellationToken); + + if (TestQuery != null) + { + var command = connection.CreateCommand(); + command.CommandText = TestQuery; + + await command.ExecuteNonQueryAsync(cancellationToken); + } + } + catch (DbException ex) + { + return new HealthCheckResult(status: context.Registration.FailureStatus, exception: ex); + } + } + + return HealthCheckResult.Healthy(); + } +} +``` + +Note that in the previous code, `Select 1` is the query used to check the health of the database. To monitor the availability of your microservices, orchestrators like Kubernetes periodically perform health checks by sending requests to test the microservices. It's important to keep your database queries efficient so that these operations are quick and don’t result in a higher utilization of resources. + +Finally, add a middleware that responds to the url path `/hc`: + +```csharp +// Program.cs from .NET 8 Web Api sample + +app.MapHealthChecks("/hc"); +``` + +When the endpoint `/hc` is invoked, it runs all the health checks that are configured in the `AddHealthChecks()` method in the `Startup` class and shows the result. + +### Query your microservices to report about their health status + +When you've configured health checks as described in this article and you have the microservice running in Docker, you can directly check from a browser if it's healthy. You have to publish the container port in the Docker host, so you can access the container through the external Docker host IP or through `host.docker.internal`, as shown in figure 10.6. + +![Screenshot of the JSON response returned by a health check.](media/health-check-json-response.png) + +**Figure 10-6**. Checking health status of a single service from a browser + +In that test, you can see that the `Catalog.API` microservice (running on port 5101) is healthy, returning HTTP status 200 and status information in JSON. The service also checked the health of its SQL Server database dependency and RabbitMQ, so the health check reported itself as healthy. + +### Open source enhancement to .NET Core health checks + +Instead of writing all your own health checks, you can use the [AspNetCore.Diagnostics.HealthChecks](https://github.com/Xabaril/AspNetCore.Diagnostics.HealthChecks]) open source library, which provides a set of pre-built health checks for common services like endpoint state, SQL Server, Redis, RabbitMQ, and more. + +## Use watchdogs + +A watchdog is a separate service that can watch health and load across services, and report health about the microservices by querying with the `HealthChecks` library introduced earlier. This can help prevent errors that would not be detected based on the view of a single service. Watchdogs also are a good place to host code that can perform remediation actions for known conditions without user interaction. + +Fortunately, you have many options to add such a service. For example if you have built your own health checks using [AspNetCore.Diagnostics.HealthChecks](https://github.com/Xabaril/AspNetCore.Diagnostics.HealthChecks]) there is also the [AspNetCore.HealthChecks.UI](https://www.nuget.org/packages/AspNetCore.HealthChecks.UI/) NuGet package that can be used to display the health check results from the configured URIs. + +![Screenshot of the Health Checks UI eShopOnContainers health statuses.](media/health-check-status-ui.png) + +**Figure 10-7**. Sample health check report + +With the introduction of .NET Aspire, you now get a built-in dashboard that has many of the same features as the open source package. You'll see more about the dashboard in the next section. + +![A screenshot of the .NET Aspire dashboard.](media/aspire-dashboard-projects.png) + +**Figure 10-8**. The .NET Aspire dashboard + +In summary, a watchdog service queries each microservice's endpoint. This will execute all the health checks defined within it and return an overall health state depending on all those checks. The `HealthChecksUI` is easy to consume with a few configuration entries and two lines of code that needs to be added into the *Startup.cs* of the watchdog service. + +## Health checks when using orchestrators + +To monitor the availability of your microservices, orchestrators like Kubernetes and Service Fabric periodically perform health checks by sending requests to test the microservices. When an orchestrator determines that a service or container is unhealthy, it stops routing requests to that instance. It also usually creates a new instance of that container. + +For instance, most orchestrators can use health checks to manage zero-downtime deployments. Only when the status of a service or container changes to healthy will the orchestrator start routing traffic to service instances. + +Another aspect of service health is reporting metrics from the service. This is an advanced capability of the health model of some orchestrators, like Service Fabric. Metrics are important when using an orchestrator because they are used to balance resource usage. Metrics also can be an indicator of system health. For example, you might have an application that has many microservices, and each instance reports a requests-per-second (RPS) metric. If one service is using more resources (memory, processor, etc.) than another service, the orchestrator could move service instances around in the cluster to try to maintain even resource utilization. + +Note that Azure Service Fabric provides its own [Health Monitoring model](/azure/service-fabric/service-fabric-health-introduction), which is more advanced than simple health checks. + +## Advanced monitoring: visualization, analysis, and alerts + +The final part of monitoring is visualizing the event stream, reporting on service performance, and alerting when an issue is detected. You can use different solutions for this aspect of monitoring. + +You can use simple custom applications showing the state of your services, like the custom page shown when explaining the [AspNetCore.Diagnostics.HealthChecks](https://github.com/Xabaril/AspNetCore.Diagnostics.HealthChecks). Or you could use more advanced tools like [Azure Monitor](https://azure.microsoft.com/services/monitor/) to raise alerts based on the stream of events. + +Finally, if you're storing all the event streams, you can use Microsoft Power BI or other solutions like Kibana or Splunk to visualize the data. + +>[!div class="step-by-step"] +>[Previous](open-telemetry-grafana-prometheus.md) +>[Next](aspire-dashboard.md) diff --git a/docs/architecture/distributed-cloud-native-apps-containers/monitoring-health/media/application_insights_example.png b/docs/architecture/distributed-cloud-native-apps-containers/monitoring-health/media/application_insights_example.png new file mode 100644 index 0000000000000..0e356738b6311 Binary files /dev/null and b/docs/architecture/distributed-cloud-native-apps-containers/monitoring-health/media/application_insights_example.png differ diff --git a/docs/architecture/distributed-cloud-native-apps-containers/monitoring-health/media/aspire-dashboard-projects.png b/docs/architecture/distributed-cloud-native-apps-containers/monitoring-health/media/aspire-dashboard-projects.png new file mode 100644 index 0000000000000..df8998f1e594f Binary files /dev/null and b/docs/architecture/distributed-cloud-native-apps-containers/monitoring-health/media/aspire-dashboard-projects.png differ diff --git a/docs/architecture/distributed-cloud-native-apps-containers/monitoring-health/media/aspire-dashboard-traces.png b/docs/architecture/distributed-cloud-native-apps-containers/monitoring-health/media/aspire-dashboard-traces.png new file mode 100644 index 0000000000000..3b171bdaa84f1 Binary files /dev/null and b/docs/architecture/distributed-cloud-native-apps-containers/monitoring-health/media/aspire-dashboard-traces.png differ diff --git a/docs/architecture/distributed-cloud-native-apps-containers/monitoring-health/media/azure-monitor.png b/docs/architecture/distributed-cloud-native-apps-containers/monitoring-health/media/azure-monitor.png new file mode 100644 index 0000000000000..93b8b61241741 Binary files /dev/null and b/docs/architecture/distributed-cloud-native-apps-containers/monitoring-health/media/azure-monitor.png differ diff --git a/docs/architecture/distributed-cloud-native-apps-containers/monitoring-health/media/azure_dashboard.png b/docs/architecture/distributed-cloud-native-apps-containers/monitoring-health/media/azure_dashboard.png new file mode 100644 index 0000000000000..585e50ffc4831 Binary files /dev/null and b/docs/architecture/distributed-cloud-native-apps-containers/monitoring-health/media/azure_dashboard.png differ diff --git a/docs/architecture/distributed-cloud-native-apps-containers/monitoring-health/media/centralized-logging.png b/docs/architecture/distributed-cloud-native-apps-containers/monitoring-health/media/centralized-logging.png new file mode 100644 index 0000000000000..0465470d1dc70 Binary files /dev/null and b/docs/architecture/distributed-cloud-native-apps-containers/monitoring-health/media/centralized-logging.png differ diff --git a/docs/architecture/distributed-cloud-native-apps-containers/monitoring-health/media/grafana.png b/docs/architecture/distributed-cloud-native-apps-containers/monitoring-health/media/grafana.png new file mode 100644 index 0000000000000..9ae45c2009c19 Binary files /dev/null and b/docs/architecture/distributed-cloud-native-apps-containers/monitoring-health/media/grafana.png differ diff --git a/docs/architecture/distributed-cloud-native-apps-containers/monitoring-health/media/health-check-json-response.png b/docs/architecture/distributed-cloud-native-apps-containers/monitoring-health/media/health-check-json-response.png new file mode 100644 index 0000000000000..f22552f8b9df6 Binary files /dev/null and b/docs/architecture/distributed-cloud-native-apps-containers/monitoring-health/media/health-check-json-response.png differ diff --git a/docs/architecture/distributed-cloud-native-apps-containers/monitoring-health/media/health-check-status-ui.png b/docs/architecture/distributed-cloud-native-apps-containers/monitoring-health/media/health-check-status-ui.png new file mode 100644 index 0000000000000..716aad2548194 Binary files /dev/null and b/docs/architecture/distributed-cloud-native-apps-containers/monitoring-health/media/health-check-status-ui.png differ diff --git a/docs/architecture/distributed-cloud-native-apps-containers/monitoring-health/media/local-log-file-per-service.png b/docs/architecture/distributed-cloud-native-apps-containers/monitoring-health/media/local-log-file-per-service.png new file mode 100644 index 0000000000000..7b7b1ec0e04f2 Binary files /dev/null and b/docs/architecture/distributed-cloud-native-apps-containers/monitoring-health/media/local-log-file-per-service.png differ diff --git a/docs/architecture/distributed-cloud-native-apps-containers/monitoring-health/media/multiple-node-monolith-logging.png b/docs/architecture/distributed-cloud-native-apps-containers/monitoring-health/media/multiple-node-monolith-logging.png new file mode 100644 index 0000000000000..5c25752b1e3c3 Binary files /dev/null and b/docs/architecture/distributed-cloud-native-apps-containers/monitoring-health/media/multiple-node-monolith-logging.png differ diff --git a/docs/architecture/distributed-cloud-native-apps-containers/monitoring-health/media/powerbidashboard.png b/docs/architecture/distributed-cloud-native-apps-containers/monitoring-health/media/powerbidashboard.png new file mode 100644 index 0000000000000..aa8efe4cd68f4 Binary files /dev/null and b/docs/architecture/distributed-cloud-native-apps-containers/monitoring-health/media/powerbidashboard.png differ diff --git a/docs/architecture/distributed-cloud-native-apps-containers/monitoring-health/media/prometheus.png b/docs/architecture/distributed-cloud-native-apps-containers/monitoring-health/media/prometheus.png new file mode 100644 index 0000000000000..468234fdbc6d7 Binary files /dev/null and b/docs/architecture/distributed-cloud-native-apps-containers/monitoring-health/media/prometheus.png differ diff --git a/docs/architecture/distributed-cloud-native-apps-containers/monitoring-health/media/single-monolith-logging.png b/docs/architecture/distributed-cloud-native-apps-containers/monitoring-health/media/single-monolith-logging.png new file mode 100644 index 0000000000000..f0f44dca1d0a4 Binary files /dev/null and b/docs/architecture/distributed-cloud-native-apps-containers/monitoring-health/media/single-monolith-logging.png differ diff --git a/docs/architecture/distributed-cloud-native-apps-containers/monitoring-health/observability-patterns.md b/docs/architecture/distributed-cloud-native-apps-containers/monitoring-health/observability-patterns.md new file mode 100644 index 0000000000000..cbd8926c9fff3 --- /dev/null +++ b/docs/architecture/distributed-cloud-native-apps-containers/monitoring-health/observability-patterns.md @@ -0,0 +1,98 @@ +--- +title: Observability patterns +description: Architecture for Distributed Cloud-Native Apps with .NET Aspire & Containers | Observability patterns +ms.date: 04/06/2022 +--- + +# Observability patterns + +[!INCLUDE [download-alert](../includes/download-alert.md)] + +Just as patterns have been developed to aid in the layout of code in applications, there are patterns for operating applications in a reliable way. Three useful patterns in maintaining applications have emerged: **logging**, **monitoring**, and **alerts**. + +## When to use logging + +When users report problems with an application, it's useful to be able to see what the app was doing when the problem occurred. To investigate such past events, your app must record them. The process of recording events is known as logging. Anytime failures or problems occur in production, you can use thorough logs to diagnose the problem or to help you recreate the conditions under which the failures occurred in a test environment. + +### Challenges when logging with cloud-native applications + +In traditional applications, log files are typically stored on the local machine. In fact, on Unix-like operating systems, there's a folder structure defined to hold any logs, typically under `/var/log`. + +![A diagram showing logging to a file in a monolithic app.](media/single-monolith-logging.png) + +**Figure 10-1**. Logging to a file in a monolithic app. + +Logging to a flat file on a single machine is much less helpful in a cloud environment. Applications producing logs may not have access to the local disk or the local disk may be highly transient as containers are shuffled around physical machines. Even simple scaling up of monolithic applications across multiple nodes can make it challenging to locate the appropriate file-based log file. + +![A diagram showing logging to files in a scaled monolithic app.](media/multiple-node-monolith-logging.png) + +**Figure 10-2**. Logging to files in a scaled monolithic app. + +Cloud-native applications also pose some challenges for file-based loggers. User requests may now span multiple services that are run on different machines and may include serverless functions with no access to a local file system at all. It would be very challenging to correlate the logs from a user or a session across these many services and machines. + +![A diagram showing logging to local files in a microservices app.](media/local-log-file-per-service.png) + +**Figure 10-3**. Logging to local files in a microservices app. + +Finally, the number of users in some cloud-native applications is high. Imagine that each user generates a hundred lines of log messages when they log into an application. In isolation, that is manageable, but multiply that over 100,000 users and the volume of logs becomes large enough that specialized tools are needed to support effective use of the logs. + +### Logging in cloud-native applications + +Every programming language has tooling that permits writing logs and typically the overhead for writing these logs is low. Many of the logging libraries provide logging for different kinds of criticalities, which can be tuned at run time. For instance, the [Serilog library](https://serilog.net/) is a popular structured logging library for .NET that provides the following logging levels: + +* Verbose +* Debug +* Information +* Warning +* Error +* Fatal + +These different log levels provide granularity in logging. When the application is functioning properly in production, it may be configured to only log important messages. When the application is misbehaving, then the log level can be increased so more verbose logs are gathered. This balances performance against ease of debugging. + +Because of the challenges associated with using file-based logs in cloud-native apps, centralized logs are preferred. Logs are collected by the applications and shipped to a central logging application which indexes and stores the logs. This class of system can ingest tens of gigabytes of logs every day. + +It's also helpful to follow some standard practices when building logging that spans many services. For instance, generating a [correlation ID](https://blog.rapid7.com/2016/12/23/the-value-of-correlation-ids/) at the start of a lengthy interaction, and then logging it in each message that is related to that interaction, makes it easier to search for all related messages. Standardization makes reading logs much easier. Figure 7-4 demonstrates how a microservices architecture can leverage centralized logging as part of its workflow. + +![A diagram showing logs from various sources are ingested into a centralized log store.](media/centralized-logging.png) + +**Figure 10-4**. Logs from various sources are ingested into a centralized log store. + +## Challenges with detecting and responding to potential app health issues + +Some applications aren't mission critical. Maybe they're only used internally, and when a problem occurs, the user can contact the team responsible and the application can be restarted. However, customers often have higher expectations for the applications they consume. You should know when problems occur with your application *before* users do, or before users notify you. Otherwise, the first you know about a problem may be when you notice an angry deluge of social media posts deriding your application or even your organization. + +Some scenarios you may need to consider include: + +- One service in your application keeps failing and restarting, resulting in intermittent slow responses. +- At some times of the day, your application's response time is slow. +- After a recent deployment, load on the database has tripled. + +Implemented properly, monitoring can let you know about conditions that will lead to problems. Then, you can address underlying conditions before they result in any significant user impact. + +### Monitoring cloud-native apps + +Some centralized logging systems collect other telemetry outside of pure logs. For example, they can collect metrics, such as database query durations, average response times from a web server, and even CPU load averages and memory pressure as reported by the operating system. In conjunction with the logs, these systems can provide a holistic view of the health of nodes in the system and the application as a whole. + +You can also add code to your application that feeds extra information to such monitoring tools. Business flows that are of particular interest. For example, when a new user signs up or an order is placed, your code can increment a counter in the central monitoring system. Such techniques unlock the monitoring tools to not only monitor the health of the application but the health of the business. + +Cloud-native monitoring tools provide real-time telemetry and insight into apps regardless of whether they're single-process monolithic applications or distributed microservice architectures. They include tools that allow collection of data from the app as well as tools for querying and displaying information about the app's health. + +## Challenges with reacting to critical problems in cloud-native apps + +If you need to react to problems with your application, you need some way to alert the right personnel. This is the third cloud-native application observability pattern and depends on logging and monitoring. Your application needs to have logging in place to allow problems to be diagnosed, and in some cases to feed into monitoring tools. It needs monitoring to aggregate application metrics and health data in one place. Once this has been established, rules can be created that will trigger alerts when certain metrics fall outside of acceptable levels. + +Generally, alerts are layered on top of monitoring. Certain conditions trigger appropriate alerts to notify team members of urgent problems. Some scenarios that may require alerts include: + +- One of your application's services is not responding after 1 minute of downtime. +- Your application is returning unsuccessful HTTP responses to more than 1% of requests. +- Your application's average response time for key endpoints exceeds 2000 ms. + +### Alerts in cloud-native apps + +You can craft queries against the monitoring tools to look for known failure conditions. For instance, queries could search through the incoming logs for indications of HTTP status code 500, which indicates a problem on a web server. As soon as one of these is detected, an e-mail or an SMS could be sent to the owner of the originating service, who can investigate. Typically, though, a single 500 error isn't enough to determine that a problem has occurred. + +A common mistage in alerting is to fire too many alerts for humans to investigate. Service owners will rapidly become desensitized to errors that they've previously investigated and found to be benign. Then, when true errors occur, they'll be lost in the noise of hundreds of false positives. Therefore you should take case to alert owners only when problems are serious or critical. + +>[!div class="step-by-step"] +>[Previous](../cloud-native-resiliency/cloud-native-resiliency.md) +>[Next](open-telemetry-grafana-prometheus.md) diff --git a/docs/architecture/distributed-cloud-native-apps-containers/monitoring-health/observability-platforms.md b/docs/architecture/distributed-cloud-native-apps-containers/monitoring-health/observability-platforms.md new file mode 100644 index 0000000000000..8a04e456e7023 --- /dev/null +++ b/docs/architecture/distributed-cloud-native-apps-containers/monitoring-health/observability-platforms.md @@ -0,0 +1,45 @@ +--- +title: Observability platforms +description: Architecture for Distributed Cloud-Native Apps with .NET Aspire & Containers | Observability platforms +ms.date: 04/06/2022 +--- + +# Observability platforms + +[!INCLUDE [download-alert](../includes/download-alert.md)] + +Observability platforms are products that understand the health, performance, and behavior of applications, services, and infrastructure. They make use of the gathered telemetry your services and infrastructure generate to provide insights into the state of the overall system. + +Some examples of popular observability platforms include: + +- New Relic +- Dynatrace +- Datadog +- Grafana Cloud +- Azure Monitor + +All the vendors aim to provide one place that you and your organization can go to understand the health of your systems. Some of them also allow you to change from a reactive to a proactive stance by providing alerts, notifications, and automation when things go wrong. + +## Why use an observability platform? + +As your cloud-native application grows, it becomes more complex. You may have many services, each with its own logs, metrics, and traces. You may have many users, each with their own experience of your application. You may have many dependencies, each with their own performance characteristics. + +The switch from single monolithic applications that could be simply monitored to cloud-native applications that are decentralized and distributed across many services requires a different approach to monitoring. You need to be able to see the whole system, not just the parts. + +The core features you should be looking for in an observability platform are: + +- **Comprehensive Insight**: Observability platforms provide end-to-end visibility into application performance, user behavior, and system health, enabling teams to identify and resolve issues quickly. + +- **Proactive Monitoring**: With continuous monitoring, observability platforms help detect anomalies and potential problems before they impact users, ensuring higher reliability and uptime. + +- **Scalability**: Designed to handle the complexity and scale of cloud-native environments, these platforms can manage vast amounts of data from microservices architectures efficiently. + +- **Performance Optimization**: By analyzing performance data, teams can optimize resource usage and application performance, leading to cost savings and better user experiences. + +- **Improved Collaboration**: Centralized observability data fosters collaboration among development, operations, and security teams, streamlining workflows and improving overall productivity. + +- **Real-Time Data Analysis**: These platforms provide real-time insights, allowing teams to respond swiftly to changing conditions and ensure continuous delivery and deployment pipelines are running smoothly. + +>[!div class="step-by-step"] +>[Previous](aspire-dashboard.md) +>[Next](azure-monitor.md) diff --git a/docs/architecture/distributed-cloud-native-apps-containers/monitoring-health/open-telemetry-grafana-prometheus.md b/docs/architecture/distributed-cloud-native-apps-containers/monitoring-health/open-telemetry-grafana-prometheus.md new file mode 100644 index 0000000000000..5288dfc5cd0c9 --- /dev/null +++ b/docs/architecture/distributed-cloud-native-apps-containers/monitoring-health/open-telemetry-grafana-prometheus.md @@ -0,0 +1,160 @@ +--- +title: Using OpenTelemetry in your .NET app +description: Architecture for Distributed Cloud-Native Apps with .NET Aspire & Containers | Using OpenTelemetry in your .NET app +ms.date: 04/06/2022 +--- + +# Using OpenTelemetry in your .NET app + +[!INCLUDE [download-alert](../includes/download-alert.md)] + +OpenTelemetry is an open-source observability framework. OpenTelemetry standardizes the way telemetry data is gathered and shared with back-end platforms. It provides a common format of instrumentation across all your microservices. You don't have to reinstrument code or install different proprietary agents every time a back-end platform changes. + +OpenTelemetry helps you to monitor all three pillars of observability: logs, metrics, and traces. + +## Add OpenTelemetry to your cloud-native app + +To add OpenTelemetry to your cloud-native app, use NuGet packages. For example, the [OpenTelemetry package](https://github.com/open-telemetry/opentelemetry-dotnet/blob/main/src/OpenTelemetry/README.md) is the main library that provides the core OpenTelemetry capabilities. Other packages include the following: + +- [OpenTelemetry.Extensions.Hosting](https://github.com/open-telemetry/opentelemetry-dotnet/blob/main/src/OpenTelemetry.Extensions.Hosting/README.md) - Extension methods for automatically starting and stopping OpenTelemetry tracing in ASP.NET Core hosts. +- [OpenTelemetry.Instrumentation.AspNetCore](https://github.com/open-telemetry/opentelemetry-dotnet/blob/main/src/OpenTelemetry.Instrumentation.AspNetCore/README.md) - Classes that can collect many metrics about your cloud-native app, without any custom code. +- [OpenTelemetry.Exporter.Console](https://github.com/open-telemetry/opentelemetry-dotnet/tree/main/src/OpenTelemetry.Exporter.Console/README.md) - Classes that enable your cloud-native app to write telemetry details to the console. + +There are many other [packages](/dotnet/core/diagnostics/observability-with-otel) available. Choose the most appropriate, depending on your cloud-native app's needs. + +Suppose you have a cloud-native eShop store .NET Core app. In this case, you'd add a new diagnostics project for OpenTelemetry in your app, so that any microservice in your app can access it. + +```csharp +using OpenTelemetry.Metrics; +using OpenTelemetry.Resources; +using OpenTelemetry.Trace; + +namespace Microsoft.Extensions.DependencyInjection; + +public static class DiagnosticServiceCollectionExtensions +{ + public static IServiceCollection AddObservability(this IServiceCollection services, + string serviceName, + IConfiguration configuration) + { + // create the resource that references the service name passed in + var resource = ResourceBuilder.CreateDefault().AddService(serviceName: serviceName, serviceVersion: "1.0"); + + // add the OpenTelemetry services + var otelBuilder = services.AddOpenTelemetry(); + + otelBuilder + // add the metrics providers + .WithMetrics(metrics => + { + metrics + .SetResourceBuilder(resource) + .AddRuntimeInstrumentation() + .AddAspNetCoreInstrumentation() + .AddHttpClientInstrumentation() + .AddEventCountersInstrumentation(c => + { + c.AddEventSources( + "Microsoft.AspNetCore.Hosting", + "Microsoft-AspNetCore-Server-Kestrel", + "System.Net.Http", + "System.Net.Sockets"); + }) + .AddMeter("Microsoft.AspNetCore.Hosting", "Microsoft.AspNetCore.Server.Kestrel") + .AddConsoleExporter(); + + }) + // add the tracing providers + .WithTracing(tracing => + { + tracing.SetResourceBuilder(resource) + .AddAspNetCoreInstrumentation() + .AddHttpClientInstrumentation() + .AddSqlClientInstrumentation(); + }); + + return services; + } + + // Add the Prometheus endpoints to your service, this will expose the metrics at http://service/metrics + public static void MapObservability(this IEndpointRouteBuilder routes) + { + routes.MapPrometheusScrapingEndpoint(); + } +} +``` + +The code does the following: + +1. It creates a variable `var otelBuilder = services.AddOpenTelemetry()` to store the OpenTelemetry builder. +1. It adds OpenTelemetry metrics and traces to `otelBuilder`. +1. The `.AddConsoleExporter()` line ensures the metrics are displayed in the console. +1. Tracing is added using `.WithTracing()`. + +After adding your diagnostics project, you then need to add reference to it to the service. For example, if you have a Products service in your app, you'd add the fllowing in the corresponding `Product.csproj` file: + +```xml + +``` + +In your Program.cs file, under the declaration for `builder`, you then add: + +```csharp +var builder = WebApplication.CreateBuilder(args); + +builder.Services.AddObservability("Products", builder.Configuration); +``` + +# View telemetry + +A common way to view the data collected through OpenTelemetry, in addition to Azure Monitor, is by using Prometheus and Grafana. + +![A screenshot of Prometheus.](media/prometheus.png) + +**Figure 10-5**. Prometheus UI + +Prometheus is an open-source monitoring tool that gets metrics from your app. To use Prometheus you complete the following steps: + +1. Add a Prometheus container. +1. Configure the container to collect data from each microservice. +1. Add the Prometheus .NET client library to collect metrics from your cloud-native app. +1. OpenTelemetry comes with an exporter for Prometheus. You add the exporter to your application by including the `OpenTelemetry.Exporter.Prometheus.AspNetCore` NuGet package. +1. Finally, you add the endpoints for your microservices. For instance: + + ```yml + global: + scrape_interval: 1s + + scrape_configs: + - job_name: 'products' + static_configs: + - targets: ['backend:8080'] + - job_name: 'store' + static_configs: + - targets: ['frontend:8080'] + ``` + +Then you can use Grafana to create dashboards and view the metrics gathered by Prometheus. + +![A screenshot of a Grafana dashboard.](media/grafana.png) + +To do configure Grafana, you: + +1. Add a Grafana container for your app, in the same way as Prometheus. +1. Add Prometheus as the data source for Grafana using YAML: + + ```yml + apiVersion: 1 + + datasources: + - name: Prometheus + type: prometheus + url: http://prometheus:9090 + isDefault: true + access: proxy + editable: true + ``` + +>[!div class="step-by-step"] +>[Previous](observability-patterns.md) +>[Next](health-checks-probes.md) diff --git a/docs/architecture/distributed-cloud-native-apps-containers/service-to-service-communication-patterns/grpc.md b/docs/architecture/distributed-cloud-native-apps-containers/service-to-service-communication-patterns/grpc.md new file mode 100644 index 0000000000000..83ec2a7bc7bf8 --- /dev/null +++ b/docs/architecture/distributed-cloud-native-apps-containers/service-to-service-communication-patterns/grpc.md @@ -0,0 +1,123 @@ +--- +title: gRPC +description: Cloud-native service to service communication patterns | gRPC +author: +ms.date: 04/25/2024 +--- + +# gRPC + +[!INCLUDE [download-alert](../includes/download-alert.md)] + +So far in this book, we've focused on [REST-based](/azure/architecture/best-practices/api-design) communication. We've seen that REST is a flexible architectural style that defines CRUD-based operations against entity resources. Clients interact with resources across HTTP with a request/response communication model. While REST is widely implemented, a newer communication technology, gRPC, has gained tremendous momentum across the cloud-native community. + +## What is gRPC? + +gRPC is a modern, high-performance framework that evolves the age-old [remote procedure call (RPC)](https://en.wikipedia.org/wiki/Remote_procedure_call) protocol. At the application level, gRPC streamlines messaging between clients and back-end services. Originating from Google, gRPC is open source and part of the [Cloud Native Computing Foundation (CNCF)](https://www.cncf.io/) ecosystem of cloud-native offerings. CNCF considers gRPC an [incubating project](https://github.com/cncf/toc/blob/main/process/graduation_criteria.md). Incubating means end users are using the technology in production applications, and the project has a healthy number of contributors. + +A typical gRPC client app will expose a local, in-process function that implements a business operation. Under the covers, that local function invokes another function on a remote machine. What appears to be a local call essentially becomes a transparent out-of-process call to a remote service. The gRPC plumbing abstracts the point-to-point networking communication, serialization, and execution between computers. + +In cloud-native applications, developers often work across programming languages, frameworks, and technologies. This *interoperability* complicates message contracts and the plumbing required for cross-platform communication. gRPC provides a "uniform horizontal layer" that abstracts these concerns. Developers code in their native platform focused on business functionality, while gRPC handles communication plumbing. + +gRPC offers comprehensive support across most popular development stacks, including Java, JavaScript, C#, Go, Swift, and NodeJS. + +## gRPC Benefits + +gRPC uses HTTP/2 for its transport protocol. While compatible with HTTP 1.1, HTTP/2 features many advanced capabilities: + +- A binary framing protocol for data transport - unlike HTTP 1.1, which is text based. +- Multiplexing support for sending multiple parallel requests over the same connection - HTTP 1.1 limits processing to one request/response message at a time. +- Bidirectional full-duplex communication for sending both client requests and server responses simultaneously. +- Built-in streaming enabling requests and responses to asynchronously stream large data sets. +- Header compression that reduces network usage. + +gRPC is lightweight and highly performant. It can be up to 8x faster than JSON serialization with messages 60-80% smaller. In Microsoft [Windows Communication Foundation (WCF)](../../../framework/wcf/whats-wcf.md) parlance, gRPC performance exceeds the speed and efficiency of the highly optimized [NetTCP bindings](/dotnet/api/system.servicemodel.nettcpbinding?view=netframework-4.8&preserve-view=true). Unlike NetTCP, which favors the Microsoft stack, gRPC is cross-platform. + +## Protocol Buffers + +gRPC embraces an open-source technology called [Protocol Buffers](https://developers.google.com/protocol-buffers/docs/overview). They provide a highly efficient and platform-neutral serialization format for serializing structured messages that services send to each other. Using a cross-platform Interface Definition Language (IDL), developers define a service contract for each microservice. The contract, implemented as a text-based `.proto` file, describes the methods, inputs, and outputs for each service. The same contract file can be used for gRPC clients and services built on different development platforms. + +Using the proto file, the Protobuf compiler, `protoc`, generates both client and service code for your target platform. The code includes the following components: + +- Strongly typed objects, shared by the client and service, that represent the service operations and data elements for a message. +- A strongly typed base class with the required network plumbing that the remote gRPC service can inherit and extend. +- A client stub that contains the required plumbing to invoke the remote gRPC service. + +At run time, each message is serialized as a standard Protobuf representation and exchanged between the client and remote service. Unlike JSON or XML, Protobuf messages are serialized as compiled binary bytes. + +## gRPC support in .NET + +gRPC is integrated into .NET Core 3.0 SDK and later. The following tools support it: + +- Visual Studio 2022 with the ASP.NET and web development workload installed +- Visual Studio Code +- The `dotnet` CLI + +The SDK includes tooling for endpoint routing, built-in IoC, and logging. The open-source Kestrel web server supports HTTP/2 connections. Figure 6-20 shows a Visual Studio 2022 template that scaffolds a skeleton project for a gRPC service. Note how .NET fully supports Windows, Linux, and macOS. + +![gRPC Support in Visual Studio 2022 diagram](./media/visual-studio-2022-grpc-template.png) + +**Figure 6-18**. gRPC support in Visual Studio 2022 + +Figure 6-19 shows the skeleton gRPC service generated from the built-in scaffolding included in Visual Studio 2022. + +![gRPC project in Visual Studio 2022 diagram](./media/grpc-project.png ) + +**Figure 6-19**. gRPC project in Visual Studio 2022 + +In the previous figure, note the proto description file and service code. As you'll see shortly, Visual Studio generates additional configuration in both the Startup class and underlying project file. + +## gRPC usage + +Favor gRPC for the following scenarios: + +- Synchronous backend microservice-to-microservice communication where an immediate response is required to continue processing. +- Polyglot environments that need to support mixed programming platforms. +- Low latency and high throughput communication where performance is critical. +- Point-to-point real-time communication - gRPC can push messages in real time without polling and has excellent support for bi-directional streaming. +- Network constrained environments – binary gRPC messages are always smaller than an equivalent text-based JSON message. + +At the time, of this writing, gRPC is primarily used with backend services. Modern browsers can't provide the level of HTTP/2 control required to support a front-end gRPC client. That said, there's support for [gRPC-Web with .NET](https://devblogs.microsoft.com/aspnet/grpc-web-for-net-now-available/) that enables gRPC communication from browser-based apps built with JavaScript or Blazor WebAssembly technologies. [gRPC-Web](https://github.com/grpc/grpc/blob/master/doc/PROTOCOL-WEB.md) enables an ASP.NET Core gRPC app to support gRPC features in browser apps: + +- Strongly typed, code-generated clients +- Compact Protobuf messages +- Server streaming + +## gRPC implementation + +The microservice reference architecture, [eShop Reference Application](https://github.com/dotnet/eShop), from Microsoft, shows how to implement gRPC services in .NET applications. + +![Backend architecture for eShop application diagram](./media/eshop-architecture.png) + +**Figure 6-20**. Backend architecture for eShop application + +The eShop App Workshop adds gRPC as a worked example in the [Add shopping basket capabilities to the web site lab](https://github.com/dotnet-presentations/eshop-app-workshop/tree/main/labs/4-Add-Shopping-Basket) + +In the previous figure, note how eShop embraces the [Backend for Frontends pattern](https://learn.microsoft.com/azure/architecture/patterns/backends-for-frontends) (BFF) by exposing multiple API gateways. + +gRPC communication requires both client and server components. The client makes synchronous gRPC calls to backend microservices, each of which implement a gRPC server. Both the client and server take advantage of the built-in gRPC plumbing from the .NET SDK. Client-side *stubs* provide the plumbing to invoke remote gRPC calls. Server-side components provide gRPC plumbing that custom service classes can inherit and consume. + +Microservices that expose both a RESTful API and gRPC communication require multiple endpoints to manage traffic. You would open an endpoint that listens for HTTP traffic for the RESTful calls and another for gRPC calls. The gRPC endpoint must be configured for the HTTP/2 protocol that is required for gRPC communication. + +While we strive to decouple microservices with asynchronous communication patterns, some operations require direct calls. gRPC should be the primary choice for direct synchronous communication between microservices. Its high-performance communication protocol, based on HTTP/2 and protocol buffers, make it a perfect choice. + +## gRPC in .NET Aspire + +If you have a gRPC server in a .NET Aspire solution and you want to call it from other microservices, you can use the `AddGrpcClient` extension method to create a gRPC client in the App Host and then pass it to the microservices that use it: + +```csharp +builder.Services.AddGrpcClient( + static client=> client.BaseAddress = new("https://catalogapi")) + .ConfigurePrimaryHttpMessageHandler( + () => new GrpcWebHandler(new HttpClientHandler())); +``` + +In the microservices, get the gRPC client from dependency injection and use it to send messages to the gRPC server microservice. + +## Looking ahead + +Looking ahead, gRPC will continue to gain traction for cloud-native systems. The performance benefits and ease of development are compelling. However, REST will likely be around for a long time. It excels for publicly exposed APIs and for backward compatibility reasons. + +>[!div class="step-by-step"] +>[Previous](service-to-service-communication.md) +>[Next](service-mesh-communication-infrastructure.md) diff --git a/docs/architecture/distributed-cloud-native-apps-containers/service-to-service-communication-patterns/introduction.md b/docs/architecture/distributed-cloud-native-apps-containers/service-to-service-communication-patterns/introduction.md new file mode 100644 index 0000000000000..132027da7b3fe --- /dev/null +++ b/docs/architecture/distributed-cloud-native-apps-containers/service-to-service-communication-patterns/introduction.md @@ -0,0 +1,16 @@ +--- +title: Introduction to cloud-native service-to-service communication patterns +description: Cloud-native service-to-service communication patterns | Introduction to cloud-native service-to-service communication patterns +author: +ms.date: 04/25/2024 +--- + +# Introduction to cloud-native service-to-service communication patterns + +[!INCLUDE [download-alert](../includes/download-alert.md)] + +This chapter is about cloud-native communication patterns. Initially we'll discover how clients and microservices locate other services in the application. Then we'll discuss how clients communicate with back-end microservices and how microservices best communicate between themselves. We'll also investigate the gRPC protocol in more detail and introduce service meshes, which can streamline microservice communication. + +>[!div class="step-by-step"] +>[Previous](../communication-patterns/communication-patterns.md) +>[Next](service-discovery.md) diff --git a/docs/architecture/distributed-cloud-native-apps-containers/service-to-service-communication-patterns/media/aggregator-service.png b/docs/architecture/distributed-cloud-native-apps-containers/service-to-service-communication-patterns/media/aggregator-service.png new file mode 100644 index 0000000000000..1a0b3670fcc9e Binary files /dev/null and b/docs/architecture/distributed-cloud-native-apps-containers/service-to-service-communication-patterns/media/aggregator-service.png differ diff --git a/docs/architecture/distributed-cloud-native-apps-containers/service-to-service-communication-patterns/media/azure-event-hub.png b/docs/architecture/distributed-cloud-native-apps-containers/service-to-service-communication-patterns/media/azure-event-hub.png new file mode 100644 index 0000000000000..ac3b293635835 Binary files /dev/null and b/docs/architecture/distributed-cloud-native-apps-containers/service-to-service-communication-patterns/media/azure-event-hub.png differ diff --git a/docs/architecture/distributed-cloud-native-apps-containers/service-to-service-communication-patterns/media/chaining-http-queries.png b/docs/architecture/distributed-cloud-native-apps-containers/service-to-service-communication-patterns/media/chaining-http-queries.png new file mode 100644 index 0000000000000..c8fb02034308a Binary files /dev/null and b/docs/architecture/distributed-cloud-native-apps-containers/service-to-service-communication-patterns/media/chaining-http-queries.png differ diff --git a/docs/architecture/distributed-cloud-native-apps-containers/service-to-service-communication-patterns/media/command-interaction-with-queue.png b/docs/architecture/distributed-cloud-native-apps-containers/service-to-service-communication-patterns/media/command-interaction-with-queue.png new file mode 100644 index 0000000000000..a5c010a514253 Binary files /dev/null and b/docs/architecture/distributed-cloud-native-apps-containers/service-to-service-communication-patterns/media/command-interaction-with-queue.png differ diff --git a/docs/architecture/distributed-cloud-native-apps-containers/service-to-service-communication-patterns/media/direct-http-communication.png b/docs/architecture/distributed-cloud-native-apps-containers/service-to-service-communication-patterns/media/direct-http-communication.png new file mode 100644 index 0000000000000..d33fd94f7648a Binary files /dev/null and b/docs/architecture/distributed-cloud-native-apps-containers/service-to-service-communication-patterns/media/direct-http-communication.png differ diff --git a/docs/architecture/distributed-cloud-native-apps-containers/service-to-service-communication-patterns/media/eshop-architecture.png b/docs/architecture/distributed-cloud-native-apps-containers/service-to-service-communication-patterns/media/eshop-architecture.png new file mode 100644 index 0000000000000..5fd2dc26e107d Binary files /dev/null and b/docs/architecture/distributed-cloud-native-apps-containers/service-to-service-communication-patterns/media/eshop-architecture.png differ diff --git a/docs/architecture/distributed-cloud-native-apps-containers/service-to-service-communication-patterns/media/event-driven-messaging.png b/docs/architecture/distributed-cloud-native-apps-containers/service-to-service-communication-patterns/media/event-driven-messaging.png new file mode 100644 index 0000000000000..d3e9340c4aedc Binary files /dev/null and b/docs/architecture/distributed-cloud-native-apps-containers/service-to-service-communication-patterns/media/event-driven-messaging.png differ diff --git a/docs/architecture/distributed-cloud-native-apps-containers/service-to-service-communication-patterns/media/event-grid-anatomy.png b/docs/architecture/distributed-cloud-native-apps-containers/service-to-service-communication-patterns/media/event-grid-anatomy.png new file mode 100644 index 0000000000000..64d70d92de429 Binary files /dev/null and b/docs/architecture/distributed-cloud-native-apps-containers/service-to-service-communication-patterns/media/event-grid-anatomy.png differ diff --git a/docs/architecture/distributed-cloud-native-apps-containers/service-to-service-communication-patterns/media/event-hub-partitioning.png b/docs/architecture/distributed-cloud-native-apps-containers/service-to-service-communication-patterns/media/event-hub-partitioning.png new file mode 100644 index 0000000000000..feac964c43b26 Binary files /dev/null and b/docs/architecture/distributed-cloud-native-apps-containers/service-to-service-communication-patterns/media/event-hub-partitioning.png differ diff --git a/docs/architecture/distributed-cloud-native-apps-containers/service-to-service-communication-patterns/media/grpc-project.png b/docs/architecture/distributed-cloud-native-apps-containers/service-to-service-communication-patterns/media/grpc-project.png new file mode 100644 index 0000000000000..4131bf62a90da Binary files /dev/null and b/docs/architecture/distributed-cloud-native-apps-containers/service-to-service-communication-patterns/media/grpc-project.png differ diff --git a/docs/architecture/distributed-cloud-native-apps-containers/service-to-service-communication-patterns/media/request-reply-pattern.png b/docs/architecture/distributed-cloud-native-apps-containers/service-to-service-communication-patterns/media/request-reply-pattern.png new file mode 100644 index 0000000000000..3b19fe1c52a81 Binary files /dev/null and b/docs/architecture/distributed-cloud-native-apps-containers/service-to-service-communication-patterns/media/request-reply-pattern.png differ diff --git a/docs/architecture/distributed-cloud-native-apps-containers/service-to-service-communication-patterns/media/service-bus-queue.png b/docs/architecture/distributed-cloud-native-apps-containers/service-to-service-communication-patterns/media/service-bus-queue.png new file mode 100644 index 0000000000000..0f47eea572a5d Binary files /dev/null and b/docs/architecture/distributed-cloud-native-apps-containers/service-to-service-communication-patterns/media/service-bus-queue.png differ diff --git a/docs/architecture/distributed-cloud-native-apps-containers/service-to-service-communication-patterns/media/service-mesh-with-side-car.png b/docs/architecture/distributed-cloud-native-apps-containers/service-to-service-communication-patterns/media/service-mesh-with-side-car.png new file mode 100644 index 0000000000000..f96cb729d5d19 Binary files /dev/null and b/docs/architecture/distributed-cloud-native-apps-containers/service-to-service-communication-patterns/media/service-mesh-with-side-car.png differ diff --git a/docs/architecture/distributed-cloud-native-apps-containers/service-to-service-communication-patterns/media/storage-queue-hierarchy.png b/docs/architecture/distributed-cloud-native-apps-containers/service-to-service-communication-patterns/media/storage-queue-hierarchy.png new file mode 100644 index 0000000000000..a6935bf4155f8 Binary files /dev/null and b/docs/architecture/distributed-cloud-native-apps-containers/service-to-service-communication-patterns/media/storage-queue-hierarchy.png differ diff --git a/docs/architecture/distributed-cloud-native-apps-containers/service-to-service-communication-patterns/media/topic-architecture.png b/docs/architecture/distributed-cloud-native-apps-containers/service-to-service-communication-patterns/media/topic-architecture.png new file mode 100644 index 0000000000000..879f1d893602a Binary files /dev/null and b/docs/architecture/distributed-cloud-native-apps-containers/service-to-service-communication-patterns/media/topic-architecture.png differ diff --git a/docs/architecture/distributed-cloud-native-apps-containers/service-to-service-communication-patterns/media/visual-studio-2022-grpc-template.png b/docs/architecture/distributed-cloud-native-apps-containers/service-to-service-communication-patterns/media/visual-studio-2022-grpc-template.png new file mode 100644 index 0000000000000..bd90f732e64ad Binary files /dev/null and b/docs/architecture/distributed-cloud-native-apps-containers/service-to-service-communication-patterns/media/visual-studio-2022-grpc-template.png differ diff --git a/docs/architecture/distributed-cloud-native-apps-containers/service-to-service-communication-patterns/service-discovery.md b/docs/architecture/distributed-cloud-native-apps-containers/service-to-service-communication-patterns/service-discovery.md new file mode 100644 index 0000000000000..b95f360433abe --- /dev/null +++ b/docs/architecture/distributed-cloud-native-apps-containers/service-to-service-communication-patterns/service-discovery.md @@ -0,0 +1,63 @@ +--- +title: Service discovery +description: Cloud-native service to service communication patterns | Service discovery +author: +ms.date: 04/25/2024 +--- + +# Service discovery + +[!INCLUDE [download-alert](../includes/download-alert.md)] + +## What is service discovery? + +Service discovery refers to the process by which services dynamically locate and connect to each other. As your application scales and evolves, the IP addresses and endpoints of components and services change dynamically. Service discovery ensures that these components can find each other without hardcoding IP addresses or relying on static configurations. + +## Service discovery using a load balancer + +Service discovery could be achieved by using a server-side load balancer (LB). The LB points to the services and clients point to the LB. The downside of using an LB for service discovery is the need to maintain addresses in the LB and the possibility that the LB becomes a single point of failure. + +## Service discovery in .NET + +To get started with service discovery in .NET, install the [Microsoft.Extensions.ServiceDiscovery](https://www.nuget.org/packages/Microsoft.Extensions.ServiceDiscovery) NuGet package. + +In the Program.cs file of your project, call the AddServiceDiscovery extension method to add service discovery to the host, configuring default service endpoint resolvers. + +```csharp +builder.Services.AddServiceDiscovery(); +``` + +Then, you can add service discovery to client services by calling the `AddServiceDiscovery` method: + +```csharp +builder.Services.AddHttpClient(static client => + { + client.BaseAddress = new("https://products"); + }) + .AddServiceDiscovery(); +``` + +## Resolve service endpoints using platform-provided service discovery + +Certain platforms, like Azure Container Apps and Kubernetes when configured accordingly, offer service discovery capabilities without necessitating a service discovery client library. In cases where an application is deployed in such environments, using the platform's built-in functionality can be advantageous. The pass-through resolver is designed to facilitate this scenario. It enables the utilization of alternative resolvers, such as configuration, in different environments, such as a developer's machine. Importantly, this flexibility is achieved without the need for any code modifications or the implementation of conditional guards. + +## Service discovery in .NET Aspire + +In .NET Aspire, service discovery features are built-in and easy to use. In any project that you've based on the .NET Aspire solution templates, or that you've added .NET Aspire orchestration to, .NET service discovery is already set up and configured. All you have to do is create the services in the App Host project, and then pass them to the other services that use them. + +For example, this code creates two microservices called `catalog` and `basket`, each of which is a project in the .NET Aspire solution. It uses the `WithReference` extension method to pass these microservices to the `frontend` microservice: + +```csharp +var builder = DistributedApplication.CreateBuilder(args); + +var catalog = builder.AddProject("catalog"); +var basket = builder.AddProject("basket"); + +var frontend = builder.AddProject("frontend") + .WithReference(basket) + .WithReference(catalog); +``` + +>[!div class="step-by-step"] +>[Previous](introduction.md) +>[Next](service-to-service-communication.md) diff --git a/docs/architecture/distributed-cloud-native-apps-containers/service-to-service-communication-patterns/service-mesh-communication-infrastructure.md b/docs/architecture/distributed-cloud-native-apps-containers/service-to-service-communication-patterns/service-mesh-communication-infrastructure.md new file mode 100644 index 0000000000000..5abaced4e17a1 --- /dev/null +++ b/docs/architecture/distributed-cloud-native-apps-containers/service-to-service-communication-patterns/service-mesh-communication-infrastructure.md @@ -0,0 +1,68 @@ +--- +title: Service Mesh communication infrastructure +description: Cloud-native service to service communication patterns | Service Mesh communication infrastructure +author: +ms.date: 04/25/2024 +--- + +# Service Mesh communication infrastructure + +[!INCLUDE [download-alert](../includes/download-alert.md)] + +Throughout this chapter, we've explored the challenges of microservice communication. We said that development teams need to be sensitive to how back-end services communicate with each other. Ideally, the less inter-service communication, the better. However, avoidance isn't always possible as back-end services often rely on one another to complete operations. + +We explored different approaches for implementing synchronous HTTP communication and asynchronous messaging. In each of the cases, the developer is burdened with implementing communication code. Communication code is complex and time intensive. Incorrect decisions can lead to significant performance issues. + +A more modern approach to microservice communication centers around a new and rapidly evolving technology entitled *Service Mesh*. A [service mesh](https://www.nginx.com/blog/what-is-a-service-mesh/) is a configurable infrastructure layer with built-in capabilities to handle service-to-service communication, resiliency, and many cross-cutting concerns. It moves the responsibility for these concerns out of the microservices and into service mesh layer. Communication is abstracted away from your microservices. + +A key component of a service mesh is a proxy. In a cloud-native application, an instance of a proxy is typically colocated with each microservice. While they execute in separate processes, the two are closely linked and share the same lifecycle. This pattern, known as the [Sidecar pattern](/azure/architecture/patterns/sidecar). + +![Service mesh with a side car diagram](media/service-mesh-with-side-car.png) + +**Figure 6-22**. Service mesh with a side car + +Note in the previous figure how messages are intercepted by a proxy that runs alongside each microservice. Each proxy can be configured with traffic rules specific to the microservice. It understands messages and can route them across your services and the outside world. + +Along with managing service-to-service communication, the Service Mesh provides support for service discovery and load balancing. + +Once configured, a service mesh is highly functional. The mesh retrieves a corresponding pool of instances from a service discovery endpoint. It sends a request to a specific service instance, recording the latency and response type of the result. It chooses the instance most likely to return a fast response based on different factors, including the observed latency for recent requests. + +A service mesh manages traffic, communication, and networking concerns at the application level. It understands messages and requests. A service mesh typically integrates with a container orchestrator. Kubernetes supports an extensible architecture in which a service mesh can be added. + +In chapter 6, we deep-dive into Service Mesh technologies including a discussion on its architecture and available open-source implementations. + +## Summary + +In this chapter, we discussed cloud-native communication patterns. We started by examining how front-end clients communicate with back-end microservices. Along the way, we talked about API Gateway platforms and real-time communication. We then looked at how microservices communicate with other back-end services. We looked at both synchronous HTTP communication and asynchronous messaging across services. We covered gRPC, an upcoming technology in the cloud-native world. Finally, we introduced a new and rapidly evolving technology entitled Service Mesh that can streamline microservice communication. + +Special emphasis was on managed Azure services that can help implement communication in cloud-native systems: + +- [Azure Application Gateway](https://learn.microsoft.com/azure/application-gateway/overview) +- [Azure API Management](https://azure.microsoft.com/services/api-management/) +- [Azure SignalR Service](https://azure.microsoft.com/services/signalr-service/) +- [Azure Storage Queues](https://learn.microsoft.com/azure/storage/queues/storage-queues-introduction) +- [Azure Service Bus](https://learn.microsoft.com/azure/service-bus-messaging/service-bus-messaging-overview) +- [Azure Event Grid](https://learn.microsoft.com/azure/event-grid/overview) +- [Azure Event Hub](https://azure.microsoft.com/services/event-hubs/) + +We next move to distributed data in cloud-native systems and the benefits and challenges that it presents. + +### References + +- [.NET Microservices: Architecture for Containerized .NET applications](https://dotnet.microsoft.com/download/thank-you/microservices-architecture-ebook) + +- [Designing Interservice Communication for Microservices](https://learn.microsoft.com/azure/architecture/microservices/design/interservice-communication) + +- [Azure SignalR Service, a fully managed service to add real-time functionality](https://azure.microsoft.com/blog/azure-signalr-service-a-fully-managed-service-to-add-real-time-functionality/) + +- [Azure API Gateway Ingress Controller](https://azure.github.io/application-gateway-kubernetes-ingress/) + +- [gRPC Documentation](https://grpc.io/docs/guides/) + +- [Comparing gRPC Services with HTTP APIs](https://learn.microsoft.com/aspnet/core/grpc/comparison?view=aspnetcore-8.0) + +- [Building gRPC Services with .NET video](https://learn.microsoft.com/Shows/The-Cloud-Native-Show/Building-Microservices-with-gRPC-and-NET) + +>[!div class="step-by-step"] +>[Previous](grpc.md) +>[Next](../event-based-communication-patterns/integration-event-based-microservice-communications.md) diff --git a/docs/architecture/distributed-cloud-native-apps-containers/service-to-service-communication-patterns/service-to-service-communication.md b/docs/architecture/distributed-cloud-native-apps-containers/service-to-service-communication-patterns/service-to-service-communication.md new file mode 100644 index 0000000000000..115e8a500591d --- /dev/null +++ b/docs/architecture/distributed-cloud-native-apps-containers/service-to-service-communication-patterns/service-to-service-communication.md @@ -0,0 +1,237 @@ +--- +title: Service-to-service communication +description: Cloud-native service to service communication patterns | Service-to-service communication +author: +ms.date: 04/25/2024 +--- + +# Service-to-service communication + +[!INCLUDE [download-alert](../includes/download-alert.md)] + +Moving from the front-end client, we now address how back-end microservices communicate with each other. + +Ideally, the less inter-service communication, the better. However, avoidance isn't always possible as back-end services often rely on one another to complete an operation. + +There are several widely accepted approaches to implementing cross-service communication. The *type of communication interaction* will often determine the best approach. + +Consider the following interaction types: + +- *Query* – when a calling microservice requires a response from a called microservice, such as, "I need the buyer information for a given customer ID." + +- *Command* – when the calling microservice needs another microservice to execute an action but doesn't require a response, such as, "Ship this order." + +- *Event* – when a microservice, called the publisher, raises an event that state has changed or an action has occurred, such as "An order was shipped.". Other microservices, called subscribers, who are interested, can react to the event appropriately. The publisher and the subscribers aren't aware of each other. + +Microservice systems typically use a combination of these interaction types when executing operations that require cross-service interaction. Let's take a close look at each and how you might implement them. + +## Queries + +Many times, one microservice might need to *query* another, requiring an immediate response to complete an operation. A shopping basket microservice may need product information and a price to add an item to its basket. There are many approaches for implementing query operations. + +### Request/Response Messaging + +One option for implementing this scenario is for the calling back-end microservice to make direct HTTP requests to the microservices it needs to query. + +> ![Direct HTTP communication diagram](media/direct-http-communication.png) + +**Figure 6-7**. Direct HTTP communication + +While direct HTTP calls between microservices are relatively simple to implement, care should be taken to minimize this practice. To start, these calls are always *synchronous* and will block the operation until a result is returned or the request times outs. What were once self-contained, independent services, able to evolve independently and deploy frequently, now become coupled to each other. As coupling among microservices increase, their architectural benefits diminish. + +Executing an infrequent request that makes a single direct HTTP call to another microservice might be acceptable for some systems. However, high-volume calls that invoke direct HTTP calls to multiple microservices aren't advisable. They can increase latency and negatively impact the performance, scalability, and availability of your system. Even worse, a long series of direct HTTP communication can lead to deep and complex chains of synchronous microservices calls: + +> ![Chaining HTTP queries diagram](media/chaining-http-queries.png) + +**Figure 6-8**. Chaining HTTP queries + +You can certainly imagine the risk in the design shown in the previous image. What happens if Step \#3 fails? Or Step \#8 fails? How do you recover? What if Step \#6 is slow because the underlying service is busy? How do you continue? Even if all works correctly, think of the latency this call would incur, which is the sum of the latency of each step. + +The large degree of coupling in the previous image suggests the services weren't optimally modeled. It would behoove the team to revisit their design. + +### Materialized View pattern + +A popular option for removing microservice coupling is the [Materialized View pattern](/azure/architecture/patterns/materialized-view). With this pattern, a microservice stores its own local, denormalized copy of data that's owned by other services. Instead of the Shopping Basket microservice querying the Product Catalog and Pricing microservices, it maintains its own local copy of that data. This pattern eliminates unnecessary coupling and improves reliability and response time. The entire operation executes inside a single process. We explore this pattern and other data concerns in Chapter 5. + +### Service Aggregator Pattern + +Another option for eliminating microservice-to-microservice coupling is an [Aggregator microservice](https://devblogs.microsoft.com/cesardelatorre/designing-and-implementing-api-gateways-with-ocelot-in-a-microservices-and-container-based-architecture/). + +> ![Aggregator service diagram](media/aggregator-service.png) + +**Figure 6-9**. Aggregator microservice + +The pattern isolates an operation that makes calls to multiple back-end microservices, centralizing its logic into a specialized microservice. The purple checkout aggregator microservice in the previous figure orchestrates the workflow for the Checkout operation. It includes calls to several back-end microservices in a sequenced order. Data from the workflow is aggregated and returned to the caller. While it still implements direct HTTP calls, the aggregator microservice reduces direct dependencies among back-end microservices. + +### Request-reply pattern + +Another approach for decoupling synchronous HTTP messages is a [Request-reply pattern](https://www.enterpriseintegrationpatterns.com/patterns/messaging/RequestReply.html), which uses queuing communication. Communication using a queue is always a one-way channel, with a producer sending the message and consumer receiving it. With this pattern, both a request queue and response queue are implemented. + +> ![Request-reply pattern diagram](media/request-reply-pattern.png) + +**Figure 6-10**. Request-reply pattern + +Here, the message producer creates a query-based message that contains a unique correlation ID and places it into a request queue. The consuming service dequeues the messages, processes it and places the response into the response queue with the same correlation ID. The producer service dequeues the message, matches it with the correlation ID and continues processing. We cover queues in detail in the next section. + +## Commands + +Another type of communication interaction is a *command*. A microservice may need another microservice to perform an action. The Ordering microservice may need the Shipping microservice to create a shipment for an approved order. In Figure 6-11, one microservice, called a Producer, sends a message to another microservice, the Consumer, commanding it to do something. + +> ![Command interaction with a queue diagram](media/command-interaction-with-queue.png) + +**Figure 6-11**. Command interaction with a queue + +Most often, the Producer doesn't require a response and can *fire-and-forget* the message. If a reply is needed, the Consumer sends a separate message back to Producer on another channel. A command message is best sent asynchronously with a message queue, supported by a lightweight message broker. In the previous diagram, note how a queue separates and decouples both services. + +A message queue is an intermediary construct through which a producer and consumer pass a message. Queues implement an asynchronous, point-to-point messaging pattern. The Producer knows where a command needs to be sent and routes appropriately. The queue guarantees that a message is processed by exactly one of the consumer instances that are reading from the channel. In this scenario, either the producer or consumer service can scale out without affecting the other. As well, technologies can be disparate on each side. For example, a Java microservice could call a [Golang](https://golang.org) microservice. + +In chapter 1, we talked about *backing services*. Backing services are ancillary resources upon which cloud-native systems depend. Message queues are backing services. The Azure cloud supports two types of message queues that your cloud-native systems can consume to implement command messaging: Azure Storage Queues and Azure Service Bus Queues. + +### Azure Storage Queues + +Azure Storage queues offer a simple queueing infrastructure that is fast, affordable, and backed by Azure Storage accounts. + +[Azure Storage Queues](/azure/storage/queues/storage-queues-introduction) feature a REST-based queuing mechanism with reliable and persistent messaging. They provide a minimal feature set, but are inexpensive and store millions of messages. Their capacity ranges up to 500 TB. A single message can be up to 64 KB in size. + +You can access messages from anywhere in the world via authenticated calls using HTTP or HTTPS. Storage queues can scale out to large numbers of concurrent clients to handle traffic spikes. + +There are a few limitations with the service to be aware of: + +- Message order isn't guaranteed. +- A message can only persist for seven days before it's automatically removed. +- Support for state management, duplicate detection, or transactions isn't available. + +> ![Storage queue hierarchy diagram](media/storage-queue-hierarchy.png) + +**Figure 6-12**. The hierarchy of an Azure Storage queue + +In the previous figure, note how storage queues store their messages in the underlying Azure Storage account. + +For developers, Microsoft provides several client and server-side libraries for storage queue processing. Most major platforms are supported including .NET, Java, JavaScript, Ruby, Python, and Go. Developers should never communicate directly with these libraries. Doing so will tightly couple your microservice code to the Azure Storage Queue service. It's a better practice to insulate the implementation details of the API. Introduce an intermediation layer, or intermediate API, that exposes generic operations and encapsulates the concrete library. This loose coupling enables you to swap out one queuing service for another without having to make changes to the mainline service code. + +Azure Storage queues are an economical option to implement command messaging in your cloud-native applications. Especially when a queue size will exceed 80 GB, or a simple feature set is acceptable. You only pay for the storage of the messages; there are no fixed hourly charges. + +### Azure Service Bus Queues + +For more complex messaging requirements, consider Azure Service Bus queues. + +Sitting atop a robust message infrastructure, [Azure Service Bus](/azure/service-bus-messaging/service-bus-messaging-overview) supports a *brokered messaging model*. Messages are reliably stored in a broker (the queue) until received by the consumer. The queue guarantees First-In/First-Out (FIFO) message delivery, respecting the order in which messages were added to the queue. + +The size of a message can be much larger, up to 256 KB. Messages are persisted in the queue for an unlimited period of time. Service Bus supports not only HTTP-based calls, but also provides full support for the [AMQP protocol](/azure/service-bus-messaging/service-bus-amqp-overview). AMQP is an open-standard across vendors that supports a binary protocol and higher degrees of reliability. + +Service Bus provides a rich set of features, including [transaction support](/azure/service-bus-messaging/service-bus-transactions) and a [duplicate detection feature](/azure/service-bus-messaging/duplicate-detection). The queue guarantees "at most once delivery" per message. It automatically discards a message that has already been sent. If a producer is in doubt, it can resend the same message, and Service Bus guarantees that only one copy will be processed. Duplicate detection frees you from having to build additional infrastructure plumbing. + +Two more enterprise features are partitioning and sessions. A conventional Service Bus queue is handled by a single message broker and stored in a single message store. But, [Service Bus Partitioning](/azure/service-bus-messaging/service-bus-partitioning) spreads the queue across multiple message brokers and message stores. The overall throughput is no longer limited by the performance of a single message broker or messaging store. A temporary outage of a messaging store doesn't render a partitioned queue unavailable. + +[Service Bus Sessions](https://codingcanvas.com/azure-service-bus-sessions/) provide a way to group-related messages. Imagine a workflow scenario where messages must be processed together and the operation completed at the end. To take advantage, sessions must be explicitly enabled for the queue and each related messaged must contain the same session ID. + +However, there are some important caveats: Service Bus queue size is limited to 80 GB, which is much smaller than what's available from Azure Storage queues. Additionally, Service Bus queues incur a base cost and charge per operation. + +> ![Service Bus queue diagram](media/service-bus-queue.png) + +**Figure 6-13**. The high-level architecture of a Service Bus queue + +In the previous figure, note the point-to-point relationship. Two instances of the same provider are enqueuing messages into a single Service Bus queue. Each message is consumed by only one of three consumer instances on the right. Next, we discuss how to implement messaging where different consumers may all be interested the same message. + +## Events + +Message queuing is an effective way to implement communication where a producer can asynchronously send a consumer a message. However, what happens when *many different consumers* are interested in the same message? A dedicated message queue for each consumer wouldn't scale well and would become difficult to manage. + +To address this scenario, we move to the third type of message interaction, the *event*. One microservice announces that an action had occurred. Other microservices, if interested, react to the action, or event. This is also known as the [event-driven architectural style](/azure/architecture/guide/architecture-styles/event-driven). + +Eventing is a two-step process. For a given state change, a microservice publishes an event to a message broker, making it available to any other interested microservice. The interested microservice is notified by subscribing to the event in the message broker. You use the [Publish/Subscribe](/azure/architecture/patterns/publisher-subscriber) pattern to implement [event-based communication](/dotnet/standard/microservices-architecture/multi-container-microservice-net-applications/integration-event-based-microservice-communications). + +Figure 6-14 shows a shopping basket microservice publishing an event with two other microservices subscribing to it. + +> ![Event-Driven messaging diagram](media/event-driven-messaging.png) + +**Figure 6-14**. Event-Driven messaging + +Note the *event bus* component that sits in the middle of the communication channel. It's a custom class that encapsulates the message broker and decouples it from the underlying application. The ordering and inventory microservices independently operate the event with no knowledge of each other, nor the shopping basket microservice. When the registered event is published to the event bus, they act upon it. + +With eventing, we move from queuing technology to *topics*. A [topic](/azure/service-bus-messaging/service-bus-dotnet-how-to-use-topics-subscriptions) is similar to a queue, but supports a one-to-many messaging pattern. One microservice publishes a message. Multiple subscribing microservices can choose to receive and act upon that message. Figure 6-16 shows a topic architecture. + +> [!div class="mx-imgBorder"] +> ![Topic architecture diagram](media/topic-architecture.png) + +In the previous figure, publishers send messages to the topic. At the end, subscribers receive messages from subscriptions. In the middle, the topic forwards messages to subscriptions based on a set of rules, shown in dark blue boxes. Rules act as a filter that forward specific messages to a subscription. Here, a "GetPrice" event would be sent to the price and logging subscriptions as the logging subscription has chosen to receive all messages. A "GetInformation" event would be sent to the information and logging subscriptions. + +The Azure cloud supports two different topic services: Azure Service Bus Topics and Azure EventGrid. + +### Azure Service Bus Topics + +Sitting on top of the same robust brokered message model of Azure Service Bus queues are [Azure Service Bus Topics](/azure/service-bus-messaging/service-bus-dotnet-how-to-use-topics-subscriptions). A topic can receive messages from multiple independent publishers and send messages to up to 2,000 subscribers. Subscriptions can be dynamically added or removed at run time without stopping the system or recreating the topic. + +Many advanced features from Azure Service Bus queues are also available for topics, including [Duplicate Detection](/azure/service-bus-messaging/duplicate-detection) and [Transaction support](/azure/service-bus-messaging/service-bus-transactions). By default, Service Bus topics are handled by a single message broker and stored in a single message store. But, [Service Bus Partitioning](/azure/service-bus-messaging/service-bus-partitioning) scales a topic by spreading it across many message brokers and message stores. + +[Scheduled Message Delivery](/azure/service-bus-messaging/message-sequencing) tags a message with a specific time for processing. The message won't appear in the topic before that time. [Message Deferral](/azure/service-bus-messaging/message-deferral) enables you to defer a retrieval of a message to a later time. Both are commonly used in workflow processing scenarios where operations are processed in a particular order. You can postpone processing of received messages until prior work has been completed. + +Service Bus topics are a robust and proven technology for enabling publish/subscribe communication in your cloud-native systems. + +### Azure Event Grid + +While Azure Service Bus is a battle-tested messaging broker with a full set of enterprise features, [Azure Event Grid](/azure/event-grid/overview) is the new kid on the block. + +At first glance, Event Grid may look like just another topic-based messaging system. However, it's different in many ways. Focused on event-driven workloads, it enables real-time event processing, deep Azure integration, and an open-platform - all on serverless infrastructure. It's designed for contemporary cloud-native and serverless applications + +As a centralized *eventing backplane*, or pipe, Event Grid reacts to events inside Azure resources and from your own services. + +Event notifications are published to an Event Grid Topic, which, in turn, routes each event to a subscription. Subscribers map to subscriptions and consume the events. Like Service Bus, Event Grid supports a *filtered subscriber model* where a subscription sets rule for the events it wishes to receive. Event Grid provides fast throughput with a guarantee of 10 million events per second enabling near real-time delivery - far more than what Azure Service Bus can generate. + +A sweet spot for Event Grid is its deep integration into the fabric of Azure infrastructure. An Azure resource, such as Cosmos DB, can publish built-in events directly to other interested Azure resources - without the need for custom code. Event Grid can publish events from an Azure Subscription, Resource Group, or Service, giving developers fine-grained control over the lifecycle of cloud resources. However, Event Grid isn't limited to Azure. It's an open platform that can consume custom HTTP events published from applications or third-party services and route events to external subscribers. + +When publishing and subscribing to native events from Azure resources, no coding is required. With simple configuration, you can integrate events from one Azure resource to another leveraging built-in plumbing for Topics and Subscriptions. + +> ![Event Grid anatomy diagram](media/event-grid-anatomy.png) + +**Figure 6-15**. Event Grid anatomy + +A major difference between EventGrid and Service Bus is the underlying *message exchange pattern*. + +Service Bus implements an older style *pull model* in which the downstream subscriber actively polls the topic subscription for new messages. On the upside, this approach gives the subscriber full control of the pace at which it processes messages. It controls when and how many messages to process at any given time. Unread messages remain in the subscription until processed. A significant shortcoming is the latency between the time the event is generated and the polling operation that pulls that message to the subscriber for processing. Also, the overhead of constant polling for the next event consumes resources and money. + +EventGrid, however, is different. It implements a *push model* in which events are sent to the EventHandlers as received, giving near real-time event delivery. It also reduces cost as the service is triggered only when it's needed to consume an event – not continually as with polling. That said, an event handler must handle the incoming load and provide throttling mechanisms to protect itself from becoming overwhelmed. Many Azure services that consume these events, such as Azure Functions and Logic Apps provide automatic autoscaling capabilities to handle increased loads. + +Event Grid is a fully managed serverless cloud service. It dynamically scales based on your traffic and charges you only for your actual usage, not pre-purchased capacity. The first 100,000 operations per month are free – operations being defined as event ingress (incoming event notifications), subscription delivery attempts, management calls, and filtering by subject. With 99.99% availability, EventGrid guarantees the delivery of an event within a 26-hour period, with built-in retry functionality for unsuccessful delivery. Undelivered messages can be moved to a "dead-letter" queue for resolution. Unlike Azure Service Bus, Event Grid is tuned for fast performance and doesn't support features like ordered messaging, transactions, and sessions. + +### Streaming messages in the Azure cloud + +Azure Service Bus and Event Grid provide great support for applications that expose single, discrete events. For example, when a new document has been inserted into a Cosmos DB table. But what if your cloud-native system needs to process a *stream of related events*? [Event streams](/archive/msdn-magazine/2015/february/microsoft-azure-the-rise-of-event-stream-oriented-systems) are more complex. They're typically time-ordered, interrelated, and must be processed as a group. + +[Azure Event Hub](https://azure.microsoft.com/services/event-hubs/) is a data streaming platform and event ingestion service that collects, transforms, and stores events. It's fine-tuned to capture streaming data, such as continuous event notifications emitted from a telemetry context. The service is highly scalable and can store and [process millions of events per second](/azure/event-hubs/event-hubs-about). Shown in Figure 6-16, it's often a front door for an event pipeline, decoupling ingest stream from event consumption. + +> ![Azure Event Hub diagram](media/azure-event-hub.png) + +**Figure 6-16**. Azure Event Hub + +Event Hub supports low latency and configurable time retention. Unlike queues and topics, Event Hubs keep event data after it's been read by a consumer. This feature enables other data analytic services, both internal and external, to replay the data for further analysis. Events stored in Event Hub are only deleted upon expiration of the retention period, which is one day by default, but configurable. + +Event Hub supports common event publishing protocols including HTTPS and AMQP. It also supports Kafka 1.0. [Existing Kafka applications can communicate with Event Hub](/azure/event-hubs/event-hubs-for-kafka-ecosystem-overview) using the Kafka protocol providing an alternative to managing large Kafka clusters. Many open-source cloud-native systems embrace Kafka. + +Event Hubs implements message streaming through a [partitioned consumer model](/azure/event-hubs/event-hubs-features) in which each consumer only reads a specific subset, or partition, of the message stream. This pattern enables tremendous horizontal scale for event processing and provides other stream-focused features that are unavailable in queues and topics. A partition is an ordered sequence of events that is held in an event hub. As newer events arrive, they're added to the end of this sequence. + +> ![Event Hub partitioning diagram](media/event-hub-partitioning.png) + +**Figure 6-17**. Event Hub partitioning + +Instead of reading from the same resource, each consumer group reads across a subset, or partition, of the message stream. + +For cloud-native applications that must stream large numbers of events, Azure Event Hub can be a robust and affordable solution. + +## Service-to-service communications in .NET Aspire + +When you build a cloud-native app using the .NET Aspire stack, service discovery is much easier and managed by code in the App Host project. The stack includes many integrations that support service-to-service communications. At the time of writing, these include: + +- Apache Kafka. +- Azure Storage Queues. +- Azure Event Hubs. +- Azure Service Bus. +- RabbitMQ. + +.NET Aspire makes it easier to manage and discover these services for your entire solution. You create and manage them in the centralized App Host project and pass them to microservices that use them. In the microservices, you can easiy obtain objects to work with them using dependency injection. For example, if you're working with Azure Service Bus, you get a instance of the `ServiceBusClient` class to use for interacting with queues and topics. + +However, the code that sends and retrieves messages will be similar, whether you use .NET Aspire or not. + +>[!div class="step-by-step"] +>[Previous](service-discovery.md) +>[Next](grpc.md) diff --git a/docs/architecture/distributed-cloud-native-apps-containers/testing-distributed-apps/challenges-of-distributed-app-testing.md b/docs/architecture/distributed-cloud-native-apps-containers/testing-distributed-apps/challenges-of-distributed-app-testing.md new file mode 100644 index 0000000000000..c8c887933246f --- /dev/null +++ b/docs/architecture/distributed-cloud-native-apps-containers/testing-distributed-apps/challenges-of-distributed-app-testing.md @@ -0,0 +1,45 @@ +--- +title: The challenges of distributed app testing +description: Architecture for Distributed Cloud-Native Apps with .NET Aspire & Containers | The challenges of distributed app testing +ms.date: 06/05/2024 +--- + +# The challenges of distributed app testing + +[!INCLUDE [download-alert](../includes/download-alert.md)] + +Distributed applications and cloud-native apps, which run across multiple servers, devices, or geographical locations, have become increasingly prevalent in today's software landscape. While they offer scalability, fault tolerance, and improved performance, testing such applications presents unique challenges. + +Here are key obstacles faced during distributed app testing and strategies to overcome them: + +1. **Network latency and communication issues** + - **Challenge**: Distributed apps rely on network communication between components. Latency, packet loss, and network congestion can impact performance and reliability. + - **Solution**: Simulate real-world network conditions during testing. Use tools like [JMeter](https://learn.microsoft.com/azure/load-testing/how-to-create-and-run-load-test-with-jmeter-script?tabs=portal) or [Chaos engineering](https://learn.microsoft.com/azure/chaos-studio/chaos-studio-overview) to inject latency and test edge cases. + +1. **Data consistency and synchronization** + - **Challenge**: Distributed systems often involve data replication across nodes. Ensuring consistency during updates or failures is complex. + - **Solution**: Implement **distributed database systems**, such as **Cassandra** or **MongoDB**, and test scenarios like node failure, data replication, and eventual consistency. + +1. **Scalability testing** + - **Challenge**: Distributed apps must handle varying loads. Testing scalability involves simulating thousands of concurrent users. + - **Solution**: Use **load testing tools**, such as [Azure Load Testing](https://azure.microsoft.com/products/load-testing/) to assess performance under heavy traffic. Monitor resource utilization and bottlenecks. + +1. **Fault tolerance and recovery** + - **Challenge**: Distributed systems encounter failures like node crashes and network partitions. Ensuring graceful recovery is critical. + - **Solution**: Test scenarios like **node failures**, **network splits**, and **automatic failover**. Validate data integrity after recovery. + +1. **Security and authorization** + - **Challenge**: Distributed apps face security threats like **man-in-the-middle attacks** and **data breaches**. Authorization across nodes is complex. + - **Solution**: Conduct **penetration testing**, validate **authentication mechanisms**, and assess **data encryption**. + +1. **Testing across environments** + - **Challenge**: Distributed apps run on diverse environments, such as in the cloud, on-premises, and in different container host systems. Ensuring consistency is crucial. + - **Solution**: Use **infrastructure as code** tools, such as [Terraform](https://learn.microsoft.com/azure/developer/terraform/overview) or [Bicep](https://learn.microsoft.com/azure/azure-resource-manager/bicep/overview?tabs=bicep)) to manage environments. Test across different setups. + +1. **Monitoring and observability** + - **Challenge**: Distributed systems generate vast logs and metrics. Identifying issues and bottlenecks requires effective monitoring. + - **Solution**: Set up **centralized logging** tools, such as [ELK stack](https://learn.microsoft.com/azure/virtual-machines/linux/tutorial-elasticsearch) or [Prometheus](https://learn.microsoft.com/azure/azure-monitor/essentials/prometheus-metrics-overview). Use [distributed tracing](https://learn.microsoft.com/azure/azure-monitor/app/distributed-trace-data) to visualize request flows. + +>[!div class="step-by-step"] +>[Previous](../cloud-native-identity/keycloak.md) +>[Next](test-aspnet-core-services-web-apps.md) diff --git a/docs/architecture/distributed-cloud-native-apps-containers/testing-distributed-apps/how-aspire-helps.md b/docs/architecture/distributed-cloud-native-apps-containers/testing-distributed-apps/how-aspire-helps.md new file mode 100644 index 0000000000000..244e2be638ae5 --- /dev/null +++ b/docs/architecture/distributed-cloud-native-apps-containers/testing-distributed-apps/how-aspire-helps.md @@ -0,0 +1,36 @@ +--- +title: How .NET Aspire helps with the challenges of distributed app testing +description: Architecture for Distributed Cloud-Native Apps with .NET Aspire & Containers | How .NET Aspire helps with the challenges of distributed app testing +ms.date: 06/05/2024 +--- + +# How .NET Aspire helps with the challenges of distributed app testing + +[!INCLUDE [download-alert](../includes/download-alert.md)] + +Distributed application testing is fraught with complexities due to the inherent nature of these systems. However, .NET Aspire integrations include logging, tracing, and metrics configurations by default using the [.NET OpenTelemetry SDK](https://github.com/open-telemetry/opentelemetry-dotnet). From creating test projects to orchestrating resources and debugging with an integrated dashboard, .NET Aspire equips developers with the tools needed to ensure their distributed applications are robust, reliable, and ready for deployment. + +By embracing .NET Aspire, teams can streamline their testing processes, reduce the time to market, and deliver high-quality software that meets the demands of modern distributed computing environments. + +.NET Aspire includes the following features for testing distributed apps: + +- **OpenTelemetry integration**: .NET Aspire uses the OpenTelemetry SDK to send data to the .NET Aspire dashboard, but you can also transmit telemetry data to other monitoring tools, such as Prometheus. + +- **Streamlined test creation**: .NET Aspire simplifies the creation of test projects with its **xUnit testing project template**. This template is not just for unit tests but also for functional and integration tests, which are crucial for distributed apps. + +- **Functional and integration testing**: The **DistributedApplicationTestingBuilder** in .NET Aspire allows for the creation of an app host and the execution of tests that mimic real-world scenarios, such as network conditions and API failures. + +- **Orchestration and resource management**: .NET Aspire's orchestration capabilities provide APIs for expressing resources and dependencies within your distributed application. This feature is invaluable for managing the configuration and interconnections of cloud-native apps during local development. + +- **Developer dashboard**: The **developer dashboard** is a pivotal tool in .NET Aspire, offering a unified view of services, logs, metrics, and traces. This dashboard is essential for debugging distributed applications and is easily accessible in Visual Studio or the command-line. + +- **Resilience testing**: .NET Aspire includes tools such as **Dev Proxy** to build and test resilient apps. It allows developers to simulate API failures, different network conditions, and more from the local machine, which is critical for distributed app testing. + +### Additional resources + +[Testing .NET Aspire apps](https://learn.microsoft.com/en-us/dotnet/aspire/fundamentals/testing) +[.NET Aspire telemetry](https://learn.microsoft.com/en-us/dotnet/aspire/fundamentals/telemetry) + +>[!div class="step-by-step"] +>[Previous](challenges-of-distributed-app-testing.md) +>[Next](../api-gateways/gateway-patterns.md) diff --git a/docs/architecture/distributed-cloud-native-apps-containers/testing-distributed-apps/test-aspnet-core-services-web-apps.md b/docs/architecture/distributed-cloud-native-apps-containers/testing-distributed-apps/test-aspnet-core-services-web-apps.md new file mode 100644 index 0000000000000..d006714703e66 --- /dev/null +++ b/docs/architecture/distributed-cloud-native-apps-containers/testing-distributed-apps/test-aspnet-core-services-web-apps.md @@ -0,0 +1,132 @@ +--- +title: Testing ASP.NET Core services and web apps +description: .NET Microservices Architecture for Containerized .NET Applications | Testing ASP.NET Core services and web apps +ms.date: 01/13/2021 +--- + +# Testing ASP.NET Core services and web apps + +[!INCLUDE [download-alert](../includes/download-alert.md)] + +Controllers are a central part of any ASP.NET Core API microservice. As such, you should have confidence they behave as intended for your application. Automated tests can provide you with this confidence and can detect errors before they reach production. + +You need to test how the controller behaves based on valid or invalid inputs, and test controller responses based on the result of the business operation it performs. You should have these types of tests for your microservices: + +- **Unit tests**: These tests ensure that individual components of the application work as expected. Assertions test the component API. + +- **Integration tests**: These tests ensure that component interactions work as expected against external artifacts like databases. Assertions can test the component API, user interface, or the side effects of actions like database I/O, logging, and so on. + +- **Functional tests**: These tests ensure that the application works as expected from the user's perspective. + +- **Service tests**: These tests check that end-to-end service use cases, including testing multiple services at the same time, are working. For this type of testing, you need to prepare the environment first. In this case, it means starting the services, for example, by using the `docker-compose up` command. + +## Implementing unit tests for ASP.NET Core web APIs + +Unit testing involves testing a small part of an application in isolation from its infrastructure and dependencies. When you unit test controller logic, only the content of a single action or method is tested, not the behavior of its dependencies or of the framework itself. Unit tests do not detect issues in the interaction between components — that is the purpose of integration testing. + +As you unit test your controller actions, make sure you focus only on their behavior. A controller unit test avoids things like filters, routing, or model binding (the mapping of request data to a view model or DTO). Because they focus on testing just one thing, unit tests are generally simple to write and quick to run. A well-written set of unit tests can be run frequently without much overhead. If a unit test fails, it's usually quick to identify and fix the problem. + +Unit tests are implemented based on test frameworks like **xUnit.net**, **MSTest**, **Moq**, or **NUnit**. In this example, we are using xUnit. + +When you write a unit test for a Web API controller, you instantiate the controller class directly using the new keyword in C\#, so that the test will run as fast as possible. The following example shows how to do this when using [xUnit](https://xunit.net/) as the test framework. + +```csharp +[Fact] +public async Task Get_order_detail_success() +{ + //Arrange + var fakeOrderId = "12"; + var fakeOrder = GetFakeOrder(); + + //... + + //Act + var orderController = new OrderController( + _orderServiceMock.Object, + _basketServiceMock.Object, + _identityParserMock.Object); + + orderController.ControllerContext.HttpContext = _contextMock.Object; + var actionResult = await orderController.Detail(fakeOrderId); + + //Assert + var viewResult = Assert.IsType(actionResult); + Assert.IsAssignableFrom(viewResult.ViewData.Model); +} +``` + +## Implementing integration and functional tests for each microservice + +As noted, integration tests and functional tests have different purposes and goals. However, the way you implement both when testing ASP.NET Core controllers is similar, so in this section we concentrate on integration tests. + +Integration testing ensures that an application's components function correctly when assembled. ASP.NET Core supports integration testing using unit test frameworks and a built-in test web host that can be used to handle requests without network overhead. + +Unlike unit tests, integration tests frequently involve application infrastructure concerns, such as databases, file systems, network resources, or web requests and responses. Unit tests use fakes or mock objects in place of these concerns but the purpose of integration tests is to confirm that the system works as expected with these systems, so for integration testing you do not use fakes or mock objects. Instead, you include the infrastructure, like databases or remote services. + +Because integration tests exercise larger segments of code than unit tests, and because integration tests rely on infrastructure elements, they tend to be orders of magnitude slower than unit tests. Thus, it is a good idea to limit how many integration tests you write and run simultaneously. + +ASP.NET Core includes a built-in test web host that can be used to handle HTTP requests without network overhead, so you can run those tests faster than when using a real web host. The test web host (TestServer) is available in the NuGet package [Microsoft.AspNetCore.TestHost](https://www.nuget.org/packages/Microsoft.AspNetCore.TestHost). It can be added to integration test projects and used to host ASP.NET Core applications. + +As you can see in the following code, when you create integration tests for ASP.NET Core controllers, you instantiate the controllers through the test host. This functionality is comparable to an HTTP request, but it runs faster. + +```csharp +public class PrimeWebDefaultRequestShould +{ + private readonly TestServer _server; + private readonly HttpClient _client; + + public PrimeWebDefaultRequestShould() + { + // Arrange + _server = new TestServer(new WebHostBuilder() + .UseStartup()); + _client = _server.CreateClient(); + } + + [Fact] + public async Task ReturnHelloWorld() + { + // Act + var response = await _client.GetAsync("/"); + response.EnsureSuccessStatusCode(); + var responseString = await response.Content.ReadAsStringAsync(); + // Assert + Assert.Equal("Hello World!", responseString); + } +} +``` + +#### Additional resources + +- **Steve Smith. Testing controllers** (ASP.NET Core) \ + [https://learn.microsoft.com/aspnet/core/mvc/controllers/testing](https://learn.microsoft.com/aspnet/core/mvc/controllers/testing) + +- **Steve Smith. Integration testing** (ASP.NET Core) \ + [https://learn.microsoft.com/aspnet/core/test/integration-tests](https://learn.microsoft.com/aspnet/core/test/integration-tests) + +- **Unit testing in .NET using dotnet test** \ + [https://learn.microsoft.com/dotnet/core/testing/unit-testing-with-dotnet-test](https://learn.microsoft.com/dotnet/core/testing/unit-testing-with-dotnet-test) + +- **xUnit.net**. Official site. \ + + +- **Unit test basics.** \ + [https://learn.microsoft.com/visualstudio/test/unit-test-basics](https://learn.microsoft.com/visualstudio/test/unit-test-basics) + +- **Moq**. GitHub repo. \ + + +- **NUnit**. Official site. \ + + +## Implementing service tests on a multi-container application + +As noted earlier, when you test multi-container applications, all the microservices need to be running within the Docker host or container cluster. End-to-end service tests that include multiple operations involving several microservices require you to deploy and start the whole application in the Docker host by running `docker-compose up`, or a comparable mechanism if you are using an orchestrator. Once the whole application and all its services are running, you can execute end-to-end integration and functional tests. + +There are a few approaches you can use. In the _docker-compose.yml_ file that you use to deploy the application at the solution level, you can expand the entry point to use [dotnet test](https://learn.microsoft.com/dotnet/core/tools/dotnet-test). You can also use another compose file that would run your tests in the image you are targeting. By using another compose file for integration tests that includes your microservices and databases on containers, you can make sure that the related data is always reset to its original state before running the tests. + +Once the compose application is up and running, you can take advantage of breakpoints and exceptions if you are running Visual Studio. Or you can run the integration tests automatically in your CI pipeline in Azure DevOps Services or any other CI/CD system that supports Docker containers. + +> [!div class="step-by-step"] +> [Previous](challenges-of-distributed-app-testing.md) +> [Next](how-aspire-helps.md)