Building and supporting infrastructure in a large organization can be a challenging task due to automation of all deployment steps, for example, creating cloud environments for a department with a set of virtual machines that contain preinstalled software, utilities, and services that may have their own dependencies like database and storage, connection to log analytics and other log services. Utilities and services may have external dependencies as well. Therefore you need to have a mechanism to check and ensure that required connections are established, dependencies are installed, proper automation runbook is triggered. The mechanism should alert the administrator of error, perform a backup.
To build this mechanism, I will use the following Azure resources.
Event Grid is a key component that acts as a main event processor and contains:
- Topics are azure resources that represent components that generate events.
- Subscriptions/endpoint is azure resource that handles the event.
The relation between publisher components, events, topic, and subscriber/endpoint is shown in the diagram below.
Event grid contains:
- Dead Letter Queue and retry policy — if message not able to reach the Endpoint, then you should also configure the retry policy
- Event filtering — the rule allows the event grid to deliver specific event types to the endpoint point. For example: when you create a new VM in the topic container (resource group or subscription). The Event Grid will catch the event and deliver it to the endpoint.
Queue Storage Account
Queue Storage uses as a main event storage. When the event is generated via Event Grid, the final destination will be a queue.
Functions are used as microservice that contains logic to validate required resources and connections. Each function may also contain a database for storing configuration or state management data.
After a rather high-level architecture description, I will provide more details on it below.
As you have noticed, an event grid is linked with a subscription listening on the events related to new virtual machine creation. It is necessary to add filtering here. Otherwise, the event grid will generate messages whenever any resource is created in a subscription or the target resource group. All events delivered are delivered in the storage queue. In my project, I’m using three queues:
- The main queue is a destination for all messages from the event grid.
- Retry queue receives all messages which failed during the first steps of validation and were scheduled for future retry.
- Succeeded queue is used for all successfully processed messages. In my project, I used this queue for future statistics and reporting.
Also, a storage account is linked to Azure Log Analytics to synchronize all logs and alerts. For example, log analytics will log this as an error alert administrator if there are more messages in the retry queue than expected.
The next component is the Azure function app contains several azure functions with validation, message processing, and logic to trigger the runbook.
Validation Function is linked with the main event queue. When the event grid sends a message to the main queue, the function is automatically triggered.
Retry Function is based on a timer trigger and will be run constantly to check failed messages intended to retry (in retry event queue).
API Function(HTTP trigger) is intended to trigger (re-run) the whole process from runbook, Admin UI, etc.
The whole functions workflow is described below.
Architecture Code Base
The function app is written on PowerShell (Powershell Class) and uses the principles of OOP. This approach allows us to build modular well-supported code and add or remove functions at any time.
In this article, I described building event-driven architecture to manage the virtual machine, related utilities, and components.
You can reuse the presented solution in the following scenarios:
- Key Vaults and SSL certificate management (check certificate expiration time, log and inform, update certificate automatically)
- Create custom logic to build cloud expense reports
- Cloud resources backup, check availability and log (using Log Analytics or other tools)
- Resource clean up management
- Container management solution
- Here you can find Source Code
- And here the complete tutorial article.
As this topic quite useful and popular in the world of Cloud and Software architecture. Moreover, these approaches solve a lot of business problems. Taking this into account, I’ve decided to create a practical, interactive course. It is based on my experience building EDA architecture for enterprise customers.
You can play with code, test architecture, and run pipelines directly in the course! Below you can find a link and enroll in the course.