When large and highly regulated organizations like communications service providers shift workloads to the cloud, stakeholders are understandably concerned about how the new cloud environment will support the strict requirements of security, governance, risk and regulatory compliance.
This blog describes a field-proven Compliance Suite developed by Sourced, an Amdocs company, which ultimately provides Cloud Security, Operations, and Development teams, as well as Application leads, with a bespoke single-plane-of-glass to monitor and assess the compliance posture of their entire cloud environment. The solution offers a bird-eye’s view of the organization’s compliance posture and support exemptions, along with event-driven notifications to operations and infrastructure teams which trigger remediation efforts. In this way, CSPs meet requirements and mitigate risks – using tools available from hyperscalers.
Why are governance and compliance rules important?
The focus of nearly every organization as they move to the cloud is to improve their experience and agility, and to increase the velocity of application release cycles so that they may be more adaptive and responsive to customer and internal business needs. Intuitively, this means that the application landscape will change at an increased pace, leading to an increased risk of sub-optimal configuration or misconfiguration – which in turn may result in additional potential vulnerabilities or open vectors of attack.
Ideally, if an organization could continually assess, audit, and evaluate cloud resources, and detect an unexpected configuration change to the cloud footprint – against company policies and known good configurations – we could mitigate much of the risk of undetected attack vectors, whilst maintaining a high level of application release velocity.
Through working with customers in financial services and other highly regulated industries over the past decade, Sourced has iterated and built upon various patterns and approaches to cloud adoption. We have found that building a robust foundation for compliance does not need to impinge at all on agility for application teams.
This blog walks you through how to build the solution with cloud native applications (referred to as the Compliance Suite) around AWS Config. You can also watch this video from the AWS Public Sector Online Summit.
The Compliance Suite
The Compliance Suite continually monitors the AWS cloud footprint and application workloads. The suite is a collection of cloud native microservices, referred to in this post as engines, each of which addresses one of three main areas:
- Rules Engine – For managing AWS config rules deployments
- Compliance Engine: For handling and managing compliance exemptions and acting as a filter for downstream applications.
- Canary Engine: For provisioning rogue data into the system to validate whether the Compliance Suite is functioning as expected.
Rules Engine: Detection
Config rules are essentially AWS Lambda functions that AWS Config uses to evaluation resource configurations. These Lambda functions can be triggered on-schedule or during a resource configuration change event, set as the recommended configuration option.
Figure 1: Rules Engine high level
Designed to automate the deployment and management of these config rules across an organization’s landing zone, the Rules Engine can be broken down to three main segments: The deployment specification file, config rule specification files, and the source code – all of which will eventually be managed as an AWS Lambda function. It can also scale along with the size of cloud adoption within an organization accordingly.
Config rules parameters are entered into the config rule specification, while the deployment specification file is a declarative map of which config rules are to be deployed and to which accounts.
During compilation, the engine will dynamically generate AWS CloudFormation templates for the config rule resources, package the source code for each config rule Lambda and generate a list of deployment actions from the deployment specification file. These items will form the deployment package, to be consumed by the deployment pipeline – henceforth referred to as pipeline.
During deployment, the pipeline reads the deployment actions and translates them into API calls, signaling the cloud formation service to orchestrate the generation of config rules resources and config rule Lambdas across specified accounts within the landing zone.
Compliance Engine: Monitoring & notification
Designed specifically to handle and manage compliance exemptions, the Compliance Engine – once certified – will be routing compliance events from the config rules and will then be ready to present the multi-faceted compliance posture of an enterprise. Still, there are three main areas of enquiry with regards to the deployment of the Compliance Engine.
- What happens if an application team needs a justifiable exemption for a failed compliance evaluation? How do we stop alerts going downstream to the Security Operations Centre (SOC)?
- How do we evaluate that config rules are written correctly before allowing results to flow downstream to the SOC?
- How do we present data during audit that demonstrates that our config rules and detective controls are enforced and adhered to?
These areas of concern, though valid, are easily remedied owing to the specific functions of the Compliance Engine, which are:
- Controls administration
- Exemption handling
- Monitoring and alerting
Controls administration is the process of mapping directive controls to config rules i.e. detective controls. This relationship is maintained in a relational database table that is the compliance database. The table that stores the lookup is called the controls table.
First, when the router is configured to receive consolidated compliance events from the landing zone, two main compliance events are expected for each resource change:
- • The Resource Configuration Item Change Event, which represents the change – What has changed on the resource? • The Compliance Change Notification, which represents the result – What is the compliance evaluation result of the resource configuration change?
These Resource Configuration Item Change Events are stored on the router and in the resource table, which receives a Compliance Change Notification. These events are then tagged with a directive control ID, should the configuration rule which performed the evaluation match another when it is cross-referenced against the control rules table.
The router is designed in a manner which only publishes compliance event evaluations which have been tagged with an associated directive control ID. This creates a live release “downstream alerting” process, whereby config rules can be evaluated before being published to the wider organization.
Compliance exemptions handling is arguably the primary value-added feature of the compliance engine, as exemptions are time-bound and can be scoped to any combination of meta data associated with the resource.
Figure 2: Compliance Engine high level
Figure 3: Compliance Engine logic flow
Here is an example to better illustrate the versatility of the exemption systems. Suppose an application owner requires the exemption of an S3 bucket – a simple web service interface which is used to store and retrieve data from the web – which under the influence of a new config rule had become non-compliant. In this instance, instead of deactivating the entire system, they could file for a temporary exemption for that unique program crash – also known as a bucket ID. In the same vein, portfolio owners could apply an exemption for all S3 buckets in their production account, by filing an exemption using a combination of the account ID and resource type.
Exemptions are filed and stored in the compliance database under the exemptions table. Prior to publishing and storing compliance events, the router will first extract all related resource meta data and cross-check it against that within the resource table for matching exemptions. Should there be a valid exemption coupled with a ‘non-compliant’ compliance evaluation result, then exemption will be deemed compliant and stored as such.
There are advantages to storing resource changes and compliance events, mainly granting organizations the ability to construct a bird’s-eye view of their compliance posture, simply by tapping onto the compliance database. The following mock-ups are examples of how processed compliance data sets can be extracted and rearranged accordingly to form unique dashboards.
Figure 4: Compliance Views by Application
This first diagram represents compliance by application, with this view enabling application owners to react and remediate non-compliant deployments.
Figure 5: Compliance Views by Control
This second diagram represents an alternative view of the same dataset mapping compliance by controls, useful for security and audit teams to assess the adherence to the implementation of the control guardrails of the entire organization.
Canary Engine: Chaos engineering
Perhaps the most interesting enabler of the exemption mechanism is the Canary Engine.
Just as Netflix sought to align their teams around the notion of infrastructure resilience by inducing failure, the Canary Engine is designed to ensure that the code of every moving piece within the compliance ecosystem is kept honest, through the introduction of rogue misconfigured resources into the system. These misconfigured resources operate by assessing if the resulting non-compliant events are handled as should be expected.
Running on a predetermined schedule, the Canary Engine applies for an exemption within the Compliance Engine for the deployment of Canary Resources – a composition of misconfigured resources represented by cloud formation templates, deployed to target accounts according to a test specification file, used to determine the canary coverage within each account.
Once the canaries are deployed, the config rules are expected to be automatically invoked, with the resultant compliance event routed to the Compliance Engine, subscribing the Canary Engine to downstream events henceforth. A canary run is deemed to have passed if the compliance event for its resources is evaluated to be compliant with exemption.
As a final step, prior to the expiry of the exemption, the engine will tear down the canary resources and restore the engine to its normal operating standards, ready for the next cycle of compliance measures.
Figure 6: Canary Engine high level
The Compliance Suite trio of serverless applications has already been deployed and proven in production at a large highly regulated enterprise.
Rolling out the Compliance Suite provides numerous business values, primarily enabling the client’s cloud program to successfully launch within the compliance mandates of the organization and the industry it operates within, leading to long-term cloud adoption and workload migration.
The Compliance Suite furnishes the client’s cloud platform with the necessary detective controls (Canary Engine), ensuring accuracy (Rules Engine), managing coverage, and allowing exemptions to rule via exemption management (Compliance Engine). This ultimately provides Cloud Security, Operations, and Development teams, as well as Application leads, with a bespoke single plane of glass to assess the compliance posture of their entire cloud environment.
This blog was initially published on sourcedgroup.com
Aaron is a Senior Consultant with Sourced with over 15 years of experience in the IT industry, designing cloud architecture and migrating workloads across multiple public cloud providers. He specializes in providing learning and enablement programs to help our clients in their adoption of cloud technology.
Somnath (Som) is a solution architect and engineer in the cloud space predominantly focused on AWS and venturing out into GCP. He has spent 15 years in the industry and has held multiple technical roles in software engineering and Dev-ops. Somnath is equally comfortable in consulting and software product delivery engagements.