5 Pillars of Well-Architected Framework

Creating a software system is a lot like constructing a building. If the foundation is not solid, structural problems can undermine the integrity and function of the building. In this article, we're going to talk about the design principles we can follow to build a future proof large scale software. The concepts are from AWS Well-Architected framework whitepaper. This whitepaper inspires to learn architectural best practices for designing and operating reliable, secure, efficient, and cost-effective systems in the cloud. It provides a way to consistently measure your architectures against best practices and identify areas for improvement. I'll be trying to summarize the Whitepaper

So let's first quickly sum up the Guiding Design Principles:

Stop guessing capacity needs: Scale up & Down as required
Automate everything: Automated systems ensure consistency & reliability
Test at scale: Test an accurate replica of production on-demand
Adapt & Evolve: Adapt the architecture as needed to meet new challenges

The framework is based on 5 pillars:

1). Operational Excellence 2). Cost optimization 3). Reliability 4). Performance Efficiency 5). Security

Operational Excellence

The main emphasis of this pillar is: Does your architecture work ? Will it continue to ? Let's look at this pillar specific principles:

All operations are code
Document is updated automatically
Make smaller changes you can roll back
Iterate...a lot
Expect things to go sideways

Cost Optimization

Emphasis: Spend only what you have to Pillar specific principles:

Consumption based pricing
Measure efficiency constantly

Reliability:

Emphasis: Will this system work consistently & recover quickly ? Pillar specific principles:

Recover from issues automatically
Scale horizontally first for resilience
Reduce idle resources
Manage change through automation

Performance Efficiency

Emphasis: Remove bottlenecks, reduce waste Pillar specific principles:

Reduce latency
Serverless

Security

Emphasis: Does this system work only as intended? Pillar specific principles:

Automate security tasks
Encrypt data in transit and at rest
Know who did what when
Identities have the least privileges required

Operational Excellence In Depth

Operational excellence is the ability to run systems and gain insights into their operations in order to deliver business value, and to continuously improve supporting processes and procedures. The 3 Phases of Operational Excellence

Prepare-Prioritize: Prioritize to align with business priorities

What is the business goal ?
What are the critical pieces need to meet that goal ?
Any compliance restrictions/requirements ?
Dependencies between services ?

Design your architecture to support business Priorities

Is the design observable ?
Are your logs & observations actionable ?

Is your workload ready to go live ?

Are your processes consistent ?
Is operational code properly managed ?
Are tests in place ?
Anticipate failure ?
Ensure your workload is actually working

Shit happens. Be ready.

Anticipate planned & unplanned events
Respond in code
Connect observations with 3rd party tools as needed

Evolve

Learn from success & failure
Post-event, have runbooks changed ?
Test assumptions
Experiment early and often find better solutions

Cost

Use the appropriate resources & configurations
Provision to current needs with an eye to future
Right size to lowest resource that meets needs
Use data to choose purchase options
Optimize by geography
Optimize data transfer
Know how much you're spending and where
Continuously work to maximize value delivered
Align utilization with requirements
Report and validate findings
Evaluate new services for value

Awareness of spend is key to maximizing value

Reliability

Reliability is the ability of a system to recover from infrastructure or service disruptions, dynamically acquire computing resources to meet demand, and mitigate disruptions.

Scale horizontally first for resilience
Reduce idle resources
Manage change through automation

Limit: Understand default & requested resources limit Networking: Understand topology, bandwidth & latency Availability: Ensure your application is ready for business use

Ensure your application is ready for business use

Can users access your application
Deploy without issue
Can you push issue to planned downtime
Can your application withstand portal outages ?

Performance Efficiency

Selection:

Is this the optimal solution for this workload ?
What type of compute best suits ?
Which data store is ideal for this workload ?
Does your network design complement compute & data store choices ?

Review:

Continuously ensure choices work for your workload
Is infrastructure stored as code ?
Are deployments simple & automated ?
Can benchmarks be taken automatically ?

Monitoring:

Use active & passive monitoring where appropriate
Understand the five phases of monitoring (Generation, Aggregation, Real-time Processing, Storage, Analysis)
Create actionable metrics

Trade of -> You can't have it all

5 Pillars of Well-Architected Framework

Comments

More from this blog

Creating APIs with NodeJS, DynamoDB and Lambda: A better approach with dynamoose

Publishing private NPM package for free

Intelligence Explosion: possibility that future of computing holds

AWS S3 different Storage tiers and managing Lifecycle of stored objects

Command Palette

Comments

More from this blog