Amazon Web Services Tutorial (2019)
I will build upon the foundation of knowledge that you’ve obtained from the past two hosting scenarios and explain concepts and functionality of AWS services and products that will enable you to deliver a website and services to your customers using the power and flexibility of the AWS platform.
In addition to the discussion around the e-commerce hosting scenario, you’ll also start looking at the features of the AWS platform that will appeal to those hosting enterprise architecture in data centers who are looking to extend their service catalog or infrastructure into the cloud.
Before I describe the architecture of the next hosting scenario, let’s touch base on the concepts and AWS services that will be covered in this section of the blog.
AWS Services Introduced/Used
AWS S3 Amazon Web Services Simple Storage Service will be revisited a third time as your main storage location, although this time you’ll also start managing a storage lifecycle for your S3 assets. You’ll also use the AWS Glacier Storage Solution as a long-term archiving solution.
AWS EC2 Amazon Web Services EC2 (Elastic Cloud Compute) Service will be used in this scenario. Your e-commerce website will be fully redundant and will support Auto Scaling to adapt your infrastructure based on your website traffic needs.
This means that you’ll have multiple EC2 virtual server instances running your web server application and available to serve your content.
You’ll learn AWS networking concepts including VPC (virtual private cloud), Elastic IPs and their use, and more, including a deeper look at web traffic load balancing using Elastic Load Balancers and Application Load Balancers.
AWS CloudFront You’ll use Amazon Web Services CloudFront, the CDN (content distribution network) solution for efficiently delivering your website content to customers across the globe in a low-latency, geographically optimized way.
AWS RDS Amazon Web Services Relational Database Services, the fully managed service that enables you to leverage AWS infrastructure to host your database services and workloads, will be used again to host the databases needed for your e-commerce website.
You will use third-party applications that need database resources to hold information about the products that you’re selling as well as your customer database.
AWS Workflow Services You’ll spend some time learning about the ability to extend your efficiency and interaction with your new infrastructure by using AWS Workflow Services including SNS (Simple Notification Service), SQS (Simple Queue Service), and code deployment options including Elastic Beanstalk, OpsWorks, and CodeDeploy.
AWS Security Services You’ll learn about the various services and products that are at your disposal to help with infrastructure security, reporting, and compliance. Topics will include implementing SSL certificates on your website, as well as introductions to AWS Inspector, AWS CloudTrail, and AWS Trusted Advisor.
AWS Enterprise Solutions Lastly, you’ll learn about AWS platform services, which enable you to extend your business into the cloud and deliver desktop workstations, enterprise mail, directory services, and document management to your employees. You’ll also see how these services can be used to fill short-term needs for trainers or instructors.
Overview of Services
In addition to the information covered above, you’ll explore the current service list offered by AWS at the time of writing this blog and you’ll learn about the release rate of new products and services from this cloud vendor
Enterprise Website Scenario
I will be describing the resources required to deliver a fault tolerant, highly available solution that leverages AWS services to scale up and down based on website traffic requirements.
The solutions and website described in this section of the blog will be based around a hypothetical business that has implemented an e-commerce website that sells products and services online.
This business also uses its website as a way to drive new business, manage and measure customer engagement, and interact with customers across social media platforms.
The focus in this hosting scenario is to deliver a great experience for website visitors and to ensure that transactions that occur are secure and that you protect your customer data and privacy.
I’ll also discuss the need for this new business to launch a new product online and how you can manage the uncertainty of being able to accurately forecast your customer interest with an eye to understanding how AWS can help in this area.
Let’s start diving into this scenario as we have in previous ones, by describing the content, the assets or resources needed, and, in this case, an architecture design diagram that will help you understand the scenario further.
Website Content Overview
This hosting scenario will be focused on a hypothetical enterprise-level website that has e-commerce capabilities plus the requirement to support a small staff operating in a physical location. In your scenario, the website will have the following resources and requirements
Home The home page will be the landing page of the website. This is the place where the majority of the advertising and “call to action” activity will occur for this website. It is also the front door to the e-commerce sections of the website content.
Products This section will have the sales information on the products that you are selling from your online store. As you’ll see from the folder structure breakdown later in this blog, there will be a few different product types. In this example, the business is selling various artworks on the site.
Promotions This area of the site will hold information pertaining to monthly promotions and discounts that the business will offer to its customers.
Static Content This section of the website will hold your static assets, such as images and documents.
Store This e-commerce website scenario will most definitely have a shopping cart/online merchant component. Some may choose to implement a simple way to process payments and orders from a site such as PayPal, and others may choose to use a more complex online store solution like Magento. In your scenario, you will assume and plan for the latter.
The installation of the Magento application will be installed in the Store directory within the site structure and you’ll need to plan for not only database resources for your website but also for this component, which will hold your product inventory and all the details about it.
Contact The traditional contact page will also have a feedback form, your social media presence information, and physical store location information.
Website Architecture Design
This section covers the architecture design for the website scenario. This design will deliver a highly available, fault-tolerant website hosted in AWS and leveraging the built-in redundancy of having multiple availability zones (AZ) within a given AWS region.
When deploying resources such as EC2 instances or RDS database instances in AWS, you can select whether you’d like them to be placed in a specific AZ.
In most cases, when you’re standing up resources for testing or development, fault tolerance is not an issue at the top of your mind, so you usually just deploy a single resource to a single AZ within a single region.
However, production-level websites and applications require fault tolerance, so for this website hosting scenario, you will deploy multiple resources in multiple availability zones.
You will front all of your web traffic using AWS Route53, which you’ll remember is a highly available DNS service with a 100% SLA. From here, you’ll have Route53 route your web traffic to a pair of redundant EC2 Elastic Load Balancers.
The load balancers will pass the traffic on to your EC2 instances, which will be part of an Auto Scaling group configuration so that you can configure a minimum and maximum setting for how many web servers you want and allow AWS to handle scaling up the infrastructure for you on-demand as your resources become more utilized from web traffic or processing requests.
Each EC2 Auto Scaling group will be inside an availability zone with an RDS database instance for that availability zone. You’ll have the same infrastructure configured in a second availability zone for an additional level of fault tolerance.
Think of availability zones as data centers. This setup allows you to have EC2 and RDS resources in two locations so that if the infrastructure in one of them fails, you can route traffic to the second. EC2 resources in
Elastic Load Balancing
In the infrastructure of this scenario, you’ve deployed two Elastic Load Balancers. Elastic Load Balancers are endpoint devices that handle web traffic and can balance that traffic between multiple EC2 virtual server instances.
In 2016, AWS introduced a new type of load balancer with enhanced features and benefits called an Application Load Balancer.
At this point, AWS started referring to the Elastic Load Balancer offering as a “Classic Load Balancer.” A great comparison of the features of each load balancing type can be found at https//docs.aws.amazon.com/elasticloadbalancing/. One of the main benefits of the Application Load Balancer is the support for multiple ports.
For your example, you are going to deploy two Classic Load Balancers as a way to offer redundancy and you’ll place your EC2 resources in auto-scaling group configurations in web server pools attached to each load balancer. In this section, you’ll set up your two load balancers in a single region.
To set up your load balancers, log into the AWS Console and browse to the EC2 Dashboard. From here you’ll select the Load Balancers link under the Load Balancing section in the left-hand navigation.
Click the Create Load Balancer button and choose the Classic Load Balancer option. This will launch the wizard to walk you through creating the resource.
On the first screen, you will define the load balancer name. You will choose which network (VPC) you would like to deploy the load balancer within. VPCs are network resources that are scoped at the AZ level, meaning you can have multiple VPCs in each AZ.
In your example, you will deploy your resources to the same VPC so that they will live in the same public IP address space. Lastly, you’ll configure listener ports on the load balancer.
Since your load balancers will only handle web traffic, you’ll leave the default settings of Port 80 listening on the load balancer and forwarding to Port 80 on the EC2 virtual server instances.
In the next step of the wizard, you can choose to apply a security group configuration to the load balancer. I’ve chosen not to associate a security group with the load balancer.
In the third step of the wizard, you will be prompted with information saying that your load balancer does not have a “secure” listener, such as listening for traffic over HTTPS using SSL (Port 443). Just click the button without taking action and proceed to the next step in the wizard.
In this step, you configure health check settings for EC2 virtual server instances that will receive traffic from the load balancer. Configuring these settings allows the load balancer to tell whether an instance is in a healthy state and ready to receive traffic.
Configuration changes to the health checks can be made after the creation of the load balancer in the Settings tab, which means that you have the opportunity to adjust them at a later time if fine-tuning is needed.
In the next step of the Load Balancer Creation Wizard, you will add the EC2 virtual server instances that you want this load balancer to balance traffic. So if AWS had a failure of services in the “us-west-2a” availability zone, the EC2 server in the “us-west2c” availability zone would still be available to serve traffic.
By checking the “Enable cross-zone load balancing” checkbox, this load balancer will be able to send traffic between these two availability zones.
So that the web traffic will be load balanced between only two instances, you can set up as many as you’d like to have the traffic balanced across. You could also set up load balancers in multiple regions, which load balances traffic in multiple availability zones if you want to add an additional layer of fault tolerance.
Setting up the infrastructure in this manner will protect against a regional failure rather than just an availability zone failure. However, for this example, let’s keep it simple and have two load balancers set up in the same region, which will load balance across instances in multiple availability zones.
In the final step of the wizard, you can add any tags that you’d like to create for the load balancer resource. Similar to tagging other resources, it makes sense to add any tagging information that would make it easier from a reporting perspective.
Tagging resources as internal or external, or production or development, can help as the deployed infrastructure grows in size.
After you’ve added any tags that you want, click the “Review and Create” button to review your load balancer configuration. After reviewing the information for accuracy, click the Create to create your new load balancing resource.
It’s worth calling out that while there are other ways to load balance web traffic between instances, AWS load balancers are easy to set up and an effective way to manage and balance the traffic.
They offer additional benefits such as supporting multiple protocols and SSL. They do come at a significant cost, however. Each load balancer deployed will cost approximately $15 per month, so be sure that you’re prepared for the cost that this level of redundancy will bring.
Now that you know how to place a load balancer in front of multiple EC2 virtual server instances to balance web traffic between them, let’s look at how you can use another feature of AWS to scale the amount of EC2 virtual server instances automatically up or down based on how busy the resource is in terms of activity.
Auto Scaling Introduction
AWS Auto Scaling is a feature that allows you to define a set of conditions that will scale up or down your EC2 resources to match the instance requirements for those conditions.
Setting up Auto Scaling is a multiple-step process that has you first create a launch configuration and then define an Auto Scaling group. I’ll describe the process in detail below.
Since Auto Scaling is a feature of AWS EC2, you can access the setup of the groups and launch configuration settings from the EC2 Dashboard in the AWS Console.
The first step is to create a launch configuration. The launch configuration of an Auto Scaling group is very similar to launching a new EC2 virtual server instance.
In the Launch Configuration Wizard, you will choose the AWS AMI to be used, the virtual server instance size, and the configuration details including storage and security groups to be used.
You’ll also name the launch configuration. To start the wizard, click the Launch Configurations link in the left-hand navigation of the EC2 Dashboard and then click the Create Auto Scaling Group button.
This will bring you to the welcome screen of the wizard, which defines the two parts of the process mentioned above and that a launch configuration will first need to be created. Click the Create Launch Configuration button to move to the first step of the Launch Configuration Creation Wizard.
In the first step, you will choose which AMI to use when launching new instances within your Auto Scaling group. For this example, you’ll choose the Amazon Linux AMI that is free tier eligible.
Click the Select button next to the instance to move on to the next step. In the second step, you will choose the instance size of the instances to be deployed in your Auto Scaling group.
For this example, you’ll again stay with the free tier eligible option and choose t2.micro. After selecting the radio button next to the instance size choice, click the “Next Configure Details” button to move on to the third step in the wizard.
The third step asks you for a launch configuration name and has an Advanced Details section, as when launching new instances, where you can put user data configuration and select IP configuration options.
For this example, you can use the UserData.txt source code included in this blog’s resources to simply load the Apache Web Server and PHP application. Once you’ve entered this data, click the “Next Add Storage” button to move on to the next step.
In the fourth step, you define storage configuration options for your new instances. Leave the default setting and click the “Next Configure Security Group” button.
In the fifth step, you can choose to have a new security group created or apply the settings of an existing security group to the instances launched by your Auto Scaling group.
In this example, choose the web server security group that already exists because it has the ports that you want open (web traffic on Port 80 from anywhere and administrative access via Port 22 from only your IP address).
Once you’ve chosen this existing security group, click the Review button to get to the last screen of the Launch Configuration Creation Wizard.
In the final screen, review the information that you’ve entered through the wizard. When satisfied, click the “Create Launch Configuration” button. You will be asked to confirm that you have AWS access key pairs to access the resources that are created before the configuration will be saved.
Now that the Launch Configuration Wizard has been completed, the next part of the process is to walk through the Auto Scaling Group Creation Wizard.
In the first step of the Auto Scaling Group Creation Wizard, you’re asked to define a name for your group and a minimum size for the group. Leaving this option set at “1” means that you’ll always want to have at least one instance in the group.
This is the preferred setting. You also have to choose which subnet you want to use for your Auto Scaling group, or you can create a new one.
In addition, under Advanced Details, you can choose to receive traffic from one or more load balancers. Since you set up load balancers earlier in the blog, you can select this option. The Health Check Grace Period tells your Auto Scaling group how long to wait between the action of scaling up or down resources within your group.
Setting this too low could result in additional resources being created unnecessarily. The default setting is acceptable for this example. Click the “Next Configure Scaling Policies” button to proceed to the next step.
In step two, you can define whether you want the group to be set to be static at the size you define, or to scale based on definitions that you can configure. Choose the radio button for “Use scaling policies to adjust the capacity of this group.”
In the scaling policies settings, the most important option is the “Scale between” configuration at the top of the screen. This tells the minimum and a maximum number of instances that will be created within your Auto Scaling group. For your example, you will limit this to a maximum of three instances.
This means that no matter the performance of the instances in the Auto Scaling group, it will not grow past three instances without manual intervention.
Although this seems counterintuitive, this is a way to protect against growing the group indefinitely and ending up costing a bunch of dollars in resources created without you knowing.
In this part of the wizard, you can set up CloudWatch Alarms that will be used to evaluate whether EC2 instances need to be added or removed from the auto scaling group. Any instances that are added will use the launch configuration that you set up in the previous wizard.
You can define a similar policy to scale down when instances are lower than a defined parameter. To do this, you create a new policy by clicking the “Add new alarm” link.
In the definition of that policy, you set the number of instances to remove in the section of your configuration where you define the action to take when the alarm triggers.
You could choose to do the opposite of above and scale down instances in the group when all of the instances have an average CPU utilization of 20% or lower.
Scale up parameters are defined in the “Increase Group Size” section and scale down parameters are defined in the “Decrease Group Size” section. Once you have defined your scaling policies and any alarms that you’d like to create, you can click the “Next Configure Notifications” button.
In the next step of the wizard, you set up how you are notified of Auto Scaling group activity. This step is important because you will want to know whether your hosted environment has arrived at a point where resources are being scaled up.
Most times you will be aware of when this may happen, such as a marketing campaign launch or some other activity that you’ve been involved in that is driving additional traffic to your website.
In this step, you can use existing or set up new SNS topics for messages to be delivered to when Auto Scaling activities occur. As a reminder, when you set up a new topic, you need to subscribe to that topic to receive notifications.
Once you have set up your notification details, you can click the “Next Configure Tags” button to move on to the next step, which allows you to add any tags to the Auto Scaling group. When you are satisfied with the tags defined, click the Review button to move to the last screen of the Auto Scaling Group Creation Wizard.
The last step allows you to review your configuration and to create the Auto Scaling group with your defined parameters and launch configuration. Since you have defined a minimum of one instance in the group, an EC2 virtual server instance will be launched with your launch configuration details immediately.
From there, the Auto Scaling group will monitor your CloudWatch metrics defined to watch the CPU utilization to know when it should add more resources to the group. Each EC2 resource will use the same AMI, instance size, and user data as defined in your launch configuration settings.
Content Lifecycles, Management, and Backup
In this blog, you’ll revisit your friend AWS S3 and look at backup strategies. You’ll also take a closer look at other storage options on the AWS platform for storing your data.
As your hosting presence, site, and files grow over months and years, you’ll need a plan for backing up data that is important, moving data that is less frequently accessed or not as relevant as it once was to less costly storage, and evaluating expiring data that is no longer needed.
The process of doing these steps is referred to as the content lifecycle and the AWS platform has a way for you to manage your data and content from the start of the lifecycle to its end of usefulness.
I’ll also introduce AWS Glacier and AWS Snowball as storage solutions that can help with storing and importing large data sets into the AWS Platform.
Managing Content Lifecycles in S3
Throughout this blog, you’ve been introduced to and revisited features within AWS Simple Storage Solution (S3).
As discussed at the beginning of the blog, the goal is to introduce you to what I feel are the most useful basic features of each service and to give you the confidence to explore the advanced features more in-depth as you start using the platform for your environments.
In this third hosting scenario I presented the enterprise website scenario, a business that not only supports and hosts a highly available, fault-tolerant e-commerce website in AWS but wants to be able to use other services on the platform in order to be as effective as possible.
Content lifecycles refer to the effective lifespan of a piece of data. The start of the lifecycle is when the data is created and the end of the lifecycle is when the data is deemed no longer of value. Lifecycle management features are accessed through your S3 Dashboard and specifically at the bucket properties level.
By default, objects (including buckets and folders) created in S3 use the Standard storage class. You’ve seen the other types mentioned earlier in this blog, but I didn’t spend much time talking about them. Let’s take a look at the three storage classes now.
The Standard storage class is the default storage type for all S3 objects. This class offers the highest level of durability in terms of SLA at “eleven nines” or 99.999999999% durability and “four nines” or 99.99% availability. This storage class is used for general storage where items are accessed frequently.
Behind the scenes, AWS replicates your data across its own infrastructure to make sure that your data is there when you need it.
As it relates to your promotions folder, this class is the default setting for all objects (bucket, folder, and object) contained within it, so there are no changes to be considered until you learn more about the other class types.
Standard - Infrequent Access is the next level of storage class available for S3 objects. This class offers the same level of durability and one less “nine” or 99.9% availability.
This class is best for data that is still needed in terms of accessibility, but less frequently. This class has a lower per-GB fee for storage of objects but also has a per-GB fee for retrieval of objects of this class.
In terms of your promotions folder, it would make sense to have your current month promotions set as Standard, but you could change the storage class to Standard - Infrequent Access once the current month promotions have passed.
Reduced Redundancy Storage (RRS) is the third storage class types for S3 objects. This class offers less durability in terms of SLA with a “four nines” or 99.99% durability.
It has the lowest per-GB cost of the three storage classes and may be a good option for files that are stored in multiple locations and are easily replaced.
As noted in the AWS documentation, the level of durability is still approximately 400 times more durable than that of a typical disk drive. There is no retrieval fee for objects that use this class.
The amount of replication that happens behind the scenes is not the same level as with Standard class object types, and this class is designed to survive the failure of a single location or availability zone.
There is another storage class, but I look at it as more of an AWS product/service, and that is AWS Glacier. Glacier is a very low-cost storage solution for objects that do not need to be accessed frequently and is an excellent as an option for long-term backup.
There is a retrieval fee per GB for data stored in Glacier and there is a longer expectation in terms of how quickly the object will be restored and available for retrieval. You’ll see how Glacier fits into your lifecycle management options shortly.
At any time you can change the storage class of your object in S3 via the Object Properties screen. Manually moving data between classes is something that is quite effective in terms of cost savings; however, if you want to perform this process in a more automated fashion, you can use content lifecycle rules to accomplish the same over time.
If you take a look at the Object Properties settings, you’ll note that with no lifecycle policy applied to it, the expiration date and expiration rule are blank.
Lifecycle management policies, referred to as rules, are accessed from within the Properties menu of the S3 bucket on which you’d like to set the policy. If you browse to the S3 bucket properties you will see a drop-down section within the properties options called Lifecycle. Expand that section to view the available options for this feature.
From within this section, you can add a new lifecycle policy for your bucket by clicking the Add Rule button. This will start the Lifecycle Rules Wizard that will guide you through the process. There are three major steps in the Lifecycle Rules Wizard. The first step asks you to choose a rule target.
You have the option of having all the content in the bucket have the policy applied to it, or you can enter a prefix to apply the policy against.
In your example, you want to apply this against the promotions folder, so you’ll enter the prefix promotions/. Entering this will mean that this lifecycle rule will apply to all objects that have that prefix, which can also be expressed as any objects within this S3 folder.
Click the Configure Rule button to move to the second step, which asks you for details about what you’d like to do in terms of lifecycle management of the target objects.
For your example, you’ll keep it simple and just check the three checkboxes next to “Transition to Standard - Infrequent Access,” “Archive to the Glacier storage class,” and “Permanently Delete.” When you select the checkboxes, a default value will be added to each of the steps in the lifecycle.
The default settings move objects from the Standard class to the Standard - Infrequent Access storage class after 30 days after the object is created. Thirty days after the object has been moved to Standard - Infrequent Access storage it will then be moved to the AWS Glacier storage class.
Lastly, 425 days after the object was created it will be permanently deleted. You can choose to have all of these steps as a natural progression for deleting unused content or a subset of them, such as moving to Standard - Infrequent Access, as a way to save on storage costs.
On the last step of the wizard, you will be asked to name the lifecycle rule and will have the opportunity to review the lifecycle management of the objects to which it will be applied.
Using AWS as a Backup Strategy
We all know how important it is to have a current backup of your data. Bad things do happen and it is important to plan so that you can return to being productive as soon as possible after an incident where data restoration is needed. Amazon Web Services offers backups and fault tolerance in all of its platform services.
You had some exposure to this earlier in the blog when you configured AWS RDS. As part of the configuration process within RDS, you can specify your preferred backup preferences.
RDS automated backups are set for a three-day period. Each day during the configured maintenance window a backup (also referred to as a snapshot) will be taken of the RDS database instance state.
This snapshot can be used to restore the database instance to a point in time, or used as a baseline for setting up a new database instance.
The concept of a snapshot is used widely across the platform services in AWS as a method for providing infrastructure backups. A list of available RDS snapshots is available for viewing from the RDS Dashboard, under the Snapshots link in the left-hand navigation.
Backup options exist for all platform infrastructure services. Copies of CloudFormation templates are stored on S3. S3 buckets can be copied across regions to provide redundancy and data protection.
EC2 uses the concept of snapshots for backing up data volumes attached to EC2 Virtual Server instances. To access this, you can log into the EC2 Dashboard and under the Elastic Block Store section, click the Snapshots link.
From here you can click the Create Snapshot button to create a new snapshot manually, which will capture information about the snapshot you would like to create and for which volume you would like to create a snapshot.
In addition to doing this manually, you can also choose to script this process using the AWS command line interface or other development and administration tools. Taking a snapshot of an EBS volume is a simple process and is an incremental backup based on the last successful snapshot.
This allows you to do point-in-time restores of the data if needed. The following link is to online documentation from AWS of a tutorial on how to set up automatic snapshots of EBS volumes using CloudWatch
EC2 Virtual Server instances can also be backed up for later use/restoration. In this case, the backup process is referred to as taking an image of the server instance state.
Taking an image of an instance is accessed through the Action menu from the EC2 Dashboard. Browsing under the Image menu option, you’ll find the link/option for Creating an Image.
Now that I’ve talked about some of the infrastructure backup options that AWS provides for your infrastructure services in the cloud, I should mention that AWS can also be used as a platform to back up important information from your home or business.
I showed you how to create S3 buckets and how to use the AWS CLI to sync local folder content with the S3 bucket. This same methodology can be used to scheduled tasks and network share locations on your business network.
A contextual use case for this type of backup is backing up customer orders, shared documentation, and company marketing assets or training material. AWS S3 is an excellent, low-cost, highly available option for your backup storage, but if you’re looking for something that is even lower cost, AWS Glacier may be a better fit for your storage needs.
AWS Glacier is described by Amazon Web Services as a “secure, durable, extremely low-cost cloud storage” and at the time of this writing was priced at $0.004 per GB per month.
Although the storage cost is very low compared to other providers or the option to store the data on disk, there are other considerations to ponder before using this service.
First, there is no user interface for the service available from the AWS Console. If a UI is something that you prefer, there are options available, such as CloudBerry Lab Explorer, which I introduced in the first section of the blog. Second, there is a retrieval fee for accessing data and retrieving it from Glacier.
Third, when you want to access specific data, you initiate a request to retrieve it; based on the type of request, it could take as little as 5 minutes or as long as 12 hours to be accessed.
The price of this service makes it an excellent option for long-term data archival of important information; however, the considerations mentioned above may make this option unattractive to some people.
Earlier in this blog, I explained how you can add the use of AWS Glacier into the lifecycle management of objects that you’re storing. It is worth talking a bit about the options for retrieving the data if you choose to use the service.
In terms of setting up AWS Glacier, the object storage lifecycle management options are one way to move data into Glacier. Other than that, you can access AWS Glacier from the AWS Console, but in there you will only be able to set up vaults and notifications.
Vaults are a collection of archive artifacts stored in AWS Glacier. Archives are a collection of objects, usually packaged in a compressed format such as in ZIP or TAR format.
There are three types of retrieval requests that can be initiated through the AWS Glacier, but all of them are initiated through either a third-party provider application, the AWS SDKs, or AWS CLI.
The three available types are Expedited, Standard, and Bulk. Expedited requests will be retrieved in the shortest amount of time, but are meant for datasets that are no larger than 250MB in size and they should be available within 1-5 minutes.
Within the option of expedited requests you can select whether you want the data in an on-demand fashion, which will be the best effort, or as provisioned storage as more of a dedicated resource format (the data will be there for as long as you need it).
The difference between these two is really in how you will be charged for the storage resources that are housing the data you are asking to be retrieved. Yes, you must pay for the storage to hold the data that you’re asking to be restored.
Standard requests are the most typical use case and you will be notified when the requested data is available for access. You can access the archive data via third-party tools, the AWS SDK, AWS CLI, or even stand up an EC2 instance to access the data.
The last type of request is the Bulk request and these are reserved for the largest datasets. Retrieval time for these archives can take up to 12 hours, but seeing as this could cover up to petabytes of data, this timeframe is quite acceptable in most cases.
More information about AWS Glacier and the product features and usage can be found on the AWS product documentation website at https//aws.amazon.com/glacier/.
Getting Large Datasets into AWS
There may be times when you need to get a large amount of data or resources migrated from your location into Amazon Web Services. There are a couple key services that may assist with these tasks.
AWS has a Database Migration Service (DMS) that will help you migrate and replicate your data from your on-premise database system into the AWS platform. The process requires at least a source database, replication instance, and a target database.
Database conversion tools are available to help you to migrate from one platform to another, including modifying the data schema of the database. More information on the service can be found at https//aws.amazon.com/documentation/dms/.
For migrating server instance resources, the AWS platform offers the AWS Server Migration Service. This service helps you move your virtual server infrastructure from your location to the cloud.
The most popular Windows and Linux operating systems are supported by the service, and this is a great option when the goal is to move a server from being hosted on-premise to AWS.
More information about this service can be found in the AWS documentation at http//docs.aws.amazon.com/server-migration-service/latest/ userguide/server-migration.html.
If the resource that you need to move into AWS is a large amount of raw data, the AWS Snowball service may be the best fit for you. Snowball is an ultra-secure option for migrating large amounts of data from your site into AWS.
The device will be shipped to your location, ready to be plugged in and accessed by your network and computing resources. Snowball devices come in 50TB and 80TB sizes, allowing you to transfer data to them locally and then ship them back to AWS.
Then AWS will migrate that data to its storage resources and make it available to you. For data sets larger than 10TB, this process is often much more cost/time effective than trying to transfer the data via the Internet, and it is definitely more reliable and secure.
More information about AWS Snowball can be found at https//aws.amazon.com/documentation/ snowball/.
Monitoring the Health of Your Services
At this point in the blog, you’ve come a long way in terms of knowledge about the many AWS services that the Amazon Web Services platform offers. You’ve set up infrastructure resources using services like EC2, S3, and RDS that are core components of your website hosting.
You’ve learned how CloudWatch can be used to gather basic metrics for RDS and EC2 and you’ve seen how well the CloudWatch service is integrated into the dashboard screens for each service.
You had a sneak peek at how powerful this tool can be in assisting you with reacting to changes in your infrastructure needs when you configured your first Auto Scaling group earlier in this section of the blog.
CloudWatch was the conduit that monitored the CPU utilization of your EC2 instances and, based on thresholds set in a CloudWatch alarm, would take action to either scale up and launch additional EC2 virtual server resources or scale down and remove EC2 virtual server resources that were no longer needed based on server load.
CloudWatch alarms can tell you when to take action on a given situation based on measured metrics. CloudWatch can be used in other ways to monitor and gather information from your AWS resources.
For example, CloudWatch can be used to gather log data from your EC2 instances. By default, the data collected about an EC2 virtual server instance has to do with the performance of that infrastructure within the AWS platform, which is available through the EC2 Dashboard, under the Monitoring tab.
CloudWatch Monitoring Options
CloudWatch offers multiple forms of monitoring that you can choose to implement to help with the monitoring of your infrastructure and services. The first was discussed in the previous section and is the basic monitoring option.
This is included in some fashion in most of the AWS platform services dashboard screens to give a glance at the performance metrics of those services implemented.
You can also access all of the metrics collected by the basic CloudWatch monitoring through the CloudWatch dashboard in the AWS Console under the Metrics link in the left-hand navigation.
By default, CloudWatch basic monitoring metrics are collected on a five-minute interval. For most resources, this is sufficient; however, there are services within the platform that allows for more granular monitoring via CloudWatch.
EC2, for example, is one of those services that allow for detailed monitoring to be enabled on a virtual server instance. This can be specified while setting up the instance or can be enabled after the instance has been created.
To enable detailed monitoring for an instance, browse to the EC2 Dashboard, choose the Instances to link in the left-hand navigation, select the Instance for which you would like to enable detailed monitoring, and click the Actions button to open up the Options menu. From here, choose CloudWatch Monitoring and Enable Detailed Monitoring.
Enabling the CloudWatch detailed monitoring this way could be quite a time consuming if you need to enable it on multiple instances. Thankfully, you can use the following command to enable detailed monitoring on an instance using the CLI
aws ec2 monitor-instances --instance-ids <instance-id>
This command will make a call to your given instance to enable detailed monitoring. It will return a status of “pending” while it is changing the instance from basic to detailed monitoring. Issuing the command a second time will show a status of “enabled” when complete.
If you want to disable monitoring on that instance, you can use the following command
aws ec2 unmonitor-instances --instance-ids <instance-id>
To verify that detailed monitoring is enabled, you can click one of the monitoring graphs presented in the Details Monitoring tab. When the graph is displayed you will note that the interval for the monitoring data is now available at 1-minute rather than 5-minute intervals.EC2 is not the only AWS service that offers this enhanced level of monitoring.
A full list of all CloudWatch metrics available and the interval at which data collected can be delivered is available at http//docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/CW_Support_For_AWS.html.
Now that you have enabled detailed monitoring for your EC2 instance, you can use CloudWatch to set alarms to report or take action on metric data values and thresholds you feel are worthy of being notified about. You can create graphs to view a graphical representation of your CloudWatch metrics over time.
You can collect these graphs into collections known as CloudWatch dashboards for easier viewing and aggregation of your data.
In addition to collecting and viewing data in the above method from CloudWatch, you can also gather additional metrics and data from CloudWatch on EC2 instances by deploying the CloudWatch Logs Agent to your EC2 instance.
Although the process of installing the Logs Agent is a bit outside of the scope of this blog, I feel that doing so to monitor and capture events written to web application logs such as Apache Server may be relevant.
So I’m including a link to the Quick Start guides, which walk through setting up the CloudWatch Log Agent on existing instances or how to include this on new EC2 instances at launch time http//docs. aws.amazon.com/AmazonCloudWatch/latest/logs/CWL_GettingStarted.html.
As you now know, CloudWatch alarms can be used to notify you or take action when a metric measurement has crossed a threshold set within the alarm. There is another way for CloudWatch to take action on resources and this is through the use of CloudWatch rules.
CloudWatch rules can be accessed through the CloudWatch dashboard under the Rules link on the left-hand navigation. Rules can be created to match an event pattern for a given resource or can be scheduled to occur on a timeframe, like that of a Linux-based cron schedule.
In the above configuration, you are using a pattern match event on EC2 resources. The event type is set to “EC2 Instance State-change Notification” and will look for a change in state on any of your EC2 resources to send a Simple Notification Service (SNS) message to a topic that you configure.
SNS is a topic/ subscription-based service provided by AWS to handle communication and notification between services.
This can be used as an endpoint for communication from all services such as EC2, CloudWatch, Lambda, and many others across the platform. Not only can it be used in your AWS account to receive notifications from services that you are using.
But it can be used as a platform service itself, creating an endpoint for messaging services from applications, SMS, and more. The full scope of SNS and how it can be used is outside of this blog, but here’s a link to the Getting Started Guide for SNS
SNS can be used as your endpoint for all of your monitoring notifications, enabling you to set up topics by resource, region, or any other grouping you prefer and then subscribing to the topic to be instantly notified.
Since SNS is also an AWS service, there are CloudWatch metrics that can be monitored and data that can be analyzed, such as messages per topic, notifications delivered, and more.
External/Third-Party Monitoring Options
The previous section discussed how you can use CloudWatch as a monitoring solution for your services and infrastructure in AWS. CloudWatch is definitely not the only option for enterprise monitoring of your resources. There are many third-party applications that offer cloud monitoring services.
A few examples that come to mind are DataDog and AppDynamics. DataDog is available from www.datadoghq.com/ and has a nice list of application integrations to help you receive notifications about infrastructure health in applications you may already use frequently, such as Slack.
AppDynamics is available from www. appdynamics.com/ and is well designed for monitoring application stacks and the health of the application components. Both of these providers have free trials that can be used to test their services and reporting on your infrastructure before committing to a long-term engagement with them.
The two examples above are third-party service providers that can help with your monitoring; however, if you want to host your enterprise monitoring solution yourself, you can spin up AWS resources within your AWS account to monitor infrastructure. There are many open-source options available for this method of implementing enterprise monitoring.
Two that come to mind are Zabbix, which is available at www.zabbix.com/, and Nagios, which is available at www.nagios.org/. Both offer enterprise-level monitoring for your infrastructure and applications and can be set up with minimal resources to monitor your environment.
Choosing self-hosted, third-party, or the AWS-included monitoring solution is most definitely a choice based on preference, time, and cost-based commitment.
Some may choose the flexibility and lower cost of self-hosting, while others will prefer the time-saving option of using a third-party service provider. While looking at either of these options, you can use AWS CloudWatch and evaluate whether or not it fits your needs.
AWS Security and Securing Website Communication
You’ll start by using Secure Socket Layer (SSL) to secure your e-commerce website traffic and then you’ll move into other topics that focus on the security and best practice reporting available for your infrastructure resources.
Securing Your Website Communication
Hosting an enterprise level e-commerce website on the Internet will certainly have you giving thought to ways to secure the communication that happens between your customers, employees, and the website resources that you have available.
The most popular web transport protocol for securing web-based traffic is Secure Socket Layer (SSL). This protocol secures the communication between a client computer and your website by encrypting the data passed between them.
To support this secure connection and encryption between the client browser and your website, the client must have a browser that supports SSL communication and the website must be able to host an SSL endpoint.
These two points in the connection are referred to as the termination points, and when communicating via SSL, any traffic passed between these points will be encrypted, adding a layer of security to the communication that doesn’t exist when communicating over a non-secure connection.
Non-secure web traffic connections can happen over many protocols, but the one that most will be familiar with is the application layer protocol HyperText Transfer Protocol (HTTP). This is the protocol that is used by default for web server applications and by default communicates over port 80.
This connection is referred to as a non-secure connection because data that is passed between the client browser and the web server is not encrypted and is sent in plain text.
This means that if someone analyzed the packets of data passed between these two points they would be able to interpret the data sent very easily. This could include non-sensitive data, but could also include things like usernames and passwords.
Most enterprise level websites, including the one in this hosting scenario, have username/password logins to gain access to certain parts of the website and will likely be selling products as well.
In both of these situations, it is recommended to secure the web traffic between client and server to minimize the risk of sensitive data being accessed by anyone that shouldn’t have access to this information.
AWS Certificate Manager
AWS Certificate Manager (ACM) is an Amazon Web Services platform resource that allows you to procure and manage SSL certificates to be used on your infrastructure. For resources hosted within the AWS platform, such as Elastic Load Balancers and CloudFront distributions there is no charge for procuring the SSL certificate.
To access ACM, browse to the Services menu in the AWS Console and under the “Security, Identity and Compliance” section heading, click Certificate Manager and click the Get Started button if this is your first visit to this AWS resource.
When creating your first SSL certificate you will need to provide the name of the domain to be secured. In your case, you specify www.nadonhosting.com and then you click the “Add another domain to this certificate” to also add in the root of this domain name, which is nadonhosting.com.
The next step is to review the information and then complete the submission request by clicking the “Confirm and Request” button. By default, AWS will email the domain registrant and administrative contacts for approval of the issuance of this SSL certificate for this domain.
If you are not sure who the contacts are for your domain, you can find this information out using a website such as www.whois.net. This tool will allow you to enter any domain name and find out information that is publically available through the domain’s registration record.
AWS Security and Best Practices Resources
Now that you have a way to secure your sensitive website communication data such as logins, passwords, and e-commerce processes, let’s discuss some of the other tools and resources that AWS has to help with security and best practices.
AWS Identity and Access Management should be the first place you think of when you think about AWS security. This is the resource I introduced at the beginning of the blog, and it is used to manage access to your AWS account and resources within it.
Early in the first hosting scenario, I talked about the best practices to secure your root account and I want to mention it here again because I feel that it is important enough to do so.
If your AWS root account is compromised, the owner of it controls all the keys to the kingdom and can cause a lot of havoc and lost revenue, or even incur operating costs by launching a bunch of AWS resources on your behalf.
CloudTrail is an auditing resource that keeps track of all API calls that happen within your AWS account and holds metadata about each of these changes including what resources were affected and who affected them.
CloudTrail is not on by default; it must be turned on through the AWS Console by configuring the first trail. Browse to the AWS Services menu and under the Management Tools section, click the CloudTrail link.
Fill out information to configure your first trail and to store the trail information in S3. There is no charge for the first trail, which can be configured to monitor your entire account; however, additional trails do come at a charge. Also, the S3 storage that will be used to store your trail data will fall under normal S3 pricing.
Once created, CloudTrail will start tracking all API calls within your account. These calls are tracked whether they are done from within the AWS Console or done via the CLI.
This gives you a full audit trail to find out what resources have been accessed, created, changed, removed, and by whom. It is a powerful tool that many forget to enable but is highly recommended as a relevant tool for hosting on the AWS platform.
AWS Config is a management tool that, similar to CloudTrail, gives you a record of changes that have been performed against your AWS resources. AWS Config is focused around configuration management and change control, so the information presented is more directed from an asset perspective rather than all API calls, as with CloudTrail.
In terms of an enterprise hosting scenario, AWS Config can be a useful resource to provide rulesets around change management for your AWS resources to show compliance.
AWS Trusted Advisor
AWS Trusted Advisor is a best practices analyzer that runs at the account level within AWS and analyzes your accounts and resources within AWS and reports on best practices that should be implemented.
An example is the EC2 security groups that are open to anyone or if your AWS root account is not using multi-factor authentication.
In addition to security best practices as the previous two examples illustrate, the tool will also report on performance optimization that can be achieved, such as applications that are running on underutilized EC2 instances.
There is a summary at the top of the Trusted Advisor dashboard that shows checks that are OK or have warnings and issues. This resource is a very valuable one and should be used at the beginning of your AWS hosting journey and at regular intervals thereafter.
AWS Inspector is an agent-based security tool that can be used to run against EC2 resources to report back on security vulnerabilities and give information on how to remediate them.
The process for using AWS Inspector is to load the agent on resources that you would like scanned and then to set up scheduled assessment scans to run against those resources.
You can limit not only the scheduled time but how long the assessment scans run and analyze the resource. This fine-tuning means that you can gather information about the resources with minimal overhead and impact on your operation of them.