Amazon Relational database service and AWS RDS
A relational database relies on tables to organize data, with rows holding individual records and columns holding individual data items. The tables focus on specific topics, such as employee names and dates of birth. They relate to other tables, which is what gives relational databases their names.
For example, a single employee can have multiple telephone numbers, so the telephone numbers would appear in a separate table related to the employee table using a common element, such as the employee ID.
If you aren’t familiar with relational databases, you can find more about them at http://www.tutorialspoint.com/ sql/sql-rdbms-concepts.htm
The point is that the Amazon Relational Database Service (RDS) provides you with access to a relational database that you can use to store data that lends itself to organized storage techniques. Most business data, such as sales to individual customers, falls into this category. The first section of this blog helps you understand RDS from the Amazon Web Service (AWS) perspective.
After you get an idea of how RDS works in AWS, you want to begin working with the RDS Management Console, which is the topic of the second section of the blog. Relational databases can become quite complex, so the overview provided in this blog focuses on the most common and easiest-to-use features. AWS can help you create a relational database of nearly any complexity to manage your business needs.
The third section of the blog looks at the steps for creating a simple relational database using RDS. Of course, the real-world relational databases that you create are likely more complex and contain a great deal more information. This section of the blog helps you get a feel for how you can perform significantly more complex tasks with RDS.
Creating the database doesn’t fill it with data, manage the data in any way, or make the data accessible to the end user. To make databases useful, you must connect them to an application of some type (where an application could provide services to users, machines, or a combination of both).
The application may not even include a user interface in the common sense of the word, but could simply be an API accessed by yet other applications. The point is that you need software to make your database user, which is the topic of the fourth section.
Finally, you need to consider the requirement to make the database scale well. A database with a single user isn’t particularly useful. Most databases support the needs of multiple users and some thousands (or possibly millions) of users.
As the number of users increases, the need to balance the load among multiple servers and the requirement to provide scaling so that the database continues to deliver data at an acceptable rate both increase as well.
Considering the Relational Database Service (RDS) Features
The main purpose of a relational database is to organize and manage consistent pieces of data using tables that relate to each other through key fields.
For example, an employee table may have a relation to a telephone number table connected through the employee ID. Because an employee can have multiple telephone numbers, every single entry in the employee table can have multiple connections to the telephone number table. Although this is a gross simplification of the Relational Database Management System (RDBMS).
To perform management tasks correctly, you must have a reliable Database Management System (DBMS) built upon a specific engine. The database engine you choose determines the characteristics and flexibility of the management environment. In addition, the database engine can also affect how well the RDBMS scales when you increase load, data size, or other factors.
Also important is to have the means to create a copy of your database using both replication (the copying of individual data elements) or cloning (the copying of the entire database). The following sections describe how RDS helps you achieve all these goals.
Choosing a database engine
AWS RDS supports a number of database engines. Of course, supporting a single RDBMS might at first seem to do the trick because they all essentially do the same thing. However, you must consider a number of factors when choosing a database engine. These factors include
The RDBMS currently used for most of your existing projects
Security concerns that may override other needs for data storage
Data storage size or type requirements
Interoperability needs, especially when working with other organizations
Coding needs, such as the capability to execute scripts in specific ways
Automation needs, such as the capability to execute scripts in response to events or at a specific time
Given that the number of RDBMS engines available today is huge, RDS is unlikely to ever support them all. As of this writing, RDS supports six database engines, each of which has characteristics in its favor, as explained in the following list:
Amazon Aurora: This product is essentially a MySQL clone. If you like MySQL, you probably like Amazon Aurora as well. However, according to a number of sites, Amazon has managed to make Aurora faster, more scalable, and inclusive of a number of interesting additional features.
Of course, you pay a higher price for Amazon Aurora as well, so if you don’t need the extra features, then using MySQL is probably a better choice. The articles at http://2ndwatch.com/blog/deeper-look-aws-aurora/ provide a more detailed comparison of Amazon Aurora to MySQL.
MariaDB: This is another MySQL clone, but it also has a significant number of additional features that you can read about at https://mariadb.com/kb/ en/mariadb/mariadb-vs-mysql-features/. You need to consider a few major differences when choosing this product.
For one thing, MariaDB is the pure open source, which means that it uses a single license that is easier to manage. However, because of the licensing, enterprise customers will deal with equivalent open source implementations in MariaDB (such as thread pool), instead of the original MySQL implementations, which could result in compatibility issues.
MariaDB is also currently locked at the MySQL 5.5 level, so you may not have access to the latest MySQL features needed to make your application work.
MySQL: This product isn’t quite as old as some of the other RDBMS offerings that Amazon supports, but it does serve as the standard to which other products are judged.
The problem with being the leader is that everyone takes pot shots at you and tries to unsettle your customers, which is precisely what is happening to MySQL. You can read about some of the pros and cons of choosing MySQL at http://www.myhostsupport.com/index.php?/News/ NewsItem/ View/58.
The fact of the matter is that MySQL sets the standard, so it likely provides the most stable and reliable platform that you can choose when these issues are the main concern.
Oracle: This product has been around for years, so it has a long history of providing great support and significant flexibility. What sets Oracle apart from a few other products, such as MySQL and SQL Server, is that Linux administrators and developers tend to prefer it.
As with MySQL, Oracle is a standard setter that everyone likes to compare with other products, even when those comparisons aren’t a good match.
Unlike other products in this list, it’s essential to view Oracle Cloud as a separate product from the enterprise setup — the two products aren’t completely compatible and have different feature sets. You can find some pros and cons of using Oracle Cloud at http://www.socialerp.com/oracle-private-cloud.php.
This is a combination product in that most people view it as an open source version of Oracle but also go to great lengths to compare it with MySQL.
Developers like PostgreSQL because it provides a significant number of features that MySQL tends not to support. In addition, the transition for developers from Oracle or SQL Server is relatively easy because PostgreSQL tends to follow their lead.
However, MySQL tends to provide better ease of use and is somewhat faster than PostgreSQL. You can find some interesting pros and cons about this product at http://www.anchor.com.au/hosting/ dedicated/mysql_vs_postgres.
This product provides essential RDBMS functionality with a considerable number of add-ons. The important thing to remember about SQL Server is that Microsoft created it for Windows, and everything about this product reflects that beginning. In general, administrators find that working with SQL Server is relatively easy unless they need to use a broad range of those add-ons. Developers like SQL Server because it integrates well with the Microsoft language products.
You can read the pros and cons of this Even with this short overview of the various choices, you can see the need to research your RDS choice completely before committing to a particular option.
In some cases, you may need to configure a dummy setup and perform tests to see which option will work best for your particular application. After you begin to fill the RDBMS with real-world data, moving to another database engine is usually an expensive, error-prone, and time-consuming task.
The smart administrator takes additional time to make a good choice at the outset, rather than discover that a particular choice is a mistake after the application moves into the development (or, worse yet, production) stage.
Understanding the need to scale efficiently
The capability of your application to scale depends on its access to resources. AWS provides consistent access to its resources by using auto-scaling, which is a combination of automation and scaling. Monitors generate events that tell services when an application requires additional resources, such as servers, to maintain a constant level of output so that the user doesn’t see any difference between a light and a heavy load.
Even though the real-world performance of autoscaling may not provide precisely this level of consistency, the automation does work well enough so that most users won’t complain from an AWS perspective.
A problem with RDS, or any other database service for that matter, is that resources include data. No matter what you do, throwing additional resources at data management issues will only go so far.
At some point, the sheer weight of the data becomes an encumbrance. Searching through several million records to find the one record you need is going to take time, no matter how many servers you allow and how much memory you provide. With this time factor in mind, you need to consider these issues when working with AWS to create an application that scales well when large amounts of data are involved:
Use the right RDBMS.
Amazon makes a number of database managers available, as described in the previous section of this blog, “Choosing a database engine.” Even though your first inclination is to use the database engine that you use most commonly in your organization now, speed considerations may trump consistency in this case. If you want your application to scale well, you may need to choose an RDBMS that provides optimal speed in a cloud environment.
Organize the data using best practices.
This blog doesn’t address DBMS-specific concerns, such as the use of normalization. The use of best practices provides you with a good starting point to ensure that your application scales well. A best practice comes into play when experimentation shows that it provides good results in general.
Experiment to find good RDBMS optimizations. Knowledge resources usually focus on the general case because no one can possibly know about your specific needs.
However, trade-offs occur when you use various general organizational and optimization techniques, and you need to consider the price of each trade-off when compared to application speed and the application’s capability to scale well under load. In some cases, relying on the best practice that works well, in general, may not produce the desired result in your specific case.
Play with AWS to determine whether additional resources will help. AWS may really be able to help you overcome some speed and scaling issues by allowing you access to resources that you wouldn’t normally have.
The AWS documentation provides some clues as to when allocating additional resources (and spending more to do it) will yield a desired result. Unfortunately, the only way to verify that using additional AWS resources will provide acceptable gain for the price paid is to experiment and monitor the results of testing carefully.
Defining data replication
Data replication is often associated with data availability. When a failure occurs, RDS uses the replica instead so that users don’t see much, if any, reduction in application speed.
Amazon recommends that you place your replica in a different availability zone from your main database to ensure that the replica also addresses regional issues, such as a natural disaster. When a failure occurs because of a tornado or other natural disaster in one area, the replica in a region that has good conditions can take over until RDS makes repairs to the main database.
Amazon relies on SQL Server Mirroring to provide data replication when you choose SQL Server as your RDBMS. You can also choose to use for replication Multi-AZ (http://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/) when using any of these RDBMSs:
Another use of data replication is as a means to help data scale better when working with large datasets or a large number of users. A Read Replica has a copy of the data in the main database, but you can’t change it.
Applications connect to the Read Replica version of the data, rather than the main copy, to reduce the load on the main database when performing read-only tasks such as queries and data analysis.
This feature is available only to MySQL, MariaDB, and PostgreSQL RDBMS users. The main advantage is that your application gains a considerable scaling feature. The main disadvantage is that Read Replica updates occur asynchronously, which means that the read-only data may contain old information at times.
Cloning your database
Replication is data-based copying of data. You ask AWS to create a copy of your data, but not necessarily the entire database. Cloning focuses on copying the entire database, including the data. AWS supports cloning by using database snap-shots, a sort of picture of the database at a specific instant in time. Database snap-shots get used in multiple scenarios:
Backup: Restoring a snapshot helps a failed RDS instance recover to a known state.
Testing: Placing a snapshot on a test system provides real-world data that a developer or other party uses to test applications or processes.
Cloning: Copying a snapshot from one RDS instance to another creates a clone of the source RDS instance.
Creating the snapshot means telling AWS where to copy the database and providing credentials for encrypted databases. You can create a database snapshot in a number of ways:
Manually by using the RDS Management Console
Automatically by scheduling the snapshot using the RDS Management Console
Programmatically by using the RDS API
When you use automation to create the snapshot, AWS automatically deletes the snapshot at the end of its retention period, when you disable automated database snapshots for an RDS instance, or when you delete an RDS instance. You can keep manually generated database snapshots as long as needed.
Copying a database snapshot from one region to another incurs data transfer charges in addition to any charges that you incur creating the snapshot or using other service features. You should consider the cost of performing this task in advance because the charges can quickly mount for a large database (see http:// aws.amazon.com/rds/pricing/ for pricing details).
In addition, Amazon places limitations on copying database snapshots from certain sources. For example, you can’t copy a database snapshot to or from the AWS GovCloud (US) region
Accessing the RDS Management Console
As with every other part of AWS, you use a special management console to work with RDS. The RDS Management Console enables you to choose an RDBMS, create a database, add tables and other objects to the database, and make the database accessible to an application. You also use the RDS Management Console to perform administrative tasks, such as to configure security. Use the following steps to access the RDS Management Console:
1. Sign into AWS using your administrator account.
2. Navigate to the RDS Management Console at https://console.aws.amazon.com/rds.
You see a Welcome page that contains interesting information about RDS and what it can do for you. However, you don’t see the actual console at this point. Notice the Navigation pane at the left. You can click the left-pointing arrow to hide it as needed. Many of the RDS Dashboard options are the same as those used by EC2, which is no surprise, given that you use EC2 to support the database.
3. Click Get Started Now.
You see the Select Engine page. Notice that you can select a major vendor and then a specific version of that vendor’s product. For example, the screenshot shows three versions of SQL Server (there are others available).
The examples in this blog on MySQL because you can also download a free local copy from https://www.mysql.com/downloads/. The MySQL Community Edition is free, and you can obtain trial versions of the other editions. Most vendors do provide a free version of their product for testing and learning purposes. In addition, MySQL works on most of the platforms that readers of this blog will use.
4. Click Select next to the MySQL Community Edition entry.
The wizard asks how you plan to use MySQL. You only see this step when working with certain DBMS. For example, you don’t see it when working with SQL Server Express Edition. Notice that you must use MySQL for development and test work, as well as for the RDS Free Usage Tier. (Amazon Aurora doesn’t offer a free usage tier.) Because this installation is for example purposes, you want to use MySQL, not Amazon Aurora.
5. Select the MySQL entry in the Dev/Test group and then click Next Step.
You see the Specify DB Details page. Notice that the Navigation page specifies that this DBMS is free-tier eligible. The right pane contains all the details about the DBMS instance. You can specify that you want to see only free-tier-eligible options, which is always a good idea to reduce potential settings errors.
6. Select the Only Show Options that are Eligible for RDS Free Tier. Choose db.t2.micro in the DB Instance Class field.
7. To retain free-tier compatibility, you must choose this particular class. It pays to review the free tier requirements found at https://aws.amazon.com/rds/ free/. This informational page provides details about free-tier usages, such as the instance type, the kinds of database product you can choose, memory requirements, and so on.
Because the free-tier requirements can change at any time, you must review the free-tier materials before making choices about the database you want to work with. You may need to modify the selections used in this blog to ensure that you maintain free-tier support and don’t incur any expenses.
8. Choose General Purpose (SSD) in the Storage Type field and type 20 in the Allocated Storage field.
When working with MySQL Community Edition, you must allocate at least 5GB of storage. However, the free tier allows you to allocate up to 20GB, which is the maximum amount that the MySQL Community Edition can use. To get the maximum performance from your experimental setup, you want to allocate as much memory as you can.
Depending on the DBMS you choose, the wizard may warn you that choosing less than 100GB of storage can cause your application to run slowly when working with high throughput loads. This warning isn’t a concern when creating an experimental setup, such as the one defined for this blog. However, you do need to keep the storage recommendations in mind when creating a production setup.
9. Type MyDatabase in the DB Instance Identifier field.
The instance identifier provides the means for uniquely identifying the database for access purposes. Usually, you choose a name that is descriptive of the database’s purpose and is easy for everyone to remember.
10. Type a username in the Master Username field.
The master user is the administrator who manages the database and will receive full access to it. A specific person should have the responsibility, rather than assign it to a group (where responsibility for issues can shift between people).
11. Type a password in the Master Password field, repeat it in the Confirm Password field, and then click Next Step.
You see the Configure Advanced Settings. This page lets you choose the VPC security group used to identify incoming requests (before they arrive at the DBMS); the authentication directory used to authenticate database users who rely on Windows Authentication; the networking options used to access the DBMS (such as the port number); the backup plan; the monitoring plan; and the maintenance plan. You do need to set the VPC security group to ensure that you can access the database.
12. Choose the Default-Launch security group created as part of defining the EC2 setup.
Depending on the DBMS you choose, you may find other database options that you can set. For example, MySQL lets you provide the name of an initial database. It pays to go through the settings carefully to ensure that you make maximum use of wizard functionality.
13. Type First Database in the Database Name field and then click Launch DB Instance.
14. Click View Your DB Instances.
Creating a Database Server
Creating a cloud-based database server works much like creating a local server except that you’re performing all the tasks remotely on someone else’s system. The change in venue means that you may find that some processes take longer to complete, that you may not have quite the same flexibility as you have when working locally, or that some features work differently.
However, the overall workflow is the same. The following sections demonstrate how to work with a Microsoft SQL Server setup, but the techniques used with other RDBMSs are similar.
Installing a database access product
To use your new database, you need an application that can access it. For example, when working with SQL Server Express, you use the SQL Server Management Studio (https://msdn.microsoft.com/library/mt238290.aspx).
Likewise, when you want to work with MySQL, you use the MySQL Workbench. No matter which DBMS you choose, you use an application outside of the RDS Management Console to manage it.
You use the RDS Management Console only to control how the DBMS works with AWS. Because this blog on MySQL as the DBMS, you need to download and install a copy of MySQL Workbench before proceeding with any of the other activities.
Accessing the instance
The database instance you created earlier will eventually become available. This means that you can interact with it. However, to interact with the database instance, you need to know its endpoint, which is essentially an address where applications can find it. When you select an instance in the RDS Management Console, a detailed view of that instance becomes visible and you can see the end-point information.
In this case, the endpoint is my database.cempzgtjl38f.us-west-2.rds. amazonaws.com:3306, which includes the instance name, a randomized set of letters and number, the instance location, and the port used to access the instance. Every endpoint is unique. If the endpoints weren’t unique, you’d experience confusion trying to access them.
When setting up a new connection, you need to supply the entire endpoint, except for the port, as the hostname. In this case, that means supplying mydatabase.cempzgtjl38f.us-west-2.rds.amazonaws.com as the hostname. You must also supply the port, which is 3306. you must also provide your username and password to access the instance.
MySQL Workbench provides a Test Connection button that you can use to determine whether the connection information will work. Most database management products provide such a button, and testing your connection before you move to the next step is a great idea. Otherwise, you can’t be sure whether an error occurs because of a problem with the database or the connection to the database.
You work with your AWS database as you would any other database that you can access using the management tool of your choice. Everything works the same; you simply perform tasks in the cloud rather than on your local network or on your machine.a typical example using MySQL Workbench.
Notice that first database appears in the Navigator pane, just as you’d expect, after making the connection to the RDS database. Right-clicking the Tables entry produces a context menu in which you can choose Create Table.
The creation process works as it normally does. The only difference you might note is that some tasks will require more time to complete because of the latency of the connection. Remember that you’re accessing the database through a number of additional layers and that your connection speed also acts as a determining factor.
Working with other features
You must make a significant separation between the database instance and the database itself. An administrator may address the needs of the database instance, such as by ensuring that the database is backed up or by resetting the instance when it crashes using the RDS Management Console.
On the other hand, a Database Administrator (DBA) is likely to interact with the actual database using a completely different tool. Because of the nature of cloud-based databases, you must consider how various administrators access tools and who can access them. The following sections detail the use of various DS Management Console tools.
Monitoring the database instance
When working with RDS, an administrator works with the database at two levels. The first level is monitoring. The detailed view always provides you with some metrics about the database. You see a log of alarms and recent events. In addition, you see the current CPU, memory, and storage use. However, these indicators are in real time, and you often need historical data to make a determination about a particular course of action.
To perform monitoring tasks, you select the database and then choose one of the monitoring options from the Show Monitoring menu of the RDS Management Console.
For example multigraph view of the server data. In this case, the graph shows the addition of a couple of connects to the database and the effect of adding a table and performing some other tasks.
You’d need to look carefully at the CPU Utilization graph to detect any activity at all. Fortunately, you can click any of the graphs to expand it and get a better look. When you finish performing a monitoring task, click Hide Monitoring to see the detailed view again.
Rebooting after a crash
The Instance Actions menu lets you interact with the database instance at an administrator level. For example, when an instance does crash, you can restore it by choosing Instance Actions ➪ Reboot.
Creating and deleting a snapshot
You may choose to create a read-only static view of the database (a snapshot) for archival reasons. In this case, you choose Instance Actions ➪ Take Snapshot. The RDS Management Console displays the Take DB Snapshot page.
The note about the InnoDB storage engine doesn’t apply to the MySQL Community Edition because RDS supports only the InnoDB storage engine in this case. To create the snapshot, type a name in the Snapshot Name field and then click Take Snapshot.
After you create the snapshot, it appears in the RDS Management Console as another snapshot (choose Snapshots in the Navigation pane) that someone can access as needed.
Of course, you don’t want to keep snapshots around that you’re not using. Otherwise, you accumulate charges for objects you don’t need. To remove a snapshot, select its entry and click Delete Snapshot.
Restoring a snapshot
The main reason to have a snapshot is for use as a backup. To restore a snapshot, select its entry in the snapshot list and then click Restore Snapshot. You see the Restore DB Instance page. This page works much the same as the details page that you interacted when creating the initial database.
However, the new database instance will contain everything found in the snapshot. It will have its own endpoint as well. Of course, you want to give the database a different instance name than the current database until you verify the snapshot’s content.
Verification is an important part of the process of working with any database in any situation, but especially so in the cloud. After you verify that the restored snapshot database instance contains the data you need, you can exchange it with the original database that needs repair by following these steps:
1. Select the original database. You see the details for that database.
2. Choose Instance Actions ➪ Modify.
You see the Modify DB Instance page. This page contains all the settings you used to create the database instance initially. You can modify any of the settings as needed.
3. Change the DB Instance Identifier field content to something new.
4. Check Apply Immediately and then click Continue.
If you don’t apply the change immediately, it won’t take place until the next maintenance cycle. You see a summary page that shows the modifications that you want to apply.
5. Click Modify DB Instance.
AWS applies the changes you requested. You must wait until AWS reboots the instance (you see Rebooting in the Status field) before the changes become permanent. When the Status field reads Available, you can move on to Step 6.
6. Perform Steps 1 through 5 for the restored snapshot database instance, except give this instance the name of the original database.
You have now swapped the two databases and are using the restored snapshot database instance as your current database instance for applications.
Performing other modifications
Adding Database Support to an Application
The Modify DB Instance page (available by choosing Actions ➪ Modify), also gives you access to a wealth of other database instance settings.
For example, you can choose when backups occur and the level of monitoring provided. You can also change the database instance security settings. Anything you defined as part of the original creation process is available for modification in the Modify DB Instance page.
After you have a database server created and configured, you can use an application to access it. The data doesn’t serve any purpose until you provide access to it. The purpose of the application, in this case, is to provide Create Read Update and Delete (CRUD) support for the data. Users are interested in data and what it represents; the application used to perform the task is secondary.
In fact, common practice today is to provide multiple applications to perform database tasks because user needs differ so widely because of usage environments, device, and personal preference. The sources of these database applications can also vary. A database vendor could provide a generic application, corporate developers could provide something specific, and a third party might provide a feature-rich version of the application.
When working with cloud data, accessing the data requires an endpoint, just as it does for your local network or drive. As shown in the “Accessing the instance” section, earlier in this blog, nothing really changes from a procedural perspective, except that you must now provide a different endpoint than normal.
From a developer perspective, the endpoint that RDS provides for a database instance is nothing more than a URL, which means that you can use the same techniques that you use for any online data.
This consideration also applies to any administrator tools used for private data. Administrators must consider the following issues as part of the application migration:
Verify that a connection works before attempting to use it to perform tasks on the data.
Assume that the connection will go down at some point, so make sure to verify that the connection is still present before each task.
Assume that someone will hack your data, no matter what security precautions you take because the data is now available in a public venue (so have a recovery plan in place).
Ensure that security measures work as anticipated so that every user group can access the data within the boundaries set by company policy.
Define security policies for working with data in a public venue that address social hacking issues.
Consider legal and privacy requirements before moving the data.
Develop a plan for dealing with sensitive data that inadvertently makes it to your hosted database, rather than staying on the local network or on a specific machine.
These precautions are in addition to the precautions you normally take when connecting an application to a database. The actual coding that you use may not change much (except for the addition of checks to address online access requirements), but the focus of how the application makes connections and performs required tasks does need to change. Otherwise, your organization might make front-page news after getting hacked and losing a lot of data to someone in another country.
Configuring Load Balancing and Scaling
The precise levels of load balancing and scaling that you receive with a particular RDBMS instance depend on how you configure the instance and which RDBMS you choose to use. It also depends partly on the application support you provide, how many users are accessing the database (and from where they access it), and many other factors too numerous to discuss in a single blog of a blog (or possibly in a whole shelf of blogs).
With this in mind, the following sections discuss load balancing and scaling issues in a generic way that works with all the RDBMS that AWS supports. These discussions help you get started with both load balancing and scaling, but you may need to augment the information for your particular RDBMS to obtain a full solution to specific management needs.
Defining the purpose of load balancing
When your application gets large enough, you need multiple servers to handle the load. Of course, you don’t want to configure each application instance to use a specific server; rather, you want to send the request to a general location and have it go to the server with the least load at any given time. The purpose of a load-balancing server is to
Act as a centralized request handler
Monitor the servers used to handle requests
Route responses to clients from the various servers
Determine the need for additional servers to handle increasing loads
Not all load-balancing scenarios perform all these activities, but most of them do. The point is that you use a single request point to allow access to multiple servers in order to hide the fact that a single server can’t handle the load for the number of requests that users make. Using this approach enables you to scale your application across multiple servers in a transparent manner.
When you’re working with AWS, load balancing always occurs across multiple EC2 instances. Even though Amazon makes a point of telling you about the fault-tolerance features added through load balancing, the main focus is on the additional processing power that load balancing provides. However, if an EC2 instance does freeze or become otherwise unusable, you can substitute another EC2 instance without any problem. The application user will never see the difference.
Working with Elastic Load Balancing
When you first configure your EC2 instances, you won’t have any Elastic Load Balancers configured — so you must create one. The Elastic Load Balancer must appear in the same region as the EC2 instances that it serves. The following steps help you create an Elastic Load Balancer:
1. Sign into AWS using your administrator account.
2. Navigate to the EC2 Management Console at https://console.aws.amazon.com/ec2. You see the EC2 Management Console.
3. Verify that you have the correct region selected by choosing it in the region drop-down list at the top of the EC2 Management Console.
4. Select Load Balancing/Load Balancers in the Navigation pane.
You see the Load Balancer page. Notice that the message specifies that you don’t have any load balancers configured for the selected region; it doesn’t say that you lack access to any load balancers.
5. Click Create Load Balancer.
In addition, notice that you can add protocols for accessing the load balancer.
You use the same protocols that your EC2 instances normally require. Remember that users will send requests to the load balancer instead of the EC2 instance.
The load balancer will then send the request to the EC2 instance best able to handle it. If you don’t provide any secure ports for your load balancer, the wizard will ask you to reconsider during the security setup step. Whether you use a secure port depends on how you’re using your EC2 instances.
If you don’t need a secure connection for your EC2 instances, you aren’t likely to need a secure connection for the load balancer.
6. Type MyLoadBalancer in the Load Balancer Name field.
7. Select the Default-Launch security group and then click Next: Configure Security Settings.
You see a message regarding the load balancer’s security. If you did select one of the secure options, the same screen asks you to provide an SSL certificate or allow AWS to generate an SSL certificate for you.
8. Click Next: Configure Health Check.
You see Step 4: Configure the Health Check page. This step is especially important because it ensures that the Elastic Load Balancer sends requests to only EC2 instances that are able to respond. Using this approach adds a level of reliability to your setup. The default options normally work quite well, but you can choose to change them if desired.
9. Click Next: Add EC2 Instances.
The wizard presents you with a list of running instances. You likely have only one such instance running now if you worked through the examples in the blog. Normally, you choose as many instances as you can to help support load balancing.
10. Select each of the EC2 instances you want to use and then click Next:Add Tags.
The tags provide information that you can use for various organizational needs. You don’t need to define any unless you use them as part of an application-programming requirement or some other need.
11. Click Review and Create.
The wizard presents you with a screen showing the selections you made. Make sure to check the information carefully.
12. Click Create.
AWS starts the Elastic Load Balancer for you and shows you the Load Balancer page.
Defining the purpose of scaling
Load balancing generally refers to server farms, groups of servers connected through a central request point. Scaling refers to the capability to control all resources used to handle application request loads in an automated manner. When the load increases, the scaling functionality automatically increases the required resources.
Likewise, a decrease in load makes the scaling functionality reduce the number of resources in use. The resources appear as part of a pool so that other applications can rely on the resources as needed.
When you’re working with Amazon, the resources may seem limitless, but they do truly have an end. Even so, it’s unlikely that most applications will ever scratch the surface of the resources that Amazon makes available, so scaling doesn’t become a problem.
AWS makes a distinct difference between load balancing and scaling. The Elastic Load Balancing service is completely separate from the Auto Scaling service, even though you can coordinate the efforts of the two services to provide a robust end-user experience.
Both services also deal with EC2 instances, but in different ways, so the outcomes can be different. The important difference for this blog is that scaling provides a means of automatically adjusting available resources to meet specific application demands.
You can adjust the functionality and performance of Auto Scaling in a number of ways. The following methods are those that you most commonly use when working with Auto Scaling to provide database services to an application:
The method that you use to configure Auto Scaling determines how the service reacts to EC2 events. For example, Auto Scaling automatically detects unhealthy EC2 instances and replaces them with healthy instances.
When you know in advance that your application will have a heavy load placed on it, you can create a schedule to ramp up the number of EC2 instances. This proactive approach may cost slightly more to use, but it always results in better application speed as long as you schedule the increased capacity at the right time.
Amazon CloudWatch events:
You can create Amazon CloudWatch events that automatically react to and handle application-scaling events. This reactive approach provides adjustments as needed, but you may see a delay between the time when the event occurs and the additional resources arrive. Generally, using Amazon CloudWatch does provide faster response times than humans can provide.
Elastic Load Balancing monitoring:
Combining Auto Scaling with Elastic Load Balancing helps you maintain a balanced server load, which uses resources more efficiently. You use a single set of servers to interact with a number of Auto Scaling groups to ensure that each group receives the resources it needs, but at a lower cost than when you manage each Auto Scaling group individually.
Working with the Auto Scaling feature
you read about autoscaling, a built-in feature that automatically adjusts how your setup reacts to loads. This blog discusses Auto Scaling, the service you use to make your RDS setup autoscale within limits that you specify.
When you see the term autoscaling, think of the generic use of a feature (not necessarily a service) to make applications, services, and other AWS features add and remove resources as needed to make applications scale better and provide a consistent user experience.
When you see Auto Scaling, think about the service that you specifically use to make autoscaling feasible with certain AWS services. The Auto Scaling feature enables your EC2 instances to handle loads without a lot of human intervention. The following sections tell you how to use Auto Scaling to make your AWS services provide autoscaling functionality.
Applying Auto Scaling
You have several options for applying Auto Scaling to your EC2 instance, but the easiest method is to select one or more EC2 instances in the Instances page and then choose Actions ➪ Instance Settings ➪ Attach to Auto Scaling Group.
You see the Attach to Auto Scaling Group dialog box. Type a name for the group in the Auto Scaling Group Name field and then click Attach. AWS then automatically creates an Auto Scaling group for you that uses precisely the same settings as the selected EC2 instances.
Removing Auto Scaling
Unfortunately, you can’t remove an EC2 instance from an Auto Scaling Group in the Instances page. Use the following steps to remove an EC2 instance from an Auto Scaling Group.
1. Choose Auto Scaling\Auto Scaling Groups in the Navigation pane. You see the Auto Scaling Group page. You may need to click an Auto Scaling Groups link to see the list of groups.
2. Select the Auto Scaling Group for the EC2 instance.
3. Select the Instances tab of that group. You see a listing of EC2 instances attached to that group
4. Select the EC2 instance that you want to remove and then choose Actions ➪ Detach in the Instances Panel.
Make sure that you choose the lower of the two Actions buttons. You see a Detach Instance dialog box.
5. Click Detach Instance.
AWS removes the EC2 instance from the Auto Scaling Group.
Simply deleting the Auto Scaling Group terminates the attached EC2 instance. After an EC2 instance is terminated, you can’t recover it and must re-create the instance from scratch.
The best way to avoid this problem is to provide your EC2 instance with termination protection by choosing Instance Settings ➪ Change Termination Protection in the Instances page for the selected EC2 instance. You see a dialog box in which you confirm that you want to enable termination protection.