What is Amazon S3 Bucket (Best Tutorial 2019)

What is Amazon S3 Bucket

What is Amazon S3 Bucket and how it works (Tutorial 2019)

Amazon S3 Bucket is more than storage, despite the name that Amazon attached to it, because of how you can use it to interact with other services and store objects other than files. This tutorial explains the What is Amazon S3 Bucket and how it works with the best examples. And also discuss various Amazon cloud storage types used in 2019.

 

The heart of every enterprise — the most important asset it owns — is data. Everything else is replaceable, but getting data back can be difficult, if not impossible. 

 

To make online storage useful, AWS has to provide support for objects, which does include files but also includes a lot of other object types that include sounds, graphics, security settings, application settings, and on and on.

 

This blog helps you focus on the kinds of objects that S3 supports. By supporting a large number of objects, S3 enables you to ­perform the most important task of all: reliable backups.

 

In the second case, you use S3 with Lambda to perform certain tasks using code. The capability to perform these tasks makes S3 a powerful service that helps you accomplish common tasks with a minimum of effort in many cases. 

 

The final section of the blog discusses long-term storage or archiving. Everyone needs to archive data, both individuals, and organizations. Data often become outdated and, some might feel, not useful. However, data often has a habit of becoming useful when you least expect it, so saving it in long-term storage is important.

 

Considering the Simple Storage Service (S3) Features

Amazon S3 Bucket

S3 does so much more than just store information. The procedure you see in the “Performing a Few Simple Tasks” section of blog 2 is a teaser of a sort — but it doesn’t even begin to scratch the surface of what S3 can do for you.

 

The following sections help you understand S3 more fully so that you can grasp what it can do for you in performing everyday tasks. These sections are a kind of overview of common features; S3 actually does more than described.

 

Introducing AWS S3

AWS S3

The idea behind S3 is straightforward. You use it to store objects of any sort in a directory structure you choose, without actually worrying about how the objects are stored.

 

The goal is to store objects without considering the underlying infrastructure so that you can focus on application needs with a seemingly infinite hard drive.

 

S3 focuses on objects, which can be of any type, rather than on a directory structure consisting of folders that hold files. Of course, you can still organize your data into folders, and S3 can hold files.

 

It’s just that S3 doesn’t care what sort of object you put into the buckets you create. As a real-world bucket, an S3 bucket can hold any sort of object you choose to place in it.

 

The bucket automatically grows and shrinks to accommodate the objects you place in it, so most of the normal limits placed on storage don’t exist with S3.

 

Even though you see a single bucket in a single region when you work with S3, Amazon actually stores the data in multiple locations at multiple sites within the region. The automatic replication of your data reduces the risk of losing it as a result of any of a number of natural or other kinds of disasters.

 

No storage technology is foolproof, however, and your own applications can just as easily destroy your data store as a natural disaster can. Consequently, you need to follow all the usual best practices to protect your data.

 

The Amazon additions act as a second tier of protection. Also, versioning (see the “Employing versioning” section, later in this blog, for details) provides you with a safety net in that you can recover a previous version of an object when a user deletes it or an application damages it in some way.

 

Amazon provides a number of storage-solution types, which are mentioned as the blog progresses. S3 is a new approach that you can use in the following ways:

AWS

Data storage: The most common reason to use S3 is to store data. The data might originally appear on your local systems or you might create new applications that generate and use data in the cloud.

 

A big reason to use S3 is to make data available anywhere to any device using any application you deem fit to interact with it.

 

However, you still want to maintain local storage when security or privacy is the primary concern, or when your own infrastructure provides everything needed to store, manage, and access the data. When you think about S3 as a storage method, think about data with these characteristics:

 

It’s in constant flux. Applications require high accessibility to it. The cost of a data breach (where hackers steal the data and you must compensate the subjects of the data in some way) is relatively low.

 

Backup: 

Backup

Localized backups have all sorts of problems, including the fact that a natural disaster can wipe them out. Using S3 as part of your data backup strategy makes sense when the need to keep the data available for quick access is high.

 

The storage costs are acceptable, and the risk from a data breach is low. Of course, backups imply disaster recovery. You make a backup on the assumption that at some point, you need the backup to recover from a major incident.

 

As with health or car insurance, you hope that the reason you’re keeping it never occurs, but you don’t want to take the risk of not having it, either. Amazon provides all the best-practice storage techniques that your organization would normally employ, such as periodic data integrity checks to ensure that the data is still readable.

 

In addition, Amazon performs some steps that most organizations don’t, such as performing checksums on all network data to catch data corruption problems before the data is stored (making it less likely that you see any data corruption later).

 

Data analysis: 

Data analysis

The use of an object-storage technique makes S3 an excellent way to store data for many types of data analysis.

 

Organizations use data analysis today to study trends of all sorts, such as people’s buying habits or the effects of various strategies on organizational processes. The point is that data analysis has become an essential tool for business.

 

Static website hosting:

website hosting

 Static websites may seem quaint because they rely on pages that don’t change often. Most people are used to seeing dynamic sites whose content changes on an almost daily basis (or right before their eyes, in the case of ads).

 

However, static sites still have a place for data that doesn’t change much, such as providing access to organizational forms or policies. Such a site can also be useful for product manuals or other kinds of consumer data that doesn’t change much after a product comes on the market.

 

Working with buckets

Working buckets

The bucket is the cornerstone of S3 storage. You use buckets to hold objects. A bucket can include organizational features, such as folders, but you should view a bucket as the means for holding related objects.

 

For example, you might use a single bucket to hold all the data, no matter what its source might be, for a particular application. Amazon provides a number of ways to move objects into a bucket, including the following:

 

Amazon Kinesis: A method of streaming data continuously to your S3 storage. You use Kinesis to address real-time storage needs, such as data capture from a device like a data logger. Because the majority of the Kinesis features require programming, this blog doesn’t cover Kinesis extensively.

 

Physical media: You can send physical media to Amazon to put into your S3 bucket if your data is so huge that moving it across the Internet proves too time-consuming.

 

When working with Direct Connect, you can speed data transfers using Transfer Acceleration. You can also read more about Transfer Acceleration at http://docs.aws.amazon.com/AmazonS3/latest/dev/ transfer-acceleration.html.

 

Essentially, Transfer Acceleration doesn’t change how a transfer occurs, but rather how fast it occurs. In some cases, using Transfer Acceleration can make data transfers up to six times faster, so it can have a huge impact on how fast S3 is ready to use with your application or how fast it can restore a backup onto a local system.

 

To use Transfer Acceleration, you simply select a box in the S3 Management Console when configuring your bucket.

 

The best way to reduce costs when working with S3 is to ensure that you create a good archiving strategy so that AWS automatically moves objects you don’t use very often to Glacier.

 

The important thing to remember is that S3 works best as storage for objects that you’re currently using, rather than as long-term storage for objects you may use in the future.

 

Managing objects using buckets

AWS_objects

You can manage objects in various ways and at various levels in S3. The management strategy you use determines how much time you spend administering, rather than using, the data. The following list describes the management levels:

 

Bucket: 

The settings you provide for the bucket affect every object placed in the bucket. Consequently, this course management option creates the general environment for all objects stored in a particular S3 bucket.

 

However, it also points to the need for creating buckets as needed, rather than using a single catchall bucket for every need. Use buckets to define an environment for a group of related objects so that you can reduce the need for micromanagement later.

 

Folder: 

Folder

It’s possible to add folders to buckets. As with folders (directories) on your local hard drive, you use folders to better categorize objects. In addition, you can use folder hierarchies to organize objects in specific ways, just as you do on your local hard drive.

 

Configuring higher-level folders to provide the best possible environment for all objects that the folder contains is the best way to reduce the time spent managing objects.

 

Object:

Configuring individual objects is an option of the last choice because individual object settings tend to create error scenarios when the administrator who performed the configuration changes jobs or simply forgets about the settings.

 

The object tends to behave differently from the other objects in the folder, creating confusion. Even so, individual object settings are sometimes necessary, and S3 provides the support needed to use them.

 

Setting bucket security

Setting bucket security

AWS gives you several levels of security for the data you store in S3. However, the main security features are similar to those that you use with your local operating system (even though Amazon uses different terms to identify these features).

 

The basic security is user-based through Identity and Access Management (IAM). You can also create Access Control Lists (ACLs) similar to those used with other operating systems.

 

In addition to standard security, you can configure bucket policies that determine actions that requestors can perform against the objects in a bucket. You can also require that requestors provide authentication as part of query strings (so that every action passes through security before S3 performs it).

 

Even though the Amazon documentation mentions security support for Payment Card Industry (PCI) and Health Insurance Portability and Accountability Act (HIPAA) support, you need to exercise extreme care when using S3 for this purpose­. Your organization, not Amazon, is still responsible for any breaches or other problems associated with cloud storage. 

 

Be sure to create a compliant configuration by employing all the AWS security measures correctly, before you store any data. In fact, you may want to work with a third-party vendor, such as Connectria (http://www.connectria.com/cloud/hipaa_aws.php), to ensure that you have a compliant setup.

 

[Note: You can free download the complete Office 365 and Office 2019 com setup Guide for here]

 

Employing encryption

Employing encryption

Making data unreadable to a third party while continuing to be able to read it yourself­ is what encryption is all about. Encryption employs cryptography as a basis for making the message unreadable, and cryptography is actually a relatively old science (see the articles at https://access.redhat.com/blogs/766093/ posts/1976023 . Encryption occurs at two levels when working with AWS:

 

Data transfer: As with any Internet access, you can encrypt your data using Secure Sockets Layer (SSL) encryption to ensure that no one can read the data as it moves between S3 and your local system.

 

Storage: 

Storage

Keeping data encrypted while stored is the only way to maintain the protection provided during the data transfer. Otherwise, someone can simply wait until you store the data and then read it from the storage device.

 

Theoretically, encryption can work on other levels as well. For example, certain exploits (hacker techniques used to access your system surreptitiously) can attack machine memory and read the data while you have it loaded in an application.

 

Because your data is stored and processed on Amazon’s servers, the potential for using these other exploits is limited and therefore not discussed in this blog. However, you do need to consider these exploits for any systems on your own network.

 

Modern operating systems assist you in protecting your systems. For example, Windows includes built-in memory protection that’s effective against many exploits (see the articles at https://msdn.microsoft. com/library/windows/desktop/aa366553.aspx 

 

Storage is one of the main encryption concerns because data spend more time in storage than in being moved from one place to another. AWS provides the following methods for encrypting data for storage purposes:

 

Server-side encryption:

Server-side encryption

You can request that S3 encrypt data before storing it on Amazon’s servers and then decrypt it before sending it to you. The advantage of this approach is that you can often obtain higher transfer speeds, encryption occurs automatically, and the amount of work you need to perform is less.

 

However, a third party could gain access to the data during the short time it remains unencrypted between decryption and transfer to your system, so this option isn’t as safe as the other options in this list.

 

Client-side encryption:

You encrypt the data before sending it to S3. Because you encrypt the data prior to transfer, the chance that someone will read it is reduced. Remember that you also encrypt the data as part of the transfer process.

Client-side encryption

The potential problems with this approach include a requirement that the client actually performs the encryption (it’s less safe than relying on the automation that a server can provide) and the application will likely see a speed penalty as well.

 

Both:

The Amazon documentation doesn’t mention that you can doubly encrypt the data, both on the client and on the server, but this option adds a level of safety because a hacker would have to undo two levels of encryptions (probably relying on different algorithms).

 

This option is the slowest because you must pay for two levels of encryption, so the application suffers a definite speed loss. However, when working with sensitive data, this method presents the safest choice.

 

Using S3 events and notifications

AWS_events

In looking at some of the S3 screens and reviewing the documentation, you might initially get the idea that S3 provides event notifications — treating events and notifications as a single entity. Events and notifications are separate entities, however. An event occurs when something happens with S3.

 

For example, adding a new object to a bucket generates an event because something has happened to the bucket. The event always occurs. A notification is a response to the event.

 

When you tell S3 that you want to know about an event, S3 tells you every time that the event happens (or you can provide automation to do something about the event in your absence). It pays to keep the concepts of event and notification separate because combining the two leads to confusion.

 

You don’t have to subscribe to every notification that S3 provides. In fact, you can create subscriptions to events in a detailed way. S3 supports the following events, each of which you can subscribe to separately:

 

An object created: Occurs whenever the bucket receives a new object. You use the * (asterisk) wildcard or All option to select all object-creation methods. A bucket can receive objects in the following ways:

 

Put: Writes an object directly to the bucket using direct means.

 

Post: Writes an object to the bucket using HTML forms, which means obtaining information needed to upload the object from the HTTP headers.

Copy: Writes an object to the bucket using an existing object as a data source.

 

Complete multipart upload: Writes multiple objects to the bucket as individual pieces (parts) and then reassembles those parts into a cohesive unit. This notification signifies the completion of the process.

 

Object deleted: Occurs whenever a bucket has an object removed or marked for removal. You use the * (asterisk) wildcard or All option to select all object-deletion methods. A bucket can have objects removed in the following ways:

 

Delete: The object is actually removed from the bucket (making it inaccessible). This form of deletion applies to a nonversioned object or complete versioned-object removal.

 

Delete marker created: The object is only marked for removal from the bucket and becomes inaccessible as a result. This form of deletion normally applies to objects with multiple versions.

 

Object lost: Occurs whenever S3 loses a Reduced Redundancy Storage (RRS) object. An RRS object is one that you mark as being noncritical. By marking an object as RRS, you save money by storing it with the potential problem of losing the object at some point.

 

You can read more about RRS objects at https://aws.amazon.com/about-aws/whats-new/2010/05/19/announcing-amazon-s3-reduced-redundancy-storage/.Notifications are useless unless you provide a destination for them. AWS supports the following notification destinations (you can use any or all of them as needed):

 

Amazon Simple Notification Service (Amazon SNS) topic: 

 Notification Service

When working with SNS, AWS publishes notifications as a topic that anyone with the proper credentials can subscribe to in order to receive information about that topic.

 

Using this approach lets multiple parties easily gain access to information at the same time. This approach represents a push strategy through which AWS pushes the content to the subscribers.

 

Amazon Simple Queue Service (Amazon Simple Queue Service) queue:

When working with Simple Queue Service, AWS places event notifications in a queue. One or more parties can query the queue to obtain the necessary information. Each party must separately query the queue, and a single party can remove the notification from the queue. This pull strategy works much like email and often is used by single parties or groups.

 

AWS Lambda function: Defines a method of providing an automated response to any given event.

 

Employing versioning

Employing versioning

During the early days of computing, the undo feature for applications was a major innovation that saved people countless hours of fixing errors. Everyone makes mistakes, so being able to undo a mistake, rather than fix it, is nice.

 

It makes everyone happier. Versioning is a kind of undo for objects. Amazon can automatically store every version of every object you place in a bucket, which makes it possible to

 

  • Return to a previous version of an object when you make a mistake in modifying the current version
  • Recover an object that you deleted accidentally

 

  • Compare different iterations of an object to see patterns in changes or to detect trends
  • Provide historical information when needed

 

  • Seemingly, you should make all your buckets versioned. After all, who wouldn’t want all the wonderful benefits that versioning can provide? However, versioning also comes with some costs that you need to consider:
  • Every version of an object consumes all the space required for that object, rather than just the changes, so the storage requirements for that object increase quickly.

 

  • After you configure a bucket to provide versioning, you can never stop using versioning. However, you can suspend versioning as needed.
  • Deleting an object doesn’t remove it. The object is marked as deleted, but it remains in place, which means that it continues to occupy storage space.

 

  • Overwriting an object results in a new object version rather than a deletion of the old object.
  • Adding versioning to a bucket doesn’t affect existing objects until you interact with those objects. Consequently, the existing objects don’t have a version number.

 

Working with Objects

Working Objects

The term object provides a generic reference to any sort of data, no matter what form the data may take. As you discover in blog 2, S3 stores the objects you upload in buckets.

 

These buckets provide the means to interact with the data. After you create a bucket, you can use any of a number of organizational methods to interact with the objects you upload to the bucket.

Next, use these steps to access the S3 Management Console:

 

1. Sign into AWS using your administrator account.

 

2. Navigate to the S3 Management Console at https://console.aws.amazon.com/s3.

You see a listing of any buckets you have created, along with any buckets that Amazon created automatically on your behalf. S3 bucket names are unique, which means that the bucket names you see in this blog won’t match the name of the bucket you see on your system, but the procedures apply to any bucket you create.

 

3. Click the link for the bucket you created from the instructions.

After you sign in, you can begin interacting with the S3 bucket you created. The following sections help you understand the details of working with objects in S3.

 

Creating folders

Creating folders

As with your local hard drive, you use folders to help organize data within buckets. And just as with local folders, you can create hierarchies of folders to store related objects more efficiently. Use the following steps to create new folders.

 

1. Navigate to the location that will hold the folder. When creating folders for the first time, you always start in the bucket.

2. Click Create Folder.

3. Type the folder name and press Enter.

The example uses a folder name of My Folder 1. You see the folder added to the list of objects in the bucket. At this point, you can access the folder if desired.

 

Simply click the folder’s link. When you open the folder, Amazon tells you that the folder is empty. Folders can hold the same objects that a bucket can. For example, if you want to add another folder, simply click Create Folder again to repeat the process you just performed.

 

When you open a folder, the navigation path shows the entire path to the folder. To move to another area of your S3 setup, simply click that entry (such as All Buckets) in the navigation path.

 

Setting a bucket, folder, and object properties

static website hosting

After you create a bucket, a folder, or object, you can set properties for it. The properties vary by type.

 

For example, when working with a bucket, you can determine how the bucket supports permissions, static website hosting, logging, events, versioning, life-cycle management, cross-region replication, tags, requestor pays, and transfer acceleration. This blog discusses most of these properties because you set them to perform specific tasks.

 

Likewise, folders let you set details about how the folder reacts. For example, you can set the storage class, which consists of Standard, Standard – Infrequent Access, or Reduced Redundancy. The storage class determines how the folder stores data.

 

Using the Standard – Infrequent Access option places the folder content in a semi-archived state. With the content in that state, you pay less money to store the data but also have longer retrieval times.

 

(The article at https://aws. amazon.com/s3/storage-classes/ describes the storage classes in more detail.) You can also set the kind of encryption used to protect folder content.

 

Objects offer the same configuration options as folders and add the capability to set permissions and change the object metadata. The metadata describes the object type and provides information that an application might use to find out more about it.

 

The standard metadata entry defines a content type, but you can add any metadata needed to identify the object fully. To set properties for an individual item, check that item’s entry in the management console and choose Actions ➪ Properties. Make any changes you want and click Save to save the changes.

 

If you select multiple items, the Properties pane shows the properties that are common to all the items you selected. Any changes you make affect all the items. When you finish setting properties, you can close the Properties pane by clicking the X in the corner of the pane.

 

Deleting folders and objects

Deleting folders

At some point, you may find that you need to delete a folder or object because it no longer contains pertinent information. (Remember, when you delete folders or objects in buckets with versioning enabled, you don’t actually remove the folder or object. The item is simply marked for deletion.) Use the following steps to delete folders and objects:

 

1. Click the box next to each folder or object you want to delete. The box changes color and becomes solid to show that you have selected it. (Clicking the box a second time deselects the folder or object.)

2. Choose Actions ➪ Delete. You see a dialog box asking whether you want to delete the selected items.

3. Click OK.

The Transfers pane, opens to display the status of the operation. When the task is complete, you see Done as the status. Selecting the Automatically Clear Finished Transfers option automatically clears messages for tasks that complete successfully. Any tasks that end with an error will remain.

 

You can also manually clear a transfer by right-clicking its entry and choosing Clear from the context menu.

 

4. Click the X in the upper-right corner of the Transfers pane to close it. You see the full entry information as you normally do.

 

Uploading objects

Uploading objects

The only method that you can use to upload an object from the console is to employ the same technique. This means clicking Upload, selecting the files that you want to send to S3, and then starting the upload process.

 

however, leaves out some useful additional settings changes you can make. After you select one or more objects to upload, click Set Details to display the Set Details dialog box, which lets you choose a storage class and determine whether you want to rely on server-side encryption.

 

The next step is to click Set Permissions. You use the Set Permissions dialog box, to define the permissions for the objects you want to upload. Amazon automatically assumes that you want full control over the objects that you upload.

 

The Make Everything Public checkbox lets everyone have access to the objects you upload. You can also create custom permissions that affect access to all the objects you upload at a given time.

 

Finally, you can click Set Metadata to display the Set Metadata dialog box. In general, you want Amazon to determine the content types automatically. However, you can choose to set the content type manually. This dialog box also enables you to add custom metadata that affects all the objects that you upload at a given time. 

 

To perform other kinds of uploads to S3, you must either resort to writing application code using one of the AWS Application Programming Interfaces (APIs) or rely on another service.

 

For example, you can use Amazon Kinesis Firehose (https:// aws.amazon.com/kinesis/firehose/) to stream data to S3. The techniques used to perform these advanced sorts of data transfers are outside the scope of this blog, but you need to know that they exist.

 

Retrieving objects

Retrieving objects

At some point, you need to retrieve your data in order to use it. Any applications you build can use S3 directly online. You can also create a static website to serve up the data as you would any other website.

 

The option discussed in this section of the blog is to download the data directly to your local drive. The following steps tell how to perform this task.

 

1. Click the box next to each object you want to download. The box changes color and becomes solid to show that you have selected it.

(Clicking the box a second time deselects the object.)  You can’t download an entire folder. You must select individual objects to download (even if you want to download every object in a folder). In addition, if you want to preserve the folder hierarchy, you must rebuild it on your local drive.

 

2. Choose Actions ➪ Download.

You see a dialog box that provides a link to download the selected objects. Clicking the Download link performs the default action for your browser with the objects. For example, if you select a .jpg file and click Download, your browser will likely display it in a new tab of the current windows.

 

Right-click the Download link to see all the actions that your browser provides for interacting with the selected objects. Of course, this list varies by browser and the kinds of extensions you add to it.

 

3. Right-click the Download link and choose an action to perform with the selected objects.

Your browser accomplishes the action as it normally would when right-clicking a link and choosing an option. Some third-party products may experience permissions problems when you attempt the download.

 

4. Click OK. You see the full entry information as it normally appears.

 

Performing AWS Backups

AWS_backup

Data is subject to all sorts of calamities — some localized, some not. In days gone by, individuals and organizations alike made localized backups of their data. They figured out before long that multiple backups worked better, and storing them in a data safe worked better still. 

 

However, major tragedies, such as flooding, proved that localized backups didn’t work well, either. Unfortunately, storing a backup in a single outside location was just as prone to trouble.

 

That’s where using cloud backups comes into play. Using S3 to create a backup means that the backups appear in more than one location and that Amazon creates more than one copy for you, reducing the risk that a single catastrophic event could result in a loss of data. The following sections explore using S3 as a backup solution.

 

Performing a manual backup

 backup

Amazon’s backup suggestions don’t actually perform a backup in the traditional sense. You don’t see an interface where you select files to add to the backup. You don’t use any configuration methods for whole drive, image, incremental, or differential backups.

 

After you get through all the hoopla, what it comes down to is that you create an archive (such as a .zip file) of whatever you want in the backup and upload it to an S3 bucket. Yes, it’s a backup, but it’s not what most people would term a full-backup solution.

Certainly, it isn’t an enterprise solution. However, it can still work for an individual, home office, or small business. The “Working with Objects” section, earlier in this blog, tells you how to perform all the required tasks using the console.

 

Automating backups

Automating backups

You can provide some level of backup automation when working with S3. However, you need to use the initial Command Line Interface (CLI) instructions found at https://aws.amazon.com/getting-started/tutorials/backup-to-s3-cli/ to make this task possible.

 

After you have CLI installed on your system, you can use your platform’s batch-processing capability to perform backups.

 

For example, a Windows system comes with Task Scheduler to automate the use of batch files at specific times. Of course, now you’re talking about performing a lot of programming, rather than using a nice, off-the-shelf package.

 

Unless your organization just happens to have the right person in place, the Amazon solution to automation is woefully inadequate.

 

You do have other alternatives.

For example, if you have a Python developer in-house, you can use the pip install aws cli command to install AWS CLI support on your system.

The process is straightforward and incredibly easy compared to the Amazon alternative. You likely need administrator privileges to get the installation to work as intended.

 

After you have the required Python support in place, you can rely on S3-specific Python backup packages, such as the one at https://pypi.python.org/pypi/s3-backups, to make your job easier. However, you still need programming knowledge to make this solution work. 

 

There are currently no optimal solutions for the administrator who has no programming experience whatsoever, but some off-the-shelf packages do look promising.

 

For example, S3cmd (http://s3tools.org/s3cmd) and S3Express (on the same page) offer a packaged solution for handling backups from the Linux, Mac, and Windows command line. Look for more solutions to arrive as S3 continues to gain in popularity as a backup option.

 

Developing a virtual tape library

virtual tape library

Most businesses likely have a backup application that works well. The problem is making the backup application see online storage, such as that provided by S3, as a destination.

 

The AWS Storage Gateway (https://aws.amazon.com/storage gateway/) provides a solution in the form of the Gateway-Virtual Tape Library (Gateway-VTL)

(https://aws.amazon.com/about-aws/whats-new/2013/11/05/ aws-storage-gateway-announces-gateway-virtual-tape-library/). 

 

This solution lets you create the appearance of virtual tapes that your backup application uses in place of real tapes. Consequently, you get to use your existing backup product, but you send the output to the cloud instead of having to maintain back-ups in-house.

 

Amazon does provide you with a trial period to test how well a Gateway-VTL solution works for you. Just how well it works depends on a number of factors, including the speed of your Internet connection.

 

The most important consideration is that a cloud-based solution is unlikely to work as fast as dedicated high-speed local connectivity to a dedicated storage device.

 

However, the cost of using Gateway-VTL may be considerably lower than working with local devices. You can check out the pricing data for this solution at https://aws.amazon.com/ storagegateway/pricing/.

 

Using S3 to Host a Static Website

website

Another way to use S3 is to host a static website. static website is one in which you deliver the content to the viewer precisely as stored.

 

Most modern websites, especially those used by business, rely on dynamic techniques that create custom content from various sources.  However, a static website can still serve useful purposes for content that doesn’t change often.

 

For example, you can potentially use it for service manuals, corporate forms, or simply as a welcome page for a much larger site. The point is to be aware at the outset that static sites can contain all sorts of content, but the content always appears precisely as you put it together.

 

Creating a static website from your S3 bucket requires a few small changes. The following steps show how to make these changes and then help you view the static site on a browser.

 

1. Select the file in the bucket that you want to use for the index page and click Properties.

You can use any file for this example. In fact, the example uses a .jpg file just for demonstration purposes. S3 shows a list of properties for the file.

 

2. Open the Permissions section. S3 shows the permissions associated with this file.

3. Click Add More Permissions. S3 adds a new permissions entry to the list.

4. Configure the property to allow Everyone the Open/Download permission.

5. Click Save. S3 makes the change to the file permissions.

 

6.  Select the bucket that you want to use for the static website; then click Properties. The Properties pane opens on the right side of the page.

 

7. Open the Static Website Hosting entry. S3 shows the static website configuration options.

 

8. Choose Enable Website Hosting.

Website Hosting

S3 asks you to provide the name of a file to use for an index page and another file to use for an error page. You need to provide only an index page entry; the error page entry is optional. Normally, you use HTML coded pages for this task, but any file will do, shown previously, with the required permissions in place.

 

9. Click the Endpoint link.

Combining S3 with Lambda

Lambda enables you to run code on AWS without provisioning or managing a server. AWS provides many other solutions for running code, but this particular solution will appeal to administrators who need to perform a task quickly and without a lot of fuss.

 

The Lambda function that you want to use must exist before you try to create a connection between it and S3. You use the following steps to connect S3 with Lambda:

 

1. Select the bucket that you want to use for the static website and then click Properties. The Properties pane opens on the right side of the page.

 

2. Open the Events entry.

Events entry

S3 shows the static website configuration options. Even though this section discusses Lambda, note that you can create other sorts of actions based on S3 events that include an SNS topic and a Simple Queue Service queue, as described in the “Using S3 events and notifications” section, earlier in this blog.

 

3. Type a name for the event. The example uses MyEvent.

 

4. Click the Events field.

You see a list of the potential events that you can use to initiate an action. The “Using S3 events and notifications” section, earlier in this blog, describes the various events.

 

5. Choose one or more of the events.

You can choose as many events as desired. Every time one of these events occurs, S3 will call your Lambda function.

 

6. Optionally type a value in the Prefix field to limit events to objects that start with a particular set of characters. The prefix can include path information so that events occur only when something changes in a particular folder or folder hierarchy.

 

7.  Optionally, type a value in the Suffix field to limit events to objects that start with a particular set of characters.

8. Select the Lambda Function in the Send To field.

9. Select a Lambda function to execute when the events occur.

You may choose an existing event or choose the Add Lambda Function ARN option to access a Lambda function by its Amazon Resource Name (ARN).

 

10. Click Save. The events you selected for the S3 bucket will now invoke the Lambda function

 

Introducing AWS CloudTrail

AWS CloudTrail

As we learned in the previous blog, AWS provides a wide variety of tools and managed services which allow you to safeguard your applications running on the cloud, such as AWS WAF and AWS Shield.

 

AWS CloudTrail provides you with the ability to log every single action taken by a user, service, role, or even API, from within your AWS account.

 

Each action recorded is treated as an event which can then be analyzed for enhancing the security of your AWS environment. The following are some of the key benefits that you can obtain by enabling CloudTrail for your AWS accounts:

 

In-depth visibility: Using CloudTrail, you can easily gain better insights into your account's usage by recording each user's activities, such as which user initiated a new resource creation, from which IP address was this request initiated, which resources were created and at what time, and much more!

 

Easier compliance monitoring: With CloudTrail, you can easily record and log events occurring within your AWS account, whether they may originate from the Management Console, or the AWS CLI, or even from other AWS tools and services.

 

The best thing about this is that you can integrate CloudTrail with another AWS service, such as Amazon CloudWatch, to alert and respond to out-of-compliance events.

 

Security automation:

Security automation

As we saw in the previous blog, automating responses to security threats not only enables you to mitigate the potential threats faster, but also provides you with a mechanism to stop all further attacks.

 

The same can be applied to AWS CloudTrail as well! With its easy integration with Amazon CloudWatch events, you can now create corresponding Lambda functions that trigger automatically each time compliance is not met, all in a matter of seconds!

 

With these key points in mind, let's have a quick look at some of CloudTrail's essential concepts and terminologies:

 

Events: Events are the basic unit of measurement in CloudTrail. Essentially, an event is nothing more than a record of a particular activity either initiated by the AWS services, roles, or even an AWS user. These activities are all logged as API calls that can originate from the Management Console, the AWS SDK, or even the AWS CLI as well.

 

By default, events are stored by CloudTrail with S3 buckets for a period of 7 days. You can view, search, and even download these events by leveraging the events history feature provided by CloudTrail.

 

Trails: Trails are essentially the delivery mechanism, using which events are dumped to S3 buckets. You can use these trails to log specific events within specific buckets, as well as to filter events and encrypt the transmitted log files. By default, you can have a maximum of five trails created per AWS region, and this limit cannot be increased.

 

CloudTrail Logs: Once your CloudTrail starts capturing events, it sends these events to an S3 bucket in the form of a CloudTrail Log file. The log files are JSON text files that are compressed using the .gzip format. Each file can contain one or more events within itself.

 

Here is a simple representation of what a CloudTrail Log looks like. In this case, the event was created when I tried to add an existing user by the name of Mike to an administrator group using the AWS Management Console:

{"Records": [{
"eventVersion": "1.0",
"userIdentity": {
"type": "IAMUser",
"principalId": "12345678",
"arn": "arn:aws:iam::012345678910:user/yohan",
"accountId": "012345678910",
"accessKeyId": "AA34FG67GH89",
"userName": "Alice",
"sessionContext": {"attributes": {
"mfaAuthenticated": "false",
"creationDate": "2017-11-08T13:01:44Z"
}}
},
"eventTime": "2017-11-08T13:09:44Z",
"eventSource": "http://iam.amazonaws.com",
"eventName": "AddUserToGroup",
"awsRegion": "us-east-1",
"sourceIPAddress": "127.0.0.1",
"userAgent": "AWSConsole",
"requestParameters": {
"userName": "Mike",
"groupName": "administrator"
},
"responseElements": null
}]}

 

You can view your own CloudTrail Log files by visiting the S3 bucket that you specify during the trail's creation. Each log file is named uniquely using the following format:

 

AccountID_CloudTrail_RegionName_YYYYMMDDTHHmmZ_UniqueString.json.gz

Where:

  • AccountID: Your AWS account ID.
  • RegionName: AWS region where the event was captured: us-east-1, and so on.
  • YYYYMMDDTTHHmmz: Specifies the year, month, day, hour (24 hours), minutes, and seconds. The z indicates time in UTC.
  • UniqueString: A randomly generated 16-character-long string that is simply used so that there is no overwriting of the log files.
  • With the basics in mind, let's quickly have a look at how you can get started with CloudTrail for your own AWS environments!

 

Working with AWS CloudTrail

CloudTrail Trail

AWS CloudTrail is a fairly simple and easy to use service that you can get started within a couple of minutes. In this section, we will be walking through a simple setup of a CloudTrail Trail using the AWS Management Console itself.

 

Creating your first CloudTrail Trail

To get started, log in to your AWS Management Console and filter the CloudTrail service from the AWS services filter. On the CloudTrail dashboard, select the Create Trail option to get started:

 

This will bring up the Create Trail wizard. Using this wizard, you can create a maximum of five-trails per region. Type a suitable name for the Trail into the Trail name field, to begin with.

 

Next, you can either opt to Apply trail to all regions or only to the region out of which you are currently operating. Selecting all regions enables CloudTrail to record events from each region and dump the corresponding log files into an S3 bucket that you specify.

 

Alternatively, selecting to record out of one region will only capture the events that occur from the region out of which you are currently operating. In my case, I have opted to enable the Trail only for the region I'm currently working out of. In the subsequent sections, we will learn how to change this value using the AWS CLI:

 

Next, in the Management events section, select the type of events you wish to capture from your AWS environment. By default, CloudTrail records all management events that occur within your AWS account.

 

These events can be API operations, such as events caused due to the invocation of an EC2 RunInstances or TerminateInstances operation, or even non-API based events, such as a user logging into the AWS Management Console, and so on. For this particular use case, I've opted to record All management events.

 

Selecting the Read-only option will capture all the GET API operations, whereas the Write-only option will capture only the PUT API operations that occur within your AWS environment.

 

Moving on, in the Storage location section, provide a suitable name for the S3 bucket that will store your CloudTrail Log files. This bucket will store all your CloudTrail Log files, irrespective of the regions the logs originated from. You can alternatively select an existing bucket from the S3 bucket selection field:

 

5. Next, from the Advanced section, you can optionally configure a Log file prefix. By default, the logs will automatically get stored under a folder-like hierarchy that is usually of the form AWSLogs/ACCOUNT_ID/CloudTrail/REGION.

 

You can also opt to Encrypt log files with the help of an AWS KMS key. Enabling this feature is highly recommended for production use. Selecting Yes in the Enable log file validation field enables you to verify the integrity of the delivered log files once they are delivered to the S3 bucket.

 

Finally, you can even enable CloudTrail to send you notifications each time a new log file is delivered to your S3 bucket by selecting Yes against the Send SNS notification for every log file delivery option.

 

This will provide you with an additional option to either select a predefined SNS topic or alternatively create a new one specifically for this particular CloudTrail. Once all the required fields are filled in, click on Create to continue.

 

With this, you should be able to see the newly created Trail by selecting the Trails option from the CloudTrail dashboard's navigation pane, as shown in the following screenshot:

 

Viewing and filtering captured CloudTrail Logs and Events

CloudTrail Logs

With the Trail created, you can now view the captured events and filter them using the event history option from the CloudTrail dashboard's navigation pane. Here, you can view the last 7 days of captured events, and even filter specific ones by using one or more supporting filter attributes.

 

Here's a quick look at the Filter attributes that you can use in conjunction with the Time range to extract the required events and logs:

  • Event ID: Each event captured by CloudTrail has a unique ID that you can filter and view.
  • Event name: The name of the event. For example, EC2 events RunInstances, DescribeInstances, and so on.
  • Event source: The AWS service to which the request was made.
  • For example, http://iam.amazonaws.com or Amazon EC2.

 

Resource name: The name or ID of the resource referenced by the event. For example, a bucket named least-prod-WordPress-code or an instance ID i-1234567 for an EC2 instance.

 

Resource type: The type of resource referenced by the event. For example, a resource type can be a Bucket for S3, an Instance for EC2, and so on.

 

User name: The name of the user that created or performed an action on the said event. For example, an IAM user logging into the AWS Management Console, and so on:

 

Once you have selected a particular filter and provided its associated attribute value, you can use the Time range to narrow your search results based on a predefined time window.

 

To analyze further, you can select the View event option present in the details pane of an Event as well. Selecting this option will view the event in a JSON format, as shown in the following code:

{
"eventVersion": "1.05",
"userIdentity": {
"type": "IAMUser",
"principalId": "AIDAIZZ25SDDZAQTF2K3I",
"arn": "arn:aws:iam::01234567890:user/yohan",
"accountId": "01234567890",
"accessKeyId": "ASDF56HJERW9PQRST",
"userName": "yohan",
"sessionContext": {
"attributes": {
"mfaAuthenticated": "false",
"creationDate": "2017-11-07T08:13:26Z"
}
},
"invokedBy": "http://signin.amazonaws.com"
},
"eventTime": "2017-11-07T08:25:32Z",
"eventSource": "Cloud Object Storage | Store & Retrieve Data Anywhere | Amazon Simple Storage Service",
"eventName": "CreateBucket",
"awsRegion": "us-east-1",
"sourceIPAddress": "80.82.129.191",
"userAgent": "http://signin.amazonaws.com",
"requestParameters": {
"bucketName": "sometempbucketname"
},
"responseElements": null,
"requestID": "163A30A312B21AB2",
"eventID": "e7b7dff6-f196-4358-be64-aae1f5e7fed6",
"eventType": "AwsApiCall",
"recipientAccountId": "01234567890"
}

You can additionally select the Download icon and select whether you wish to export all the logs using the Export to CSV or Export to JSON option. You can alternatively even download the log files by accessing your CloudTrail S3 bucket and downloading the individual compressed JSON files, as per your requirements.

 

With this, we come towards the end of this section. You can use these same steps and create different Trails for capturing data as well as management activities. In the next section, we will see how we can leverage the AWS CLI and update our newly-created Trail.

 

Modifying a CloudTrail Trail using the AWS CLI

AWS CLI

With the Trail in place, you can now use either the AWS Management Console or the AWS CLI to modify its settings. In this case, we will look at how to perform simple changes to the newly created Trail using the AWS CLI itself.

 

Before proceeding with this section, however, it is important that you have installed and configured the AWS CLI on your desktop/laptop, based on the guides provided at http://docs.aws.amazon.com/cl i/latest/userguide/installing.html.

 

Once the CLI is installed and configured, we can now run some simple commands to verify its validity. To start off, let's first check the status of our newly-created Trail by using the describe-trails command, as shown in the following command:

 

# aws cloud trail describe-trails

This will display the essential properties of your CloudTrail Trails, such as the Name, the Trail ARN, whether the log file validation is enabled or not, and whether the Trail is a multi-regional Trail or it belongs to a single region.

 

In this case, the IsMultiRegionTrail value is set to false, which means that the Trail will only record events for its current region, that is, us-east-1. Let's go ahead and modify this using the AWS CLI.

 

To do so, we will be using the update-trail command:

aws cloudtrail update-trail \ --name useast-prod-CloudTrail-01 \ --is-multi-region-trail

 

The following code will simply change the IsMultiRegionTrail value from false to true. You can verify the same by using the describe-trails command, as performed earlier.

 

Similarly, you can use the update-trail command to change other settings for your CloudTrail Trail, such as enabling the log file validation feature, as described in the following command:

 

# aws cloud trail update-trail \

--name useast-prod-CloudTrail-01 \

--enable-log-file-validation

Finally, you can even use the AWS CLI to check the current status of your Trail by executing the get-trail-status command, as shown in the following command:

aws cloud trail get-trail-status \ --name uses-prod-CloudTrail-01

 

Apart from these values, the get-trail-status command will additionally show two more fields (LatestNotificationError and LatestDeliveryError) in case an Amazon SNS subscription fails or if a CloudTrail Trail was unsuccessful at writing the events to an S3 bucket.

 

Monitoring CloudTrail Logs using CloudWatch

Monitoring CloudTrail

One of the best features of using CloudTrail is that you can easily integrate it with other AWS services for an enhanced security auditing and governance experience. One such service that we are going to use and explore here with CloudTrail is Amazon CloudWatch.

 

Using CloudWatch, you can easily set up custom metric filters and an array of alarms that can send notifications to the right set of people in case a specific security or governance issue occurs in your AWS environment.

 

To get started with CloudWatch using CloudTrail, you will first need to configure your Trail to send the captured log events to CloudWatch Logs. This can be easily configured using both the AWS Management Console and the AWS CLI.

 

Next, once this is done, you will be required to define custom CloudWatch metric filters to evaluate the log events for specific matches. Once a match is made, you can then additionally configure CloudWatch to trigger corresponding alarms, send notifications, and even perform a remediation action based on the type of alarm generated.

 

Here is a diagrammatic representation of CloudTrail's integration with CloudWatch:

In this section, we will be using the AWS CLI to integrate the Trail's logs with Amazon CloudWatch Logs. First, we will need to create a new CloudWatch Log Group using the following command:

aws logs create-log-group --log-group-name least-prod-CloudTrail-LG-01

 

Next, you will need to extract and maintain the newly created Log Group's ARN for the forthcoming steps. To do so, type in the following command and make a note of the Log Group's ARN, as shown here:

aws logs describe-log-groups

 

With the Log Group successfully created, we will now need to create a new IAM Role that will essentially enable CloudTrail to send its logs over to the CloudWatch Log Group. To do so, we first need to create a policy document that assigns the

 

AssumeRole permission to our CloudTrail Trail. Create a new file and paste the following contents into that file. Remember to to create the file with a .json extension:

# vi policy.json

{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "",
"Effect": "Allow",
"Principal": {
"Service": "http://cloudtrail.amazonaws.com"
},
"Action": "sts:AssumeRole"
}
]
}

With the file created, use the create-role command to create the role with the required permissions for CloudTrail:

aws iam create-role --role-name useast-prod-CloudTrail-Role-01 \ --assume-role-policy-document http://file://policy.json

 

Once this command executed, make a note of the newly created role's ARN. Next, copy and paste the following role policy document into a new file.

 

This policy document grants CloudTrail the necessary permissions to create a CloudWatch Logs log stream in the Log Group that you created a while back, so as to deliver the CloudTrail events to that particular log stream:

vi permissions.json
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "CloudTrailCreateLogStream",
"Effect": "Allow",
"Action": [
"logs:CreateLogStream"
],
"Resource": [
"<YOUR_LOG_GROUP_ARN>"
]
},
{
"Sid": "CloudTrailPutLogEventsToCloudWatch",
"Effect": "Allow",
"Action": [
"logs:PutLogEvents"
],
"Resource": [
"<YOUR_LOG_GROUP_ARN>"
]
}
]
}
Next, run the following command to apply the permissions to the role. Remember to provide the name of the policy that we created during the earlier steps here:
aws iam put-role-policy --role-name useast-prod-CloudTrail-Role-01
\
--policy-name cloudtrail-policy \
--policy-document http://file://permissions.json

 

The final step is to update the Trail with the Log Group ARN as well as the CloudWatch Logs role ARN, using the following command snippet:

aws cloudtrail update-trail --name useast-prod-CloudTrail-01 \ --cloud-watch-logs-log-group-arn <YOUR_LOG_GROUP_ARN> \ --cloud-watch-logs-role-arn <YOUR_ROLE_ARN>

 

With this, you have now integrated your CloudTrail Logs to seamlessly flow into the CloudWatch Log Group that we created. You can verify this by viewing the Log Groups provided under the CloudWatch Logs section of your CloudWatch dashboard.

 

 In the next section, we will be leveraging this newly created Log Group and assign a custom metric as well as an alarm for monitoring and alerting purposes.

 

Creating custom metric filters and alarms for monitoring CloudTrail Logs

With the Log Group created and integrated with the CloudTrail Trail, we can now continue to create and assign custom metric filters as well as alarms. These alarms can be leveraged to trigger notifications whenever a particular compliance or governance issue is identified by CloudTrail.

 

To begin with, let's first create a custom metric filter using CloudWatch Logs. In this case, we will be creating a simple filter that triggers a CloudWatch alarm each time an S3 bucket API call is made. This API call can be either a simple PUT or DELETE operation on the bucket's policies, life cycle, and so on:

 

Log in to your Amazon CloudWatch dashboard or, alternatively, select the link provided here to get started, at https://console.aws.ama zon.com/cloudwatch/. Once logged in, select the Logs option from the navigation pane. Select the newly created Log Group that we created a while back, and opt for the Create Metric Filter option.

 

Here, in the Create Metric Filter and Assign a Metric page, start off by providing a suitable Filter Name for the new metric, followed by populating the Filter Pattern option with the following snippet:

 

{($.eventSource = Cloud Object Storage | Store & Retrieve Data Anywhere | Amazon Simple Storage Service) && (($.eventName = PutBucketAcl)

($.eventName = PutBucketPolicy) || ($.eventName = PutBucketLifecycle) || ($.eventName = DeleteBucketPolicy) || ($.eventName = DeleteBucketLifecycle))}

 

Once done, type in a suitable Metric Namespace value followed by a Metric Name as well. Leave the rest of the values to their defaults, and select the option Create a filter to complete the process.

 

With this step completed, you now have a working CloudWatch filter up and running. In order to assign this particular filter an alarm, simply select the Create Alarm option adjacent to the filter, as depicted in the following screenshot:

 

Creating an alarm is a fairly straightforward and simple process, and I'm sure you would be more than qualified enough to set it up. Start off by providing a Name and an optional Description to your alarm, followed by configuring the trigger by setting the event count as >= 1 for 1 consecutive period.

 

Consequently, also remember to set up the Actions section by selecting an SNS Notification List or, alternatively, creating a new one. With all the settings configured, select the Create Alarm option to complete the process.

 

With this step completed, the only thing remaining is to give the filter a try! Log in to your S3 dashboard and create a new bucket, or alternatively, update the bucket policy of an existing one.

 

The CloudTrail Trail will pick up this change and send the logs to your CloudWatch Log Group, where our newly created metric filter triggers an alarm by notifying the respective cloud administrator!

 

Simply awesome isn't it? You can use more custom filters and alarms for configuring CloudWatch's notifications, as per your requirements.

 

In the next section, we will be looking at a fairly simple and automated method for creating and deploying multiple CloudWatch alarms using a single CloudFormation template.

 

Automating deployment of CloudWatch alarms for AWS CloudTrail

 CloudWatch alarms

As discussed in the previous section, you can easily create different CloudWatch metrics and alarms for monitoring your CloudTrail Log files. Luckily for us, AWS provides a really simple and easy to use CloudFormation template, which allows you to get up and running with a few essential alarms in a matter of minutes!

 

The best part of this template is that you can extend the same by adding your own custom alarms and notifications as well. So without any further ado, let's get started with it. The template itself is fairly simple and easy to work with.

 

You can download a version at https://s3-us-west-2.amazonaws.com/awscloudtrail/cloudwatch-alarms-for-cloudtrail-api-activity/CloudWatch_Alarms_for_CloudTrail_API_Activity.json.

 

At the time of writing this blog, this template supports the creation of metric filters for the following set of AWS resources:

  • Amazon EC2 instances
  • IAM policies
  • Internet gateways
  • Network ACLs
  • Security groups

 

1. To create and launch this CloudFormation stack, head over to the CloudFormation dashboard by navigating to https://console.aws.amaz on.com/cloudformation.

2. Next, select the option Create Stack to bring up the CloudFormation template selector page.

Paste https://s3-us-west-2.amazonaws.com/awscloudtrail/cloudwatch-alarms-for-cloudtrail-api-activity/C loudWatch_Alarms_for_CloudTrail_API_Activity.json in the Specify an

 

Amazon S3 template URL field, and click on Next to continue. In the Specify Details page, provide a suitable Stack name and fill out the following required parameters:

Email: A valid email address that will receive all SNS notifications. You will have to confirm this email subscription once the template is successfully deployed.

 

LogGroupName: The name of the Log Group that we created earlier in this blog.

 

Once the required values are filled in, click on Next to proceed. Review the settings of the template on the Review page and finally select the Create option to complete the process. The template takes a few minutes to completely finish the creation and configuration of the required alarms.

 

So far, we have seen how to integrate CloudTrail's Log files with CloudWatch Log Groups for configuring custom metrics as well as alarms for notifications. But how do you effectively analyze and manage these logs, especially if you have extremely large volumes to deal with?

 

This is exactly what we will be learning about in the next section, along with the help of yet another awesome AWS service called Amazon Elasticsearch!

 

Analyzing CloudTrail Logs using Amazon Elasticsearch

Amazon Elasticsearch

Log management and analysis for many organizations starts and ends with just three letters: EL, and K, which stands for Elasticsearch, Logstash, and Kibana. These three open-sourced products are essentially used together to aggregate, parse, search, and visualize logs at an enterprise scale:

 

Logstash: Logstash is primarily used as a log collection tool. It is designed to collect, parse, and store logs originating from multiple sources, such as applications, infrastructure, operating systems, tools, services, and so on.

 

Elasticsearch: With all the logs collected in one place, you now need a query engine to filter and search through these logs for particular events. That's exactly where Elasticsearch comes into play.

 

Elasticsearch is basically a search server based on the popular information retrieval software library, Lucene. It provides a distributed, full-text search engine along with a RESTful web interface for querying your logs.

 

Kibana: Kibana is an open source data visualization plugin, used in conjunction with Elasticsearch. It provides you with the ability to create and export your logs into various visual graphs, such as bar charts, scatter graphs, pie charts, and so on.

 

You can easily download and install each of these components in your AWS environment, and get up and running with your very own ELK stack in a matter of hours! Alternatively, you can also leverage AWS own Elasticsearch service!

 

Amazon Elasticsearch is a managed ELK service that enables you to quickly deploy operate, and scale an ELK stack as per your requirements. Using Amazon Elasticsearch, you eliminate the need for installing and managing the ELK stack's components on your own, which in the long run can be a painful experience.

 

For this particular use case, we will leverage a simple CloudFormation template that will essentially set up an Amazon Elasticsearch domain to filter and visualize the captured CloudTrail Log files.

CloudFormation

1. To get started, log in to the CloudFormation dashboard, at https://console.aws.amazon.com/cloudformation.

 

2. Next, select the option Create Stack to bring up the CloudFormation template selector page.

Paste http://s3.amazonaws.co m/concurrencylabs-cfn-templates/cloudtrail-es-cluster/cloud trail-es-cluste r.json in, the Specify an Amazon S3 template URL field, and click on Next to continue.

 

In the Specify Details page, provide a suitable Stack name and fill out the following required parameters:

 

AllowedIPForEsCluster: Provide the IP address that will have access to the nginx proxy and, in turn, have access to your Elasticsearch cluster. In my case, I've provided my laptop's IP.

 

Note that you can change this IP at a later stage, by visiting the security group of the nginx proxy once it has been created by the CloudFormation template.

 

CloudTrailName: Name of the CloudTrail that we set up at the beginning of this blog.

KeyName: You can select a key-pair for obtaining SSH to your nginx proxy instance:

LogGroupName: The name of the CloudWatch Log Group that will act as the input to our Elasticsearch cluster.

ProxyInstanceTypeParameter: The EC2 instance type for your proxy instance. Since this is a demonstration, I've opted for the t2.micro instance type. Alternatively, you can select a different instance type as well.

 

Once done, click on Next to continue. Review the settings of your stack and hit Create to complete the process.

 

The stack takes a good few minutes to deploy as a new Elasticsearch domain is created. You can monitor the progress of the deployment by either viewing the CloudFormation's Output tab or, alternatively, by viewing the Elasticsearch dashboard.

 

Note that, for this deployment, a default t2.micro.elasticsearch instance type is selected for deploying Elasticsearch. You should change this value to a larger instance type before deploying the stack for production use.

 

You can view information on Elasticsearch Supported Instance Types at http://docs.aws.amazon.com/elasticsearch-service /latest/developerguide/aes-supported-instance-types.html. 

 

With the stack deployed successfully, copy the Kibana URL from the CloudFormation Output tab: "KibanaProxyEndpoint": "http://<NGINX_PROXY>/_plugin/kibana/"

 

The Kibana UI may take a few minutes to load. Once it is up and running, you will need to configure a few essential parameters before you can actually proceed. Select Settings and hit the Indices option. Here, fill in the following details:

 

The index contains time-based events: Enable this checkbox to index time-based events

Use event times to create index names: Enable this checkbox as well Index pattern interval: Set the Index pattern interval to Daily from the drop-down list Index name of the pattern:

Type [cwl-]http://YYYY.MM.DD into this field Time-field name: Select the @timestamp value from the drop-down list

 

Once completed, hit Create to complete the process. With this, you should now start seeing logs populate on to Kibana's dashboard. Feel free to have a look around and try out the various options and filters provided by Kibana:

 

Phew! That was definitely a lot to cover! But wait, there's more! AWS provides yet another extremely useful governance and configuration management service that we need to learn about as well, so without any further ado, here's introducing AWS Config!

 

Introducing AWS Config

AWS Config

AWS Config is yet another managed service, under the security and governance wing of services, that provides a detailed view of the configurational settings of each of your AWS resources.

 

Configurational settings here can be anything, from simple settings made to your EC2 instances or VPC subnets, to how one resource is related to another, such as how an EC2 instance is related with an EBS volume, an ENI, and so on.

 

Using AWS Config, you can actually view and compare such configurational changes that were made to your resource in the past, and take the necessary preventative actions if needed.

 

Here's a list of things that you can basically achieve by using AWS Config:

  • Evaluate your AWS resource configurations against the desired setting
  • Retrieve and view historical configurations of one or more resources
  • Send notifications whenever a particular resource is created, modified, or deleted
  • Obtain a configuration snapshot of your resource that you can later use as a blueprint or template
  • View relationships and hierarchies between resources, such as all the instances that are part of a particular network subnet, and so on

 

Using AWS Config enables you to manage your resources more effectively by setting governing policies and standardizing configurations for your resources. Each time a configuration change is violated, you can trigger off notifications or even perform remediation against the change.

 

Furthermore, AWS Config also provides out-of-the-box integration capabilities with the likes of AWS CloudTrail, as well to provide you with complete end-to-end auditing and compliance monitoring solution for your AWS environment.

 

Before we get started by setting up AWS Config for our own scenario, let's first take a quick look at some of its important concepts and terminologies.

 

Concepts and terminologies

Concepts

The following are some of the key concepts and terminologies that you ought to keep in mind when working with AWS Config:

 

Config rules: Config rules form the heart of operations at AWS Config. These are essentially rules that represent the desired configuration settings for a particular AWS resource.

 

While the service monitors your resources for any changes, these changes get mapped to one or more set of config rules, that in turn flag the resource against any non-compliances.

 

AWS Config provides you with some rules out of the box that you can use as-is or even customize as per your requirements. Alternatively, you can also create custom rules completely from scratch.

 

Configuration items: Configuration items are basically a point-in-time representation of a particular AWS resource's configuration. The item can include various metadata about your resource, such as its current configuration attributes, and its relationships with other AWS resources, if any, its events, such as when it was created, last updated, and so on.

 

Configuration items are created by AWS Config automatically each time it detects a change in a particular resource's configuration.

 

Configuration history: A collection of configuration items of a resource over a particular period of time is called its configuration history.

 

You can use this feature to compare the changes that a resource may undergo over time, and then decide to take necessary actions. Configuration history is stored in an Amazon S3 bucket that you specify.

 

Configuration snapshot: A configuration snapshot is also a collection of configuration items of a particular resource over time. This snapshot acts as a template or benchmark that can then be used to compare and validate your resource's current configurational settings. 

 

With this in mind, let's look at some simple steps which allow you to get started with your own AWS Config setup in a matter of minutes!

 

Getting started with AWS Config

Getting started with AWS Config is a very simple process, and it usually takes about a minute or two to complete.

 

Overall, you start off by specifying the resources that you want AWS Config to record, configure an Amazon SNS topic, and Amazon S3 bucket for notifications and storing the configuration history, and, finally, add some config rules to evaluate your resources:

 

To begin, access the AWS Config dashboard by filtering the service from the AWS Management Console or by navigating to https://console.aws.amazon.com/config/.

Since this is our first time configuring this, select the Get Started option to commence the Config's creation process.

 

In the Resource types to record section, select the type of AWS resource that you wish config to monitor. By default, config will record the activities of all supported AWS resources.

 

You can optionally specify only the services which you want to monitor by typing in the Specific types field, as shown in the following screenshot. In this case, I've opted to go for the default values: Record all resources supported in this region and Include global resources:

 

Next, select a location to store your configuration history as well as your configuration snapshots. In this case, I've opted to create a new S3 bucket for AWS Config by providing a unique Bucket name.

 

Moving on, in the Amazon SNS topic section, you can choose to create a new SNS topic that will send email notifications to your specified mailbox, or choose a pre-existing topic from your account.

 

Finally, you will need to provide config with a Read-only access role so that it can record the particular configuration information as well as send that over to S3 and SNS.

 

Based on your requirements, you can either Create a role or, alternatively, Choose a role from your account. Click Save to complete the basic configuration for your AWS Config.

 

With this step completed, we can now go ahead and add Config rules to our setup. To do so, from the AWS Config dashboard's navigation pane, select the Rules and click on the Add rule option.

 

In the AWS Config rules page, you can filter and view predefined rules using the filter provided. For this particular scenario, let's go ahead and add two rules for checking whether any of the account's S3 buckets have either public read prohibited or public write prohibited on them or not.

 

To do so, simply type in S3-bucket in the filter and select either of the two config rules.

 

Resources: When any resource that matches the evaluation criteria is either created, modified, or deleted

Tags: When any resource with the specified tag is created, modified, or deleted

All changes: When any resource recorded by AWS Config is created, modified, or deleted 

 

8. Selecting a particular rule will pop up that rule's configuration page, where you can define the rule's trigger as well as its scope. Let's pick the s3-bucket-public-read-prohibited rule for starters and work with that.

 

In the Configure rule page, provide a suitable Name and Description for your new rule. Now, since this is a managed rule, you will not be provided with an option to change the Trigger type.

 

However, when you create your own custom rules, you can specify whether you wish to trigger the rule based on a Configuration change event or using a Periodic check approach that uses a time-frequency that you specify to evaluate the rules.

 

Next, you can also specify when you want the rule's evaluations to occur by selecting the appropriate options provided under the Scope of changes section. In this case, I've opted for the Resources scope and selected S3: Bucket as the resource, as depicted in the following screenshot:

 

Optionally, you can also provide the ARN of the resource that you wish config to monitor using the Resource identifier field. Click on Save once done. 

 

Similarly, using the aforementioned steps, create another managed config rule called s3-bucket-public-write-prohibited. With the rules in place, select the Resources option from the config's navigation pane to view the current set of resources that have been evaluated against the set compliance.

 

In my case, I have two S3 buckets present in my AWS environment: one that has public read enabled on it while the other doesn't. Here's what the Resources evaluated dashboard should look like for you:

 

Here, you can view the evaluated resources against a Config timeline by simply selecting the name of the resource from the column with the same name. This will bring up a time series of your particular resource's configuration state.

 

You can choose between the different time series options to view the state changes, as well as toggle between the time periods using the Calendar icon.

 

The best part of using this feature of the config is that you can simultaneously change your resource's configuration by selecting the Manage resource option. Doing so will automatically open the S3 buckets configuration page, as in this case.

 

You can alternatively select the Dashboard option from AWS Config navigation pane and obtain a visual summary of the current status of your overall compliance, as depicted in the following screenshot:

 

You can use the same concepts to create more such managed config rules for a variety of other AWS services, including EC2, EBS, Auto Scaling, DynamoDB, RDS, Redshift, CloudWatch, IAM, and much more!

For a complete list of managed rules, check out http://docs.aws.amazon.com/config/la test/developerguide/managed-rules-by-aws-config.html.

 

With the managed config rules done, the last thing left to do is create a customized config rule, which is exactly what we will be covering in the next section.

 

Creating custom config rules

The process for creating a custom config rule remains more or less similar to the earlier process, apart from a few changes here and there.

 

In this section, we will be exploring how to create a simple compliance rule that will essentially trigger a config compliance alert if a user launches an EC2 instance other than the t2.micro instance type:

 

To get started, select the Rules option from the AWS Config navigation pane, then select the Add custom rule button present on the Add rule page. The creation of the custom rule starts off like any other, by providing a suitable Name and Description for the rule. Now, here's where the actual change occurs.

 

Custom config rules rely on AWS Lambda to monitor and trigger the compliance checks. And this is actually perfect, as Lambda functions are event-driven and perfect for hosting the business logic for our custom rules.

 

Select the Create AWS Lambda function to get things started. Here, I'm going to make use of a pre-defined Lambda blueprint that was essentially created to work in conjunction with AWS Config.

 

Alternatively, you can create your config rule's business logic from scratch, and deploy the same in a fresh function. For now, type in the following text in the Blueprints filter, as shown in the following screenshot (config-rule-change-triggered):

 

Ensure that the blueprint is selected, and click on Next to continue.

In the function's Basic Information page, provide a Name for your function followed by selecting the Create new role from template(s) option from the Role drop-down list.

 

The role will essentially provide the Lambda function with the necessary permissions to read from EC2 and write the output back to AWS Config as well as to Amazon CloudWatch.

 

Type in a suitable Role name and select the Create function option to complete the process. Once the function is deployed, make a note of its ARN, as we will be requiring the same in the next step.

 

Return back to the AWS Config Add custom rule page and paste the newly created function's ARN in the AWS Lambda function ARN file, as shown in the following screenshot:

 

With the function's ARN pasted, the rest of the configuration for the custom rule remains the same. Unlike the managed rules, you can opt to change the Trigger type between Configuration changes or Periodic, as per your requirements.

 

In this case, I've opted to go for the Condition changes as my trigger mechanism, followed by EC2: Instance as the Resource type.

 

Last, but not least, we also need to specify the Rule parameters, which is basically a key-value pair that defines an attribute against which your resources will be validated. In this case, the desired instance type is the Key and t2.micro is the Value. Click Save to complete the setup process:

 

With the rule in place, all you need to do now is take it for a small test run! Go ahead and launch a new EC2 instance that is other than t2.micro. Remember that the instance has to be launched in the same region as that of your Lambda function! Sure enough, once the instance is launched, the change gets immediately reflected in AWS Config's dashboard:

 

With this, we come towards the end of this section as well as the blog! However, before we conclude, here's a quick look at some interesting best practices and next steps that you ought to keep in mind when working with AWS CloudTrail and AWS Config!

 

Tips and best practices

Here's a list of a few essential tips and best practices that you ought to keep in mind when working with AWS CloudTrail, AWS Config, and security in general:

 

Analyze and audit security configurations periodically: Although AWS provides a variety of services for safeguarding your cloud environment, it is the organization's mandate to ensure that the security rules are enforced and periodically verified against any potential misconfigurations.

 

Complete audit trail for all users: Ensure that all resource creation, modifications, and terminations are tracked minutely for each user, including root, IAM, and federated users.

 

Enable CloudTrail globally: By enabling logging at a global level, CloudTrail can essentially capture logs for all AWS services, including the global ones such as IAM, CloudFront, and so on.

 

Enable CloudTrail Log file validation: An optional setting, however, it is always recommended to enable CloudTrail Log file validations for an added layer of integrity and security.

 

Enable access logging for CloudTrail and config buckets: Since both CloudTrail and config leverage S3 buckets to store the captured logs, it is always recommended that you enable access tracking for them to log unwarranted and unauthorized access. Alternatively, you can also restrict access to the logs and buckets to a specialized group of users as well.

 

Encrypt log files at rest: Encrypting the log files at rest provides an additional layer of protection from unauthorized viewing or editing of the logged data.

Recommend