Filter in Google Analytics (2019)

Filter in Google Analytics Google Search filters 2018

The filter in Google Analytics Search 2018

Google Analytics and Google Tag Manager include a number of tools that help you clean up data to make it as useful as possible, including filters and view settings in Google Analytics and blocking triggers and overriding default values in Google Tag Manager.

 

Like all changes in Google Analytics and Google Tag Manager, it’s important to test before applying these tools to the data you use for reporting and analysis. In Google Analytics, because changes only affect the data going forward, a careful use of test and unfiltered views to try out and compare results is important.

 

The most common needs for cleaning up data include removing internal traffic and standardizing and organizing URLs. There are a number of approaches using the tools in Google Tag Manager and Google Analytics appropriate for different situations, as well as some specialized tools for dealing with specific kinds of content (such as URL query parameters & fragments and site search results URLs).

 

Tools for Cleaning Up Data

Tools for Cleaning Up Data

You have two opportunities to remove or change data that ultimately end up in Google Analytics reports:

In Google Tag Manager, blocking or changing data at the point of collection using triggers and variables. In Google Analytics, excluding or changing data in a view after it’s received, using filters and other settings. Let’s take a look at the options available.

 

Google Tag Manager: Blocking Triggers and Overriding Default Values First, let’s talk about the ways that Google Tag Manager can remove or change data.

 

In Google Tag Manager, you can prevent data from being sent at all to Google Analytics by an appropriate use of triggers. You also have the opportunity to change or override the default values collected by Google Analytics tags or other tags in Google Tag Manager by using variables.

 

Blocking Triggers

Google Tag Manager uses triggers to say “fire this tag when” some set of criteria occurs. It could be a page view, an auto-event like a click or form submission, or a custom event pushed to the data layer.

 

You can also specify blocking triggers. The combination of firing triggers and blocking triggers say “fire this tag when X occurs, except when Y occurs”—that is, you can set some exceptions that prevent the tag from firing. You can add blocking triggers to a tag in the same way that you set firing triggers, choosing to Create Exceptions in the tag setup flow.

 

A trigger must always include an event pushed to the data layer. If you select the trigger type for pageviews, the built-in Google Tag Manager.js event is used. If you select click, form or one of the other auto-event options, the auto-event labels (gtm.click, etc.) are used. If you’re specifying custom criteria, you need to specify which event or events the trigger should block on.

 

Tip  in many cases, the firing trigger and blocking trigger will operate on the same event: “Fire on all page views, except pageviews where the URL contains /blog”, for example. However, sometimes you may want to block based on some variable for all events (page views, clicks, etc.). “Fire on X (any trigger), except when Google Tag Manager is in preview mode” might be one example.

 

In this case, you can use a custom event trigger and specify a regular expression of “.*” (match anything) to block when any event occurs. Then you can reuse this trigger as a blocking trigger across many different types of firing triggers for tags. You’ll see several examples throughout this blog.

 

Google Analytics: Filters and Views

google_filter

Next, let’s take a look at how you can change data with Google Analytics. Each property can have one or more views, which are different buckets of data about that site.

 

Creating a view is as easy as simply selecting Create New View from the bottom of the view drop-down menu in the rightmost column of the Admin tab in Google Analytics. Keep in mind that the view begins accumulating data from the property from the time it’s created, but is not filled with data retroactively.

 

Filters allow you to include or exclude specific data from a view (include just a subsection of a site, for example, or exclude your own employees’ interactions on the site). They can also enable you to make changes to data (to clean up URL formats or other information).

 

In a view’s settings (the rightmost column) you can see Filters listed as one of the options. Each view can have different filters applied, resulting in a different set of data. You can see any existing filters applied to the view in this list. Selecting a filter from this list allows you to edit it or remove it from the view.

Filters can be added to a view in one of two ways:

  • Apply an existing filter from the account to the view.
  • Create a new filter to apply to the view.

 

Once created, filters are stored at the account level. This means that you can create a filter once and apply it to as many views within the account as you like. (It also means that a user needs Edit access at the account level to create new filters.) You can see a list of all filters in the account listed in the Admin area under All Filters in the account column (the leftmost column), where you can also quickly apply a filter to multiple views.

 

To apply an existing filter or create a new one, select the button at the top of the list of filters. You have the option to apply one or more existing filters or to create a new filter

 

Types of Filters

Types of Filters

When creating a new filter, you have two main choices:

  • Predefined filters cover a handful of common use cases, with easy-to-use drop-down menus to define the filter options.
  • Custom filters allow more flexibility and the use of regular expressions to match patterns.

 

Predefined Filters

The predefined filters use a series of drop-down menus to specify data to include or exclude in the view that covers a few common use cases, such as excluding an IP address or including only a specific subdirectory of the site. (You’ll see both of those examples later in the blog.)

 

Custom Filters

filter

For more complex needs, custom filters offer a variety of options. First, there are a number of different types of custom filters:

  • Exclude: exclude data that matches a regular expression
  • Include: include only data that matches a regular expression
  • Lowercase: convert a field’s alphabetic characters into lower case

 

  • Uppercase: convert a field’s alphabetic characters into upper case
  • Search and Replace: find a regular expression match in a field and replace it with text
  • Advanced: find regular expression matches in one or two fields and rearrange the text from those fields

 

You’ll see each of these filter types in action in applications later in the blog. When you choose one of these filter types, you’ll see that Google Analytics provides a very long drop-down list of fields to filter on.

 

This list includes all the fields you might expect, but there’s a little challenge: the names used for fields in this list don’t always match up very well with how corresponding dimensions are named in the reports in Google Analytics.

 

For example, this menu includes Request URI, which is the same as the dimension simply referred to as Page in GA’s reports. As you look at examples in this blog, you’ll see several of the most commonly used fields from this list that may be useful, and there’s a full accounting in the Google Analytics documentation if you can’t quite decide which field is the right one.

 

All of the filter types which allow you to match values in a field (Include, Exclude, Search and Replace, Advanced) use regular expressions as input. This allows you to be as specific (one exact value) or as broad (multiple values, or some common pattern across values) as you need to be.

 

Caution   In regular expressions, characters like the dot (.) and question mark (?) have special meaning. Sometimes these appear in the strings you are matching, however (consider a URL like /alice.html.potion=drink-me). You need to be careful, and for any special regular expression characters, you’d like to use as the literal character, preface it with a backslash (\).

 

Characters that require a backslash to be treated as ordinary characters are: \.?+*[(^$ Testing and Troubleshooting Views and Filters in Google Analytics. 

 

For the changes you make in Google Tag Manager, you can use GTM’s preview mode and other browser tools to inspect the data being sent to Google Analytics. But what about the changes you make with views and filters in Google Analytics? These are applied after the data is received, so browser-based testing tools don’t help you.

 

filter

The critical characteristic of any changes you make in GA’s Admin interface is that they take effect only from the time you make the change going forward. Changes don’t retroactively affect the data in the reports in a view. This has important repercussions for testing changes.

 

Suppose you add a filter to a view, and later (an hour, a day, a week) you find there’s a typo, and you accidentally filtered out something you didn’t mean to. There’s no way to undo this retroactively in the view’s data. You can correct the typo in the filter, which fixes data going forward, but for the intervening hour or day or week where the data was wrong, it stays wrong.

 

With this constraint in mind, to help with testing and troubleshooting filters and other view configuration settings, you’ll also want to create the following views:

 

An unfiltered view for troubleshooting and backup. If you’re filtering traffic out, how do you know what you’ve removed without an unfiltered set of data to compare it to?

 

An unfiltered view gives you a point of comparison. Create this view and don’t apply any filters to it. The other advantage of an unfiltered view is as a backup: if something goes wrong and you accidentally lose some data, you can always refer to the unfiltered view.

 

One or more test views to test changes to filters and other view configuration settings and ensure that they work properly. In this way, you have a place to try changes with no adverse consequences if you make a mistake (since you’re not actually using the data in these test views). These views would, of course, be in addition to any views you actually use for reporting and analysis of your site.

 

Tip  If you’re setting up a site from scratch, create these views at the same time you set up the site in Google Analytics so that the data goes back to the very beginning. If you’re inheriting a site that has been previously set up, create these views right away so that they start gathering data to aid you as you make changes and updates going forward.

Once you’ve created a test view, you can use it to try out configuration changes in the Admin settings, and then look at the data in the reports of the test view to make sure they match your expectations.

 

Because the data in Google Analytics is typically reported on by day, and the changes only take effect going forward, changes made in the middle of a day can result in a mixed set of data (both before and after the change) for the day when the change was made. To get a clear view of whether a filter is working correctly and the data is “clean,” you generally need to wait until the next day. The process typically goes something like this:

 

Make a change to a filter or other view configuration setting in the test view.

Wait until sometime in the next calendar day.

  • Check the relevant report(s) in Google Analytics, changing the date range to the current day.
  • Did you get it right?
  • If everything looks good, go ahead and apply the same filter or change the same setting in the reporting view(s).
  • If not, repeat this process from step 1.

 

Note  For certain types of data, you may be able to use the Real-Time reports to see the effect of a newly applied filter, but those reports don’t have full coverage of the types of data in Google Analytics and may be difficult to use in comparing two sets of data.

 

Partitioning Internal Traffic

Internal traffic

One of the most common desires is to separate website traffic by internal users from that of the general public, the customers and prospects whose website behavior is of interest to you. Internal users have very different behavior from customers, are excluded from the targets of your marketing activities, and are often engaged in testing website functionality.

 

Although data about these activities may be useful to you, it’s useful in ways different from the activities of external users, so you’d like to be able to separate them.

 

The following sections explore two main topics connected to this challenge:

  • Removing (or separating out) traffic from internal users to avoid “polluting” your customer data.
  • Dealing with multiple versions of a site for development, testing, and staging to keep testing activity separate from your live website.

 

Removing Internal Traffic

“Internal traffic” includes your organization’s employees, as well as third parties who work on your website on your behalfs, like advertising agencies or web developers. Usually, you’re not interested in their usage of your website in the same context as external users—your actual audience—so you’d like to eliminate or separate them out.

 

There are several different ways to identify and block this internal traffic (by IP address in Google Tag Manager or in Google Analytics, or by Service Provider in GA), which you’ll explore in the following sections. Different organizations define and treat internal traffic differently, so assess all of the options available to understand what will work best for your situation.

Note  None of these methods can be absolutely perfect: your employees sometimes travel, or work from home, or other situations in which it would be hard to identify them as “internal.” Do your best, but you should expect some data will leak through in any case.

Using IP Addresses

IP Addresses

If you’re targeting particular IP addresses to exclude internal traffic, you have two potential options:

  • Apply a blocking trigger to your Google Analytics tags in Google Tag Manager. This prevents your site from sending any data about the internal IP addresses.

 

  • Apply a filter to a view in Google Analytics. This allows you to exclude internal traffic, while still collecting internal traffic data in an unfiltered view in Google Analytics.

 

If your only goal is to exclude internal traffic, either method works just fine. However, in many cases, you might be interested in including only internal traffic in some views in Google Analytics.

 

For example, imagine you have a university website, and you wish to create views for both on-campus and off-campus traffic based on IP addresses. In this case, you need to collect data for all IP addresses using Google Tag Manager and filter the data in views in Google Analytics (the latter method).

 

BLOCKING INTERNAL TRAFFIC IN NON-GA TAGS

Whether you separate internal traffic in Google Analytics or in Google Tag Manager, first you need to know what IP addresses to filter out. Although you can check your own IP address (Googling “what is my IP address” does the trick), what you won’t know from that is whether everyone in your office network shares a single IP or a range, whether there are additional IP addresses for other office locations, and other details about your internal network.

 

You’ll need to get a comprehensive list or range of IP addresses from a networking guru in the IT department to reliably filter out your internal traffic.

Now let’s take a look at the two different methods of removing traffic by IP address, starting with blocking triggers in Google Tag Manager. Google Tag Manager doesn’t have a built-in variable for IP address, and it’s not a value accessible via JavaScript alone, so you need to provide Google Tag Manager a way to access the value. The natural way to do this is simply to add it to the data layer declaration:

 

data layer = [{ 'ipAddress': '93.184.216.34' }]; This is quite straightforward to do with any server-side programming language.

 

PREVENT SENDING TRAFFIC BY IP ADDRESS IN Google Tag Manager

PREVENT SENDING TRAFFIC BY IP ADDRESS IN Google Tag Manager

First, you need a variable for IP address. Let’s assume it’s already in the data layer, as shown earlier.

  • In Google Tag Manager, create a new variable.
  • For the variable type, choose Data Layer Variable.

 

  • For the data layer variable name, enter IP address (or whatever label you used for the IP address in the data layer).
  • Save the variable, giving it a name: “IP Address”.

 

  • Now you can create a trigger based on the {{IP Address}} variable. For the trigger type, you’ll need to use a regular expression to match any event, since you want to match pageviews, auto-event triggers, and any other events that occur.

 

  • In Google Tag Manager, create a new trigger.
  • For the event, choose Custom Event.

 

  • For the event name, select the Use Regex checkbox and enter .*
  • Select Add Filters to apply a condition.

 

  • For the condition, choose the {{IP Adress}} variable and an appropriate condition and value for the specific IP address(es) you’d like to exclude.
  • Save the trigger, giving it a name: “Blocking – Internal IPs”.

 

You can repeat the preceding steps to create multiple triggers if there’s a list of IP addresses you’d like to block. Now you’re finally ready to apply this trigger to block a tag.

 

 

  • Select Create Exceptions.
  • Choose the Blocking – Internal IPs trigger you created earlier. If you created multiple blocking triggers (for multiple IP addresses), you can choose multiple triggers.

 

  • Save the changes to the tag.
  • Don’t forget to preview and publish the changes to the container so the new blocking trigger takes effect.

 

Next, let’s take a look at the second method of using IP address, with a filter in Google Analytics. Google Analytics automatically captures the IP address, so there’s no need for a variable for that value in the data layer. Additionally, Google Analytics will not only allow you to exclude one or more IP addresses but also include only internal traffic in a view.

 

FILTER TRAFFIC BY IP ADDRESS IN Google Analytics

FILTER TRAFFIC BY IP ADDRESS IN Google Analytics

You’ll create a filter on a single view, but once it’s been created, you can apply it to as many views as you like. In the Admin area of Google Analytics, with the appropriate account, property, and view selected, choose Filters in the third column to see the filters for the view. Select the New button to add a new filter.

 

  • Choose to create a new filter (rather than apply an existing one).
  • Give the filter a name: “Exclude Internal IPs”.

 

  • You can use a predefined filter for simple cases (an exact IP, IPs that begin with a particular value), or a custom filter to use regular expressions for multiple values or ranges:

 

  • Select the Predefined tab, and use the drop-down menus to exclude traffic from an IP address
  • “Equals” or “starts with” can both be useful options (for a single IP address and an entire block of IP addresses, respectively).

 

  • Alternatively, select the Custom tab. Choose to Exclude as the filter type and IP Address as the field. Then, enter a regular expression for the IP address(es) you wish to exclude.
  • Save the filter.

 

Remember that this filter will only affect data going forward. Past data collected from this IP address will not be affected. If you instead wanted to create a view with only internal traffic, you can simply change the selection of “exclude” to “include” in this filter.

 

Tip  Google Analytics starts with the entire set of data for the property, then applies the filters to find what is left over. This means that it can make sense to apply multiple exclude filters (get rid of two different IP addresses, for example). Conversely, however, it does not make sense to apply multiple include filters on the same field—include means include only.

 

So if you say, include only IP address A and then include only IP address B, there's nothing left in the resulting data. Instead, you must use regular expressions to match multiple values in a single filter. Now that you’ve created this filter, you can apply it to as many views as you need.

 

Using Service Providers

Using Service Providers

Google Analytics contains another label that is sometimes useful for filtering internal traffic: a dimension called Service Provider. The Service Provider dimension contains a label for the user’s internet service provider, based on their IP.

 

For many consumers and some businesses, the internet service provider listed will be the telecommunications company through which they have internet access (a phone or cable company). Larger businesses, institutions, and government agencies, however, may be listed as their own service provider for the range of IP addresses they are assigned.

 

In these cases, you can use a filter in Google Analytics for the service provider label. This approach has the advantage that you don’t need to know the IP address ranges or express them in regular expressions.

 

You can check if your organization is listed as its own service provider in the Audience ➤ Technology ➤ Network report in Google Analytics.

 

Notice that many of the entries are telecommunications providers, but you may also see your own company listed here. Use the search function at the top of the report to search for your organization’s name to see if you can find yourself. (Also try various abbreviations for your organization’s name, and keep in mind that if you have multiple office locations or subsidiaries, they may be listed in multiple entries.)

 

If the name of your organization appears here, you can create a filter in Google Analytics to exclude your internal traffic. (No such luck for you? See the previous section on excluding based on IP addresses.)

 

EXCLUDE TRAFFIC BY SERVICE PROVIDER WITH A FILTER IN Google Analytics

In the Admin area of Google Analytics, with the appropriate account, property, and view selected, choose Filters in the third column to see the filters for the view. Select the New button to add a new filter.

  • Choose to create a new filter (rather than apply an existing one).
  • Give the filter a name: “Exclude Internal Service Provider”.

 

  • Choose the Custom tab.
  • Choose Exclude as the filter type.

 

  • Choose ISP Organization as the field. (This is an example of poor naming in the filter field drop-down compared to the dimension names in Google Analytics: here, ISP Organization, in Google Analytics reports, Service Provider.)

 

  • Enter a regular expression for the service provider value to match (see the following screenshot).
  • Save the filter.

 

Remember that this filter will only affect data going forward. Past data collected from this IP address will not be affected. If you instead wanted to create a view with only internal traffic, you could simply change the selection of “exclude” to “include” in this filter. Now that you’ve created this filter, you can apply it to as many views as you need.

 

Separating Test and Production Environments

test

In many cases, you may already have taken care of excluding internal traffic in Google Analytics (see the previous section in this blog), which would typically include your employees’ or agency’s testing of your website. However, you might want to be able to separately view such testing traffic to ensure that everything is working correctly—collecting test data about new site content or features before they go live on the website to ensure they’re working and recording data as expected.

 

Your exact setup will depend on your environments and how you’d like to be able to view the data. Let’s assume for example purposes that you have the following environments:

  • dev.ThesisScientist.com.com
  • staging.ThesisScientist.com.com
  • www.ThesisScientist.com.com

 

You’d like to keep the messy, internal testing data from the development and staging sites separate from the good, clean, customer-oriented data you use to analyze your site’s content and marketing on the production site.

 

You can accomplish this by creating additional properties in Google Analytics for the development and staging sites and using Google Tag Manager to route data to the correct property. The next section will walk through this process.

 

There’s another kind of testing traffic you should consider: when you use GTM’s preview mode for testing changes to tags and triggers on the live production website (www.ThesisScientist.com.com, in this example). You might also want to separate this data into another property. A later section will look at that case later as well.

 

USE A LOOKUP TABLE VARIABLE TO PARTITION PREVIEW MODE

First, you’ll create an additional lookup table that separates preview mode. If it’s not already enabled, enable the built in variable {{Debug Mode}}. This variable is true if Google Tag Manager is in preview mode.

 

  • Create a new variable. Choose Lookup Table as the type.
  • As the Input Variable, choose the {{Debug Mode}} variable.

 

  • Add a row to the table.
  • For the input, enter true.

 

  • For the output, enter a Google Analytics Property ID where you’d like data for that site to be routed when Google Tag Manager is in preview mode.

 

  • Set a default value by checking the box and entering a Google Analytics Property ID where you’d like data to be routed when Google Tag Manager is not in preview mode.

 

  • Save the variable, giving it a name: “GA Property ID – Preview Mode”.
  • Now you can use this value in the overall {{GA Property ID}} variable, which you changed to a lookup table in the previous section.

 

  • Edit the {{GA Property ID}} variable.
  • Select Configure Variable to edit the table.

 

  • In the row containing the live/production website (in this example, www.ThesisScientist.com.com), replace the output with the {{GA Property ID – Preview Mode}} variable.
  • Save the changes to the variable.

 

After publishing this container, Google Tag Manager will continue to route traffic for the development and staging websites as before. Now, for the live website, if it’s in preview mode, it will route to a test property, while the live production data continues to route to the main property.

 

Cleaning Up and Grouping Content

Cleaning Up and Grouping Content

URLs are just one of many pieces of information captured in Google Analytics, but they’re often one of the messiest. In an ideal world, you have a site with URLs that:

  • are easy to interpret and grouped together in a logical hierarchy,
  • contain no extraneous or unnecessary information, and
  • uniquely represent a single page.

 

Often you don’t have complete control over the structure and format of URLs on your site (because of particular software being used or historical reasons, for example).

 

In many cases, you need to work with URLs on your website that are made for machines, not for people. By default, Google Analytics captures the URL that you see in the browser’s location bar, exactly as it appears. In GA’s reports, these URLs are broken up into two dimensions:

 

  • Hostname, which contains the domain name (which does not include the protocol and punctuation, such as http:// or https://)
  • Page, which contains the rest of the URL, including the path and query parameters, but not the fragment

 

So for example, in the following URL, Google Analytics shows these values for the Hostname and Page dimensions:

Hostname: www.ThesisScientist.com

 

Since Google Analytics captures these pieces from the URL exactly as they appear, you can encounter inconsistencies in your data that aren’t ideal, where there are several variations of a URL that all correspond to what is essentially the same page on your site. 

 

These can include inconsistencies in capitalization, query parameters, and more. This section will take a look at each of these issues and some best practices for dealing with them, as well as methods for gathering URLs into groups, such as by topic categories or by internal search results pages.

 

Important  Changes to URLs can also affect goal conversions based on those URLs. Make sure to update your goal configurations if necessary after you have made changes to URLs through filters.

 

Enforcing Case in URLs

Technically speaking, URLs are case sensitive. That is, http://example.com/jabberwock and http://example.com/JabberWock don’t have to go to the same page. However, in practice this is almost never how websites use URLs; usually, your web server or content management system will deliver the same page regardless of capitalization.

 

However, since Google Analytics simply captures the URL as it appears, it will potentially show the same URL capitalized in different ways in your reports

 

This isn’t very helpful—if you just want to know how many people viewed the page, you have to do the arithmetic yourself. Instead, you’d rather have one, consistent entry for each page in your reports, regardless of capitalization. To do that, you can apply a filter to this data in Google Analytics.

 

USE A FILTER TO LOWERCASE URLS

USE A FILTER TO LOWERCASE URLS

In the Admin area of Google Analytics, with the appropriate account, property, and view selected, choose Filters in the third column to see the filters for the view. Select the New button to add a new filter.

  • Choose to create a new filter 
  • Give the filter a name: “Lowercase URLs”.
  • Choose the Custom tab.

 

  • Choose Lowercase as the filter type.
  • Choose Request URI as the field. (Another example of the mismatch between the filter field drop-down and the dimension names in Google Analytics: here, Request URI, in Google Analytics reports, Page.)
  • Save the filter.

 

Remember that this filter will only affect data going forward. Past data collected with the mixed case in URLs will not be affected

 

Tip the lowercase filter can be useful for any field that potentially involves a human error in the entry at some point. This includes both parts of the URL (Request URI and Hostname) as well as the Campaign fields. You should look at these values in your Google Analytics reports and consider applying lowercase filters for increased consistency.

 

Default URLs

Another issue with consistency in URLs involves default pages. For your homepage or the default page of subdirectories on a site, a browser can often access the same page via multiple URLs. For example, your home page might be accessible through either of these URLs:

  • http://www.ThesisScientist.com.com/
  • http://www.ThesisScientist.com.com/index.html

 

Note  The ending might differ depending on your content management system and web server: index.html, index.php, default.aspx, and so forth. Check your site to see what it uses. Now you again have a problem in your reporting

 

Similarly to the capitalization problems described in the previous section, you’d have to add up those rows to understand how many times this particular page was actually viewed. Fortunately, you can fix this in Google Analytics as well. There are two options, depending on which version of the URL you’d like to consolidate on:

 

There’s a view setting that will append index.html (or whatever is appropriate) to the end of URLs that end in a trailing slash.

  • You can use a filter to remove index.html (or whatever is appropriate), leaving only the trailing slash.
  • Which you prefer is mostly a stylistic choice in the way you’d prefer to see the pages listed in your reports.

 

USE THE VIEW SETTING TO APPEND DEFAULT TO URLS

Google Analytics provides a setting that will automatically append text to the end of any URL that ends in a trailing slash.

In the Admin area of Google Analytics, with the appropriate account, property, and view selected, choose View Settings in the third column to edit the settings for the view.

 

  • Scroll down to the setting Default Page.
  • Enter the text you’d like to append to the end of any URL that ends in a trailing slash. (Do not include the slash.)
  • Save the changes to the settings.

Remember that this setting will only affect data going forward. Past data collected for URLs will not be affected.

 

USE A FILTER TO REMOVE DEFAULT FROM URLS

 

In the Admin area of Google Analytics, with the appropriate account, property, and view selected, choose Filters in the third column to see the filters for the view. Select the New button to add a new filter.

  • Choose to create a new filter (rather than apply an existing one).
  • Give the filter a name: “Remove index.php” (for example).
  • Choose the Custom tab.

 

  • Choose Search and Replace as the filter type.
  • Choose Request URI as the field.

 

  • For the search string, enter a regular expression for the pattern you want to remove. This might be as simple as index\.html, or a more complex pattern if there are multiple possibilities—for example, (index|default)\. (php|html?)

 

  • For the replace string, leave it blank (to replace with nothing).
  • Save the filter.

Remember that this filter will only affect data going forward. Past data collected with index.php in URLs will not be affected.

 

Query Parameters

The “query string” or “query parameter” portion of the URL is the part after the question mark:

http://www.ThesisScientist.com.com/reservation.php?status=completed&sort=az&session id=123456

 

The query parameters consist of name-value pairs joined by equal signs, with multiple parameters separated by ampersands. Sometimes, they tell you something really valuable about the content of the page or the choices of the user (status=completed) but other times they’re unimportant (sort=az) or even detrimental to your data (sessionid=123456).

 

To see how query parameters can muck up your page data, consider that last example, sessionid. Web servers sometimes use a session ID or visitor ID to keep track of the user’s state in a process.

 

This number is used internally by the web server, but you don’t care about it, and furthermore, it’s different for every single browser session. This is like your capitalization problem from before, but instead of two ways to capitalize the URL, now there are hundreds or thousands!

 

Fortunately, you can remove query parameters you’d like to get rid of. There’s a view setting in Google Analytics that allows you to remove query parameters from the URL, so that you can see one consistent URL with just the query parameters that you want to keep around. Or, if you want to do away with all query parameters, you could use a simple filter.

 

Tip  Sometimes it might make sense to have multiple views for the site with different settings or filters on the query parameters. For example, one view could keep more detailed query parameter information for more specific reporting (when you really want to know if a user sorted from A to Z or Z to A), while another could remove those parameters for better high-level information about pages.

 

Before getting started, you’re going to need a list of the query parameters you’d like to eliminate. The part of the query parameter you are interested in is the name (that is, the part before the equals sign)—for example, status, sort, or sessionid in the example URL. How do you go about compiling this list?

 

If you have existing data in Google Analytics, that’s one place to begin. (It can be helpful to export the pages into a spreadsheet and use formulas to break apart the query parameters based on delimiters like ?, &, and =.) Looking into the documentation of your content management system or other software used to manage your website may also give insights into query parameters and how they are used. Once you’ve established a list of query parameters you’d like to eliminate, you can remove them.

 

Tip  One type of query parameter you might encounter are parameters relating to an internal site search. For example, in http://thesisscientist.com/searchresults.php?q=hats, the query parameter q=hats could indicate that the user searched for the term “hats” using your site’s search.

 

Don’t remove these with filters or the view setting—they’re valuable information, and Google Analytics has a special set of reports for dealing with them (and then you can strip them out of the URLs, too). You’ll see how in a later section in this blog.

 

USE THE VIEW SETTING TO REMOVE SPECIFIC QUERY PARAMETERS

USE THE VIEW SETTING TO REMOVE SPECIFIC QUERY PARAMETERS

Google Analytics provides a view setting that lets you selectively remove query parameters from URLs.

 

  • In the Admin area of Google Analytics, with the appropriate account, property, and view selected, choose View Settings in the third column to edit the settings for the view.

 

  • Scroll down to the setting Exclude URL Query Parameters.

 

  • Enter the names of query parameters you’d like to exclude, separated by commas. Note that you should not enter ampersands, equals signs, or other delimiters, merely the name of the query parameter.

 

  • Save the changes to the settings.

 

Remember that this setting will only affect data going forward. Past data collected with query parameters in URLs will not be affected.

 

USE A FILTER TO REMOVE ALL QUERY PARAMETERS

This filter removes everything after the question mark character (?) in your URLs.

 

In the Admin area of Google Analytics, with the appropriate account, property, and view selected, choose Filters in the third column to see the filters for the view. Select the New button to add a new filter.

 

  • Choose to create a new filter (rather than apply an existing one).
  • Give the filter a name: “Remove All Query Parameters”.
  • Choose the Custom tab.

 

  • Choose Search and Replace as the filter type.
  • Choose Request URI as the field.

 

  • For the search string, enter the following regular expression: \?.*
  • For the replace string, leave it blank (to replace with nothing).
  • Save the filter.

 

Caution  This filter removes everything in a URL after the question mark character. Ensure that you have at least one other view with full, unfiltered URLs should you ever need them. Remember that this filter will only affect data going forward. Past data collected with query parameters in URLs will not be affected.

 

Capturing the URL Fragment

You may have noticed in previous examples, or in your Google Analytics data, that the Page dimension in Google Analytics includes the path and query string of the URL (in bold in this example), but not the fragment (the part after the hash mark):

http://www.ThesisScientist.com.com/tea-party/index.html?location=wonderland#agenda

 

The fragment (#agenda, in this example) is often used to link to a particular section of a page. Google Analytics discards the fragment by default. If you’d like to capture it, you can grab it with a variable in Google Tag Manager and override the default page URL with the fragment appended. Here’s how.

 

Note  If URL fragment changes result from an AJAX application and its information on history listeners to trigger tags to capture that information. The example described here merely captures the fragment in the page’s URL to be recorded when an existing tag is triggered.

 

USE A Google Tag Manager VARIABLE TO CAPTURE THE URL FRAGMENT

First you’ll need to set up a variable to get the URL fragment.

  • Create a new variable in Google Tag Manager.
  • Choose URL as the variable type.

 

  • Choose Fragment as the component.
  • Save the variable, giving it a name: “Page URL Fragment”.

 

  • Now you can alter the Google Analytics pageview tag to include the fragment in the URL.
  • In Google Tag Manager, edit the Google Analytics – Pageview tag.

 

  • Select Configure Tag to make changes to the Google Analytics tag settings.
  • in the Fields to Set section, add a new field.

 

  • Select page from the drop-down as the field name.

 

For the field value, you’re going to construct a new URL that includes the site search term from the built-in {{Page Path}} variable and the {{Page URL Fragment}} variable you just created: {{Page Path}}#{{Page URL Fragment}}

 

Note  If there’s no fragment, note that this tag would still append # to the end of the URL. You can easily use a Search and Replace filter in Google Analytics to remove the trailing # if there’s no fragment. You could also make a clever use of variables in Google Tag Manager to handle this.

 

Save the changes to the tag. after publishing these changes, Google Tag Manager will include the fragment with each page URL sent to Google Analytics.

 

Viewing Hostnames for Subdomains and Cross Domains

You can use filters in Google Analytics to prepend the hostname value to the Page dimension so that these reports become much more easily digestible 

 

You’ll do this using the Advanced filter type, which you haven’t seen so far in the examples in this blog. The Advanced Filter lets you take two different fields in Google Analytics (Field A and Field B), extract information from them, and output that information to a field (Output).

 

Field A and Field B each take a regular expression. Any part of the regular expression enclosed in parentheses captures that part of the field to use in the output.

 

The Output takes text as well as variables that refer back to the information captured in Fields A and B. The variables look something like $A1, where $ is the signal for a variable, A refers to field A, and 1 refers to the first set of parentheses in the regular expression.

 

In this way, you can combine data from two fields into one, which is exactly what you’d like to do with the URL’s hostname and path. Let’s walk through it.

 

USE A Google Analytics FILTER TO PREPEND HOSTNAME TO URL

In the Admin area of Google Analytics, with the appropriate account, property, and view selected, choose Filters in the third column to see the filters for the view. Select the New button to add a new filter.

 

  • Choose to create a new filter (rather than apply an existing one).
  • Give the filter a name: “Prepend hostnames to URL”.

 

  • Choose the Custom tab.
  • Choose Advanced as the filter type. (See the example earlier in the blog for cross-domain tracking for more on how the Advanced filter works.)

 

  • For Field A, select Request URI in the drop-down menu. Enter the regular expression (.*). This will select the entire contents of the Request URI field, and enclosing it in parentheses lets you use it later in the output.

 

  • For Field B, select Hostname in the drop-down. Enter the regular expression (.*). This will select the entire contents of the Hostname field, and enclosing it in parentheses lets you use it later in the output.

 

  • For the Output, select Request URI (since you want to overwrite the existing URL with your version). Enter $B1$A1. $B1$A1 calls back to the previous fields. $B1 selects the first set of parentheses in Field B (the hostname), and $A1 selects the first set of parentheses in Field A (the URL path).

 

  • Since you’ve run them right together with no spaces or punctuation, that’s how they’ll look in GA’s reports, for example, www.ThesisScientist.com.com/page.html.

 

  •  Check the boxes for Field A and B Required and Override Output Field.
  • Save the filter.

 

Remember that this filter will only affect data going forward. Past URLs will not be changed in reports.

 

Site Search

Site Search

Almost every site has a way for users to search for content within the site. Depending on the type of site, this search might be used to find articles, products, or whatever types of information are available.

 

Google Analytics has a subset of reports in Behavior ➤ Site Search devoted to measuring searches that users perform and how successful they are in finding answers

 

To enable these reports, you need to tell Google Analytics where to find the information about what a user searched for. There are three possible scenarios a site’s search may fall into:

 

The search term is contained within a query parameter. For example, a search for “hats” might result in a URL like the following:

http://www.ThesisScientist.com.com/searchresults.php?q=hats

 

This is GA’s default assumption and is true for many site search engines. This is easily set up with a view setting in Google Analytics.The search term is contained in the URL, but not in a query parameter: http://www.ThesisScientist.com.com/search/hats

 

Using a filter in Google Analytics, you can pull the term “hats” out of the URL and place it in the appropriate field (the Search Term dimension).

 

The search term is not contained in the URL at all: http://www.ThesisScientist.com.com/searchresults.php. In this situation, you need to capture the term in the tag in Google Tag Manager, since it’s not already part of the URL.

 

In addition to the search term, Google Analytics can also optionally capture a category, which typically represents a way of limiting or restricting the search to a subset of content on the site. Like the search term, this may be represented in a query parameter, other information in the URL, or not in the URL. In the URL, a category might be something like the color parameter in this URL:

http://www.ThesisScientist.com.com/searchresults.php?q=hats&color=violet

 

SET UP SITE SEARCH WITH A QUERY PARAMETER

Suppose your search results URLs look like /searchresults.php?q=hats&color=violet, where “hats” is the search term.

 

In the Admin area of Google Analytics, with the appropriate account, property, and view selected, choose View Settings in the third column to see the settings for the view.

 

  • At the bottom, in the section labeled Site Search Settings, turn site search tracking on.

 

  • Specify where to find the site search term in the URL:
  • Enter the name of the query parameter. In this example, it is q. Note that you do not need to enter ampersands, equals signs, or other delimiters, merely the name of the query parameter.

 

  • (Recommended) Check the box to strip this query parameter out of URLs. Once this data is in the Site Search reports, you don’t need to see it in the URLs.

 

  • (Optional) Specify where to find the site search category in the URL by turning site search categories on and entering the query parameter. In the example URL, the category was color.
  • Save the changes to the settings.

 

Remember that this setting will only affect data going forward. Past data collected with site search parameters will not backfill the Site Search reports.

 

SET UP SITE SEARCH WITH INFORMATION IN THE URL

Suppose your search results URLs look like /search/hats, where “hats” is the search term. Here’s how you can use a filter to pull the value out of the URL and into the appropriate field.

 

In the Admin area of Google Analytics, with the appropriate account, property, and view selected, choose Filters in the third column to see the filters for the view. Select the New button to add a new filter.

  • Choose to create a new filter (rather than apply an existing one).
  • Give the filter a name: “Site Search Term”.

 

  • Choose the Custom tab.
  • Choose Advanced as the filter type. (See the example earlier in the blog for cross-domain tracking for more on how the Advanced filter works.)

 

  • For Field A, select Request URI in the drop-down. Enter a regular expression to extract the term portion of the URL (enclosing the desired part of the URL in parentheses). In this example, you might use the following: ^/search/(.*)
  • Leave Field B blank. You don’t need a second field for this filter.

 

  • For the Output, select Search Term. Enter the variable from Field A that corresponds to the search term; in this example: $A1
  • Check the boxes for Field A Required and Override Output Field.
  • Save the filter.

 

Remember that this filter will only affect data going forward. Past data will not be backfilled into the site search reports. Like the setup using query parameters in the previous section, you can also capture a category with this method (if applicable to your site search). Create another filter and select Site Search Category as the field, and alter the regular expression to match the relevant portion of the URL.

 

CAPTURE SITE SEARCH INFORMATION USING Google Tag Manager

CAPTURE SITE SEARCH INFORMATION USING Google Tag Manager

In the final scenario, you have a search results page URL like /searchresults.php, where the site search information isn’t in the URL at all, so you’ll have to supply it in some alternative way. You can include this information in the data layer declaration:

 

dataLayer = [{ 'searchTerm': hats' }];

 

First, you’ll create a data layer variable for this information. Then, you can override the default URL captured in the Google Analytics pageview tag to include this information.

 

Note  Alternatively, the search term could be extracted from the page’s content using a DOM Element variable. However, as discussed in, variables based on page content can be subject to changes in layout and appearance, so a data layer variable is preferred when possible.

 

  • In Google Tag Manager, create a new variable.
  • For the variable type, choose Data Layer Variable.

 

  • For the data layer variable name, enter searchTerm (or whatever label you used for the search term in the data layer).
  • Save the variable, giving it a name: “Search Term”.

 

  • Now you can create a tag that uses this variable to augment the URLs that you send to Google Analytics for search results pages. In Google Tag Manager, make a copy of the Google Analytics – Pageview tag. You can copy a tag by selecting it to edit, then select the Copy button at the bottom right.

 

  • Select Configure Tag to make changes to the Google Analytics tag settings.
  • in the Fields to Set section, add a new field.

 

  • Select page from the drop-down as the field name.
  • For the field value, you’re going to construct a new URL that includes the site search term from the built-in {{Page Path}} variable and the {{Search Term}} variable you just created: {{Page Path}}?q={{Search Term}}

 

  • Select Fire On to make changes to the tag’s triggers.
  • remove the All Pages trigger. You only want this tag to fire on search results pages.
  • Select Some Pages.

 

  • Create a new trigger called “Search Results Pages Only”, where the Page URL matches your search results page URL(s). For this example: {{Page URL}} – equals – /searchresults.php
  • Save the tag, giving it a name: “GA – Pageview – Search Results”.

 

Note   Since you’ve created a specific pageview tag with a trigger only for search results pages, you’ll also want to add this trigger as a blocking trigger for the general Google Analytics pageview tag so that you don’t double-count.

 

After you publish this change to the Google Tag Manager container, your search results pages will send URLs to Google Analytics that look like /searchresults.php?q=hats. From here, you can simply proceed with the same setup as though you have query parameters discussed earlier in this section, using q as the query parameter set.

 

Grouping Content

In the previous section, you looked at a variety of ways to clean up URLs so that they correspond precisely well to a page, without needing to aggregate across variations of a URL. Sometimes you also have a need to group pages at higher levels: by category, page type, or other classifications. Google Analytics has a number of tools to address this. Sometimes groupings may be obvious in the structure of URLs:

 

  • http://thesisscientist.com/hearts/queen/ 
  • http://thesisscientist.com/hearts/king/ 
  • http://thesisscientist.com/spades/ace/
  •  http://thesisscientist.com/clubs/deuce/

 

In this case, the Content Drilldown report (located in Behavior ➤ Site Content) groups URLs together based on these subdirectories. You can drill down through up to four levels of subdirectory in this report.

 

In other cases, you may wish to make groupings based on other information in the URL (such as a query parameter) or information not present in the URL at all. You can use a feature in Google Analytics called Content Grouping to accomplish this.

 

Content Groupings are created in a view. Each view can have up to five Content Groupings—that is to say, five different sets of categories for grouping pages (with an unlimited number of categories within each grouping). Useful ways of grouping content will depend on your site and what kinds of content it contains, but common types of content groupings include the following:

 

  • By topic or product category
  • By content type or page template (landing pages, article pages, product pages, etc.)
  • By author, publication date, or other content metadata
  • By qualities of the page content (word count, contains a video, etc.)

 

Once the Content Groupings have been created, they are available to group pages in the Behavior ➤ Site Content ➤ All Pages report also other reports with Page as the primary dimension) as well as in the navigational reports such as the Behavior Flow report.

 

Content Groupings can be defined in three ways: by extraction, by rule definitions, and by tracking code. Extraction and rule definitions allow you to use patterns based on the page’s URL or title see while the tracking code option allows grouping based on categories you supply (often provided in the data layer). The following sections explain each type of content grouping with examples of the situations in which it can be used.

 

Grouping by Extraction

Extraction uses regular expressions to extract group names from the page’s URL or title (or from custom dimensions). So, for example, in the following URLs, you might wish to extract the product type (hats, teacups, etc.):

 

  • http://thesisscientist.com/products?type=hats&id=123 
  • http://thesisscientist.com/products?type=teacups&id=456 
  • http://thesisscientist.com/products?type=croquet&id=789

 

The trick is to write a regular expression to match this pattern and to capture only the part of the URL you want. The portion to be captured is expressed in parentheses. In this case, the following regular expression might do what you need: /products?type=([^&]*)

 

Recall that GA’s Page dimension does not contain the hostname, so you begin with the URL path starting with /products. The [^&] matches all characters that aren’t ampersands, and the * says 0 or more of them. However, you should be careful: what happens if the category parameter isn’t always the first one?

 

Consider a URL like this: /products?id=123&type=hats

 

You want to be as specific as you need to be, and no more. A better choice for the regular expression, to match the earlier possibility, would be the following: type=([^&]*). Now you’re able to accommodate capturing the type parameter’s value no matter where it appears in the URL.

 

Notice that, with the extraction method, you are specifying the location in the URL where the content group label appears, but you never have to specify the labels themselves. They’re already there in the URL, and this pattern will match as many different category names as appear in the URLs.

 

Tip  The regular expression pattern shown earlier is especially useful in creating content groups by extraction. It’s known as “character class negation”: [^something], most often used with ampersands, slashes, and other common delimiters in URLs to match all characters except those listed.

 

SET UP CONTENT GROUPING BY EXTRACTION

Let’s set up a content grouping by extraction, based on the preceding example URLs. 

  • In the Admin area of Google Analytics, with the appropriate account, property, and view selected, 
  • choose Content Grouping in the third column to see the content groupings for the view. 

 

  • Select the button to add a new content grouping. 
  • Enter a name for the content grouping (the label that will appear for this set of categories in the report menus): “Product Category”.

 

  • Choose to group by extraction.
  • For this example, select Page as the dimension.

 

  • Enter the regular expression: category=([^&]*)
  • Notice that the part you’d like to extract should be in parentheses.

 

  • Choose Save to save the content grouping.

 

The new content grouping will be available immediately as a choice in the drop-down in reports, but the categories will only be filled in for pages on data from the point the content grouping was created going forward.

 

SET UP CONTENT GROUPING BY RULE DEFINITIONS

Let’s set up a content grouping by rule definitions, based on the preceding example URLs.

  • In the Admin area of Google Analytics, with the appropriate account, property, and view selected, 
  • choose Content Grouping in the third column to see the content groupings for the view. 

 

  • Select the button to add a new content grouping.
  • Enter a name for the content grouping (the label that will appear for this set of categories in the report menus): “Suit Color”.

 

  • Choose to group by rule definition, selecting the option to add a new rule set.
  • Name the rule set “Red” for this example.

 

  • For the first criterion, select Page as the dimension.
  • Select contains as the matching option.

 

  • Enter hearts as the text to match.
  • Choose to add the second criterion by selecting OR, then repeat the previous steps, choosing “Page contains diamonds”.

 

  • Select Done to finish the first rule set.
  • Add another rule set named “Black” and repeat the previous step for “Page contains spades” OR “clubs”.
  • Choose Save to save the content grouping.

 

The new content grouping will be available immediately as a choice in the drop-down in reports, but the categories will only be filled in for pages from the point the content grouping was created going forward.

 

Grouping by Tracking Code

In both of the previous methods, content groupings were based on information contained in the URL or other data already in Google Analytics. However, in some cases, you’d like to group by other information. Consider the following URLs:

 

  • http://thesisscientist.com/blog/2015/09/22/guide-rules-flamingo-croquet/ 
  • http://thesisscientist.com/blog/2015/10/25/wheres-that-cat/ 
  • http://thesisscientist.com/blog/2015/11/13/red-queen-proclamation/

 

This structure is pretty common on blog or news articles. If you were looking for the publication date, that would be easy enough to extract with a regular expression (see the earlier section). What if you wanted to group based on, say, the article’s author? Or what about article length?

 

Grouping based on tracking code allows you to supply the label in the page’s tags, by having a Google Tag Manager variable with the correct label. The source of this variable would typically be one of two options:

 

Data layer variable: For information that comes from the website’s content management system or another source. In the preceding example, the content management system knows the author of the article, and so you could include that in the data layer declaration for the page:

dataLayer = [{ 'author': 'Alice' }];

 

This approach can be used for any kind of content metadata that is available from the content management system or another source when the page is rendered.

 

Custom JavaScript variable: To check for a condition or makes a calculation. In the preceding example, if you wanted to create a content grouping based on article length, you could have some JavaScript code to find the article section of the page, split it by spaces and other punctuation, measure the length, and bucket into categories (like 0–100, 101–200, etc.).

 

Similarly, custom JavaScript variables could check for the presence or number of images, videos, or other kinds of media in the article, or any other information that is present in or can be calculated from the page’s content.

 

Caution  Although Content Groupings are set up in each view in Google Analytics, Content Grouping by tracking code would apply across all views in the property (since the data all comes from the same set of tracking code for the property, you can only supply one value in that slot). Map out how you will be using each of the five Content Groupings across the views within a property to ensure you’re not stepping on your own toes anywhere.

 

SET UP CONTENT GROUPING BY TRACKING CODE

SET UP CONTENT GROUPING BY TRACKING CODE

First, you’ll set up the Content Grouping in Google Analytics. In the Admin area of Google Analytics, with the appropriate account, property, and view selected, choose Content Grouping in the third column to see the content groupings for the view. Select the button to add a new content grouping.

 

Enter a name for the content grouping (the label that will appear for this set of categories in the report menus): “Author”.

  • Choose to group by tracking code.
  • Set the Enable setting to On.
  • Choose the slot you’d like to use (1–5) from the index drop-down.
  • Save the content grouping.

 

Next, you’ll need a variable in Google Tag Manager that contains the category name for the content grouping. This example assumes you are using a data layer variable, but you could also use a custom JavaScript variable.

 

  • Create a new variable in Google Tag Manager.
  • Select Data Layer Variable as the type.

 

  • Enter author as the name (in this example; use whatever name you are using in your data layer).
  • (Optionally) Set a default value by checking the box and entering a value. Otherwise, if the data layer doesn’t specify a category, the value will be blank and Google Analytics reports will show “(not set)”.

 

  • Save the variable, giving it a name: “Author”.
  • Finally, you’re ready to alter the Google Analytics pageview tag to include the content grouping.

 

  • In Google Tag Manager, edit the Google Analytics – Pageview tag.
  • Select Configure Tag to make changes to the Google Analytics tag settings.

 

  • in the Content Groups section, add a new content group.
  • For the index, enter the index number you selected when you created the grouping in Google Analytics (1–5).

 

  • For the Content Group, enter the {{Author}} variable you just created.
  • Save the tag.

 

When you publish the changes to the Google Tag Manager container, content group data will be passed for each page to Google Analytics. The new content grouping will be available immediately as a choice in the drop-down in reports in Google Analytics, but the categories will only be filled in for pages on data from the point the content grouping was created going forward.

 

Other Applications for Filters

The previous section looked at a number of the most common applications of filters and other settings for cleaning up data in Google Analytics, as well as a bevy of workarounds using Google Tag Manager to capture additional information. Filters have many potential uses, but the previous examples capture most of the common use cases.

 

The other most common purpose for filters is to create views with only a subset of the site’s data. This is useful especially when the parts of a site are operated or marketed semi-independently from one another.

 

For example, if there are separate marketing teams for different product lines or brands, each team may desire to have a view with reports that only contain the parts of the site under their influence. Then they are also able to filter and group the data for the view in the ways that are most useful to them and set up conversion goals that are relevant.

 

Note  Keep in mind that the base limit for the number of views per property is 25. This limit can be raised for Google Analytics Premium subscribers and in some cases for other users of Google Analytics.

 

The most common ways to divide a site into these kinds of functional areas are by subdirectories or (sub) domains:

 

Subdirectories: You want to divide www.ThesisScientist.com.com/hotels/ and www.ThesisScientist.com.com/amusement-parks/ into separate views.

 

(Sub)domains: You want to divide hotels.this scientist.command parks.ThesisScientist.com into separate views.

If more complex criteria are needed to specify the pages or hostnames in the view, the custom Include filter can be used with regular expressions.

 

Filtered Views vs. Segments

GA’s reporting interface includes a feature called Segments. On its face, applying a segment to a report seems like it does much the same thing as filtering a view: you can include or exclude sessions in the segment based on a long laundry list of a dimension and metric criteria. When should you use a segment on a report, and when should you use a filter on a view?

 

When possible, prefer segments over filters. They have an easy-to-understand interface built into the reports in Google Analytics. No knowledge of regular expressions is necessary, the dimension and metric names match up to those in reports, and no administrative access to the view is needed to create and apply segments. And most usefully, as an in-report feature, segments can be applied to historical data that already resides in your view.

 

However, filters have advantages in some situations and capabilities that are not possible via segments (such as rewriting field values), as you’ve seen in this blog. If there’s data you always want to exclude (like internal traffic), if there’s data you need to change for purposes of cleanup, or if there are functional reasons to create separate sets of data in separate views (as discussed earlier), filters are the appropriate tool.