It shouldn’t really need saying on the basis that it’s common sense but I was recently reminded of just how relevant this mantra is.
On Monday last a weekly automated client dashboard report from Google Analytics showed an alarming drop in conversion. It had hit it’s lowest weekly ebb in several months since we’d started pushing it up via a structured program of testing and refinement.
First order of the day was to go into GA and have a root around followed by a quick call to the client to ask if any changes to the site had been made over the last week.
Fortunately we have all major steps in the customer journey set up as individual goals in GA (as opposed to steps in a single funnel) so by individually trending all goals over the past month the “quick root around” soon revealed which step in the journey was suffering.
The subsequent call to the client was therefore that much more productive as we were able to identify exactly where the problem seemed to have occurred.
Followup investigations revealed that a change had been implemented in the name of providing a uniform customer experience across the site, in other words there was an element of content that was present on one area of the site but not present on another area and in order to reduce customer confusion it was removed from the area where it was present without testing first.
As a result of that action conversion to the main objective (in this case to fill out a form) took a sharp nose dive.
The change was made for all the right reasons; provide a more uniform on site experience, reduce confusion and hopefully push up conversion as a result. But the change should have been implemented as a test first and not outright. This should have ben the case for a few pretty good reasons:
1.) Testing allows changes to be made using not just one variant but several thereby increasing the chances of success.
2.) Testing allows changes to be implemented whilst limiting downside risk as a result of controlling the sample size of the test. Starting with a small sample size means that if the hypothesis doesn’t hold water then the potential for loss of revenue is considerably muted.
3.) Assumption is the mother of all mistakes. In this case, just because something is absent on one part of the site doesn’t mean that it should be absent elsewhere for the sake of uniformity. This is a bit of a sticky wicket as by-and-large I’d be inclined to agree with the uniformity hypothesis but in fact this situation made me realise that even my own “best practice” assumptions need to be challenged from time to time and here was a perfect example of why.
In truth, the two comparable areas of content were themselves slightly different in other ways aside from the one that was the subject of the change and so because of that it is possible that users of the site were behaving differently in each area anyway which is why the original hypotheses may not have been correct.
The change was quickly reversed and conversion recovered.
On the upside some learning did come out of this situation but at a cost which was greater than it should have been and for that reason, wherever possible, it is best to test everything.
Well done to The Watch Shop for doing something so obvious and yet so helpful to the customer (and their own online selling efforts).
In recognition of Christmas, which is presumably a peak trading period for them, they have made a decent effort to break down an obvious barrier to the gift buying decision making process: “What happens if s/he doesn’t like this present? Will I have wasted my money? Can it be returned?”
Happily for the punter yes it can and here are the details of the return policy, nice and easy to see, at just the point where that thought is likely to crop up in the purchase decision making cycle.
Google is in the process of rolling out demographic profiling in its ubiquitous analytics tool and as part of this they are serving up reports that will allow users to view site performance and segment data based on age, gender and interest.
The reports are available in the ‘Audience’ menu under the ‘Demographics’ and ‘Interests’ sub-menus but to see the data you first have to put in a little effort.
1. You will need to go into the GA property settings for your site and “enable” the reports from there.
2. You will need to add a line of code to your GA tag or if you’re using Google Tag Manager you’ll want to select “Display Advertiser Support”.
3. You’ll need to enable the reports from the GA reporting interface.
How does Google Analytics know this stuff?
Step number two above also alludes to some interesting and important stuff to keep in mind about this data and how’s it’s collected. One might be forgiven for wondering what display advertising has to do with the collection of demographic data for Google Analytics but in fact Google is gathering this information via its display advertising network, the same network on which it delivers Google Analtytics and Adwords retargeted advertising. The point about this is that Google uses the dreaded and much maligned third party cookie which it puts on browsers and which gathers the age, gender and interest data based on the sites that have been visited. In other words Google’s making assumptions about these demographic profiles based on visitor behaviour, it doesn’t actually know this information except in cases where the information is specifically supplied by the site, in particular social networking sites (read Google +).
In it’s explanation of how the data is gathered Google also makes it clear that due to the nature of the collection method it is in fact more accurately assessing the age, gender and interest of web browsers and not of specific people.
So there are three things to keep in mind when using this data in Google Analytics:
- It’s performance data against browser software and not people that you’re actually looking at in Google Analytics (nothing new there really).
- People have some pretty bizarre interests and fantasies which they are more inclined to live out on the web where they think they can’t be personally identified.
In the latest survey by Econsultancy and Lynchpin news is that marketeers are baffled, worried, confused, put off by the term “big data”. What’s really telling is that 8% thought it just a “pointless marketing term”.
Any dataset that might exceed more than a few MB could be considered “big data”. It would be more helpful if when presented with a dataset or combination of datasets to ignore whether or not it is big data and simply focus on where it came from and how it might be used and by whom.
For example, web analytics data + stock control data + marketing spend data = an opportunity to better understand how digital acquisition channels plus on site customer journey can affect availability, stock control, cost of acquisition, margin and profit levels. It might also be described as “Big Data” if one was so inclined.
The Obama campaign’s effective use of email testing is now legendary and this post highlights some of the more interesting factoids.
In particular it’s the counter intuitive findings that are worth keeping in mind. For example:
- “Best practice” doesn’t always work, so keep an open mind.
- “Ugly” emails might actually do better than ones with lovely designs and graphics.
- Unsubscribes and good outcomes both went up linearly. Positive and negative results can coexist side by side so don’t be distracted by the secondary negative results. From a web analytics standpoint, I equate this to an increasing bounce rate being accompanied by an increase in conversion and / or revenue. I’ve seen it happen many times.
I’d say the key message coming out of this is that you have to keep testing even counter intuitive stuff and what seems almost more important than the content is the testing structure and program.
Google Analytics offers default mobile segments which most people who use it will be aware of.
What they might not be aware of (in my experience), especially if they are Brits, is that the default ‘Mobile’ segment covers BOTH mobile phone and tablet traffic. In other words the default ‘Tablet’ segment is a subset of ‘Mobile’. I suspect the confusion for Brits lies in the language, we call mobile phones…. mobiles, in America they are of course referred to as cell phones, so in this case mobile is an umbrella term referring to both cell phones (to use the American parlance) and tablets.
This may be an important distinction to be aware of when looking at visits that view a mobile site or perhaps even building the case for a mobile site.
The real mobile phone traffic
For people / Brits who want to know how mobile phone traffic looks on their site, there is a simple solution involving a Custom Segment.
Go into Advanced Segments, create a new custom segment and look for the ‘Device Category’ dimension. Type “mobile’ into the filter field and select include. Give the segment an appropriate name, save and you’re done.
To be sure you are looking at the right data you can apply your new mobile phone custom segment plus the default tablet segment and the default mobile segment to a standard report showing visits, you should find that visits in your custom mobile phone and default tablet segments tally to match the total count of visits in the default mobile segment.
I recently came across an odd thing in Google Analytics which, although only a minor frustration, took me a moment to figure out… not immediately obvious.
I was comparing year on year data using four advanced segments and looking at the headline change in share of visits for each segment.
When applying segments in Google Analytics it clearly shows what the corresponding splits are as a percentage of all site visits, something one would naturally want to see when segmenting data… you’re bound to want to know what the weight is of each segment and if it’s something worth worrying about.
Sample data above demonstrates how clearly this info. is displayed. So far so good.
Then came the moment when I wanted to see if the weight of each segment had changed from one year to the next.
Looking at the second sample data set above all seems fine, for example UK visits as a percentage of all site traffic has decreased by -10.72%… Europe excluding UK has increased by 9.38%. Not so.
At the time I was checking this data I happened to have already done these calculations in a dashboard for a client and my YoY variation data looked quite different. After a bit of brow rubbing followed by two minutes in excel the obvious finally occurred to me, Google Analytics is not showing the percentage change expressed as it normally does by calculating one as a percentage variance of the other but in fact it is simply subtracting one from the other, in other words it’s giving the change in percentage points. Something very different.
Looking at the percentage point change the difference is comparatively slight and if you were using the method (i.e. advanced segments) to look at year on year change in something like organic traffic landing on a particular page that you had spent time optimising you might think your effort (and money) had been wasted and give up or worse still change what you are doing.
The lesson to me… it always pays to take a second look at your GA data when you’re using advanced segments.
Recently I’ve been experimenting with the attribution modelling feature which is steadily being rolled out across the free version of Google Analytics. I’ve been doing this with a particular task in mind and as I’ve been working with it some doubt has arisen in my mind around how GAM is allocating conversions in some specific situations.
Attribution modelling is of course all about understanding the true contribution of each referring source of traffic to a website in the context of a specific goal or objective.
The first thing to say is that by and large I think the Attribution Modelling tool in Google Analytics and its reports are helpful in lots of situations.
A little background
GA’s attribution modelling tool provides ‘out of the box’ models based on first click, last click, linear, time decay, last non direct click, last adwords click and so on. It also allows users to create their own custom models based on any of the above. These models can be used to compare sources of traffic to see which deliver the most conversions based on the type and strength of credit value assigned to them.
A word on credit and value
GA’s various attribution models assign value to each channel according to the amount of credit due based on the model in use. So if you have a multi channel conversion funnel like this…
Paid search > organic search > email > affiliate (conversion step)
…and you were comparing the last click model with the first click and linear models then last click would assign 100% of all credit for a conversion to the affiliate channel, first click would assign 100% of all credit for a conversion to paid search and linear would assign an even 25% of credit to each of the four channels.
The role of credit issuance (for want of a better way to put it) is important because it can also be applied to custom models in two different ways:
- Using the position based model and assigning a percentage of credit across the first, last and middle positions in the conversion journey
- Assigning a credit level to a custom rule.
A sample problem
Say your marketing mix comprises both paid search and affiliate channels; you want to understand the true value of your affiliate marketing but at the same time you’re concerned about double paying for sales leads that include both sources in the acquisition funnel. In theory attribution modelling should help shed light on that.
By comparing data from up to three different models against your preferred dimension, let’s say by medium, you can see how each source performs against the base model. By default the base model is Last Click but it can be changed to other models, in this example First Click.
By looking at the two comparative models it becomes possible to understand which model delivers best in terms of conversion volume compared to the base.
The data below shows three models being compared side by side. Reading from left to right there is the base model which is first click and then the two comparative models are the standard linear model and a custom linear model where affiliate traffic has been given a value of zero i.e. it has been completely devalued.
The data above indicates that for the model in which the affiliate channel has been completely devalued the fall in conversions attributed to paid search (CPC) when compared to the First Click base model is less severe (-8.86%) than it is when compared to the standard linear model (-12.03%) in which all referring sources including affiliate have an equal share of the credit.
In a sense it’s saying that without affiliate traffic in the marketing mix we can expect paid search to do a better job than it would if affiliate is included in the marketing mix and given an equal share of the credit.
The fishy part
The trouble is this assumes that while completely devaluing the contribution from the affiliate source of traffic we will still harvest exactly the same volume of conversions and in doing so all that’s happening is the credit, which was once ascribed to affiliate traffic, is now being reallocated to other sources.
In reality if we completely devalue the contribution from affiliate we might expect the actual volume of conversions to reduce on the grounds that affiliate must to some extent be responsible for some incremental conversions, at the very least in referral funnels where affiliate was the ONLY referring channel.
While the actual numbers in the example data below are blurred out, that doesn’t matter. What matters is that it shows there are instances in which a particular referrer might be the only channel in a funnel.
This is where the problem seems to be. A quick tally of the total volume of conversions against the two comparison models (linear with equal attribution and linear where affiliate is completely devalued) shows the numbers are exactly the same, this logic doesn’t make sense.
Where the affiliate channel was the only channel responsible for the referral (and arguably even in situations where it was the first touch channel) the associated conversions should be subsequently excluded from the dataset. How else could they have occurred?
Additionally, what we cannot tell from attribution modelling is sentiment i.e. which channel, source, medium or combination thereof was responsible for tipping the consumer into making a purchasing decision. It’s the age old problem of analytics data, we have the ‘what’ but not the ‘why’. Was it the initial tag line in the paid search advertisement was it the meta description copy in the organic listing or was it the discount voucher offer from the affiliate link?
Currently there doesn’t seem to be a clear answer but as things stand I would caution against using insight from GA’s attribution modelling tool to completely overhaul the marketing mix.
Google’s switch from Website Optimiser (GWO) to Content Experiments (GCE) has made the testing process significantly easier and, in so doing, the effort of site optimisation more fun and, importantly, more productive.
Until now the level of effort required to set up an experiment using Website Optimiser was a barrier to action for many, by contrast Content Experiments makes use of the standard Google Analytics tag that is already set and as a result cuts out much of the configuration hassle associated with GWO.
However, there are differences between the two systems. One which has drawn some attention is the way in which GCE manages tests as they progress. In the example data below the split of visits between the control page and the variation page is roughly 74/26 in favour of the control page, when it started out it was a straight 50/50 split. There has been some anguish from other GCE users who have experienced the same issue.
Why is this happening?
Google have evolved their testing methodology from a straight A/B test in which, throughout the duration of the test, each of the two variants is given an equal measure of the sample traffic to a methodology referred to as a Multi Armed Bandit approach. Without going into the origins of the name (read about it here) suffice it to say this methodology effectively looks to optimise for performance as the experiment progresses based on accumulated results.
GCE will take a snapshot of test performance every day and adjust the sample traffic based on performance against the designated goal. When one or other of the pages starts to show itself as the better performer GCE will tweak the split of sample visits to favour the better performing page.
The trended sample data above shows how this appears to be working. On the more extreme days which have been highlighted with black arrows, where conversion falls against one variant and rises against the other it is easier to see how the following day reflects with a corresponding shift in apportioned visits. In this example there is still the obvious shift which occurred 10 days in where the sample split changed radically. It would seem that there are two likely reasons for this.
- GCE needs to accumulate a statistically significant set of test data before making any changes to sample split.
- In this particular case the sample size was only 5% at the start of the test but it was increased to 10% on the second day and then 25% on day seven. By increasing the sample size this will impact statistical validity.
Accepting (for now) that the approach has been correctly implemented by Google, the benefit to this approach is that downside potential is reduced as sample traffic is diverted away form the under performing variant and over to the better performing variant.
In most cases a “full version” website looks basically the same on an iPad as it does on a desktop, notebook or netbook PC, flash notwithstanding. Evidence to support that the iPad screen format is not really an issue comes from the fact that conversion rates for iPad tend to be the same or sometimes even a little better than the site average.
Navigation is one of the most important aspects of site design and conversion rate optimistation and it is here that a site can let down iPad users. Take www.forbes.com; …works well on a PC and also defaults nicely to a mobile site for browsers using a phone but the iPad sits somewhere in between the two and as with many other sites, the Forbes.com default for browsers using an iPad is the main site not the mobile site.
Forbes also use drop down menus in the global navigation bar that reveals when a cursor is hovered over the top, visitors then click on the link they want. iPad users do not have a cursor and they will be waiting a long time if they go for the finger hovering option. They must by default tap on an ara of the screen where they want to take action. Because, in the case of the Forbes site, the main menu only reveals when a cursor is hovered over it and because a tap on the relevant section of the menu bar only serves to refresh the page (see below), users with an iPad will be left frustrated.
This is not to say that the hover option is generally bad, just that for the benefit of visitors using an iPad when a navigation option that leads to an expanded menu is tapped on it should do just that, i.e. expand the menu and not refresh the page.