Google Analytics (Not Provided) SSL Solution

Google Analytics (Not Provided) SSL Solution
Costumer privacy is a big consideration when devising a solution for the Google Analytics (Not Provided) problem.

Already fully aware of this issue? Skip straight to the solution.

Solution

Solution

Google decided to encrypt search query data for logged-in users, such that they cannot be tracked in Google Analytics. They did this by setting SSL Search to default for all logged-in google users. Their official blog post on October 18, 2011 explained their goal of making search more secure. I’ll spare everyone the conspiracy theories behind the reasons for this major change and stick to the facts.

The end result of this change was that website analytics – either through Google or another analytics package — have become less accurate. These encrypted results show as (not provided) rather than an actual keyword in Google Analytics. Estimates are all over the place but see to hover around 10% of search queries being completely anonymous and skewing results. Merchants are no longer able to associate keywords with conversions. SEO providers are no longer able to provide accurate reporting data to their clients. Besides being unable to report specific keyword traffic results, we are even unable to differentiate between longtail results and keywords related to a client’s brand name. Simply put, there is no way to measure ROI for organic search campaigns.

There are a number of hacks to estimate keyword traffic. Cross-referencing Google Webmaster Tools, setting up advanced segments and manipulating keyword data by using filters to associating entry-pages to a website with the expected keywords used to find them and estimating traffic sources based on known values. But ultimately, the only true solution needs to come from Google.

Protecting the Privacy of the End User

If we assume that any solution to this reporting problem requires that the user’s personal information, including the search query he or she used to find a webpage, then it stands to reason that any information used for reporting needs to be aggregated in the background. This means that Google would probably need to collect browsing data anonymously, then send that data back to Google Analytics in such a way that a merchant will not be able to associate specific keywords to a specific user or transaction. This would be nearly impossible for websites with very little traffic or few conversions. However, a reasonable level of anonymity may be reached when there is ample data. For example, if Google Analytics were to report a keyword/transaction pair that led to a sale on a particular day, to a website that averages one sale per day, the merchant could very easily associate that keyword with the customer who placed the order. However, if Google were to provide this data for 50 transactions, to a website that generated 50 orders per day, the likelyhood that an ecommerce store owner would be able to make this association diminishes significantly.

The Solution

First, please note that although I know a fair amount of PHP and MySQL, I am by no means what one would consider a developer. I’m just a guy who understands SEO and website analytics, with something of a logical mind to form a solution. This initial plan is far from perfect, and may contain flaws. It is simply meant as a starting point for a discussion to find a real solution that Google, if compelled to do so, may take note of and proceed with a very well-thought plan.

First the diagram, then the explanation:

Solution to Google Analytics Encrypted Search Reporting

Solution to Google Analytics Encrypted Search Reporting

Explanation

Here’s how it works:

  1. The user performs a search query on Google, and clicks a link
  2. Rather than sending that traffic directly to the merchant website, Google runs a small redirect script to capture anonymous data before passing that traffic to the website in question.
  3. Google looks up the UA-XXXXXXXX-X ID associated with the website, if it exists. This ID is stored to a new database table along with all search data – keywords, date, etc. and assigns a Guaranteed Unique ID (GUID) to that record.
  4. After writing this information to the database,  the GUID is passed back to the website, sans keyword and other data.
  5. In the event that this visitor ends up making a purchase, the website sends the GUID back to Google Analytics along with all purchase data.
  6. Google associates this conversion and all relevant data  with the original source keyword and demographic information and stores it to the database.
  7. On a daily basis, Google runs a script that checks whether enough data has been collected so as to maintain a high degree of privacy. If enough records exist in the database then that data is sent to Google Analytics to replace the (Not Provided) token with the appropriate keywords.
So, this is pretty straightforward. I believe that Google’s engineers are very talented and may in fact be working on another solution in-house. Hopefully this article and methodology will shed light on this issue.
Like this article? If so, please link to it, reference it, or otherwise send a little love back to Consorte Marketing.

Want to make an impact? Send a link to this or another solution to analytics-feedback@google.com – your voice will be heard!

  • By Dennis Consorte
  • Published on December 13th, 2011
  • Posted in Google.