Clean data isn’t something that is a feature of the platform, but something you need to work towards.
Bad data is one of the things that keeps Analytics consultants in business. With GA3, unless you can track down and resolve bad data issues, it ends up in your system where it will affect all of your data. As a consultant, that means stopping bad data in its tracks otherwise you quickly lose your clients’ trust.
Fortunately, GA4 gives you more options for keeping your data clean.
Getting Clean Data in GA4
There are many features in GA4 to both clean the data that is there, but also to improve the quality of the data to start with, so that less cleanup is required.
The debug view is one of the key features of GA4 that can help you to keep your data clean
To use it, add the Google Analytics Debugger extension to Chrome or use the Preview mode in GTM every time you access your websites. This way, you can check things are working without having your own data polluting your real customer data.
Google Analytics Debugger Extension
Install the extension to your chrome browser and then just turn it on when you’re browsing.
GTM Preview mode
Click on Preview in GTM to activate the preview mode.
Either method will automatically activate the debugger mode in your analytics account.
With GA3, you would have to remove your own data by searching for your IP address, so this is a much more efficient way of keeping your data clean.
The measurement protocol in GA4 requires an API key. This means that random people cannot insert data into your site, keeping your data clean from the outset.
In your Analytics account, open the Web stream details and scroll down until you find the Measurement protocol ID.
This is where you can find the secret value which is required to send data through the API. This way, you can control who can send data to your site by ensuring you are only sharing the key with authorized users.
Junk data coming through from spammers will no longer be a problem. However, Google have not yet added the setting where you can exclude data coming from bots and spiders that they had in GA3.
Filters are simpler in GA4 but more limited. This can help you with cleaning the data because they allow you to use test mode without creating a whole view.
In GA3, you would create different views using different filters, depending on what you wanted to view.
With the filters in GA4, you can switch between three different states: Testing, Active and Inactive, meaning you don’t need to create all these different views.
The limitations with GA4 filters is that you only get two types already set up: Developer Traffic and Internal Traffic. If you want to use other filters, you would need to create them from scratch.
With GA4, you are able to retroactively change your data by modifying events. Go to the Events list and click Modify Event.
Give the modification a name and then add the parameters or conditions that you want to apply to the event.
With GA4, not only can you easily turn conversions on and off at the click of a button, but you can also apply them retroactively.
The Studio Dashboard
You can easily compare GA3 and GA4 data from the studio dashboard. In the Data Driven Insiders resources, you can find a template that will help you do this.
Data deletion is much easier in GA4 as it comes “baked into” the product from the ground up. This is part of the features they have added to improve user privacy and security
You can schedule for data to be deleted from the Analytics interface. Go to Data Deletion Requests and click Schedule Deletion.
Choose the type of data you want to schedule for deletion, then click Schedule request.
… and the bad news
Although overall GA4 has some great features for improving your data, it is not without its flaws.
There was a recent issue with source/medium attribution which skewed everyone’s results. This was fixed by Google but it is a worry that these things are able to happen and that they might happen again.
On top of that, the graphs GA4 are offering are pretty basic and not yet at the level of the graphs we were used to in GA3.
Jeff has written articles about some of the issues related to getting clean data in Google Analytics:
- Is Google Analytics newest data quality issue the most challenging?
- How to identify and remove bot traffic in Google Analytics
- 8 steps for eliminating bad data