GA4 RegEx Conversions Available: Why You Need to Be Careful
Universal Analytics (UA or GA3) has destination goals that allow marketers to create conversion reporting based on complex page path conditions. For example, if you want to send a conversion event when a user views multiple thankyou pages, UA allows this tracking – but Google Analytics 4 (GA4) does not… until now.
Google recently introduced a recent update to GA4 to support Regular Expression (RegEx) for event creation. This makes GA4 closer to its old Universal Analytics counterpart, which helps you migrate goals from UA to GA4 more easily.
However, after some testing on our end, we found some issues that you should be aware of.
A refresher: how do we send events to GA4?
Before anything else, we need to remind you that there are 3 ways to add conversion events: via Google Tag Manager, via inline gtag event code and via GA4-created events in the GA4 interface. If you are using Google Tag Manager you should use RegEx triggers, as these allow tags to be sent to both GA4 and other vendors. If you are not using GTM then continue reading…
New method: creating new events directly in the GA4 interface
Users can create artifical events from existing parameters & URLs sent to GA4. For example, if there is no inline event on a thank you page, then GA4 allows one to be created. To see this feature, you need to go to:
Admin > Events > Create event > Create
New GA4 RegEx opererator condition
You can now see a new option for matching regular expressions in the list. This can be used to generate these artificial events.
Issues with regular expressions in event creation
We did some testing and were able to discover two potential problems.
1. Incomplete GA4 RegEx expression syntax recognition
When creating a new event using page_location field GA4 forces the input to start with “http://” or “https://”.
This is problematic as this means you cannot use Regex characters like parenthesis () at the beginning of the regex string.
For example: the expression ^(http://.*|https://.*)$ or ^(https?://.*)$ is not allowed, even though it is valid expression.
To solve the above issue we recommend:
a) Edit the page_location input to replace all HTTP to HTTPS using a JS function such as myURLString.replace(“http://”,”https://”);
b) Ask your developer to edit your .htaccess file to redirect all non-SSL traffic to SSL pages.
c) Ask your developer to enable Strict-Transport-Security which will cause SSL pages to become the default.
You will then be able to enter the expression https://(.+)(thankyou1\.html|thankyou2\.html)(.*)$ in GA4 created events and thus ignore the “http only form validation” input bug.
2. Potential misconfiguration that will break a website
With great power, comes great responsibility. The same is true of GA4 RegEx, and if used incorrectly such as Malformed GA4 RegEx, then it will lead to site performance issues. This is because unlike Universal Analytics; GA4 RegEx executes in the browser and there is an unlimited number of characters allowed and no RegEx validation. Hence it’s possible to accidentally invoke catastrophic backtracking.
In Google support, it’s stated:
In effect, your website will be disabled. This scenario is similar to, but not the same as DDos (distributed denial-of-service) attack, wherein the website is disabled due to a high number of server requests.
What does runaway GA4 RegEx look like in practice?
Let’s add the event parameter page_location as a matching condition with 40,000 characters! Amazingly, GA4 allows it.
The browser will render this super long value and this could cause performance issues and even break the website.
This could be done accidentally or maliciously.
How to stop this from happening
1. Validate your RegEx before implementing it
Use tools such as regex101.com or use GA3 regex table filters, to validate your RegEx before saving them.
2. Create a separate GA4 account and web stream for DEV
If you have a clone of the GA4 conversion rules on DEV, then typos such as this should be detected in QA testing.
3. Peer review & get expert help
It’s always a good idea to ask another in-house analyst or external Agency, to check your work before it is published, to detect possible RegEx typos. The exclude referral string also needs to be checked, as this supports regex, but has no native validation protection.
4. Limit GA4 edit access to trusted users
Another important precaution is to limit the number of trusted users who have edit access to your GA4. This is kind of a given, but always make sure that the people who are editing the setup of GA4 know what they are doing.
What to do now
It’s possible to import Universal Analytics RegEx goals into GA4 which is great news! However, this import could accidentally break your site, hence why we suggest caution.
We notified Google of this interface bug 30 days ago, on 13th April 2023. Hopefully, they will update the User Interface to prevent invalid regex from being entered and reduce the maximum string length. They haven’t fixed it as of writing this article, but hopefully, they will.
If you’re feeling unsure about this or any other bugs, we as a Google Analytics agency are here to help. Please get in touch or comment below.