User-Data Sessionisation
On a beautiful sunny day, I was working on a client’s case. It was related to user ID implementation. I was about to start, yet decided to scroll through my LinkedIn feed for a moment and saw multiple posts popping up that discussed the new Google update. Ouch.
The new update: Collecting user-provided data
Out of the blue, Google rolled it out, saying that you do not have to send a user ID if you collect user-provided data. They can take this data and create a hashed pseudonymized user ID instead.
The data you send is then matched with other Google data in a privacy-safe way – according to Google. *One eyebrow raised*. Ideally, this should help improve your data accuracy.
Here’s a look at the official update from Google:
The benefits of user-provided data collection
Well, I understand the reasons for introducing it. Since we are saying goodbye to 3d-party cookies, Google wants to future-proof your setup by using first-party data. This update means that you can connect your users’ behaviour across different sessions and on various devices and platforms without the need for a user ID.
Additionally, this new way of user-provided data collection activates enhanced conversions support for GA4 conversions (*cough*, Key Events) as well as provides demographic & interest reporting based on first-party data.
Things to consider if you decide to use this feature
This all sounds like sunshine & rainbows, but there are some caveats here.
The first thing to remember is that you MUST link your GA4 & GAds account to collect user-provided data.
According to their documentation, acknowledging the feature policy is permanent. Uff, I can feel the pressure here.
You cannot use it with your app data streams – at least for now.
After enabling it, user IDs will NOT be available in event-level and user-level data that BigQuery exports. Google claims this will be supported later on in open beta.
The process of user data sessionisation
I would say I need more time to process this update. Another sip of coffee. What concerns me most is user data sessionisation. So, I could theoretically set up a user ID along with user-provided data collection, but how would my data be stitched? If you provide multiple user-data types, Analytics will prioritise them in the following order:
- email,
- phone,
- name
- address
Moreover, if a user visits your website and you’ve decided to use this new feature to collect their data, and then later you send a user ID together with this data, Analytics will recognise these as 2 DIFFERENT users. Looks like a mess.
It would be better if Google just added a new backend sessionsation process so there are three identifiers:
- sha256_email
- user_id
- device_id
Rather than two:
- sha256_email OR user_id
- device_id
The solution
So, the team united their superpowers to find the most efficient solution in this case. As you can see, this new feature does not solve a problem yet creates it. Additionally, sending a user ID with this new feature will BREAK your sessions and will KILL the flow of data user ID into BigQuery.
The team came up with 2 possible workarounds:
1. To set up user_id_2 and add it as a custom definition in GA4. In this case, you send the additional information you need without calling it a user ID.
2. Save the hashed email as a cookie and then BOTH user ID & the hashed email as identifiers; you must always send them TOGETHER.
The user ID is usually available only when a user is logged in. This is why we need to save the hashed email (let’s say a user submitted a form before) as a cookie so that we could access it when a user logs in.
3. To make the solution as robust for cross-device tracking the team also recommends asking your developer to update login pages in your CMS (such as WordPress) or your App to expose user_data.email value. This is because the cookie method mentioned above, won’t work if the user is on 2 different devices, or surfing in Mobile Web and Mobile App. Thus we recommend exposing these 2 lines as well, to maximise data quality and matching rates:
dataLayer.push({
user_data.email: “jane@gmail.com”, // Add this line
user_data.phone_number: “+44771234569”, // Add this line
user_id”: “9999999”
});
Searching for some additional info, I came across this warning:
Well, fun times. Just another day in the life of a technical marketer.
I will keep you updated after running some tests. Let me know your thoughts (I believe in the superpower of community).
Bonus tip for banking websites that want additional security: we recommend exposing ONLY the hashed value in JS dataLayer to prevent other trackers (e.g. Tiktok) from reading the unhashed dataLayer.user_data.email value without permission:
dataLayer.push({
“user_data”: {
“sha256_email_address”: “XXXXXXXXXXXXXXXXXXXX”,
“sha256_phone_number”: “XXXXXXXXXXXXXXXXXXXX”,
“address”: {
“sha256_first_name”: “XXXXXXXXXXXXXXXXXXXX”,
“sha256_last_name”: “XXXXXXXXXXXXXXXXXXXX”,
“sha256_postal_code”: “XXXXXXXXXXXXXXXXXXXX”,
},
“new_customer”: true,
“customer_lifetime_value”: 100.00
},
“user_id”: “9999999”,
“event”: “user_provided_data”
});
- User-Data Sessionisation - 05/04/2024
- Goals, Conversions, Key Events → What’s Next - 27/03/2024