Split Strings in Google Data Studio Using RegEx
Here’s a quick and dirty tip for Google Data Studio. We are going to be splitting strings using regular expressions and turning one custom dimension into two different values.
Let’s say you are sending a string from an ad platform to GA custom dimension. It looks like this:
It’s easier to send those dimensions to GA as separate custom dimensions, so why bother with this? Well, sometimes you don’t have the luxury of owning GA 360, and in the free version custom dimensions are scarce.
You want to separate values of lineItem and bannerDomain into two different fields. Google Data Studio will serve as a visualisation platform. Your tool of choice to crack this nut is going to be RegEx. Some of you are thinking of the quote by Jamie Zawinski:
Some people, when confronted with a problem, think “I know, I’ll use regular expressions.” Now they have two problems.
You are absolutely right.
There are way too many ways to approach this problem, but here’s the one I used.
1) Create two new fields in GDS
One will serve as a placeholder for Line Item and the other for Banner Domain values. That’s the easy part.
2) Write up formulas
Let’s start with the Line Item field.
REGEXP_EXTRACT(Custom Dimension 1, '([^=]*)\\|')
This extracts the portion of the string between the “=” character and the “|” character. It is possible that the pipe at the end of the extracted string will remain as a part of the new string. Don’t worry, GDS will most likely cut it and not visualise it.
You can test the formula at regex101, a tool I highly recommend using.
On to the second formula that goes to the Banner Domain field. Here it gets tricky, because you need to extract the back part of the string. In the end, I went for a combination of RegEx and some modifications on the GDS side.
REGEXP_REPLACE( REGEXP_EXTRACT(Custom Dimension 1, 'bannerDomain=(.+)'), 'bannerDomain=', '' )
Almost too easy, right? We take every character that supersedes the pipe character, including the “bannerDomain=” string. That’s what the
REGEXP_EXTRACT is for. We then wrap this code in a replacement function
REGEX_REPLACE that finds the string “bannerDomain=” and replaces it with a string as empty as my life’s purpose. Done.
Check and test your formula again at regex101.
3) Visualise the data
You have newly acquired a treasure trove of data, so you can run different analyses and help your PPC team to optimise campaigns in a much more efficient fashion. I’m sure they’ll be thrilled about that.
You might have noticed that some characters are escaped using a double backslash instead of the single backslash. That’s because Google Data Studio uses RE2 (perfect bedtime reading).
Another caveat is that you can’t perform more complex calculations inside of a
CASE function. This makes Power BI or Tableau look better and better, doesn’t it?
We occasionally send out a newsletter for specialists and business decision makers alike. Interested?
No spam, no annoying sales pitches.
Honza Felt is a performance marketing specialist who turned into a marketing consultant after a mysterious accident. He spends his days leading a bunch of misfits at CF Agency. Drinks rum and knows things.