Bottom Line: Learn how to effectively use Excel's new REGEX functions to validate, extract, and replace data patterns, enhancing data accuracy and consistency with the help of AI tools.
Skill Level: Intermediate
Watch the Tutorial
Download the Excel File
The workbook that I use in the video can be accessed here:
New REGEX Functions
Microsoft recently introduced three new REGEX functions in Excel that simplify finding and extracting complex patterns in your data. This tutorial will walk you through using these functions to prevent data entry errors with data validation and conditional formatting. You'll also learn how to leverage AI tools like Copilot or ChatGPT to make writing formulas easier.
Understanding REGEX Functions
REGEX Test
The REGEXTEST function allows you to check if a specific pattern exists in your data. For example, to determine if account codes contain any numbers:
- Insert REGEX Test Function
- Type
=REGEXTEST(B6,"[0-9]")
in the cell where you want to test.This pattern[0-9]
checks for any numeric values.
- Type
- Drag to Apply
- Then drag the formula down to apply it to the rest of the column. You’ll see
TRUE
for cells containing numbers andFALSE
otherwise.
- Then drag the formula down to apply it to the rest of the column. You’ll see
- For Complex Patterns
- For more complex patterns, such as detecting phone numbers in scraped data, use a more detailed REGEX pattern.
Fortunately, we can use AI tools to help write these complex patterns for us. More on that in a minute.
REGEX Extract
The REGEXTEXTRACT function is used to pull data matching a pattern from a text string.
- Insert REGEX Extract Function
- Type
=REGEXEXTRACT(B6, "your_phone_number_pattern")
to extract phone numbers.
- Type
- Return All Matches
- Modify the return to get all matches by using
=REGEXEXTRACT(B6, "your_phone_number_pattern", 1)
. This will spill the results into adjacent cells.
- Modify the return to get all matches by using
- Transpose Results
- To avoid spill errors, use
TRANSPOSE(REGEXEXTRACT(B6, "your_phone_number_pattern", 1))
.
- To avoid spill errors, use
REGEX Replace
The REGEXREPLACE function lets you find a pattern and replace it with specified text or another pattern.
Insert REGEX Replace Function
- Use
=REGEXREPLACE(B6, "your_phone_number_pattern", "replacement_format")
to standardize phone number formats.
Apply to Dates
- Similarly, format dates consistently using REGEXREPLACE.
Enhancing Data Validation with REGEX
A great way to put REGEX to use is with data validation. You can ensure data integrity by incorporating REGEX into your data validation rules.
- Copy REGEX Test Formula
- Copy your
REGEX.TEST
formula, e.g.,=REGEXTEST(C7, "your_sku_pattern")
.
- Copy your
- Apply Data Validation
- Then select the cells where you want the data validation applied, go to the Data tab, click Data Validation, choose Custom, and paste the formula. Adjust cell references as necessary.
Using Conditional Formatting with REGEX
A great way to make noncompliant entries stand out is to use conditional formatting with REGEX. You can highlight cells that do not match the specified pattern.
- Copy Formula
- Copy the
REGEXTEST
formula.
- Copy the
- Apply Conditional Formatting
- Select the cells, go to the Home tab, click Conditional Formatting, choose New Rule, use a formula, paste the formula, and format the cells to highlight mismatches.
Once you hit OK, the entries that are out of compliance with your pattern will be highlighted.
Leveraging AI Tools for Complex Patterns
Creating complex REGEX patterns can be challenging. Use AI tools like ChatGPT or Copilot to assist you. For example, request help from ChatGPT by providing the pattern requirements with a prompt such as, “Please help create a REGEX expression for the following pattern…”
ChatGPT not only gives you the expression, but breaks down its components for explanation. Copy the generated pattern into your Excel formula.
Another website that you can use is regex101.com, where you can test your patterns and get help learning the language for them.
Conclusion
These new REGEX functions in Excel enhance your ability to manage and validate data efficiently. Share how you might use these functions in the comments below. Of course, any questions are welcome there as well.
Thanks for taking the time to read this post!
Using option 2 for return_mode with REGEXEXTRACT sounds like it will only return a submatch array for the first match (as in only getting the group details for one out of potentially several matches and not really offering an easy way to get those details for a second, third, etc.). But your goal might be to extract the first and third submatch for every match, meaning it would be ideal if there were an option to return an array of all submatch groups so that CHOOSECOLS and other array handlers could isolate the desired pieces.
I mean if the pattern includes 4 submatch groups, then every match will have 4 submatches even if some of them are null (that the function could default to “” or #N/A as an optional pad_with argument). So an array of all submatches would always be a nice, neat, predictable rectangular array, which seems like it would be a very, very nice and relatively easy option to offer.
I’ll be trying to submit this as an idea/feedback directly to Microsoft, but perhaps you have an inside track with them as an MVP and could pass this along if you agree it would be a nice feature.
Hi, I’ve the Microsoft 365 Excel and these functions doesn’t appears