Skip to main content
Patterns are regular expressions (regex) Orka uses to detect sensitive data in your databases. When you create a custom data protection rule, you define patterns to match either column names, data values or both.

Pattern types

Data protection rules use two types of patterns to detect sensitive data:
  • Column name: use when column names indicate sensitive data (for example, a column named email or ssn).
  • Value: use to detect the actual data within columns.
For better accuracy, we recommend using both column and value patterns together. Example: Email detection
Column pattern: (?i)(email|e_mail|emailaddress|mail)
Value pattern: ^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$
This catches columns named variations of “email” and validates email format in the data:
  • Column pattern: Case-insensitive match for “email”, “e_mail”, “emailaddress” or “mail”
  • Value pattern: Standard email format validation
  • Combined: Detect email columns by name or content

Regex basics

If you’re new to regular expressions, here are the essential elements:
ElementMeaningExampleMatches
.Any single charactera.c”abc”, “a1c”, “a c”
*Zero or more of previousab*c”ac”, “abc”, “abbc”
+One or more of previousab+c”abc”, “abbc” (not “ac”)
?Zero or one of previousab?c”ac”, “abc” (not “abbc”)
^Start of string^hello”hello world” (not “say hello”)
$End of stringworld$”hello world” (not “world peace”)
|OR operatorcat|dog”cat”, “dog”
[]Character class[abc]”a”, “b”, “c”
[^]Negated character class[^abc]Any character except a, b, c
()Grouping(ab)+”ab”, “abab”, “ababab”
\bWord boundary\bcat\b”cat” (not “category”)
\dAny digit\d{3}”123”, “456”
\wWord character\w+”hello”, “user123”
\sWhitespace\s+” ”, ” ”, tab

Escape special characters

To match special characters literally, escape them with a backslash:
\.  matches a period
\(  matches an opening parenthesis
\$  matches a dollar sign

Best practices

Use Orka’s rule testing interface with actual sample data from your databases before you apply patterns to production.
Before you create custom patterns, check if Orka’s prebuilt rules already cover your use case.
Add clear descriptions to custom rules that explain what the pattern detects and why. This helps other members of the Data protection admin group understand and maintain your rules.
Account for inconsistencies in real-world data: different separators, optional formatting, case variations and common misspellings.