Regular Expressions
cargs offers powerful regular expression support via PCRE2 (Perl Compatible Regular Expressions) to validate user inputs with sophisticated pattern matching.
Overview
This guide covers advanced usage of regular expressions in cargs:
- Using predefined patterns
- Creating custom patterns
- Pattern syntax and techniques
- Best practices for regex validation
For a complete list of predefined patterns, see the Regular Expressions API Reference.
Using Predefined Patterns
cargs provides a wide range of predefined patterns in cargs/regex.h that cover common validation scenarios.
#include "cargs/regex.h"
CARGS_OPTIONS(
options,
HELP_OPTION(FLAGS(FLAG_EXIT)),
// Email validation
OPTION_STRING('e', "email", HELP("Email address"),
REGEX(CARGS_RE_EMAIL)),
// IP address validation
OPTION_STRING('i', "ip", HELP("IP address"),
REGEX(CARGS_RE_IPV4)),
// URL validation
OPTION_STRING('u', "url", HELP("URL"),
REGEX(CARGS_RE_URL))
)
// Options with different types of validation
CARGS_OPTIONS(
options,
HELP_OPTION(FLAGS(FLAG_EXIT)),
// Date in ISO format (YYYY-MM-DD)
OPTION_STRING('d', "date", HELP("Date (YYYY-MM-DD)"),
REGEX(CARGS_RE_ISO_DATE),
HINT("YYYY-MM-DD")),
// Password with strong validation
OPTION_STRING('p', "password", HELP("Password (8+ chars, mixed case, numbers, symbols)"),
REGEX(CARGS_RE_PASSWD_STRONG),
HINT("StrongP@ss1")),
// Semantic version number
OPTION_STRING('v', "version", HELP("Semantic version"),
REGEX(CARGS_RE_SEMVER),
HINT("X.Y.Z"),
FLAGS(FLAG_REQUIRED))
)
Creating Custom Patterns
While predefined patterns cover many common needs, you can also create custom patterns for specific requirements.
Using MAKE_REGEX
The MAKE_REGEX macro creates a new regex pattern with an error hint:
Inline Patterns
You can also define patterns inline without creating a constant:
OPTION_STRING('z', "zipcode", HELP("US ZIP code"),
REGEX(MAKE_REGEX("^\\d{5}(-\\d{4})?$", "Format: 12345 or 12345-6789")))
MAKE_REGEX Structure
The MAKE_REGEX macro takes two parameters:
- The regular expression pattern
- An explanation text that will be displayed in case of validation failure
Combining Patterns
For more complex validation requirements, you can create composite patterns:
// Pattern for valid US phone numbers (multiple formats)
#define RE_US_PHONE_COMPLEX MAKE_REGEX(
"^(\\+1[-\\s]?)?(\\([0-9]{3}\\)|[0-9]{3})[-\\s]?[0-9]{3}[-\\s]?[0-9]{4}$",
"US phone format: 123-456-7890, (123) 456-7890, or +1 123 456 7890"
)
Error Messages
A key advantage of cargs' regex implementation is the ability to provide helpful error messages:
Pattern Syntax
Since cargs uses PCRE2, you have access to powerful pattern matching features:
Common Pattern Elements
| Element | Description | Example |
|---|---|---|
^ |
Start of string | ^abc - String starts with "abc" |
$ |
End of string | xyz$ - String ends with "xyz" |
[] |
Character class | [abc] - Matches a, b, or c |
[^] |
Negated character class | [^abc] - Matches any character except a, b, or c |
\d |
Digit | \d{3} - Three digits |
\w |
Word character | \w+ - One or more word characters |
\s |
Whitespace | \s* - Zero or more whitespace characters |
* |
Zero or more | a* - Zero or more "a" characters |
+ |
One or more | a+ - One or more "a" characters |
? |
Zero or one | a? - Zero or one "a" character |
{n} |
Exactly n times | a{3} - Exactly three "a" characters |
{n,m} |
Between n and m times | a{2,4} - Between 2 and 4 "a" characters |
(...) |
Capturing group | (abc)+ - One or more occurrences of "abc" |
\1, \2, ... |
Backreferences | (a)\\1 - "a" followed by another "a" |
a|b |
Alternation | cat|dog - Either "cat" or "dog" |
Escaping in C Strings
When writing regex patterns in C, remember to double-escape special characters:
Advanced Pattern Elements
PCRE2 also supports advanced features for complex validation needs:
| Element | Description | Example |
|---|---|---|
(?:...) |
Non-capturing group | (?:abc)+ - One or more occurrences of "abc" without capturing |
(?=...) |
Positive lookahead | a(?=b) - "a" only if followed by "b" |
(?!...) |
Negative lookahead | a(?!b) - "a" only if not followed by "b" |
(?<=...) |
Positive lookbehind | (?<=a)b - "b" only if preceded by "a" |
(?<!...) |
Negative lookbehind | (?<!a)b - "b" only if not preceded by "a" |
(*UTF) |
Unicode mode | (*UTF)\\p{L}+ - One or more Unicode letters |
\p{...} |
Unicode property | \p{Lu} - Uppercase letter |
Performance Considerations
Regex validation can be powerful, but complex patterns can impact performance:
Performance Best Practices
- Keep patterns simple when possible
- Avoid excessive backtracking (
.*followed by specific matches) - Use anchors (
^and$) to prevent unnecessary scanning - Prefer non-capturing groups (
(?:...)) when you don't need captures
Implementation Details
cargs implements regex validation through the PCRE2 library:
int regex_validator(cargs_t *cargs, const char *value, validator_data_t data)
{
const char *pattern = data.regex.pattern;
// Compile the regular expression
pcre2_code *re = pcre2_compile(...);
// Execute the regex against the input string
int rc = pcre2_match(...);
// Return validation result
if (rc < 0) {
CARGS_REPORT_ERROR(cargs, CARGS_ERROR_INVALID_VALUE,
"Invalid value '%s': %s", value, data.regex.hint);
}
return CARGS_SUCCESS;
}
Best Practices
When using regex validation with cargs:
- Use predefined patterns when possible for common validations
- Create descriptive error messages that help the user understand what's required
- Keep patterns focused on a single validation concern
- Document complex patterns with comments explaining their purpose
- Test thoroughly with valid and invalid inputs
Complete Example
Here's a comprehensive example demonstrating multiple regex validation techniques:
#include "cargs.h"
#include "cargs/regex.h"
#include <stdio.h>
/* Custom regex patterns */
#define RE_PRODUCT_ID MAKE_REGEX("^[A-Z]{2}\\d{4}-[A-Z0-9]{6}$", "Format: XX0000-XXXXXX")
#define RE_SIMPLE_NAME MAKE_REGEX("^[a-zA-Z][a-zA-Z0-9_-]{2,29}$", "Letters, numbers, underscore, dash")
CARGS_OPTIONS(
options,
HELP_OPTION(FLAGS(FLAG_EXIT)),
// Using predefined patterns
GROUP_START("Network and Communication", GROUP_DESC("Network-related options")),
OPTION_STRING('i', "ip", HELP("IPv4 address"),
REGEX(CARGS_RE_IPV4)),
OPTION_STRING('e', "email", HELP("Email address"),
REGEX(CARGS_RE_EMAIL)),
OPTION_STRING('u', "url", HELP("URL with any protocol"),
REGEX(CARGS_RE_URL)),
GROUP_END(),
// Custom patterns defined above
GROUP_START("Custom Formats", GROUP_DESC("Options with custom regex patterns")),
OPTION_STRING('p', "product", HELP("Product ID (format: XX0000-XXXXXX)"),
REGEX(RE_PRODUCT_ID)),
OPTION_STRING('n', "name", HELP("Username (letters, numbers, underscore, dash)"),
REGEX(RE_SIMPLE_NAME)),
GROUP_END(),
// Inline patterns
GROUP_START("Inline Patterns", GROUP_DESC("Options with inline regex patterns")),
OPTION_STRING('z', "zipcode", HELP("US Zip code"),
REGEX(MAKE_REGEX("^\\d{5}(-\\d{4})?$", "Format: 12345 or 12345-6789"))),
OPTION_STRING('t', "time", HELP("Time (format: HH:MM)"),
REGEX(MAKE_REGEX("^([01]?[0-9]|2[0-3]):[0-5][0-9]$", "Format: HH:MM"))),
GROUP_END()
)
int main(int argc, char **argv)
{
cargs_t cargs = cargs_init(options, "regex_example", "1.0.0");
cargs.description = "Example of using regex validation with both predefined and custom patterns";
int status = cargs_parse(&cargs, argc, argv);
if (status != CARGS_SUCCESS)
return status;
printf("Validation successful! All provided values match the expected patterns.\n\n");
// Display validated values
printf("Network & Communication:\n");
printf(" IP Address: %s\n", cargs_is_set(cargs, "ip") ?
cargs_get(cargs, "ip").as_string : "(not provided)");
printf(" Email: %s\n", cargs_is_set(cargs, "email") ?
cargs_get(cargs, "email").as_string : "(not provided)");
printf(" URL: %s\n", cargs_is_set(cargs, "url") ?
cargs_get(cargs, "url").as_string : "(not provided)");
printf("\nCustom Formats:\n");
printf(" Product ID: %s\n", cargs_is_set(cargs, "product") ?
cargs_get(cargs, "product").as_string : "(not provided)");
printf(" Username: %s\n", cargs_is_set(cargs, "name") ?
cargs_get(cargs, "name").as_string : "(not provided)");
printf("\nInline Patterns:\n");
printf(" Zip Code: %s\n", cargs_is_set(cargs, "zipcode") ?
cargs_get(cargs, "zipcode").as_string : "(not provided)");
printf(" Time: %s\n", cargs_is_set(cargs, "time") ?
cargs_get(cargs, "time").as_string : "(not provided)");
cargs_free(&cargs);
return 0;
}
Resources
For more information on regular expressions:
Learning Materials
Related Documentation
- Validation Guide - Basic validation concepts including regex
- Regular Expressions API Reference - Complete list of predefined patterns
- Custom Validators - Creating custom validators including pre-validators