pattern_matching
genlm.eval.domains.pattern_matching
PatternMatchingInstance
Bases: Instance
Schema for pattern matching instance.
Source code in genlm/eval/domains/pattern_matching.py
PatternMatchingDataset
Bases: Dataset[PatternMatchingInstance]
Dataset for pattern matching evaluation.
Source code in genlm/eval/domains/pattern_matching.py
__init__(patterns)
Initialize the dataset with a list of regex patterns.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
patterns
|
list[str]
|
List of regex patterns to evaluate. |
required |
from_csv(csv_path, pattern_column)
classmethod
Load patterns from a CSV file.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
csv_path
|
str
|
Path to the CSV file. |
required |
pattern_column
|
str
|
Name of the column containing regex patterns. |
required |
Returns:
Type | Description |
---|---|
PatternMatchingDataset
|
Dataset initialized with patterns from the CSV. |
Source code in genlm/eval/domains/pattern_matching.py
__iter__()
Iterate over regex patterns.
Returns:
Type | Description |
---|---|
Iterator[PatternMatchingInstance]
|
Iterator over regex instances. |
Source code in genlm/eval/domains/pattern_matching.py
schema
property
Get the schema class for this dataset.
Returns:
Type | Description |
---|---|
type[PatternMatchingInstance]
|
The Pydantic model class for pattern matching instances. |
PatternMatchingEvaluator
Bases: Evaluator[PatternMatchingInstance]
Evaluator for pattern matching.
Source code in genlm/eval/domains/pattern_matching.py
evaluate_sample(instance, response)
Evaluate if a response matches the regex pattern.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
instance
|
PatternMatchingInstance
|
The pattern matching instance being evaluated. |
required |
response
|
str
|
The model's response text. |
required |
Returns:
Type | Description |
---|---|
EvaluationResult
|
Evaluation result for whether the response matches the pattern. |
Source code in genlm/eval/domains/pattern_matching.py
PatternPotential
Bases: Potential
Potential function for regex pattern matching.
Source code in genlm/eval/domains/pattern_matching.py
default_prompt_formatter(tokenizer, instance, use_chat_format=False, system_prompt=SYSTEM_PROMPT, few_shot_examples=FEW_SHOT_EXAMPLES)
Default prompt formatter for pattern matching.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
tokenizer
|
Tokenizer
|
The tokenizer to use. |
required |
instance
|
PatternMatchingInstance
|
The instance to format. |
required |
use_chat_format
|
bool
|
Whether to use chat format. |
False
|
system_prompt
|
str
|
The system prompt to use. |
SYSTEM_PROMPT
|
few_shot_examples
|
list[tuple[str, str]]
|
The few shot examples to use. Each example is a tuple of (pattern, response). |
FEW_SHOT_EXAMPLES
|
Returns:
Type | Description |
---|---|
list[int]
|
The prompt ids. |