Skip to content

Performance Optimizations

The genlm-control library offers two key performance optimizations for instances of the Potential class:

  • Autobatching: Automatically batches concurrent requests to the potential's instances methods
  • Multiprocessing: Runs multiple instances of a Potential in parallel across CPU cores

Auto-batching

Auto-batching improves performance when a Potential class's batch methods (batch_complete, batch_prefix, batch_logw_next, batch_score) are more efficient than sequentially running individual instance methods.

Usage

To enable auto-batching, use the to_autobatched() method:

autobatched_potential = potential.to_autobatched()
# Use it exactly like a regular potential - batching happens automatically
results = await asyncio.gather(
    *(autobatched.complete(seq) for seq in sequences) # These will batched and processed by batch_complete
)

This creates a new potential that is a wrapper (AutoBatchedPotential) around the original potential. The wrapper automatically collects concurrent requests in the background and processes them together using the potential's batch methods. This happens transparently without requiring changes to your code structure.

Multiprocessing

CPU parallelization can significantly improve performance for compute-intensive Potential classes. This is particularly useful when methods like complete, prefix, or logw_next involve heavy computation.

Usage

To enable multiprocessing, use the to_multiprocess() method:

# Create a multiprocess wrapper with desired number of workers
mp_potential = potential.to_multiprocess(num_workers=2)
# Use it like a regular potential - requests are distributed across workers
results = await asyncio.gather(
    *(mp_potential.complete(seq) for seq in sequences) # These will be distributed across workers
)

This creates a new potential that is a wrapper (MultiProcPotential) around the original potential. The wrapper asynchronously distributes requests across multiple processes (in a non-blocking manner). This allows you to scale your computations across multiple cores without changing your code structure.

Requirements

For multiprocessing to work, the potential must implement a picklable spawn() method that creates a new instance of the potential. Only some built-in Potential classes support this by default. Custom potentials need to implement their own spawn() method.

Performance Benefits

Multiprocessing improves performance for both batched methods (batch_complete, batch_prefix, batch_logw_next) and unbatched methods (complete, prefix, logw_next).

In the batched case, requests within a batch are processed in parallel across workers. For individual method calls, requests are distributed to available worker processes and are executed asynchronously.

When to use each optimization

Note: Built-in Potential classes that can benefit from auto-batching support (e.g., PromptedLLM) will have auto-batching enabled by default.

  • Use auto-batching when the potential's batch operations are more efficient than sequential operations
  • Use multiprocessing when the potential's operations are compute-intensive and can benefit from parallel processing
  • Consider the overhead of each optimization when deciding which to use. Multiprocessing in particular incurs a significant overhead when the potential's operations are not compute-intensive.