Pipeline Lifecycle
The Aris RAG framework exposes three synchronous hooks for customizing the retrieval and generation process. These hooks allow for query transformation, result filtering, and prompt safety checks.Hook Definitions
onBeforeRetrieve(query: str, context: dict) -> str
Purpose: Modify the user’s raw query before it hits the vector database.Use Cases:
- Expanding acronyms (e.g., “RFP” -> “Request for Proposal”).
- Correcting domain-specific spelling errors.
- Injecting metadata filters based on user role.
onAfterRetrieve(chunks: list[Chunk]) -> list[Chunk]
Purpose: Filter or re-rank the raw results from the vector store.Use Cases:
- Removing chunks with low confidence scores (< 0.7).
- Deduplicating identical content from different sources.
- Redacting PII (Personally Identifiable Information) before context injection.
onBeforeGenerate(prompt: str, chunks: list[Chunk]) -> str
Purpose: Final inspection of the full prompt sent to the LLM.Use Cases:
- Checking for prompt injection attempts.
- Formatting the
contextblock with specific XML tags. - Truncating the prompt to fit strict token limits.
Implementation Example
Register hooks in yourrag_pipeline.py configuration. All hooks must be synchronous to avoid event loop blocking.