Agent

The FinanceAgent is the main entry point for processing receipts. Internally it runs a 4-agent pipeline:

  1. Metadata agent — receipt date, number, type, category

  2. Counterparty agent — vendor / client name, address, VAT ID, tax number

  3. Amounts agent — total, VAT amount, VAT rate, net amount, currency

  4. Line items agent — individual purchased items with per-item VAT

finamt.agents.agent

finamt.agents.agent

Main entry point for receipt processing.

Pipeline (see agents/pipeline.py for details):
  1. OCR text extraction (PaddleOCR)

  2. Duplicate detection (SHA-256 content hash)

  3. Agent 1 — receipt number, date, category

  4. Agent 2 — counterparty (vendor or client)

  5. Agent 3 — amounts (total, VAT %, VAT amount)

  6. Agent 4 — line items

  7. Python merge of the 4 results → ReceiptData

  8. Validate + auto-save to SQLite

All 4 agents use the same local LLM (configured via FINAMT_AGENT_MODEL). They run sequentially — not in parallel — for local model compatibility. Debug output is saved to ~/.finamt/debug/<receipt_id>/.

class finamt.agents.agent.FinanceAgent(config: Config | None = None, project: str | None = None, db_path: str | Path | None = <object object>, agents_cfg: AgentsConfig | None = None)[source]

Bases: object

Orchestrates OCR + multi-agent extraction and auto-persists every result.

Parameters:
  • config – Optional Config instance (reads .env by default).

  • project – Project name — determines ~/.finamt/<project>/ layout. Default: “default” (or FINAMT_PROJECT env var).

  • db_path – Explicit SQLite path — overrides project layout. Pass None to disable persistence entirely.

  • agents_cfg – Optional AgentsConfig — controls which models are used.

process_receipt(pdf_path: str | Path | bytes, receipt_type: str = 'purchase', taxpayer_info: dict | None = None) ExtractionResult[source]

Process a receipt or invoice PDF through the full pipeline.

Parameters:
  • pdf_path – Filesystem path or raw PDF bytes.

  • receipt_type – “purchase” (default) or “sale”.

  • taxpayer_info – Optional dict with taxpayer’s own data so Agent 2 does not confuse the taxpayer with the counterparty. Keys: name, vat_id, tax_number, address.

Returns ExtractionResult — always populated, success flag indicates whether the result passed validation and was saved.

batch_process(pdf_paths: list[str | Path], receipt_type: str = 'purchase') dict[str, ExtractionResult][source]

Process multiple receipts sequentially.

finamt.agents.config

finamt.agents.config

All configuration for finamt in one place.

Config — OCR, general settings (env prefix: FINAMT_) AgentsConfig — LLM model settings for the 4-agent extraction pipeline ModelConfig — immutable snapshot returned by Config.get_model_config() AgentModelConfig — immutable snapshot returned by AgentsConfig.get_agent_config()

Override via environment variables or a .env file:

FINAMT_AGENT_MODEL=mistral:7b

Recommended models (text-only, no vision required):

mistral:7b ← works well, recommended default qwen2.5:7b-instruct-q4_K_M

class finamt.agents.config.Config(_case_sensitive: bool | None = None, _nested_model_default_partial_update: bool | None = None, _env_prefix: str | None = None, _env_prefix_target: EnvPrefixTarget | None = None, _env_file: DotenvType | None = PosixPath('.'), _env_file_encoding: str | None = None, _env_ignore_empty: bool | None = None, _env_nested_delimiter: str | None = None, _env_nested_max_split: int | None = None, _env_parse_none_str: str | None = None, _env_parse_enums: bool | None = None, _cli_prog_name: str | None = None, _cli_parse_args: bool | list[str] | tuple[str, ...] | None = None, _cli_settings_source: CliSettingsSource[Any] | None = None, _cli_parse_none_str: str | None = None, _cli_hide_none_type: bool | None = None, _cli_avoid_json: bool | None = None, _cli_enforce_required: bool | None = None, _cli_use_class_docs_for_groups: bool | None = None, _cli_exit_on_error: bool | None = None, _cli_prefix: str | None = None, _cli_flag_prefix_char: str | None = None, _cli_implicit_flags: bool | Literal['dual', 'toggle'] | None = None, _cli_ignore_unknown_args: bool | None = None, _cli_kebab_case: bool | Literal['all', 'no_enums'] | None = None, _cli_shortcuts: Mapping[str, str | list[str]] | None = None, _secrets_dir: PathType | None = None, _build_sources: tuple[tuple[PydanticBaseSettingsSource, ...], dict[str, Any]] | None = None, *, model: str = 'mistral:7b', temperature: Annotated[float, Ge(ge=0.0), Le(le=2.0)] = 0.1, top_p: Annotated[float, Ge(ge=0.0), Le(le=1.0)] = 0.9, num_ctx: Annotated[int, Ge(ge=512)] = 8192, tesseract_cmd: str = 'tesseract', ocr_language: str = 'german', ocr_preprocess: bool = True, ocr_timeout: Annotated[int, Ge(ge=5)] = 60, pdf_dpi: Annotated[int, Ge(ge=72), Le(le=1200)] = 150, max_retries: Annotated[int, Ge(ge=0), Le(le=10)] = 3, request_timeout: Annotated[int, Ge(ge=1)] = 30)[source]

Bases: BaseSettings

model_config = {'arbitrary_types_allowed': True, 'case_sensitive': False, 'cli_avoid_json': False, 'cli_enforce_required': False, 'cli_exit_on_error': True, 'cli_flag_prefix_char': '-', 'cli_hide_none_type': False, 'cli_ignore_unknown_args': False, 'cli_implicit_flags': False, 'cli_kebab_case': False, 'cli_parse_args': None, 'cli_parse_none_str': None, 'cli_prefix': '', 'cli_prog_name': None, 'cli_shortcuts': None, 'cli_use_class_docs_for_groups': False, 'enable_decoding': True, 'env_file': '.env', 'env_file_encoding': 'utf-8', 'env_ignore_empty': False, 'env_nested_delimiter': None, 'env_nested_max_split': None, 'env_parse_enums': None, 'env_parse_none_str': None, 'env_prefix': 'FINAMT_', 'env_prefix_target': 'variable', 'extra': 'ignore', 'json_file': None, 'json_file_encoding': None, 'nested_model_default_partial_update': False, 'protected_namespaces': ('model_validate', 'model_dump', 'settings_customise_sources'), 'secrets_dir': None, 'toml_file': None, 'validate_default': True, 'yaml_config_section': None, 'yaml_file': None, 'yaml_file_encoding': None}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model: str
temperature: float
top_p: float
num_ctx: int
tesseract_cmd: str
ocr_language: str
ocr_preprocess: bool
ocr_timeout: int
pdf_dpi: int
max_retries: int
request_timeout: int
get_model_config() ModelConfig[source]
property DEFAULT_MODEL: str
property TESSERACT_CMD: str
property OCR_LANGUAGE: str
property OCR_PREPROCESS: bool
property PDF_DPI: int
property MAX_RETRIES: int
property REQUEST_TIMEOUT: int
class finamt.agents.config.ModelConfig(model: str, temperature: float, top_p: float, num_ctx: int, max_retries: int, timeout: int)[source]

Bases: object

Snapshot of general LLM settings (used by OCR pipeline).

model: str
temperature: float
top_p: float
num_ctx: int
max_retries: int
timeout: int
class finamt.agents.config.AgentsConfig(_case_sensitive: bool | None = None, _nested_model_default_partial_update: bool | None = None, _env_prefix: str | None = None, _env_prefix_target: EnvPrefixTarget | None = None, _env_file: DotenvType | None = PosixPath('.'), _env_file_encoding: str | None = None, _env_ignore_empty: bool | None = None, _env_nested_delimiter: str | None = None, _env_nested_max_split: int | None = None, _env_parse_none_str: str | None = None, _env_parse_enums: bool | None = None, _cli_prog_name: str | None = None, _cli_parse_args: bool | list[str] | tuple[str, ...] | None = None, _cli_settings_source: CliSettingsSource[Any] | None = None, _cli_parse_none_str: str | None = None, _cli_hide_none_type: bool | None = None, _cli_avoid_json: bool | None = None, _cli_enforce_required: bool | None = None, _cli_use_class_docs_for_groups: bool | None = None, _cli_exit_on_error: bool | None = None, _cli_prefix: str | None = None, _cli_flag_prefix_char: str | None = None, _cli_implicit_flags: bool | Literal['dual', 'toggle'] | None = None, _cli_ignore_unknown_args: bool | None = None, _cli_kebab_case: bool | Literal['all', 'no_enums'] | None = None, _cli_shortcuts: Mapping[str, str | list[str]] | None = None, _secrets_dir: PathType | None = None, _build_sources: tuple[tuple[PydanticBaseSettingsSource, ...], dict[str, Any]] | None = None, *, agent_model: str = 'mistral:7b', agent_timeout: int = 60, agent_num_ctx: int = 4096, agent_max_retries: int = 2, temperature: float = 0.0, top_p: float = 1.0)[source]

Bases: BaseSettings

LLM settings for the 4-agent sequential extraction pipeline. All agents use the same model — override with FINAMT_AGENT_MODEL. Temperature is 0.0 for deterministic JSON output.

model_config = {'arbitrary_types_allowed': True, 'case_sensitive': False, 'cli_avoid_json': False, 'cli_enforce_required': False, 'cli_exit_on_error': True, 'cli_flag_prefix_char': '-', 'cli_hide_none_type': False, 'cli_ignore_unknown_args': False, 'cli_implicit_flags': False, 'cli_kebab_case': False, 'cli_parse_args': None, 'cli_parse_none_str': None, 'cli_prefix': '', 'cli_prog_name': None, 'cli_shortcuts': None, 'cli_use_class_docs_for_groups': False, 'enable_decoding': True, 'env_file': '.env', 'env_file_encoding': 'utf-8', 'env_ignore_empty': False, 'env_nested_delimiter': None, 'env_nested_max_split': None, 'env_parse_enums': None, 'env_parse_none_str': None, 'env_prefix': 'FINAMT_', 'env_prefix_target': 'variable', 'extra': 'ignore', 'json_file': None, 'json_file_encoding': None, 'nested_model_default_partial_update': False, 'protected_namespaces': ('model_validate', 'model_dump', 'settings_customise_sources'), 'secrets_dir': None, 'toml_file': None, 'validate_default': True, 'yaml_config_section': None, 'yaml_file': None, 'yaml_file_encoding': None}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

agent_model: str
agent_timeout: int
agent_num_ctx: int
agent_max_retries: int
temperature: float
top_p: float
get_agent_config() AgentModelConfig[source]
class finamt.agents.config.AgentModelConfig(model: str, temperature: float, top_p: float, num_ctx: int, timeout: int, max_retries: int)[source]

Bases: object

Snapshot of extraction agent LLM settings.

model: str
temperature: float
top_p: float
num_ctx: int
timeout: int
max_retries: int

finamt.agents.pipeline

finamt.agents.pipeline

4-agent sequential extraction pipeline.

Agent 1 → receipt number, date, category Agent 2 → counterparty (vendor or client depending on receipt_type) Agent 3 → amounts (total, vat_percentage, vat_amount) Agent 4 → line items

Each agent runs sequentially (not parallel) for compatibility with local models. After all 4 finish, results are merged in Python (no LLM validator step).

Debug output saved to ~/.finamt/debug/<receipt_id>/:

agent1_prompt.txt / agent1_raw.txt / agent1_parsed.json agent2_prompt.txt / agent2_raw.txt / agent2_parsed.json agent3_prompt.txt / agent3_raw.txt / agent3_parsed.json agent4_prompt.txt / agent4_raw.txt / agent4_parsed.json final.json

finamt.agents.pipeline.run_pipeline(raw_text: str, pdf_path: str | Path | None, receipt_type: str, cfg: AgentsConfig | None = None, receipt_id: str | None = None, debug_root: Path | None = PosixPath('/home/docs/.finamt/debug'), taxpayer_info: dict | None = None) ReceiptData[source]

finamt.agents.prompts

finamt.agents.prompts

Short, focused prompts for the 4-agent sequential extraction pipeline. Pattern: instruction → schema → text → output reminder (sandwich).

finamt.agents.prompts.build_agent1_prompt(text: str) str[source]
finamt.agents.prompts.build_agent2_prompt(text: str, receipt_type: str, taxpayer_info: dict | None = None) str[source]
finamt.agents.prompts.build_agent3_prompt(text: str) str[source]
finamt.agents.prompts.build_agent4_prompt(text: str) str[source]

finamt.agents.llm_caller

finamt.agents.llm_caller

Local LLM caller used by all 4 extraction agents. Handles debug output and JSON parsing with fallback. Inference is delegated to llm_backend (mlx-lm on Apple Silicon, transformers elsewhere).

finamt.agents.llm_caller.call_llm(prompt: str, cfg: AgentModelConfig, agent_name: str, expected_keys: list[str], debug_dir: Path | None = None) dict | None[source]

Send prompt to the local LLM backend, parse JSON response, return dict or None.

Saves to debug_dir:

{agent_name}_prompt.txt {agent_name}_raw.txt {agent_name}_parsed.json