Skip to content

Project Structure

Every bank_statement_parser project follows a standard directory layout. The project folder contains all configuration, data, and export files for a set of bank statements.

Directory Layout

<project_root>/
  config/              # TOML configuration files
    HSBC_UK/           #   Bank-specific config subfolder
    TSB_UK/            #   (one per bank)
    account_types.toml #   Shared account type registry
    standard_fields.toml # Shared standard field mappings
    anonymise.toml     #   Anonymisation exclusion rules
  parquet/             # Parquet data files (permanent + temporary)
  database/
    project.db         # SQLite database (star-schema mart)
  export/
    csv/               # CSV report exports
    excel/             # Excel workbook exports
    json/              # JSON exports
  reporting/
    data/
      simple/          # Flat transactions CSV feed
      full/            # Star-schema CSV feeds
  statements/          # Source PDF copies (flat, one file per statement)
  log/
    debug/             # Per-statement debug output

Project Lifecycle

When you run bsp process, the project is automatically validated or initialised via validate_or_initialise_project(). The decision table is:

Condition Action
project_path does not exist raise ProjectFolderNotFound
Root exists; no config .toml files and no database/project.db scaffold new project (create all dirs, copy default TOMLs, create empty project.db)
Root exists; config present; DB absent; root is the default bundled project create project.db only (config is committed to source control; DB is excluded)
Root exists; config present; DB absent (custom project path) raise ProjectDatabaseMissing
Root exists; DB present; config absent raise ProjectConfigMissing
Root exists; both config and DB present no-op (valid project)

Python API

copy_project_folders()

copy_project_folders(destination: Path) -> list[Path]

Copy the project folder structure (directories only) to a destination.

Recursively copies every sub-directory of the default project folder into destination, creating the destination and any parents as needed. No files are included in the copy — only the folder hierarchy is reproduced.

Args:

  • destination — Root directory to create the project folder structure in. The directory (and any missing parents) will be created if it does not already exist.

Returns:

List of Path objects for every directory that was created.

Raises:

  • NotADirectoryError — If destination exists but is a file, not a directory.

Example:

    import bank_statement_parser as bsp
    from pathlib import Path

    bsp.copy_project_folders(Path("~/my_project").expanduser())

validate_or_initialise_project()

validate_or_initialise_project(project_path: Path) -> None

Validate an existing project or initialise a new one at project_path.

This is the single gatekeeper for project-path correctness and must be called once — at the top of :class:~bank_statement_parser.modules.statements.Statement and :class:~bank_statement_parser.modules.statements.StatementBatch — before any downstream code touches files within the project. All other functions and classes that accept project_path rely on this guarantee and will raise specific errors if required files are absent rather than trying to create them.

Decision table (evaluated top-to-bottom):

+------------------------------------------+-----------------------------------+ | Condition | Action | +==========================================+===================================+ | project_path does not exist | raise :exc:ProjectFolderNotFound| +------------------------------------------+-----------------------------------+ | Root exists; no config .toml files | scaffold new project (create all | | and no database/project.db | dirs, copy default TOMLs, create | | | empty project.db) | +------------------------------------------+-----------------------------------+ | Root exists; config present; DB absent; | create project.db only | | root is the default bundled project | (config is committed to source | | | control; DB is excluded) | +------------------------------------------+-----------------------------------+ | Root exists; config present; DB absent | raise :exc:ProjectDatabaseMissing| | (custom project path) | | +------------------------------------------+-----------------------------------+ | Root exists; DB present; config absent | raise :exc:ProjectConfigMissing | +------------------------------------------+-----------------------------------+ | Root exists; both config and DB present | no-op (valid project) | +------------------------------------------+-----------------------------------+

Args:

  • project_path — The project root directory to validate or initialise.

Raises:

  • ProjectFolderNotFound — If project_path does not exist.
  • ProjectDatabaseMissing — If the project looks like an existing project (config present) but database/project.db is absent and the project is not the default bundled project.
  • ProjectConfigMissing — If the project looks like an existing project (database present) but config/ contains no .toml files.

CLI Integration

The bsp process command handles project creation automatically:

# Creates ./bsp_project/ if it doesn't exist
bsp process --pdfs ~/statements

# Use a custom project path
bsp process --pdfs ~/statements --project ~/my_project

See the CLI Reference for all project-related options.