Export Options¶
bank_statement_parser supports exporting report data in CSV, Excel, and JSON formats,
with two export presets and a dedicated reporting feed for external BI tools. Exports are
generated automatically by bsp process or can be triggered manually via the Python API.
Export Presets¶
| Preset | Description |
|---|---|
simple (default) |
A single flat transactions table joining all dimensions. Best for spreadsheet analysis. |
full |
Separate star-schema tables (accounts, calendar, statements, transactions, balances, gaps) for loading into an external database or BI tool. |
CLI Usage¶
# Default: export simple preset in both CSV and Excel from database
bsp process --pdfs ~/statements
# Export full star-schema tables as CSV only
bsp process --pdfs ~/statements --export-type full --export-format csv
# Skip export entirely
bsp process --pdfs ~/statements --no-export
| Option | Choices | Default | Description |
|---|---|---|---|
--export-type |
simple, full |
simple |
Export preset |
--export-format |
excel, csv, json, all, reporting |
all |
Output file format |
--no-export |
— | off | Skip the export step entirely |
Python API¶
Export functions¶
bsp.db provides export_csv(), export_excel(), export_json(), and export_reporting_data():
export_csv()¶
export_csv(folder: Path | None = None, type: Literal['single', 'multi'] = 'single', project_path: Path | None = None, batch_id: str | None = None, filename_timestamp: bool = False) -> None
Write report data to CSV files in folder.
Each table is written as a separate .csv file named after its logical
table name (e.g. transactions.csv, or statement_dimension.csv,
account_dimension.csv, etc. for type="multi").
When filename_timestamp is True:
type="single": the timestamp is appended to the filename, e.g.transactions_20250331143022.csv.type="multi": files are written into amulti_20250331143022/sub-folder inside folder with their original names.
When filename_timestamp is False:
type="single": files are written directly to folder with their original names, e.g.transactions.csv.type="multi": files are written into amulti/sub-folder inside folder with their original names.
Args:
folder— Directory to write CSV files into. WhenNonethe project'sexport/csv/directory (resolved via project_path) is used and created automatically if absent.type— Export preset —"single"(flat transactions table) or"multi"(separate star-schema tables for loading into a database). Defaults to"single".project_path— Optional project root used to resolve the default export folder and data sources. Falls back to the bundled default project whenNone.batch_id— Optional batch identifier to filter report data to a single batch. WhenNoneall rows are exported.filename_timestamp— WhenTrue, append a human-readable timestamp (yyyymmddHHMMSS) to the filename (single) or create a timestamped sub-folder (multi). Defaults toFalse.
export_excel()¶
export_excel(path: Path | None = None, type: Literal['single', 'multi'] = 'single', project_path: Path | None = None, batch_id: str | None = None, filename_timestamp: bool = False) -> None
Write report data to an Excel workbook at path.
Each table is written as a separate worksheet. For type="single" a
single transactions sheet is written; for type="multi" six sheets
are written (statement_dimension, account_dimension,
calendar_dimension, transaction_measures,
daily_account_balances, missing_statement_report).
Filename conventions:
type="single", no timestamp:transactions.xlsxtype="single", with timestamp:transactions_20250331143022.xlsxtype="multi", no timestamp:transactions_multi.xlsxtype="multi", with timestamp:transactions_multi_20250331143022.xlsx
Worksheet names are never modified by the timestamp or type logic.
Args:
path— Full file path for the output.xlsxworkbook. WhenNonethe file is written toexport/excel/transactions.xlsxinside the project directory resolved via project_path.type— Export preset —"single"(flat transactions table) or"multi"(separate star-schema sheets for loading into a database). Defaults to"single".project_path— Optional project root used to resolve the default export folder and data sources. Falls back to the bundled default project whenNone.batch_id— Optional batch identifier to filter report data to a single batch. WhenNoneall rows are exported.filename_timestamp— WhenTrue, append a human-readable timestamp (yyyymmddHHMMSS) to the workbook filename. Worksheet names are unaffected. Defaults toFalse.
export_json()¶
export_json(folder: Path | None = None, type: Literal['single', 'multi'] = 'single', project_path: Path | None = None, batch_id: str | None = None, filename_timestamp: bool = False) -> None
Write report data to JSON files in folder.
Each table is written as a separate .json file containing a JSON array
of row objects, named after its logical table name (e.g.
transactions.json, or statement_dimension.json,
account_dimension.json, etc. for type="multi").
When filename_timestamp is True:
type="single": the timestamp is appended to the filename, e.g.transactions_20250331143022.json.type="multi": files are written into amulti_20250331143022/sub-folder inside folder with their original names.
When filename_timestamp is False:
type="single": files are written directly to folder with their original names, e.g.transactions.json.type="multi": files are written into amulti/sub-folder inside folder with their original names.
Args:
folder— Directory to write JSON files into. WhenNonethe project'sexport/json/directory (resolved via project_path) is used and created automatically if absent.type— Export preset —"single"(flat transactions table) or"multi"(separate star-schema tables for loading into a database). Defaults to"single".project_path— Optional project root used to resolve the default export folder and data sources. Falls back to the bundled default project whenNone.batch_id— Optional batch identifier to filter report data to a single batch. WhenNoneall rows are exported.filename_timestamp— WhenTrue, append a human-readable timestamp (yyyymmddHHMMSS) to the filename (single) or create a timestamped sub-folder (multi). Defaults toFalse.
export_reporting_data()¶
Write CSV reporting feeds to the project's reporting/data/ sub-directories.
Calls :func:export_csv twice — once with type="single" writing to
reporting/data/single/ and once with type="multi" writing to
reporting/data/multi/ (created as a sub-folder of reporting/data/
by the multi-export logic). Both directories are created automatically if
absent.
This produces a stable set of CSV files that external reporting tools (e.g. Power BI, Tableau, Excel) can point at directly without needing to know about the full export machinery.
Args:
project_path— Optional project root directory. Falls back to the bundled default project whenNone.
Example:
import bank_statement_parser as bsp
from pathlib import Path
bsp.db.export_reporting_data(project_path=Path("/my/project"))
# Writes:
# /my/project/reporting/data/single/transactions.csv
# /my/project/reporting/data/multi/statement_dimension.csv
# /my/project/reporting/data/multi/account_dimension.csv
# /my/project/reporting/data/multi/calendar_dimension.csv
# /my/project/reporting/data/multi/transaction_measures.csv
# /my/project/reporting/data/multi/daily_account_balances.csv
# /my/project/reporting/data/multi/missing_statement_report.csv
Usage examples¶
import bank_statement_parser as bsp
# Export simple CSV from database backend (default project)
bsp.db.export_csv()
# Export full star-schema tables to Excel
bsp.db.export_excel(type='full')
# Export JSON
bsp.db.export_json()
# Write stable CSV feeds for BI tools (simple + full presets)
bsp.db.export_reporting_data()
# Export to a custom directory
from pathlib import Path
bsp.db.export_csv(folder=Path('~/exports'))
bsp.db.export_excel(path=Path('~/exports/report.xlsx'))
Report Classes¶
All report classes expose a .all attribute containing a pl.LazyFrame.
Call .collect() to materialise the data.
import bank_statement_parser as bsp
# Read from the DB backend
df = bsp.db.FlatTransaction().all.collect()
Available classes¶
FlatTransaction¶
Denormalised transaction view joining all dimensions. One row per transaction with date, account, statement, and value columns.
FactBalance¶
Daily balance series per account. Forward-filled from statement data to cover every calendar date.
DimTime¶
Calendar dimension. One row per date spanning the full transaction date range, with year, quarter, month, week, and day attributes.
DimStatement¶
Statement dimension. One row per parsed PDF statement with statement date, filename, and batch timestamp.
DimAccount¶
Account dimension. One row per unique account with company, type, number, sort code, and holder.
FactTransaction¶
Transaction fact table. One row per transaction with foreign keys to dimension tables.
GapReport¶
Gap detection report. Flags periods where the closing balance of one statement does not match the opening balance of the next. Access .gaps for filtered gap rows only.
Output Files¶
Simple preset¶
| Format | File | Contents |
|---|---|---|
| CSV | export/csv/transactions_table.csv |
Flat transaction table |
| Excel | export/excel/transactions.xlsx |
Sheet: transactions_table |
| JSON | export/json/transactions_table.json |
Flat transaction table (JSON array) |
Full preset¶
| Format | Files / Sheets | Contents |
|---|---|---|
| CSV | export/csv/ — statement.csv, account.csv, calendar.csv, transactions.csv, balances.csv, gaps.csv |
Separate star-schema tables |
| Excel | export/excel/transactions.xlsx with sheets: statement, account, calendar, transactions, balances, gaps |
Star-schema workbook |
| JSON | export/json/ — statement.json, account.json, calendar.json, transactions.json, balances.json, gaps.json |
Separate star-schema tables (JSON arrays) |
Reporting feed (--export-format reporting)¶
| Preset | Path | Contents |
|---|---|---|
simple |
reporting/data/simple/transactions_table.csv |
Flat transaction table |
full |
reporting/data/full/ — statement.csv, account.csv, calendar.csv, transactions.csv, balances.csv, gaps.csv |
Star-schema CSV feeds |