How to clean scanned bank statements for bookkeeping
How to handle scanned or low-quality bank statement PDFs before using extracted transactions for bookkeeping, reconciliation, and review.
Who this is for
Bookkeepers and business owners working with scanned statement PDFs.
Scanned statements need extra review
A scanned PDF is really an image inside a PDF. There may be no selectable text, so extraction depends on OCR and layout cleanup.
That makes review more important. Treat the output as a starting point, not a final accounting record.
Improve the file before extraction
Better scans produce better rows. If you can control the source file, scan straight, avoid shadows, include the full page, and keep resolution high enough for small transaction text.
- Use clear black-and-white or grayscale scans
- Avoid cropped dates and balances
- Keep pages in order
- Avoid phone photos with perspective distortion
Use confidence to decide next steps
If standard extraction cannot read the rows cleanly, advanced extraction can attempt OCR or AI cleanup. After the download, compare row counts and ending balances against the original statement.
Step-by-step workflow
Upload the cleanest available PDF
Use the original bank PDF if possible. If not, use the highest-quality scan you have.
Run standard extraction
A scan will often produce low confidence, but the preview tells you whether any text layer is usable.
Run advanced extraction when needed
Use advanced extraction for scanned pages, low-confidence rows, or missing transaction tables.
Reconcile before import
Compare totals, row counts, and ending balances before importing into accounting software.
Review checklist
- Pages are in statement order
- Dates, descriptions, and amounts are readable in the source scan
- OCR did not confuse 0/O, 1/I, or decimal points
- Row count is reasonable for the statement period
- Ending balance reconciles after cleanup
Frequently asked questions
Why does a scanned PDF say no selectable text?
Because the file may contain page images instead of embedded text. OCR or advanced extraction is needed to read the transaction rows.
Are scanned statement exports safe to import directly?
You should review them first. Scans can produce OCR mistakes, especially on small decimals, dates, and wrapped descriptions.