OCR (Optical Character Recognition)

Optical character recognition (OCR) is the conversion of scanned PDFs and image-based documents into searchable, indexable text. OCR is essential in virtual data rooms because much of the source documentation in European M&A — particularly older real-estate, legal, and HR documents — exists only as scans.

Modern VDRs run OCR automatically on upload. Once OCR'd, documents become searchable across the room and can be redacted by AI tools.

Last updated: May 2026.