OCR (Optical Character Recognition)
Optical character recognition (OCR) is the conversion of scanned PDFs and image-based documents into searchable, indexable text. OCR is essential in virtual data rooms because much of the source documentation in European M&A: particularly older real-estate, legal, and HR documents: exists only as scans.
Modern VDRs run OCR automatically on upload. Once OCR'd, documents become searchable across the room and can be redacted by AI tools.
Published: May 2026. Updated: 18 June 2026.