MC966632 - Microsoft Purview | Information Protection: Enhanced OCR support for embedded images for Microsoft Exchange Online

Service

Microsoft 365 suite
Microsoft Purview

Published

Dec 27, 2024

Tag

New feature
User impact
Admin impact

Platforms

Web

Summary

Microsoft Purview | Information Protection's OCR capabilities will be enhanced to scan embedded images in files for sensitive content in Exchange Online, with a rollout starting mid-January 2025 for Public Preview and mid-February 2025 for General Availability. This update is associated with Microsoft 365 Roadmap ID 323899.

More information

Optical Character Recognition (OCR) scanning enables Microsoft Purview | Information Protection to scan images for sensitive information. With this rollout, we will enhance the OCR capabilities to scan for sensitive content in images embedded in other files (such as a .png image file in a Microsoft Word .docx file) with a limit of 20 embedded images scanned per file for Microsoft Exchange Online. Before this rollout, OCR was able to scan standalone images only.

This message is associated with Microsoft 365 Roadmap ID 323899.

When this will happen:

Public Preview: We will begin rolling out mid-January 2025 and expect to complete by late January 2025.

General Availability (Worldwide): We will begin rolling out mid-February 2025 and expect to complete by late February 2025.

How this will affect your organization:

Optical character recognition (OCR) supports images embedded in Microsoft Office files or archive files in Exchange Online.

Images can be embedded in these Microsoft Office applications:

  • Microsoft Word (.docx)
  • Microsoft PowerPoint (.pptx)
  • Microsoft Excel (.xlsx)

Images can be embedded in .pdfs containing searchable text and images.

Images can be embedded in these container files and archive files:

  • .rar
  • .tar
  • .xip
  • .7z

After the policy matches, you will see the matches in Alerts, Activity explorer, email incident reports, and content explorer, just as you would for any other matched file.

This change will be available by default for admins to configure. There are no expected changes to the user experience.

What you need to do to prepare:

This rollout will happen automatically by the specified date with no admin action required before the rollout. For existing customers using OCR for Exchange workload, by default embedded files will also be scanned by OCR and customers will be billed accordingly per usage. New customers not using OCR can turn on OCR to start using OCR for standalone and embedded file types in Exchange.

Standalone images supported by OCR:

  • .jpg
  • .png
  • .bmp
  • .tiff
  • .pdf (images only)

Learn more: Learn about optical character recognition in Microsoft Purview | Microsoft Learn (will be updated before rollout)