Quantifying User Mailbox Sizes in Microsoft 365’s Exchange Online

 

Planning for Mailbox Crawling in Microsoft 365

In many initial deployment scenarios for DataCove, an organization will want to import all of their existing emails into the system to allow for a running start to their archiving and compliance journey. One of the most rapid mechanisms to achieve this is using DataCove’s Exchange Crawler, which imports all emails that are currently live within a user’s mailbox from both on-premises Exchange and/or Microsoft 365’s Exchange Online systems.

Identifying just how much data will be brought in is a factor however, since this could be anywhere from some dozens of gigabytes of data to several terabytes, which can have an enormous impact on how much capacity of the DataCove will be consumed right off the bat. Since these numbers are so variable, it’s best to understand just how much data will be coming in to appropriately size the DataCove system in advance, as well as model the potential Retention Policy impacts to the imported data.

The purpose of this article will be a brief review of how to quantify the total amount of live data inside of the M365 Exchange tenant’s mailboxes. The process discussed below will bring up both a “total” size listing across all mailboxes, as well as offer the ability to look into individual mailboxes for their sizing in the event only select personnel are being Crawled. These totals are the best numbers to use when estimating just how much data is expected to populate onto the DataCove once Crawled.

There are a couple of caveats to obtaining the Mailbox sizes in Exchange Online:

The counts shown will not show either the Archive Mailboxes (if enabled) of the user mailboxes, nor any data in the Recoverable Items Folder.

  1. The Recoverable Items Folder is the equivalent of the Recycle Bin, but for Exchange mailboxes. Emails that have been instructed for removal past the Deleted Items folder are sent to this hidden folder and purged from the mailbox entirely after 14 days by default (Calendar items last for 120 days).

  2. Archive Mailboxes are Exchange’s method of retaining additional data in a separate, but associated, mailbox with the primary account holder’s mailbox. This allows them to find and retrieve emails in a similar manner to how they normally would within their current mailbox but without the risk of a single mailbox growing too large and unwieldy to be handled by M365’s backend Exchange servers.

Both of these folders are optional for the DataCove Exchange Crawler to fetch from and are not always used, but are generally recommended to be imported whenever a large scale Crawl is being performed.

  • The Recoverable Items Folder is generally a small folder containing a couple weeks worth of mail flow for the user and is negligible in the grand scheme of data imports.

  • The Archive Mailboxes can be significant in size, ranging from a few gigabytes to 50GB+ depending on utilization and licensing levels. Identifying the size of the Archive Mailbox can be performed individually within the Exchange Admin Center of M365, but obtaining a list en masse is best performed using PowerShell commands, with Elora Krizel’s excellent article and script covering the process linked below.

Note that as a third party website, Tangent does not possess any control over the content or changes on this website.

Please read more about this process linked here at the O365Reports website.

 

Running the Mailbox Size Report

Begin by logging into the Microsoft Administrator Center by opening a web browser and visiting Portal.Office.Com. Log in with administrative credentials.

Once logged into the Admin Center, select the left hand side menu (sometimes hidden in the “hamburger” stack) and click on the Reports menu.

The Reports menu will then expand to offer some additional options; select Usage from the submenu.

Now in the Usage Admin Center, select Exchange on the left hand side menu to narrow down the list of Usage statistics to just Exchange Online.

Select Mailbox Usage in the top menu.

This will now present two different sections of data that will allow for quick estimation of data to import.

  1. The middle “Storage” graph will show the total amount of Exchange data inside the tenant itself. This is the number to use when ascertaining an approximate amount of data that will be imported to the DataCove when Crawling the entire tenant and all user mailboxes.

  2. Below the data graphs are “Storage Used” counts for individual users and their respective mailbox quantities. If looking to import data for just a handful of users, rather than all users, these numbers are the best ones to pull from for a quick idea of how much data will be coming in.

With these numbers now in hand, a good idea of just how much data will be used by DataCove off the bat can be calculated. Exchange Online uses similar deduplication to DataCove itself, so the numbers are often 1:1.

Previous
Previous

Tracking Activity via DataCove’s Audit Log

Next
Next

Understanding DataCove’s Deduplication Capability