You’re probably well aware of the Office 365 Import Service and the way that it can import PSTs to mailboxes that are provided either on drives shipped to a Microsoft datacenter or uploaded over the network. Well, as it turns out, PSTs are just the first type of data to be ingested and brought into Office 365 by the Import Service.
Recently, Microsoft quietly published a TechNet page to inform readers that they now support the ability to create packages from data extracted from file shares or SharePoint lists, sites, or libraries and have those packages processed so that the data they contain is ingested into SharePoint Online. Unlike most developments that occur inside Office 365, Microsoft has been peculiarly reticent in informing the world that this capability exists.
Perhaps the reason is that Microsoft is tweaking the new SharePoint import capability to make sure that it runs effectively and efficiently. But maybe it’s because they’re working on something even bigger, a hint of which was given on June 30 when they announced partnerships with Actiance and Globanet to archive non-Microsoft data in Office 365.
The idea is that companies like Globanet and Actiance have software that can extract information from various repositories where corporations store data. These include:
Social media: Think of the corporate Facebook page and Twitter account that serve as the public face for many companies.
- Instant Messaging: Yahoo Messenger, GoogleTalk, and Cisco Jabber
- Document storage: DropBox, Box, and similar sites
- Vertical applications: Salesforce Chatter and the IM applications run by companies like Thomson Reuters and Bloomberg that are often found in the financial sector
- SMS/Text messaging: BlackBerry IM, MobileGuard, and so on
Obviously, Microsoft has the ability to extract information from data sources that it “owns”, such as PSTs, SharePoint sites, and file shares. Although it could determine how to extract interesting data from repositories like Jabber and Yammer, other companies already have that capability. The partnerships that Microsoft has announced (and some others that they are working on) are based on a simple strategy:
- Partners understand how to mine various sources of corporate data. They understand how to harvest the information and how to manage that harvesting on an ongoing basis.
- Microsoft provides those partners with a specification to build “import packages” that hold the information and metadata that can be used for compliance purposes (owner, dates, authors, etc.). The specification is not yet public but I understand that Microsoft plans to make it available soon after some of the initial glitches are worked out in conjunction with Globanet and Actiance. When the specification is released, anyone will be able to use it to create import packages.
- The import packages are created by the partners and provided to the Office 365 Import Service, which processes the packages using the same kind of methods used to import PSTs. The data is uploaded into Azure Data Services and then read using a workflow process based on the Exchange Mailbox Replication Service (MRS) to end up in archive mailboxes. The new form of expandable archive mailbox allow any reasonable volume of data to be imported.
The net result is that all of the data gathered from the various sources ends up in archive mailboxes where that data becomes visible for compliance and eDiscovery purposes using the suite of tools available in Office 365. This helps to solve a compliance problem for large companies who consider that the interactions occurring in places like Facebook and Twitter form part of the corporate data store and should be managed in the same way as email, documents, and other communications.
This seems like a good deal from a partner perspective. They can sell customers the software necessary to harvest data and build the import packages whilst also providing paid-for services to manage the inevitable complexities that will crop up within large companies. And they don’t have to build large datacenters to store and manage the data because that’s the function of Office 365.
The Office 365 Import Service started off with PSTs, progressed to handle file shares and SharePoint, and is now heading in a direction where almost any form of corporate communications could be captured and ingested into Office 365. Not every Office 365 tenant will be interested in such a capability, but this capability removes another barrier that might stop large corporations moving workload to the cloud. It will be interesting to see how many other partners join the party after Microsoft releases the import package specification.