gmail_GoogleI posted here a year ago laying out a detailed methodology for collection and preservation of the contents of a Gmail account in the static form of a standard Outlook PST.  Try as I might to make it foolproof, downloading Gmail using IMAP and Outlook is tricky.  Happily since my post, the geniuses at Google introduced a truly simple, no-cost way to collect Gmail and other Google content for preservation and portability.  It sets a top flight example for online service providers, and presages how we may use the speed, power and flexibility of Google search as a culling mechanism before exporting for e-discovery.

I’m excited about this because, like millions, I’ve depended on Google apps such as Gmail and Google Calendar for as long as they’ve been around.  As an expert witness, I collect and produce messages and attachments in response to subpoenae duces tecum.  Gmail made it easy to find responsive content, but hard to get that content out in forms that preserved utility and integrity.  In the past, I’d printed the items to searchable PDFs; but, printing to PDF is tedious and runs counter to my penchant for functional and complete forms.

Lately, I’ve taken to using the IMAP protocol to download Gmail to Outlook, creating .PST container files and processing these with e-discovery tools.  Getting a complete, compact PST is no picnic.  It can take days to grab all message headers, message bodies and attachments in a big collection, and the level of replication is appalling because, when they are downloaded, foldered (i.e., labeled) messages generate duplicate messages and attachments for each label applied.  The upshot is that anything in, say, the Inbox or Sent Mail folders also shows up in the All Mail group.  This is a convenience online; but, radically increases the collection time and volume to pull the data out with IMAP. Message threading is also a casualty when converting Gmail to Outlook content.

Even if you’re a lawyer who could care less about IMAP, this is a development worth cheering because until now, you had two choices when it came to putting Gmail on legal hold: Either you’d instruct your client not to delete anything (and cross your fingers they’d comply) or you had to hire someone to download the data. Now, Google does the Gmail collection gratis and puts it in a standard MBOX container format that can be downloaded and sequestered.  Google even incorporates custom metadata values that reflect labeling and threading.  You won’t see these unique metadata tags if you pull the messages into an e-mail client; but, e-discovery software will pick them up. I tested this using Nuix and the $100 marvel, Prooffinder.  Both parsed the Gmail metadata handily, enabling the messages to be threaded and paired with their Gmail labels.

MBOX might not have been everyone’s choice for a Gmail container file; but, it’s inspired.  MBOX stores the messages in their original Internet message format called RFC 2822 (now RFC 5322).  Regular readers may recognize that I’ve been a vocal proponent of this format as a superior form for e-discovery preservation and production.  I had no hand in Google’s decision; but, it’s nice to have Google on my side!

So, let me introduce you to Google Data Tools.

The only hard part of archiving Gmail is navigating to the right page.  You get there from the Google Account Setting page by selecting “Data Tools” and looking for the “Download your Data” option on the lower right. When you click on “Create New Archive,” you’ll see a menu like that below where you choose whether to download all mail or just items bearing the labels you select.

gmail archiveThe ability to label content within Gmail and archive only messages bearing those labels means that Gmail’s powerful search capabilities can be used to identify and label potentially responsive messages, obviating the need to archive everything.  It’s not a workflow suited to every case; yet, it’s a promising capability for keeping costs down in the majority of cases involving just a handful of custodians with Gmail.

A lot of discoverable data is moving to Google–to Gmail, Drive, Calendar, YouTube–you name it  Kudos to Google for turning a task that’s been hard into something so simple anyone can do it well.  That it costs nothing at all, what more can I say?  Thank you, Google!