search for truthIn my last post, I addressed why search terms used to cull data sets in discovery should not be protected as attorney work product.  Today, I want to distinguish an attorney’s “investigative queries” (for case assessment, to hone searches or to identify privileged content) from “culling queries” (to generate data sets meeting a legal obligation, whether conceived by an attorney, client, vendor or expert).   I contend culling queries warrant no work product protection from disclosure.

Let’s assume a producing party has a sizable collection of potentially responsive electronic information.  Producing party concludes that it would be too costly, slow or unreliable to segregate the ESI by reading everything and, instead, decides to examine just those items that contain particular words or phrases.  Keyword queries thus serve to divide the ESI into two piles: one that will be reviewed by counsel and another that no one and nothing will qualitatively review.  The latter is the “discard pile.”  Culling queries may be applied iteratively, first to collect data from the enterprise and later to cull the collection for review.  The reductive process may entail the successive use of a client’s local and enterprise search capabilities and/or a law firm’s or vendor’s search tools.

The common thread is that each lexical search mechanism serves to exclude ESI lacking certain terms from substantive review.  No one ever assesses the discards for relevance or responsiveness.

Now, if we could be confident that keyword culling worked reasonably well and that the persons who came up with keywords were lexical magicians, there’d be no need to worry over the discard pile.  We could trust that what we don’t know doesn’t hurt us.

But we do know that a hefty slug of responsive items ends up in that discard pile.  We know this because studies and experience have established that keyword search is a crude, mechanical filter.  It leaves most of what we seek behind.

Whether we are leaving behind an endurable or unendurable volume of responsive items depends on just how poorly those keywords performed.  To gauge that, we’ve got to know what queries were run.

When we demand disclosure of queries used to collect or cull, our opponents say, “I won’t tell you because if you know the words I looked for, you’ll know how I thinkYou’ll know my mental impressions, my work product.  Plus, if my client suggested keywords to me, you’ll know the contents of privileged attorney-client communications.

Opposing counsel may also argue that, while it’s important to know just how well or poorly keywords performed, “That’s not something anyone but I need to know.”  It’s somehow work product.  Opponents are just supposed to trust one-another to make sure the keywords were sufficient and that only an endurable volume of responsive material won’t ever be scrutinized or produced.  The customary justification for this is, “that’s how we did it in the good, ol’ paper days.”

Not so.  In the “old days,” we did not exclude great swaths of responsive data from review by haphazard guesses about lexical content.  We managed paper records, and that management afforded us greater confidence that things would be found where they were kept.  It was a very different world.

Now, let’s put the shoe on the other foot.  Rewind to 30 years ago, and the producing party invites you to their document repository and says, “Put a Post-It® on what you want us to copy.”  Could you reply, “No, that reveals my mental impressions?”  Fast forward to 2013 and the producing party insists that you propose search terms.  Can you say, “Okay, but I don’t have to reveal them to you.  You must run my searches without knowing what they are because if you know my preferred search terms, you know what I think is important?”

See how the ‘keywords as work product’ notion falls apart like a two-bit suitcase in the rain?

In the fog of war, we must not forget that the law favors disclosure.  A core principle of Anglo-American jurisprudence is that the public has a right to every man’s evidence.  Privileges, explained the great evidence scholar John Henry Wigmore, are “distinctly exceptional.” And the U.S. Supreme Court put it masterfully in United States v. Nixon: “[E]xceptions to the demand for every man’s evidence are not lightly created nor expansively construed, for they are in derogation of the search for truth.”

Contrary to what some lawyers presume, privileges are to be narrowly construed, and there must be a compelling public good demonstrated to justify concealing the search terms run through machines to cull ESI.  Moreover, the burden of proof falls squarely on the proponent of privilege.  So it’s not for the requesting party to show a particularized need to know—it’s for the responding party to show a compelling justification to suppress.

It’s long been clear that existing documents do not become protected as work product by virtue of being examined by counsel.  “An attorney may not bring a document within the scope of the work product rule simply by reviewing it if it was not originally prepared in anticipation of litigation.”  Brown v. Hart, Schaffner & Marx, 96 FRD 64, 68 (N.D. Ill 1982).  If the entire document is not protected, how can a few search terms it contains be protected?”

Moreover, disclosure of search terms need not reveal who chose the terms or why. An attorney can protect his or her mental impressions by simply keeping his or her own counsel as to whether the terms sprang from the attorney’s noggin or (my preferred method) descended from Heaven on wings of white doves.  Absent voluntary disclosure of origins, the queries are just a bag of words.

Finally, work product cannot be interposed to protect from disclosure the underlying facts. What queries were run to exclude potentially responsive ESI from review is an inquiry into underlying facts.  Why those queries were selected is a far different question, and one that may be out of bounds.

This is where a crucial distinction should be made between keyword searches executed by counsel for case assessment, what I’ve dubbed “investigative queries,” in contrast to keyword culling for collection, review or production, which I’ve called “culling queries.”

If counsel runs keyword searches against a collection for the purpose of better understanding the case, formulating strategies or refining searches, it seems to me that counsel can (and should) do so without obligation to disclose the details of that effort absent exceptional circumstances (such as, e.g., when counsel’s competency, honesty or diligence are in issue).  Investigative queries are benign when they do not operate to expand or contract the corpus of the collection subject to review.

But when queries are used to collect, filter and cull the collection to exclude information from collection, review or production, such culling queries should be scrutinized and freely discoverable, warranting no privilege or work product protection.  Otherwise, undisclosed queries may be so insufficient, whether by design or innocent flaw, that they operate to exclude responsive information from all further inspection, i.e., they operate in derogation of the search for truth.

No keyword cull is perfect.  None is even close to perfect.  And “perfect” is not the standard.  But when an opponent hands you a little pile of data and says, “That’s all there is;” shouldn’t you be able to ask, “Did you just search for car or did you search for car, auto, automobile, vehicle, sedan, Ford, Taurus and Tawrus?”

The litmus test for me is this: Are search terms deployed so as to exclude potentially responsive ESI from review and production?  If so, the search terms are not protected as work product and may be discovered.  If the search terms were purely investigative and did not serve to cull or exclude ESI from a collection, those search terms may be protected from disclosure if they can be shown to be attorney work product or otherwise privileged.

This sort of dichotomy is supported by the leading cases addressing whether an attorney’s selection of documents is protected from discovery as work product. See, e.g.,  In re San Juan Dupont Plaza Hotel Fire Litigation, 859 F.2d 1007 (1st Cir. 1988) and Sporck v. Peil, 759 F.2d 312 (3rd Cir. 1985), cert. denied, 474 U.S. 903 (1985). Courts generally deny disclosure of which documents were chosen by counsel in the assembly of documents for a client’s review but decline to protect selections of documents for review by others.  That is, you can discover the selections, though you may not be able to establish that counsel made the selections. “Like requiring pleadings, answers to contention interrogatories, pretrial exhibit and witness lists, and trial memoranda, the district court’s [order requiring disclosure of the documents that may be used at deposition] merely adjusts the timing of disclosure. The situation is not remotely analogous to the situation where a party seeks an attorney’s personal notes and memoranda which contain his confidential assessments of the testimony of prospective witnesses.”  In re San Juan Dupont Plaza Hotel Fire Litigation, at 1017.

Much of the fight about keywords and work product strikes me as more about lawyer hubris than substance.  While requesting parties certainly care about the precision of searches lest they get data dumps, requesting parties tend to care more that the search terms prompt high recall of responsive documents. Accordingly, requesting parties want disclosure of search terms to insure that responding parties have not failed to include a term or variant they think likely to strike gold.  The responding party who says, “I’ll disclose the queries we used, but there are a couple I need to keep to myself” is likely to meet with a shrug and a nod from requesting counsel because the savviest requesters care more about the searches the other side overlooked than the ones piled on.

I’m trying to get to the right formulation here, so don’t hesitate to draw my attention to any cases on point or otherwise mix it up.  I’m all ears, and I won’t bite. Leastwise, not hard enough to break the skin.