I’ve been thinking about how we implement technology-assisted review tools and particularly how to hang onto the on-again/off-again benefits of keyword search while steering clear of its ugliness. The rusty flivver that is my brain got a kickstart from many insightful comments made at the recent CVEDR e-discovery retreat in Monterey, California. As is often the case when the subject is technology-assisted review (by whatever name you prefer, dear reader: predictive coding, CAR, automated document classification, Francis), some of those kicks came from lawyer Maura Grossman and computer scientist Gordon Cormack. So, if you like where I go with this post, credit them. If not, blame me for misunderstanding.
Maura and Gordon are the power couple of predictive coding, thanks to their thoughtful papers and presentations transmogrifying the metrics of NIST TReC into coherent observations concerning the efficacy of automated document classification. While they’re spinning straw into gold. I’m still studying it all; but from where I stand, they make a lot of sense.
Maura expressed the view that technology-assisted review tools shouldn’t be run against subset collections culled by keywords but should be turned to the larger collection of ESI (i.e., the collection/sources against which keyword search might ordinarily have been deployed). The gist was, ‘use the tools against as much information as possible, and don’t hamstring the effort by putting old tools out in front of new ones.’ [I'm not quoting here, but relating what I gleaned from the comment].
At the same Monterey conference, Judge Andrew Peck reminded us of the perils of GIGO (Garbage In : Garbage Out) when computers are mismanaged. The devil is very much in the details of any search effort, but never more so than when one deploys predictive coding in e-discovery. Methodology matters.
If technology-assisted review were the automobile, we’d still be at the stage where drivers asked, “Where do I hook up my mules?” Our “mules” are keyword search.
When you position keyword search in front of predictive coding; that is, when you use keyword search to create the collection that predictive coding “sees,” the view doesn’t change much from the old ways. You’re still looking at the ass end of a mule. Breath deep the funky fragrance of keyword search. Put axiomatically, no search technology can find a responsive document that’s not in the collection searched, and keyword search leaves most of the responsive documents out of the collection.
Keyword search can be very precise, but at the expense of recall. It can achieve splendid recall scores, but with abysmal precision. How, then, do we avail ourselves of the sometimes laser-like precision of keyword search without those awful recall in-laws coming to visit? Time-and-again, research proves that keyword search performs far less effectively than we hope or expect. It misses 30-80% of the truly responsive documents and sucks in scads of non-responsive junk, hiding what it finds in a blizzard of blather.
To be clear, that’s an established metric based on everyone else in the world. It doesn’t apply to YOU. YOU have the unique ability to frame fantastically precise and effective keyword searches like no one else. Likewise, all the findings about the laughably poor performance of human reviewers applies only to other reviewers, not to YOU. Tragically, not everyone has the immense good sense to employ YOU; so, let’s take YOU and what YOU can do out of the equation until human cloning is commonplace, okay?
For all their shortcomings, mules are handy. When your Model-T gets stuck in the mud, a mule team can pull you out. Likewise, keyword search is a useful tool to pull us out of the sampling swamp and generate training sets. Using keywords, you’re more likely to rapidly identify some responsive documents than using random sampling alone. These, in turn, increase the likelihood that predictive coding tools will find other responsive documents in the broader collection of ESI sources. Good stuff in : good stuff out.
With that in mind, I made the following slide to depict how I think keyword search should be incorporated into TAR and how it shouldn’t. (George Socha is so much better at this sort of thing, so forgive my crude effort). This is a work-in-progress, and I’d very much like your comments on the merits. My mind is still open on all of this, especially to remarks by those who can offer evidence to make their case.
I hope you’ll agree that the interposition of keyword search to cull the collection before it’s exposed to an automated document classification tool is wrong. But, in fairness, doing it the right way could come at a cost depending upon how you approach the assembly and processing of potentially responsive ESI. If you have to pay significantly more to let the tool “see” significantly more data, then quality will be sacrificed on the altar of savings. How it shakes out in your case hinges on how you handle keyword search and what you’re charged for ingestion and hosting. Currently, many use keyword search via entirely separate tools and workflows to reduce the volume of information collected, processed and hosted. Garbage In.
Another caution I think important in using keywords to train automated classification tools is the requirement to elevate precision over recall in framing searches to insure that you don’t end up training your predictive classification tool to replicate the shortcomings of keyword search. If only 20% of the documents returned by keyword search are responsive, then you don’t want to train the tool to find more documents like the 80% that are junk. So when, in the illustration above, I depict keyword search as a means to train technology-assisted review tools, please don’t interpret the line leading from keyword search to TAR as suggesting that the usual guesswork approach to keyword search is contemplated and you’ll just dump keyword results into the tool. That’s like routing the exhaust pipe into the passenger compartment. The searches required need to be narrow–precise–surgical. They must jettison recall to secure precision…and may even benefit from a soupçon of human review.
For the promise of predictive coding to be fulfilled, workflows and pricing must better balance the quality vs. cost equation. Yes, a technology that is less costly when introduced at nearly any stage of the review process is great and arguably superior only by being no worse than alternatives. But if that is all we seek when quality is also within easy reach, we do a disservice to justice. The societal and psychic benefits of a more trusted and accurate outcome to disputes cannot be overvalued. “Perfect” is not the standard, but neither is “screw it.”