Proteus Blog | eDiscovery & Managed Review

Predictive Coding for Savvy Legal Teams

Written by Sarah Barth | Mar 21, 2024 2:12:49 PM

Litigators and eDiscovery practitioners often face a common document review challenge when preparing for litigation – too many documents (and too little time to get through them). Linear review, that is, reviewing every document individually, is the most appropriate option in certain situations, but can often be complemented by predictive coding. This is a defensible way to create massive time and cost savings, preserving budget for merits counsel, not for eDiscovery and document review partners.

A Quick Refresher: What is Predictive Coding?

Predictive coding is an umbrella term that primarily encompasses Technology Assisted Review (TAR) and Continuous Active Learning (CAL). These programs leverage machine learning to learn from manual coding decisions and make predictions as to whether remaining documents will be responsive, privileged, etc. As reviewers code documents, the software analyzes and predicts which documents share similar traits.

Why Embrace this Technology?

Predictive coding is cost-efficient and saves considerable time when it is used correctly.

Benefits include:

  • Efficiency: Depending on your needs, various forms of predictive coding can prioritize the documents most likely to be responsive or serve up random documents to enable you to stabilize the model and produce quickly. Faster reviews translate to quicker insights for the legal team and lower costs for the client.

  • Consistency: Using predictive coding technology is, at the end of the day, predictable. This means that the choices it makes (based on human input) are consistent. If there is an error (which happens), it is much easier to identify and correct. This is why validating results is so important.

  • Engagement: Predictive coding queues critical documents for the reviewer, whether you are searching for responsiveness or privilege. This keeps reviewers engaged vs. linear review where reviewers might have to mark hundreds irrelevant and spam emails as non-responsive.

Implementing Predictive Coding

You wouldn’t hop out of your Toyota Camry and into a Formula One car for a 220-mph lap without a pit crew, and you shouldn’t leave manual review for predictive coding without appropriate assistance, guidance, or training either. Here are a few high-level considerations when thinking about adopting a predictive coding workflow:

  1. Predictive coding is primarily intended for large, complex matters. In smaller matters, the predictive model might not have time to stabilize, meaning linear review would be the preferred workflow. “Big” and “small” are subjective terms, so talk to your software or services partner if you’re not sure of the best route.

  2. You need a solid data set as well – PDFs with hundreds or thousands of pages are not appropriate for a predictive coding model, neither are documents that only contain few words (e.g., text messages).

  3. Understand the goal: are you trying to quickly determine a cutoff of what is and is not responsive so you can stop review and make a production? Or are you trying to find hot documents? Different goals require different workflows, so communicate your goals to the project management team accordingly.


As Always, QC The Work

Predictive coding features are a powerful tool, not a magic wand. Proper planning, workflows, and Quality Control (QC) are critical. A solid partner can help you prepare the data, select the optimal review workflow, and provide oversight and judgement to maintain defensibility. Used properly, technology can transform your eDiscovery approach, reaching the crucial evidence summit with efficiency, accuracy, and confidence.

For more info, we just published an eBook, "AI in Document Review. Skeptical? We are Too.".