The eDiscovery Definition of Complete Processing is … Complete Processing

The Real Issue – Your Software and Workflow Must Get all of your Data.

Along with the rest of the Ediscovery Twitter and Blog worlds last week, I saw the blog post, the denial, the retraction, and the apology surrounding an attempted take on a recent ruling by Judge Shira Scheindlin.

I’m referring to Judge Scheindlin’s ruling that chastised an agency of the US Government for not producing various types of ESI correctly pursuant to a FOIA request.

The issue here is not the “wrong” post, the retraction or the apology. The issue was, is and always will be the defensibility of the process.

  1. The completeness of the processing affects the accuracy of the data.
  2. The accuracy of the data affects the quality of the review.
  3. The quality of the review can affect the outcome of the case.

It’s a domino effect, as I commented in my blog about the recent NY Times article.

While there have been numerous court cases sanctioning parties for preservation failures, it is only recently that the Judiciary is holding litigants to task for failure to properly PROCESS the information. This ruling emphasizes the importance of end users understanding the limitations of the products they are using as well as their legal obligations to produce the data requested in the correct formats.

Unfortunately, the key issues of that ruling (i.e.: the failure to correctly process all of the requested ESI) was drowned out by a misleading headline and attributing statements criticizing the processing software utilized by the Government to the Judge.

The case – “National Day Laborer Organizing Network v. US Immigration and Customs Enforcement” (“ICE”) 2011 WL 381625 (SD NY Feb. 7, 2011) (“ICE”) – calls attention to the potential serious problems that can arise where end-users do not fully understand the data processing systems they are using. As a result, the information produced in this case did not perform up to the professional standards expected by the Judiciary.

Judge Scheindlin took ICE to task for not producing specific types of metadata and other basic production formats (e.g.: maintaining parent – child relationships of emails). ICE responded that they could not perform these functions and that to comply with the Court’s ruling would require enormous additional resources and at a cost that was not measurable in advance.

I’ve had similar conversations with clients stating that their vendor did not advise them that not all attachments, PDF’s and image-only files were actually searchable. Well into one particular case, a client lost confidence in the reliability of their vendor and turned to us. We reprocessed their data and the client was taken by surprise when 80,000 attachments were found in the email collection that then had to be re-reviewed. On a positive note, at least they were located before production was completed!

Another client stated “It was brought to our attention after several conference calls with the litigation team that one of the popular E-discovery & ECA tools used for this high profile case does not extract and generate full text conversion of potentially relevant images files that were embeddings and/or attachments to emails. It was a deeply embarrassing experience for our department to recommend this solution and potential harm that this caused to our clients’ case”. Luckily, the issues were addressed and corrected without judicial involvement.

Here, the resulting pain for the defendants was caused by a failure to adequately perform their due diligence. It is critical before adopting a processing software solution to assess whether it is capable of performing the task in a complete and defensible manner.

We advise potential clients to explore in detail the following issues before making such a decision:

Make sure the software’s document extraction process allows them to meet the requirements of their case. As seen in the ICE case, although the software extracted all of the information and was searchable, the data was treated as one big record, so they couldn’t break it apart to search email parent-child relationships, nor create attachment ranges. This issue means that these records would need to be re-processed in order to properly review, redact and produce the correct information. To perform that function ICE would have had to export the data to an external system and further process it (at additional costs).

Another critical issue is the use of embedded objects, as they are increasingly common in e-Mail and other MS Office Documents (MS Word, PPT, Excel, and so on). Other document types such as PDF’s and Open Office also commonly employ them. In a recent EDiscoveryJournal.com article, Greg Buckles states “We have always known that searching for text within spreadsheets is problematic, but this new generation of compound Office 2007 XML documents and email formats have dramatically expanded the potential for false negative search results.” See “Proximity Search Challenges in eDiscovery” – EdiscoveryJournal.com – January 10, 2011 – EDJ.com . The methods used to handle this are crucial because they can impact traditional searching and conceptual searching in unexpected ways. “In-line” or “in-context” text extraction” allows the extracted text to match what you visually see in the original document. The embedded object is also extracted as a “Child-Document” where it is 100% searchable so nothing is lost or missed. This solution provides the user with highly accurate Proximity and Conceptual search results, virtually eliminating false positives. Very few eDiscovery software packages or vendors process data this way.

Additionally, a second major limitation that ICE encountered was its inability to produce a “highlighted or “red-redacted” copy of the production. As stated by the Director of the ICE FOIA office in a Declaration as part of a government Motion to Stay the Court’s ruling until an Appeal could be filed, it was noted that the Agency would have to redact the text twice, incurring significant additional time and expense.

The inability to create separate records for attachments and embedded objects did not allow ICE to deliver back only the responsive information. In fact, they would be forced to manually redact out the information that was not producable, at great expense in both time and money. Additionally, they could not fulfill the court mandated requirement of delivering attachment ranges without having to reprocess this data in a separate system.

By using a “black box approach”, all of these technical and security issues are now owned by the end-user rather than a professional data services company. The end user is limited to the capabilities of the black box with no direct support to resolve specific problems quickly. Subsequent issues such as those described above will likely be disclaimed by the software manufacturer, leaving the end user with the costs, aggravation and potential legal consequences (e.g.: sanctions, privilege waiver) resulting from these problems. Would it not be more prudent to entrust these issues to companies that are better equipped to professionally manage these solutions?

Black boxes can become very expensive very quickly and a seemingly low entry price can become a “high priced” solution. Predicting and scaling up a system in advance is very difficult and can incur great costs if you are wrong. The best pricing is usually the “introductory” pricing and vendors will frequently have “premium pricing” for additional add-on services and hardware. Additionally, added services such as OCR, Decryption and Redaction can have major impacts on time lines as well as costs.

As an attorney or litigation support specialist, you are not expected to have all of the answers. Performing due diligence on your vendor, their people and products is very important – since they are the experts who must support you. However, you are responsible for the completeness and integrity of your discovery. In Federal court, attorneys are attesting pursuant to FRCP 26(g) that they made a reasonable inquiry to obtain the information and that the information is true and correct. Can such a statement be made without risk if counsel is on notice that the data processing software utilized may not be appropriate for the task?

This entry was posted in Uncategorized. Bookmark the permalink.

Print Post Print Post

Leave a Reply