From Data to Decision: 5 Questions with Mike DeCesaris on the Role of Large Language Models in Driving Legal Innovation

Share

The interview discusses how the Data Science Center has led the tactical adoption of LLMs for casework at Cornerstone Research and built the infrastructure to use these tools safely and securely.

Cornerstone Research’s Data Science Center is a proven industry leader in delivering efficient and secure large-scale data analytics to clients. This innovation hub has made noteworthy and consequential strides in using large language models (LLMs) to support the firm’s casework. These capabilities have enabled our firm to analyze vast quantities of information on a scale that, a few years ago, would have been nearly impossible.

Mike DeCesaris founded the firm’s Data Science Center in 2014 to meet client needs for efficient, secure, big data solutions. Over the past decade, our interdisciplinary team of data scientists and data engineers has grown substantially. The team has developed bespoke, right-sized analytic approaches to complex big data, text, geospatial information, social media, blockchain, and software/code matters, and also supports the analytical needs of clients in data-intensive industries such as high-frequency financial markets, e-commerce, and healthcare. Mr. DeCesaris discusses this work in the below interview.

Given your more than 20 years of experience with data science uses in litigation, how are you seeing LLMs, a relatively new technology, benefiting the work of Cornerstone Research?

The Data Science Center team has already spearheaded LLM-assisted analyses on dozens of different litigation cases. At the same time, they train our consulting teams in these tools, creating a growing group of power users. As the firm’s consulting professionals become skilled in deploying LLM solutions, implementing them into casework becomes ever more seamless and scalable. We also work closely with the firm’s Applied Research Center to consider how LLMs can support the deep technical expertise they bring to cases in order to enhance efficiency and effectiveness.

As our consulting professionals become power users of LLM tools, integrating them into casework becomes increasingly seamless—enabling us to tackle more complex research and reasoning tasks.

Now we are working to expand LLM use while also progressing in developing capabilities for a growing set of use cases. The team is working on applying LLMs to increasingly complex research and reasoning tasks.

With the center’s experience in developing bespoke analytic approaches for complex data sets, how has the adoption of LLMs impacted the firm’s ability to analyze vast quantities of information at scale, a capability that was previously infeasible just a few years ago?

Our data scientists were quick to realize the transformative impact of LLMs, conducting large scale experiments since early 2023. Since then, the firm has rapidly iterated and adapted in parallel with the advancing technology. The Data Science Center is intensely focused on exploring and developing practical applications that demonstrably improve the quality of the firm’s client work by making possible analyses that would be prohibitive with manual work alone.

Off-the-shelf LLM tools are powerful, but often fall short of our clients’ needs. To ensure reproducibility and defendability, we built infrastructure to deploy cutting-edge, open-weight models on premises.

The team’s initial experiments made clear that although off-the-shelf tools are powerful, they are insufficient for our clients’ needs in many cases. Reproducibility of results is essential, but challenging to achieve when relying on a model over which we lack full visibility and control. To understand how an LLM performs—and to support our experts in confidently defending its output—we needed to develop the infrastructure to deploy cutting-edge, open-weight models on premises.

Can you give an example of how the Data Science Center has used LLMs to enhance casework, such as in a specific case or project where the team was able to provide more value to clients using these new tools?

Cornerstone Research’s substantial investment in LLM technology is already paying material dividends in our work for clients, principally in relation to document analysis and content analysis. We have vast experience with document review analyses that pair sophisticated keyword searching and text analytics with human review. We were therefore eager to investigate how LLMs could improve the quality and comprehensiveness of our traditional processes.

In one instructive comparison, we tested our baseline traditional workflow against an LLM-assisted alternative. The corpus of material analyzed comprised close to a hundred thousand publicly filed corporate proxy statements. The LLM-assisted approach performed comparably to human review alone in identifying “true positive” hits, correctly identifying more than 95% of the examples found via the traditional approach. As compared to traditional keyword search, the LLM-assisted approach was substantially more effective in filtering “false positive” hits, reducing the pool of documents for which manual review was required by 85%. This enabled the team to broaden their search terms and generate an initial document count that would normally surpass the feasible constraints of human review. The result was a more than 67% increase in the number of relevant examples found.

The LLM-assisted approach reduced false positives by 85% and increased relevant findings by over 67%—demonstrating clear advantages over traditional keyword search.

A second case study assessed LLM capabilities for comprehensive document review and coding, compared against the original team’s manual results. The LLM’s accuracy rate was between 82% and 96%. Although an iterative manual review process remains essential to refining questions and ensuring quality, these findings indicate that after the initial manual review, an LLM pass improves the consistency of results obtained.

What are some of the biggest technical or logistical challenges that Cornerstone Research has faced in integrating LLMs into its litigation consulting work, given the firm’s commitment to security and the sensitive nature of the data it handles, not to mention that the courts are particularly slow to adopt and adapt to new technology?

The remarkably rapid development of LLM technology is exciting, but with new models being released on a near-weekly basis, you are chasing a constantly moving target. Moreover, an off-the-shelf tool may not be secure or tailored enough to support our work or our clients. A big part of this project has been developing foundational infrastructure and software that supports deploying on-premises LLM models and applications. This approach is well-suited to the precision and sophistication, as well as confidentiality and security, required in our work.

Another challenge is that LLMs have an inherent degree of randomness that works at cross-purposes with our need for reproducible results. We have had notable success in combatting that issue, but ensuring reproducibility has required us to trade off some efficiency. We are now working to boost efficiency, but the broader point is that striking the right balance in this work is not easy.

Our on-premises GPU servers enable us to scale exponentially larger projects while maintaining the high security and precision our clients require—an essential foundation for credible, reproducible LLM solutions.

Hardware investment was another important hurdle to overcome. Cornerstone Research has made considerable investments in cutting-edge, top-of-the-line equipment. Notably, at the end of 2023, we procured GPU servers in an environment of intensely competitive demand. These on-premises GPU servers have reduced our reliance on third-party hosted models and enabled our team to scale exponentially larger projects, all while maintaining appropriately high security. We believe that this infrastructure was an essential precursor to supporting our consultants, experts, and clients at the level they demand.

Lastly, we are all well aware of the headlines about how LLM use can go wrong. This highlights the importance of educating others about how and why our approach is tested and credible. We continually evaluate the potential risks and limitations associated with LLM use specifically and broader technological advancements more generally. We also work closely with our clients to discuss any of our suggested methods and applications to ensure we only use technologies with which they are comfortable.

As the founder of the Data Science Center and a leader in the legal industry, what does “legal innovation” mean to you, particularly in an era where new technologies like LLMs are disrupting traditional approaches to data analysis in litigation consulting?

Innovation in the legal industry is particularly challenging because the expectation of accuracy is simply higher than in most sectors. “Move fast and break things” is not an option. Even so, in a period where virtually every large business is investigating the potential of AI to improve efficiency and enable innovation, the legal industry must do the same. For those of us at the forefront of bringing AI-driven innovation to the sector, the imperative is to balance our enthusiasm for integrating new and exciting technology against the baseline expectation of utmost accuracy and precision.

Legal innovation means mastering the balance between embracing powerful new technology and upholding the industry’s unwavering standards of accuracy and precision.

As we in the Data Science Center have seen firsthand, LLMs are poised to catalyze a significant shift in litigation, with expert witnesses and consultants leveraging these powerful tools to enhance the quality, scope, and speed of their analysis. We are driving innovation without sacrificing rigor and continuing to deliver defensible, best-in-class work. To me, this is the true meaning of legal innovation, mastering the balance of these two ideas.


The views expressed herein are solely those of the author and do not necessarily represent the views of Cornerstone Research.

Interviewee

Mike DeCesaris
  • Location icon San Francisco
  • Email icon
  • Phone icon

Mike DeCesaris

Vice President, Data Science Center