Applying AI in contract review and drafting

Amit Sharma

November 2, 2024

10 min read

ContractKen's AI Advancements in 2025: Revolutionizing Contract Drafting and Review

We have been at the forefront of integrating artificial intelligence (AI) and machine learning (ML) into legal services, particularly in contract review and drafting. Since our last update in 2023 (available here), the landscape of AI has evolved significantly, with the advent of more powerful large language models (LLMs) like GPT4o/4.5 and models with advanced reasoning capabilities like o3, etc.

This article provides an overview of our latest AI deployments, focusing on carefully orchestrated use of leading LLMs via API, our Retrieval Augmented Generation (RAG) pipeline, and fine-tuned models tailored for law firms.

‍

Lets dive deeper into these major engineering capabilities:

RAG (Retrieval Augmented Generation) and Knowledge Layer‍

This is now almost a standard mechanism to build customized solutions using LLMs while avoiding hallucinations and better. RAG Pipeline and Knowledge LayerRetrieval Augmented Generation (RAG) is a technique that combines the power of LLMs with a knowledge base to provide more accurate and contextually relevant outputs. At ContractKen, we have implemented a robust RAG pipeline that includes a Knowledge Layer tailored for specific contract types, industry domains, or clients.This integration ensures that ContractKen's AI solutions are both generalizable and specifically tuned, addressing the challenges of unstructured contracts and interlinked clauses highlighted in our earlier work.

What is RAG? RAG enhances LLMs by retrieving relevant information from a knowledge base before generating a response. This approach mitigates the issue of hallucinations and ensures that the output is grounded in accurate data, crucial for legal applications where errors can be costly.Implementation at ContractKen: We curate and maintain a comprehensive database of legal documents, case law, and industry-specific regulations. Using advanced embedding techniques and similarity search, we retrieve the most relevant information for any given query, then the LLM generates responses based on both the query and retrieved data.Role of the Knowledge Layer: This layer is customized for different contract types (e.g., NDAs, service agreements), industry domains (e.g., tech, healthcare), or specific clients. It allows the RAG pipeline to adapt to unique requirements, enhancing accuracy and relevance, especially for niche legal scenarios.

‍

Fine Tuning best-in-class closed LLMs for contract review: We have fine-tuned gpt-3.5-turbo model on contract review & analysis specific tasks. The broad approach is to use a more powerful model (like gpt-4) as a 'Teacher' to create a dataset and use it to fine-tune gpt-3.5-turbo's output. We have realized significant gains in certain areas and continue to work on this approach

Prompt Engineering: There is still a lot of alpha in prompting the models the right way. And due to RLHF'ing, prompts do not work across context window sizes or different models. Some tremendous stuff out there in terms of using right prompts to improve output

LLMOps: Managing an LLM pipeline requires integration with an LLMOps tool. We're exploring partnerships with industry leading startups in this space like Portkey, Dreamboat, etc.

‍

Older post:

Let us start with opening a sample contract, in MS Word:

via GIPHY

Here is a Merger Agreement between two media companies, focused on a variety of issues, transactions, etc. This is a massive document spanning 82+ pages, not including a large number of exhibits & schedules.

Typically, the execution of such a contract is the result of months, if not years of contracting work between all parties involved. Easy to imagine the amount of drafting, review, and iterations that such a large agreement would take.

Reviewing such a large contract is surely not for the weak-hearted or the impatient! This is where an area of AI called Natural Language Processing (NLP) steps in.

Challenges of using NLP for contract reviews

Contracts are unstructured, unstandardized, and use nuanced legal language. Take a look at the below example of two clauses having very similar language but diametrically opposite meaning / implications:

Clause 1:

During the Term and for a period of two years thereafter, or for a period of seven years from the date of creation of the Records (whichever is longer) the Supplier shall keep full, true and accurate Records to show compliance with its obligations under this Agreement together with any other records that are required by any professional rules of any regulatory body which apply to the activities of the Supplier or as may from time to time be agreed in writing between the Company and the Supplier.

Clause 2:

During the Term and for a period of two years thereafter, or for a period of seven years from the date of creation of the Records (whichever is longer) the Supplier shall keep full, true and accurate Records to show compliance with its obligations under this Agreement together with such other records as may from time to time be agreed in writing between the Company and the Supplier.

Contracts exist to guard against rare and potentially catastrophic occurrences, so tolerance for false negatives and false positives is almost nil
A contract document is not a simple collection of paragraphs of text whose individual inferences can be summed up to overall understanding. Instead, it is a carefully constructed instrument of risk management, where implications of concepts, terms & clauses are dependent on other concepts, terms & clauses, or even other contracts.
Experts have a way of reviewing pieces of contract, cross-referencing, triaging and then concluding the risks presented in a clause. Traditional ML algorithms process a document or a piece and are not suited for this iterative, interlinked way of assessing risks.

How ContractKen's AI assisted process helps speed up contract review by up to 50% and with zero errors / oversight

via GIPHY

Identifying key clauses present in this contract so that you can focus your attention on the language of those clauses instead of spending time searching for the keyword or phrase
Alerting the user to missing clauses or key terms in the document
Similarity scoring for each of the detected clauses on a scale of 1-10 (compared to your organization’s standards for that clause)
Enabling use of a contract review playbook within MS Word - this is the coolest part of our tech which enables organizations to customize their contract playbook and use it right within word

All of this functionality has multiple NLP models working in unison in the background. However, there are two broad types of algorithms deployed - Pattern Recognition & Deep Learning.

Let's take a look under each one's hood.

Pattern Recognition

We use algorithms like K-Nearest Neighbors(KNN) to recognize patterns in training data. Following (oversimplified to 3 dimensions) diagrams show how a pattern recognition algorithm solves the (relatively) easier problem of identifying contract metadata

KNN is a type of algorithm known as 'Unsupervised Learning' - i.e. the machine will automatically detect patterns of similarity or dissimilarity (across n-dimensions) and sort the data points out into various 'clusters'. In this example, after our data pipelines pre-process and tokenize the data in the training documents dataset and feed it into this algorithm, the model creates 3 distinct clusters - belonging to the key terms like ‘Governing Law’, and ‘Effective Date’ & ‘Expiry Date’.

Pattern Recognition and Model Training — Model Training

‍
When a new data point is fed into the system (in production use), the model calculates the distance of the new data point from the center (in an n-dimensional space) of each of the clusters that the model has identified. The model will assign this new data point to the nearest cluster.

Model Inference in Production - contract — Model Inference (i.e. use in Production)

This is an over-simplified example of how basic pattern recognition algorithms can be deployed to detect contract terms on the basis of their meanings, not through a keyword search type of approach

Deep Learning

There are broadly 2 types of models being used here:

Q&A

This is to detect the presence of key contract clauses and identify their location in the document. We are leveraging the SQuAD approach to fine-tune several pre-trained language models using the HuggingFace Transformers library. Because the prediction task is similar to extractive question and answering tasks, we use the QuestionAnswering models in the Transformers library. Each ‘Question’ identifies the label category (clause) under consideration. This technique is called ‘Transfer Learning’ in ML.

Take these sentences, for example, 1, “I like to play football” and 2, “I am watching the Julius Cesar play”. The word ‘play’ has different meanings. These models use neural networks as their foundation and consider the semantics of the text.

The model returns the precise location of the clauses (ones which are detected) in the document (starting position and length), which is then used by our word add-in to highlight the relevant text.

We’re using a Transformers-based DL algorithm to detect the presence and location of many key commercial clauses and terms. To understand more about Transformers, the following article is perhaps the best out there: https://jalammar.github.io/illustrated-transformer/

Primary task formulation: The model should predict which substrings of the contract document are related to each clause label category. The model learns the start and end token positions of the substring. This formulation is built from SQuAD 2.0 setup.

The algorithm that we’re using is BERT (short for Bidirectional Encoder Representations from Transformers). This is the original BERT paper created by Google research. We have used multiple variations of BERT were used to optimize the overall Precision & Recall scores, and continue to test variations of simple algorithms, new data, and model parameters to get higher coverage (i.e. more terms/clauses getting predicted), better accuracy, and superior inference performance.

Named Entity Recognition (NER)‍

This is to identify key business entities in the contract document. For e.g. ‘Parties Names’, Financial values, etc. At ContractKen, we’ve deployed multiple variants of the NER algorithm for specific commercial entities.

‍
This is a fast-changing domain with ever larger and better language models coming into the open source domain every month. At ContractKen, we’re excited and committed to deploying the best-in-class technology to solve a wide variety of challenging problems with the document review process.