Insights

How AI Agents are Changing the Way Lawyers do Legal Work

See how legal teams are using AI agents to draft, analyze, research, and deliver review-ready work while keeping lawyers focused on strategy and judgment.

by Harvey Team•Jun 8, 2026

For most of legal AI's first decade, the tools sat next to the work. A lawyer would highlight a clause, ask a question, paste the answer into a draft, repeat. The work was the lawyer's. The tool was a faster reference. That arrangement is ending. AI agents for legal are the technology now doing what reference tools never could, taking a counterparty's underwriting agreement and returning a marked-up draft with an issues list, comparing an ESG disclosure against current regulatory requirements and flagging every gap, drafting a witness examination outline from the case file before the partner walks into prep. The agent does the work. The lawyer decides what to keep.

They plan, act, and deliver finished work product rather than answering one question at a time. For legal leaders, the practical question is no longer whether agents can do meaningful legal work. It's where to deploy them first, how to govern them, and what the rollout actually looks like in the first six months.

This shift matters because it changes the unit of legal AI from the prompt to the task. A lawyer no longer asks an assistant a series of questions and assembles the answers into a deliverable. The agent receives the goal, maps the steps, gathers the sources, does the work, and returns something review-ready. The lawyer's role moves up the stack to scope, strategy, and judgment.

For legal leaders trying to understand where this is going, the rest of the article works through what these agents actually do and how they differ from earlier legal AI. It walks through the five stages of an agent's work, where agents create the most leverage, and the governance questions that matter. It closes with what credible platforms look like, the adoption pattern that works, and where the technology is headed next.

What Legal Work Looks Like When an Agent Owns the Task

The mental model legal teams should hold for any agent worth deploying is simple. The agent receives a goal, plans the work, executes the steps, and returns a finished deliverable. The lawyer governs the goal and the review. The agent governs the execution. Every step is visible, because invisible work is unreviewable work.

This is where agents diverge from earlier legal AI. A chat-style assistant answers one question at a time. A document-extraction tool pulls fields from contracts but doesn't do anything with them. An agent owns the task. It plans the steps, executes the work, and returns a deliverable the lawyer can sign off on.

In production legal use, three categories of agents have emerged, and Harvey ships agents in all three. Ad hoc agents take a described goal and plan from there, useful when the work is one-off or doesn't fit an existing template. Pre-built agents come from vetted libraries built by lawyers for tasks that recur across matters, such as drafting a witness examination outline or comparing an exhibit list against a pretrial order. Harvey's library includes more than 500+ such agents, vetted by its in-house lawyers across practice areas from capital markets to environmental law to bankruptcy. Custom agents, built through Harvey's Agent Builder, take an organization's own templates, standards, and review steps and turn them into a reusable agent the whole team can run. Most organizations end up using all three, with the mix shifting toward custom as internal expertise gets encoded.

The Five Stages an Agent Runs on Every Legal Task

The five stages of an agent's work are the operating model lawyers should hold in their heads when assessing whether an agent is built for serious legal practice. Plan. Research. Work. Deliver. Review. Each stage is visible. Each one is auditable. The lawyer sees what the agent intends to do before it does it, watches the work happen, and signs off on what comes back. An agent that hides any of these stages isn't ready for legal use.

1. Plan

The agent translates a goal into a step-by-step plan. When the scope is ambiguous, it asks clarifying questions before starting. A lawyer asking an agent to "draft a markup of the counterparty's services agreement" might be asked whether the markup should focus on liability and indemnification or take a fuller pass across all material terms. The plan is shown back to the lawyer. The lawyer can review the plan before execution begins. This is the stage that sets the boundary of the work.

2. Research

The agent pulls from trusted sources. Those sources include the organization's own documents, vetted legal databases, and, where appropriate, the live web. Every claim the agent surfaces in its output traces to a citation the lawyer can audit. This isn't a feature flourish. It's the thing that makes the work reviewable.

3. Work

This is the stage that distinguishes agents from research tools. The agent performs the substantive task, drafting the markup, building the issues list, comparing the disclosure against regulatory requirements, extracting the response action obligations from a record of decision. It applies analysis, not just retrieval. The output is the work product, not a memo about how the work product might be made.

4. Deliver

The agent returns the deliverable in the format the work calls for. A markup looks like a markup, with redlines and comments in the right places. A memo reads like a memo, with structure, citations, and the right register. A diligence summary lands as a structured document, not a chat transcript. This matters because deliverable quality is where most legal AI fails the practical test. Output that needs reformatting before a partner can read it isn't a finished product.

5. Review

The lawyer accepts, refines, or rejects. The decisions made at this stage shape how the agent performs on the next run, which is how custom agents get sharper inside an organization over time. This is also the stage that keeps accountability where it belongs. The agent does the work. The lawyer signs the deliverable.

Where AI Agents Create the Most Leverage in Legal Work

Agents help most where the work is repeatable and the constraint is hours, not insight. The legal tasks where adoption has moved fastest share three traits. They run at high volume, follow a consistent structure across matters even when the inputs vary, and run into human time as the limit on throughput. Where those three conditions hold, an agent compounds in value with every run. The categories below are where they hold most clearly.

Transactional Work

Transactional work is the category most often cited, and the reason is structural. Diligence, markup drafting, and issues lists across a deal involve thousands of pages of documents that have to be read, compared, and summarized in patterns that recur across deals. Pre-built agents now handle drafting an issues list for an escrow agreement, identifying issues in an underwriting agreement, and drafting a markup of an acquisition agreement against a counterparty's first cut. Harvey's library alone includes more than 60 agents across mergers and acquisitions and capital markets, which gives a sense of how granular the task taxonomy has gotten in transactional practice.

Litigation

Litigation has been slower to absorb agents because the work depends more on judgment and less on volume, but the high-volume corners of litigation are absorbing them quickly. Drafting a witness examination outline. Comparing an exhibit list against a pretrial order. Drafting responses to requests for production. Extracting key allegations from a government inquiry letter. These are the tasks that consume associate hours without producing the kind of work that builds a litigator's craft, which makes them natural candidates for agent execution and partner-led review.

Compliance and Regulatory Work

Compliance and regulatory work is the category where in-house teams are getting the most leverage. The work is high-volume, jurisdictionally fragmented, and procedurally repetitive in ways that punish manual review. Comparing an environmental, social, and governance (ESG) disclosure against regulatory requirements. Drafting a permit application narrative or updating a code of conduct for new regulatory requirements. Assessing breach notification obligations across affected jurisdictions. The compliance team that runs these as agent tasks gets back hours that go straight into the strategic work the business actually needs.

In-House Operations

In-house operations are where the practical math is starting to matter most. Legal departments are running flat or contracting while the volume of work in front of them keeps growing. Master services agreements (MSAs) are a typical example. Agents close that gap on tasks like assessing MSA renewal terms against business performance, drafting an employment agreement from offer letter terms, drafting a contract amendment, and identifying issues in counterparty financial statements. The result isn't a smaller team. It's a team that handles more matters with the same headcount and gets out of the queue of contract review long enough to do the strategic work that justifies a seat at the table.

How Agents Change the Way Associates Learn the Craft

If agents handle the first draft, what does a first-year associate do? It's a question every managing partner and chief innovation officer is starting to ask, and the easy answer (less work) is the wrong one. The honest answer reshapes early-career legal work entirely.

Agents shift the early-career skill set toward review, judgment, and source verification, which are exactly the skills that take the longest to develop under the old model. The hours tell the story. A first-year who once spent 40 hours marking up a services agreement might now spend 5 hours reviewing an agent's markup.

The work that takes the remaining 35 hours is different. It's reading the agent's plan and catching where the scope is wrong. It's spotting the citation that doesn't quite support the claim. It's recognizing the issue the agent didn't flag because the agent doesn't know what this client cares about. That is judgment work, and judgment work is what makes a senior associate.

The real concern is whether craft survives the transition. Pattern recognition and instinct come from doing the work yourself the first hundred times. An associate who has never drafted a witness examination outline from scratch can't tell when an agent's version is missing the question that matters. The firms that deploy agents without rethinking training will produce associates who can review competently but cannot draft from a blank page when the matter calls for it.

The firms gaining ground on this are treating it as a curriculum problem. They are pairing agent deployment with structured review programs that walk associates through what good looks like, line by line. They are building intentional friction into training, having associates draft sections by hand before comparing against the agent's output. The goal is to use agents in a way that produces the next generation of partners, not just the next generation of reviewers.

The Governance Work That Decides Whether an Agent Reaches Production

Most coverage of legal AI agents focuses on what they can do. The harder questions for general counsel, chief innovation officers, and managing partners are different. What happens when an agent gets something wrong? Can the work product stand up to a partner's review, a client's audit, or a regulator's inquiry? These are the questions that decide whether an agent moves from a pilot into production, and they get the least airtime in platform marketing materials.

AI governance in legal practice rests on two layers: written policies that set expectations, and the programmatic controls — layered permissions, access boundaries, audit trails, and adoption visibility — that turn those expectations into practice inside the systems where work happens. With assistants, those controls map cleanly onto individual interactions. With agents that run multi-step processes, touch more systems, and produce more consequential output, the governance focus shifts from the user to the work itself. Firms are not only deciding who can use an agent. They are deciding what an agent is permitted to do, which paths and sources it can draw on, and where human review occurs. (For a more detailed look, see Harvey’s guide, Extending Governance to Legal AI Agents.)

Six governance dimensions matter most, and each one addresses a risk that is easier to prevent than to remediate once an agent is in production.

Scope of Access

Governance starts with what an agent is permitted to read and what tools it is permitted to use. The relevant boundary may be the materials the user is already authorized to access, or it may be the systems and actions the firm has enabled for a particular workflow. A diligence agent that can access a deal data room but not unrelated client matters is operating within defined scope. An agent with undefined access boundaries is a confidentiality incident waiting to happen. The governance question is not only which repositories, research sources, or downstream systems an agent can reach, but also what the agent is permitted to do with that access.

Authorized Actions

Agents do work, not just retrieve information. Firms need to distinguish between actions an agent can run without intervention and actions where human sign-off is needed. Drafting an internal issues list is one thing. Preparing materials intended to leave the firm, updating a system of record, or finalizing a document is another. Where to draw those lines is a judgment for each firm, but the distinction has to be made explicitly in policy before the agent runs, not discovered after the output has already shipped.

Reviewability and Human-in-the-Loop Supervision

Because agents operate across multiple steps, lawyers need to review and supervise the work as it progresses, not only at the end. That includes understanding the agent’s proposed approach before it executes, validating sources and outputs at intermediate checkpoints, and intervening before the final work product is assembled. A lawyer must review all agent output before it is relied on or shared — that is non-negotiable. But the firms getting this right are also identifying earlier points in the agent’s execution where escalation or approval is appropriate. Citation auditability is the mechanism that makes this supervision practical. Every claim an agent makes should trace to a source the lawyer can verify. A markup with citations pointing to the exact clause that triggered each issue can be reviewed in minutes. A markup with no citations has to be re-read from scratch, which defeats the purpose.

Matter-Level Isolation and Data Governance

Work for one client has to stay walled off from work for another. This is a security architecture question, not a feature toggle. Information from a matter should not leak into another matter, into the model’s general behavior, or into outputs for an unrelated client. Closely related is the question of what happens to data as it passes through an agent. What is retained, and for how long? What, if anything, is used to train the underlying models? These are the questions a general counsel needs answered before a single client document goes through an agent. The right answer is usually that client data is not used for training, retention is governed by the customer, and storage locations match the regulatory regime of the work.

Deployment and Configuration Governance

As agent libraries grow and firms begin building their own, the governance question extends beyond who can use an agent to who can configure or publish one for others to run. A deployed agent sets the process for everyone who runs it, which means publishing an agent is closer to writing a policy than to sharing a document. The same principles of role-based access and admin approval that govern other firm systems serve as useful reference points, with the added consideration that a misconfigured agent can produce unreliable output at scale rather than in a single instance.

Accountability and Documentation

Agents do not sign documents. Lawyers do. Internal policy needs to spell out who reviews what, what level of agent output requires partner sign-off, and how the firm or department documents the review. Without that, an agent’s deliverable risks being treated as final work product rather than a draft. Equally important is the audit trail. Firms should document the inputs, plans, steps, sources, and outputs of agent-run work to support auditability and operational transparency. The firms running agents well are the ones that wrote the policy and built the logging before the rollout, not after the first incident.

What Credible Legal AI Agents Look Like in Practice

What separates a production-grade legal agent from a demo that won't survive a procurement conversation comes down to 3 things. The agent has to be built on a vetted library of legal tasks. Its reasoning has to be transparent, with citations the lawyer can audit at every step. And the platform has to let an organization turn its own templates, standards, and review steps into custom agents the whole team can run. Anything short of all 3 is a tool, not a platform.

The legal AI category has consolidated around a small number of platforms operating at scale across the largest law firms and in-house teams. The firms running agents in production today aren't experimenting at the edges. They are running thousands of agent tasks a day across diligence, drafting, regulatory analysis, and litigation work, with internal policy governing how outputs are reviewed and signed off.

Harvey, used by more than half the AmLaw 100 and global firms running agents in production, is the platform most often cited when legal leaders compare notes on agent deployment. Customers include Reed Smith, Macfarlanes, Vinson and Elkins, and Willkie Farr and Gallagher. The platform operates at a scale that gives a sense of where legal AI has actually moved. Harvey reports more than 700,000 daily tasks run using agents and 50 million terms extracted weekly across its customer base. Those numbers matter less as a marketing point than as evidence that agent work has crossed from pilot into production volume inside the largest legal organizations.

What a Successful Agent Rollout Looks Like in the First Six Months

The biggest predictor of whether a legal AI agent rollout produces real value is how the first six months are structured. The failure mode is consistent. A firm or department announces a top-down mandate, deploys across every practice group at once, and runs into a wall of cultural resistance, governance gaps, and uneven adoption that takes another year to unwind. The pattern that works is almost the opposite. It looks slower at the start and produces more durable results.

The organizations getting agents into production tend to follow a four-step pattern.

Pilot with a single practice group. Start with a high-volume, well-bounded use case where the work is repeatable, the volume produces learning fast, and the partner or department head is genuinely interested in the outcome. Pilots that meet these criteria produce signal in six to eight weeks. Pilots that don't produce political noise instead.
Measure in lawyer terms, not platform terms. The metrics that matter are quality of output, hours redirected to higher-value work, and partner satisfaction with deliverables. Usage counts and feature adoption tell you whether people are clicking, not whether the agent is producing work the firm trusts.
Build internal champions inside each practice group. Identify two or three lawyers per group who become the people their peers come to with agent questions. Adoption happens lawyer to lawyer, and these champions translate lessons into the language of the practice itself.
Expand horizontally before going firm-wide. Once the first practice group is producing trusted output, the next step is a second practice group with adjacent characteristics. The horizontal expansion lets each practice group benefit from the lessons of the last one, and it gives leadership an honest read on which workflows the agent serves well.

The firms scaling agents fastest aren't the ones with the most ambitious announcements. They are the ones with the most disciplined first six months. Ambition is earned a practice group at a time, which is the only way the work holds up under partner and client scrutiny.

The Next Phase of How Legal Work Gets Done is Already Underway

The first wave of legal AI agents has been about single-task execution, like drafting a markup, identifying issues in an underwriting agreement, or comparing a disclosure against regulatory requirements. The next phase of agentic AI for legal teams is already taking shape. Agents will run diligence, drafting, and analysis simultaneously across a single matter, integrate natively with the document management systems and collaboration tools where legal work already happens, and get sharper over time as custom agents trained on a firm's review history become an extension of its institutional knowledge.

The question for legal leaders is no longer whether to use agents. It is what work to delegate, what judgment to retain, and how to train the next generation of lawyers in a profession where the first draft is no longer where craft is built. The firms that take the question seriously now will be the ones still doing recognizable legal work in 5 years, with better tools and the same standards.

The platform legal teams choose for that work matters. Harvey Agents were built for the way lawyers actually work, with a library of more than 500 pre-built agents, a custom agent builder, governance, matter-level isolation, and the citation-grounded reasoning that makes the work reviewable. The legal teams getting agents into production are the ones working with a platform that has already solved the hard problems. Schedule a demo to see Harvey running on the work your team handles every week.

Harvey Agents

A New Era of Collaboration for Legal and Professional Services

Harvey Academy

2025 Year in Review