Attorney-Client Privilege and AI: A Partner's Guide to Vendors, Data Paths, and the Waiver Problem

Every legal-AI sales call ends at the same question, eventually, and it is almost always the general counsel or the managing partner who asks it: where does our data go, and what does that do to privilege? The vendor answers with a mix of acronyms and boilerplate — SOC 2, zero-retention, encryption at rest — and everyone nods. Nobody in the room wants to be the one who slows down the deal, so the deal closes. A year later the firm has shipped twenty thousand client documents through a pipeline nobody fully understands.

This is the waiver problem, and it is worth a serious look before the next renewal. Not because the sky is falling. Most AI vendors are run by competent engineers who take security seriously. The problem is that privilege is not a security question. It is a control question. And the deployment model a firm picks in 2026 determines, more than any feature, how much control it keeps.

What follows is the framework I walk partners through before we sign anything with any vendor — ours, theirs, anyone’s. It is not legal advice; every jurisdiction has its own wrinkles and your ethics counsel is the right person to sign off on final language. But these are the questions that keep the conversation honest.

The waiver problem, stated plainly

Attorney-client privilege protects communications made in confidence for the purpose of obtaining legal advice. The operative word is confidence. The moment a privileged communication is shared with a third party who is not an agent of the attorney acting in furtherance of that representation, privilege is arguably waived.

Courts have long carved out space for necessary third parties: translators, paralegals, outside experts retained by counsel, the IT vendor who keeps the email server running. The common thread is that each of those parties is working for the firm, in service of the representation, under a confidentiality structure that the firm controls.

AI vendors are, in 2026, the newest third party in that chain. The question is whether they sit inside the protective bubble or outside it. The answer depends almost entirely on the deployment model.

Three deployment models. Only one of them is yours.

Strip away the marketing and every AI vendor in the legal space is selling one of three architectures. They look similar on a slide. They are not similar when the data starts moving.

Multi-tenant SaaS. Your firm logs into a web app that the vendor runs on infrastructure shared with every other customer. Your prompts and documents pass through the vendor’s servers, often then through a foundation-model provider’s API (OpenAI, Anthropic, Google), and the responses come back the same way. The vendor’s privacy terms promise that your data will not be used to train the shared model. That promise is almost always credible. It is also almost always accompanied by a long list of logging, caching, abuse-monitoring, and support-access carve-outs that most partners have never read.

Single-tenant hosted. The vendor runs a dedicated instance for your firm, either in their cloud account or in yours. The model and the application are isolated. Your documents do not commingle with any other firm’s. The vendor still has operational access — they push updates, they respond to support tickets, they have the keys to the infra — but the blast radius of a bug or a misconfiguration stops at your firm’s door. This tier usually costs three to ten times what the multi-tenant tier costs.

On-prem or self-hosted custom. The software runs on infrastructure the firm controls, whether that is an AWS account the firm owns, an on-prem server room, or a private Azure tenant. The source code is the firm’s. The keys are the firm’s. Nobody outside the firm sees a document unless the firm’s own staff explicitly grants access. This is the model that looks most like the IT arrangements lawyers have been comfortable with for thirty years. It is also, historically, the most expensive — which is why it used to be reserved for the Am Law 50. In 2026, a focused custom build costs less than two years of a twenty-attorney per-seat SaaS contract.

The privilege analysis runs in the same direction as the cost analysis. Multi-tenant SaaS carries the most exposure and the most ambiguity. Single-tenant hosted narrows the exposure. Self-hosted custom collapses the vendor from a third party into a tool the firm owns, which is exactly the posture that fits most cleanly inside the privilege bubble.

What the state bars have actually said

Ethics guidance on generative AI has moved quickly since 2023. The ABA’s Formal Opinion 512, issued in the summer of 2024, lays out the baseline: lawyers using generative AI tools owe the same duties of competence, confidentiality, and informed client consent that they owe when using any other technology. That sounds unremarkable until you read the rest, which asks lawyers to understand the tool well enough to make those determinations.

Several state bars have gone further. California, Florida, New Jersey, and New York have each published generative-AI opinions that, in different language, arrive at a similar set of obligations: confirm how the vendor handles inputs, understand what data is retained and for how long, understand whether prompts or outputs train any shared model, and — in several jurisdictions — obtain informed client consent before sending confidential client material to a third-party AI service the firm does not control.

The common thread across the opinions is not the specific technical requirements. It is the burden of knowing what your vendor actually does with your data. A partner who signs a multi-tenant SaaS contract without reading the sub-processor list, the logging terms, and the support-access carve-outs has not satisfied that burden. A partner running a self-hosted custom module has an easier case to make, because the answer to “what does your vendor do with your data” is “there is no vendor.”

The due diligence checklist I run with partners

Before the firm signs with any vendor, including ours, run this list. If the vendor cannot answer every question in writing inside of two business days, the deal is not ready to close.

1. What is the data path? From the moment a document leaves the firm’s network to the moment a response comes back, list every system it touches. Vendor servers, sub-processors, foundation-model APIs, logging services, support databases. If the vendor cannot draw that diagram for you, they do not know their own product well enough to deserve your data.

2. What is retained, where, and for how long?Prompts, documents, outputs, metadata, and error logs are all separate questions. Some vendors delete documents in minutes and retain prompt text for thirty days for abuse monitoring. Some retain everything for ninety days by default. Every answer is legitimate. None of the answers should be a mystery.

3. Is firm data used, ever, to train any shared model? The answer should be an unambiguous no. If the answer is “no, unless you opt in,” confirm that the default for new accounts is opt-out and that opt-out is the documented posture of the firm’s tenant.

4. Who at the vendor can read firm data, under what circumstances? Support engineers, incident responders, security staff, legal staff. Multi-tenant SaaS products frequently have broader internal access than partners realize. Single-tenant hosted products can be configured so that vendor staff cannot read firm data without explicit firm approval. Custom self-hosted products have no vendor staff at all.

5. What is the sub-processor list, and does any sub-processor sit outside the United States? Data that crosses a border acquires a new regulatory overlay — GDPR, UK DPA, Canadian PIPEDA, and, for some clients, cross-border discovery implications. Most firms find this question surfaces at least one sub-processor the partner committee had not previously heard of.

6. What is the termination posture? On the day the contract ends, how quickly is firm data deleted from every system in the data path? Get the number in days, and get it in writing. A product that cannot tell you this cleanly is a product that has not planned for a graceful exit.

7. What happens in a breach or legal-process scenario? If the vendor is served a subpoena for firm data, what is the notice protocol? How quickly does the firm learn? Does the firm get a chance to move to quash before the vendor complies? Partners often assume the answer is favorable. In the actual contracts, it frequently is not.

8. Can the firm exit without losing the work product? If the firm leaves the vendor, do the prompts, configurations, and integration glue come with the firm or stay with the vendor? This is the question that quietly converts multi-year AI contracts into effective lock-in, and it is almost never answered in the first sales call.

Why custom AI sidesteps most of this

The honest answer to most of the questions on the checklist, for a custom module the firm owns and hosts, is not applicable. Not because the questions are unimportant, but because the firm occupies the role that would otherwise belong to the vendor.

The data path is the firm’s own infrastructure, ending at whichever foundation-model provider the firm has chosen to use, under a direct contract with that provider that includes the strongest confidentiality and zero-retention terms that provider offers. Retention policy is the firm’s retention policy. Access control is the firm’s access control. The sub-processor list is short and deliberate. Termination is not a concern because there is nothing to terminate — the code is in the firm’s repo, the infra is in the firm’s cloud account, the model provider contract is the firm’s contract.

This is not a magic trick. It is simply the original posture of firm-owned technology, applied to a new class of tools. Firms have been running trust-account software, document-management systems, and case-management platforms on owned infrastructure for decades. AI is not, on the privilege axis, a different category. It is only harder to reason about because the current market dominant is a multi-tenant SaaS model that happens to look like a subscription.

What to do if you are already in a contract

Most firms reading this are not starting from scratch. They have one or two AI tools in production already, usually on a multi-tenant SaaS tier, often with a contract that was signed before the ethics opinions came out. The right move is not to rip everything out. The right move is to run the checklist against the existing contract, document the gaps in a short memo, and decide which workflows are acceptable on the current footing and which are not.

The workflows that tend to be acceptable on a multi-tenant tier are the ones that do not touch privileged material: internal knowledge search over non-privileged documents, drafting of non-client content, administrative workflows. The workflows that tend to need a stronger deployment model are the ones that touch client files, draft client-facing documents, process intake from represented parties, or generate anything that might end up on a privilege log. That is also, not coincidentally, the list of workflows where a custom build produces the largest return.

In other words, the privilege analysis and the ROI analysis point in the same direction. The workflows that most justify owning the stack are the same workflows where ownership protects the firm’s oldest and most valuable contract — the one with the client.

The question worth asking in the next partner meeting

The next time a vendor pitches the firm, make the checklist the first slide, not the last. If the vendor cannot answer the eight questions cleanly, the deal is premature. If they can answer them cleanly but the answers push privileged work through a multi-tenant pipeline the firm does not control, the deal might still be right for some workflows and wrong for others.

And if the answers leave the partner group uneasy, that is worth hearing. Uneasy is the correct reaction to a data path nobody in the firm can draw. The fix is not to sign more aggressive indemnity language. The fix is to pick a deployment model where the privilege bubble is the firm’s bubble, and the vendor, if there is one, is a tool inside it rather than a party outside it.

If you want a second set of eyes on the checklist applied to a specific vendor, that is exactly the kind of thing a thirty-minute bottleneck audit can close out in one call. Bring the contract, bring the data-path question, and we will work through it together.