How KPMG Can Transform Audit Quality and Advisory Services with Agentic AI
Audit and advisory teams have spent the last few years experimenting with automation: rule-based bots, analytics scripts, and chat-style assistants that can summarize a file or draft an email. Useful, but limited.
The next step is already taking shape across the profession: Agentic AI in audit and advisory, where AI doesn’t just answer questions, but plans work, executes steps across systems, validates outputs, and produces documentation that stands up to scrutiny. Done well, this becomes a quality multiplier: more consistent procedures, tighter evidence chains, faster exception handling, and better visibility into what happened, when, and why.
Done poorly, it becomes a risk amplifier.
This guide explains what agentic AI is, why it matters for audit quality transformation and advisory workflow automation, where it fits (and where it doesn’t), and how KPMG could deploy it with the governance and controls expected in high-stakes, regulated environments.
What “Agentic AI” Means (and Why It’s Different From GenAI Chat)
Definition (simple and executive-friendly)
Agentic AI in audit and advisory refers to AI systems that can plan multi-step tasks, use approved tools (like audit platforms, ERPs, and document repositories), take actions under defined permissions, and self-check results before presenting them for human review and sign-off.
A chat assistant can explain a standard or summarize a document. An agentic system can run a workflow: gather the right files, pull the right data, perform tie-outs, flag exceptions, draft workpapers, and route items to the right reviewer with an audit trail.
Core capabilities that matter for audit/advisory
Agentic AI becomes relevant to AI in external audit and advisory work because it handles the messy middle: the orchestration work that usually consumes hours and introduces inconsistency.
Key capabilities include:
Task planning and decomposition
An agent can break “test revenue completeness” into specific steps: extract populations, define criteria, run queries, select samples, obtain support, evaluate exceptions, and document results.
Tool use across systems
Audit and advisory work is distributed across systems: ERP, consolidation tools, data catalogs, document management, email, ticketing, and audit workflow platforms. Agentic systems can interact with these tools through connectors and controlled actions.
Memory and context handling
Engagement context matters: prior-year issues, risk assessments, control deviations, known journal entry patterns, and client-specific policies. Agents can retain and apply context so the same issue doesn’t get rediscovered every year.
Verification loops and exception handling
Strong agentic AI includes checks like reconciliations, cross-validation across sources, threshold logic tied to materiality, and “stop conditions” when data is missing or conflicts appear.
This is where assurance automation becomes more than speed. It becomes a mechanism for consistency.
Where agentic AI can go wrong (and why governance matters)
Agentic AI in audit and advisory adds a new category of risk: it can take action, not just generate text. That changes the control conversation.
Common failure modes include:
Hallucinations become action-taking errors A chat tool might produce an incorrect summary. An agent might route the wrong evidence, apply the wrong filter, or draft a conclusion that doesn’t match the underlying support.
Over-reliance and reviewer complacency If the workflow looks polished, reviewers may stop challenging assumptions. That’s dangerous in an environment built on professional skepticism.
Confidentiality and data leakage Audit and advisory teams handle sensitive client data. Without strict data access boundaries, retention rules, and logging, risk escalates quickly.
Regulatory scrutiny and explainability gaps Regulators and internal quality teams will care less that an AI system is impressive and more that it is controlled, repeatable, and reviewable. AI governance and controls can’t be an afterthought.
The takeaway: agentic AI can improve audit quality transformation, but only when autonomy boundaries and human-in-the-loop review are designed in from day one.
The Audit Quality Levers Agentic AI Can Improve
Audit quality is often discussed abstractly. In practice, it’s shaped by a small number of operational levers: planning quality, evidence quality, documentation completeness, and the ability to detect exceptions without drowning in false positives. Agentic AI in audit and advisory touches each of these.
Planning and risk assessment
Planning is where inconsistency compounds. Two teams can look at the same client and produce very different risk narratives and procedures. Agentic AI can reduce that variability.
High-leverage applications include:
Ingesting prior-year artifacts at scale Agents can pull prior-year workpapers, meeting minutes, key contracts, management representations, control narratives, and issue logs, then summarize what changed and what remains relevant.
Surfacing risk factors with evidence links Instead of generic risks, an agent can propose engagement-specific hypotheses tied to actual documents, transactions, or changes in the business.
Generating tailored audit programs mapped to methodology The goal is not to “let the AI decide the audit,” but to accelerate first drafts and ensure completeness against firm methodology and standards, with clear traceability.
This is one reason Agentic AI in audit and advisory is different from basic GenAI: it’s structured around inputs, checks, and outputs, not just language.
Better audit evidence collection and orchestration
A major driver of inefficiency and quality issues is evidence sprawl: files spread across email threads, portals, shared drives, and ad hoc folders. Agents can coordinate this work and enforce structure.
Examples:
PBC item requests, tracking, and validation An agent can manage request lists, send reminders based on rules, flag missing metadata, and detect duplicates.
Automated tie-outs Tie-outs are conceptually simple and operationally painful. Agentic systems can reconcile trial balance to lead schedules to financial statements, flag differences, and produce an exception package.
Exception queues that preserve human attention Instead of having teams re-check everything, agents can route only the exceptions to humans, along with supporting context and steps already performed.
When designed correctly, this raises audit quality by focusing human effort where judgment is actually needed.
Consistency and completeness of documentation
Audit documentation automation is often framed as “draft the workpaper.” The better framing is “standardize what good documentation includes, every time.”
A well-designed agent can produce workpapers that consistently capture:
Purpose of the procedure and the assertion addressed
Population definition and data source
Steps performed and parameters used (filters, thresholds, time periods)
Evidence references and where they are stored
Results, exceptions, and how exceptions were resolved
Reviewer notes and a resolution trail
Timing, performer, and review status
This doesn’t replace judgment. It reduces the risk that documentation quality varies by seniority, deadline pressure, or team turnover.
Continuous auditing and monitoring (where appropriate)
Continuous auditing is attractive, but it’s not a universal fit. Not every audit procedure should run “always on,” and not every client environment supports it.
Where it can work:
Near-real-time monitoring of specific populations For example, journal entries with certain characteristics, changes to key master data fields, or control exceptions in a workflow system.
Alerts tied to thresholds and materiality logic A good continuous auditing setup doesn’t generate noise. It uses logic aligned to risk and materiality and escalates only what matters.
Guardrails matter
Continuous auditing can create reliance questions, scope creep, and expectations gaps. Agentic AI in audit and advisory should support defined procedures, not silently expand them.
Advisory Services: Where Agentic AI Changes Delivery and Value
Advisory teams are under pressure to move faster without diluting rigor. Agentic AI helps by compressing time spent on discovery, documentation, and coordination, which often consumes more effort than the “analysis” itself.
Faster diagnostics and current-state assessments
Current-state work typically involves collecting artifacts, conducting interviews, summarizing gaps, and aligning findings to a framework. Agents can accelerate each step.
Practical applications:
Artifact gathering and normalization Agents can pull policies, process maps, control matrices, tickets, logs, and prior assessments, then organize them into a consistent structure.
Gap summaries against frameworks Whether the engagement is aligned to COSO, ISO, NIST, or internal frameworks, agents can draft initial gap assessments that consultants refine.
Interview guide generation Based on artifacts and observed gaps, an agent can propose stakeholder-specific questions and follow-up prompts.
This is advisory workflow automation that preserves consultant time for judgment, prioritization, and stakeholder alignment.
Proposal-to-delivery acceleration (without sacrificing quality)
Proposal work often repeats: assumptions, milestones, deliverables, RACI outlines, and risk language. Agentic AI can standardize and speed up drafts while improving traceability from day one.
Useful patterns:
Scoping agents that draft structures, not promises Agents can propose engagement structures with clear assumptions and dependencies, which teams validate and finalize.
Research agents with enforced sourcing discipline Instead of “summarize the latest guidance,” an agent can retrieve and compile information with strict source constraints and a clear separation between facts and interpretations.
Delivery agents that maintain traceability As findings evolve, agents can keep the chain intact: finding to evidence, evidence to recommendation, recommendation to implementation tasks.
That traceability is increasingly important for risk and compliance AI engagements, where clients and regulators expect evidence-backed conclusions.
High-impact advisory use cases
Agentic AI in audit and advisory can support multiple advisory lines without changing the core concept: structured workflows, tool use, verification, and documentation.
High-impact areas include:
Controls transformation and SOX optimization
Finance transformation: close optimization, reconciliations, working capital improvements
Cyber risk assessments with remediation tracking
Regulatory compliance readiness: documentation, evidence collection, reporting packages
Practical Use Cases for KPMG (Examples + Mini Workflows)
Below are practical “use-case cards” to make agentic AI tangible. Each use case includes a goal, inputs, tools, outputs, and controls. These examples are deliberately designed around clear inputs and outputs because that’s where agent initiatives succeed or fail.
Use case #1 — PBC management agent
Goal
Improve evidence collection speed and completeness while reducing follow-up cycles.
Inputs
PBC list, engagement timeline, client contacts, prior-year request history, received files.
Tools
Email and calendar, client portal, document management system, audit workflow platform.
Outputs
PBC status dashboard, reminders and escalations, audit-ready evidence packages with standardized naming and metadata.
Controls
Role-based access to client files
Audit logs for every request, reminder, and file move
Validation checks for completeness (required fields, file type, period coverage)
Human approval gates before evidence is marked “accepted”
Use case #2 — Journal entry testing agent
Goal
Increase coverage and consistency in journal entry testing and exception documentation.
Inputs
Journal entry population, chart of accounts, user access lists, posting policies, materiality thresholds.
Tools
ERP queries, analytics environment, audit documentation system.
Outputs
Explainable anomaly flags, sampling suggestions, exception list with rationale and linkage to supporting data.
Controls
Pre-approved query templates with change control
Threshold logic tied to engagement parameters
Validation steps that reconcile totals to source reports
Mandatory human-in-the-loop review for any conclusion language
Use case #3 — Contract review agent for revenue/lease impacts
Goal
Speed up term extraction and route issues to the right specialists.
Inputs
Executed contracts, amendments, pricing schedules, relevant accounting policy positions.
Tools
Document processing and OCR, clause extraction, issue tracker, document repository.
Outputs
Key term extraction, clause-level citations, issue log, summary for specialist review.
Controls
Source-linked outputs only, no unsupported assertions
Confidence scoring and “needs review” flags for ambiguous clauses
Retention and confidentiality controls aligned to client requirements
Human approval before policy conclusions are added to workpapers
Use case #4 — Controls walkthrough and narrative drafting agent
Goal
Standardize control narratives and reduce rework in documentation.
Inputs
Walkthrough notes, process recordings/transcripts, control matrices, risk register.
Tools
Transcription tools, document templates, audit workflow system.
Outputs
Draft narratives, identified control points, mapping of controls to risks, draft questions for follow-up.
Controls
Template enforcement to match firm methodology
Visible evidence links from notes to narrative statements
Reviewer sign-off required before narratives are finalized
Change tracking to show what was modified by humans and why
Use case #5 — Advisory PMO agent
Goal
Reduce administrative overhead and increase delivery transparency.
Inputs
Project plan, meeting notes, decision logs, issue lists, status updates from workstreams.
Tools
Project management tools, email/calendar, document repository.
Outputs
Weekly status packs, RAID logs, action item tracking, escalation summaries.
Controls
Approval workflow for client-facing status reports
Permissions aligned to workstream roles
Audit log of changes to RAID items
Clear labeling of draft vs. approved outputs
These examples show why Agentic AI in audit and advisory is best implemented as a set of targeted agents, not a single “do everything” system. Smaller workflows distribute risk, make evaluation easier, and create reusable building blocks across service lines.
Operating Model: How KPMG Could Deploy Agentic AI Responsibly
The difference between a pilot and a durable program is the operating model: who owns it, how it’s controlled, how it’s evaluated, and how humans stay accountable.
Human-in-the-loop design (review, approval, escalation)
Human-in-the-loop review is not a box to check. It’s the core control surface for agentic systems.
A practical approach is to define three layers of agent behavior:
Drafting actions (low risk) Summaries, first-draft workpapers, checklists, and request emails prepared for review.
Execution actions (medium risk) Running approved queries, performing tie-outs, compiling evidence packages, creating exception lists.
Decision actions (high risk) Conclusions, sign-offs, scope changes, and anything that would alter audit strategy or advisory recommendations.
In most assurance contexts, the third layer should remain human-owned. Agentic AI can tee up decisions, but not make them.
Reviewer experience matters too. Reviewers should see:
What data was used
What steps were executed
What checks were performed
What sources support each claim
What remains uncertain
That visibility reduces reviewer complacency and supports defensible accountability.
Governance, compliance, and risk controls
AI governance and controls for agentic AI in audit and advisory should look familiar to risk leaders: policies, technical controls, and monitoring.
Key control categories:
Policy controls
Client confidentiality rules
Data retention requirements
Approved use cases and prohibited data types
Engagement-level approvals and exception processes
Technical controls
Role-based access and least privilege
Encryption in transit and at rest
Audit logs for every action and output
Environment separation between sandbox and production
Model controls (aligned to model risk management)
Evaluation before release and after changes
Bias and performance testing where relevant
Monitoring for drift, regressions, and rising error rates
Versioning and change control for prompts, tools, and workflows
For firms with formal model risk management (MRM) practices, agentic AI should be treated like any other system that influences decisions and documentation: tested, monitored, and governed.
Tooling architecture (high level)
At a high level, agentic AI in audit and advisory typically needs:
An orchestrator layer to manage workflows and tool calls
Secure connectors to systems like ERP, document repositories, and audit platforms
A retrieval layer for policies, methodologies, and engagement artifacts with strong access controls
A logging and analytics layer for monitoring, audits, and investigations
Versioning and change control for workflow definitions
Architecture matters because audit and advisory workflows cross many systems. Without a secure, cross-platform approach, teams end up with fragile point solutions that don’t scale.
Change management and skills
Scaling Agentic AI in audit and advisory will require new roles and new habits.
Common roles that emerge:
AI engagement lead to manage use-case selection, setup, and stakeholder alignment
AI quality reviewer focused on validation steps, documentation integrity, and exception handling
Agent librarian or workflow owner to manage templates, updates, and reuse
Training should emphasize:
Professional skepticism in an AI-assisted workflow
Validation steps and when to escalate
Documentation standards and what must be explicitly supported
How to interpret confidence, exceptions, and tool logs
Incentives matter as well. If teams are rewarded only for speed, quality will degrade. Strong programs measure quality and risk outcomes alongside efficiency.
Measuring Success: Audit Quality and Advisory KPIs That Matter
Agentic AI programs fail when success metrics are vague. Measuring success in Agentic AI in audit and advisory should include quality, efficiency, and risk.
Audit quality metrics
Reduced rework and fewer review notes
Increased consistency in documentation across teams and engagements
Faster evidence retrieval and tie-outs with fewer unresolved differences
Improved anomaly detection coverage, tracked alongside false-positive rates
Advisory delivery metrics
Cycle time reduction for assessments and deliverables
Improved client satisfaction driven by clearer, more traceable recommendations
Traceability score: how consistently the chain from finding to evidence to recommendation to action is maintained
Risk metrics (to prove “safe acceleration”)
Data access exceptions and permission violations
Unsupported-claim rate, measured through sampling audits of agent outputs
Override frequency: how often humans reject or modify agent outputs, and why
Escalation outcomes: time to resolution and root cause patterns
If these metrics are visible, teams can scale with confidence. If they’re invisible, risk accumulates silently.
Roadmap for Adoption (0–90 Days → 12 Months)
A realistic roadmap treats Agentic AI in audit and advisory as a sequence: start with structured, low-risk workflows, then integrate deeper, then standardize and scale.
Phase 1 (0–90 days): Low-risk pilots
Select 2–3 use cases with:
Clear inputs and outputs
High documentation value
Minimal autonomous action risk
Accessible data sources
Then set up an evaluation harness:
Define baseline cycle time, error rates, and rework
Create test datasets and edge cases
Establish approval gates and logging requirements
Many programs stall because they start with a complex, high-autonomy use case. Early wins should be boring, repeatable, and measurable.
Phase 2 (3–6 months): Integrate with core platforms
Once the workflow patterns are proven, integrate with:
Audit workflow tools
Document management systems
Data catalogs and governed data sources
Then expand to multi-agent patterns:
Planner agent to decompose tasks
Executor agent to run tool actions
Validator agent to cross-check outputs
Documenter agent to generate workpapers and evidence packages
This is where assurance automation becomes operationally meaningful, not just experimental.
Phase 3 (6–12 months): Scale and standardize
Scaling requires standardization:
Reusable agent templates tied to playbooks
Formal controls testing of the AI system itself
Periodic internal audits of logs, access, and output quality
Rollout governance across service lines and geographies
By this point, the program should resemble a product: versioned, monitored, controlled, and continuously improved.
Conclusion: The Responsible Path to Higher Quality
Agentic AI in audit and advisory is not about replacing auditors or consultants. It’s about building a methodology layer that improves consistency, traceability, and execution across complex workflows.
For KPMG, the highest-impact path is straightforward:
Build safe architecture that supports cross-platform workflows and strong access control
Design human-in-the-loop review so accountability stays clear
Measure outcomes across quality, efficiency, and risk so acceleration remains defensible
The firms that get this right won’t just move faster. They’ll produce more consistent work, with clearer evidence trails and stronger governance, even as complexity grows.
To see how enterprise teams build governed AI agents that connect across systems and support real workflows, book a StackAI demo: https://www.stack-ai.com/demo
