>

AI Agents

How Con Edison Can Transform Utility Grid Management and Customer Services with Agentic AI

StackAI

AI Agents for the Enterprise

StackAI

AI Agents for the Enterprise

How Con Edison Can Transform Utility Grid Management and Customer Services with Agentic AI

Agentic AI for utility grid management is quickly moving from an innovation buzzword to a practical operating model for utilities that need to improve reliability, accelerate outage response, and rebuild customer trust during high-stakes events. For Con Edison, the opportunity is especially clear: a dense urban service territory, complex infrastructure, increasingly volatile weather, and rising expectations for real-time updates create the perfect environment for AI agents that can coordinate work across systems, teams, and channels.


This isn’t about replacing grid operators, dispatchers, or customer care teams. It’s about reducing the friction that slows them down: searching across SCADA alarms, OMS tickets, AMI signals, GIS layers, asset history, playbooks, and customer records just to answer basic questions like “What’s happening?”, “What should we do next?”, and “Who needs to know?”


Agentic AI in utilities can help Con Edison move faster with better consistency by turning insights into actions, under strict human control and operational guardrails. Done well, it becomes a shared orchestration layer across grid operations automation and customer experience: one system that understands the situation, follows approved procedures, and executes repeatable steps at machine speed.


What “Agentic AI” Means for a Utility (and Why It’s Different)

Definition (clear, featured-snippet ready)

Agentic AI for utility grid management refers to goal-driven AI systems that can plan tasks, make decisions within defined limits, and take actions across utility tools and workflows. Unlike a chatbot that only answers questions, an AI agent can read telemetry and tickets, follow operating procedures, call approved system APIs, generate drafts and recommendations, and route work for human approval when risk is high.


The key is that the agent is connected to real workflows and governed by strict controls. It doesn’t “freestyle” grid operations. It performs structured work: gather context, apply rules and policies, propose a next step, and either execute or escalate depending on risk.


Here’s a practical way to distinguish the common approaches:


  • Traditional automation (rules/RPA): Executes fixed steps when conditions match, but can’t adapt well when inputs are messy or incomplete.

  • Predictive ML: Forecasts outcomes (failures, loads, churn), but usually stops short of doing the next operational step.

  • Chatbots: Provide conversation and knowledge retrieval, but typically don’t connect to tools to complete work.

  • Agentic AI in utilities: Combines understanding, planning, tool use, and controlled action to complete multi-step workflows end to end.


Why agentic AI is emerging now

Utilities have long used automation and analytics, but today’s constraints are different. A modern grid is more dynamic, and the work is more cross-functional. Agentic AI is emerging because the ingredients finally line up:


  • Better integration options: More mature APIs and event streams make it easier to connect OMS, ADMS/DMS, AMI/MDMS, CRM, and work management.

  • Stronger orchestration frameworks: Teams can build multi-step workflows with logging, approvals, retries, and versioned procedures.

  • Operational complexity is increasing: DER adoption, EV charging growth, and aging assets add new variables to every decision.

  • Customer expectations have shifted: People want proactive updates, self-service resolution, and consistent answers across channels, especially during outages.


For Con Edison, these forces converge in daily grid and service operations. Agentic AI for utility grid management becomes a way to standardize execution without forcing every team into yet another manual process.


Con Edison’s High-Impact Use Cases for Grid Management

Agentic AI for utility grid management becomes most valuable when it touches high-volume, high-variability workflows where decisions depend on multiple systems and time matters. The goal is not to create a single “super-agent,” but a set of focused AI agents for outage management, maintenance, distribution optimization, and storm response, each with clear inputs, outputs, and guardrails.


Outage detection, triage, and restoration orchestration

Outage response is a coordination problem. Signals arrive from everywhere: AMI last-gasp messages, SCADA alarms, customer calls, IVR selections, mobile app reports, OMS tickets, and weather feeds. Humans can manage this, but the first 15 to 30 minutes are where delays cascade.


An AI agent can continuously monitor these feeds and do the repetitive work at speed:


  • Cluster signals into probable feeder- or transformer-level events

  • Cross-check with known device topology from GIS and switching constraints from ADMS/DMS

  • Prioritize incidents based on impacted customers, critical facilities, and safety flags

  • Propose switching plans for human review, especially for FLISR-style restoration options

  • Update estimated time of restoration (ETR) confidence as milestones are completed


This is where AI agents for outage management can improve outcomes without taking unsafe actions. The agent prepares the decision package; operators approve the steps.


KPIs that can move with this approach include:


  • Faster fault location and triage time

  • Better ETR accuracy and consistency across channels

  • Reduced time to restoration when safe switching alternatives exist

  • Improved SAIDI/SAIFI performance where process delays are reduced


Predictive maintenance that turns insights into work orders

Most utilities already experiment with predictive maintenance for utilities, but many programs stall at “nice dashboard” stage. The hard part is converting predictions into scheduled work that fits constraints: crew availability, parts, access windows, load conditions, and safety requirements.


This is a natural fit for agentic AI for utility grid management because agents can push the workflow forward:


Inputs the agent can use:


  • Asset health indices and failure history

  • Inspection notes and defect codes

  • Thermal imaging and condition monitoring alerts

  • Vegetation management data

  • Work backlog, crew calendars, and outage windows


Outputs the agent can produce:


  • A prioritized maintenance backlog with risk scoring and rationale

  • Draft work orders with recommended scope, parts/tools checklist, and safety notes

  • Suggested scheduling windows that consider load, weather, and operational constraints

  • Escalations when data is missing or risk crosses a threshold


Instead of asking planners to “go look at the model,” the agent delivers ready-to-review work packages and keeps the backlog current as new data arrives.


Distribution optimization with ADMS/DMS plus DER coordination

As DER penetration increases, distribution operations become more variable. Voltage and load issues can change quickly, and optimization needs to account for local conditions and operating policies. Agentic AI in utilities can support ADMS optimization with AI without bypassing safety rules.


Examples of agentic support in distribution management system (DMS) AI workflows include:


  • Assisting with voltage/VAR optimization setpoints by compiling constraints, recent events, and device statuses

  • Supporting FLISR by generating alternative restoration paths and highlighting constraint conflicts

  • Recommending peak load mitigation actions when constraints emerge (with operator approval)


DER orchestration AI becomes increasingly relevant as solar, storage, and EV charging change net load patterns. Even when a utility is not directly controlling customer DERs, an agent can coordinate available programs and operational levers:


  • Recommend demand response events or targeted outreach where programs allow

  • Identify feeders with recurring overload risk and propose staged mitigation plans

  • Suggest EV charging demand shaping strategies through partnerships or customer programs


The value is less about “AI makes the grid optimal” and more about “AI reduces the time to produce a safe, policy-compliant plan.”


Storm response automation (before, during, after)

Storm operations are where process maturity is tested. The challenge is volume, speed, and uncertainty. AI for storm response utilities works best as a set of focused agents that support each phase.


Pre-storm:


  • Identify circuits with higher vulnerability based on historical performance and forecasted conditions

  • Recommend staging locations for crews and materials based on predicted impact zones

  • Draft pre-event communications and preparedness reminders with approved language


During storm:


  • Continuously reconcile OMS, AMI signals, and customer reports to refine incident clusters

  • Recommend dynamic crew reallocation based on restoration progress and priorities

  • Generate situation reports at defined intervals without manual cut-and-paste


Post-storm:


  • Automate reporting packages by pulling timelines, actions taken, and outcomes

  • Support root-cause analysis by summarizing recurring failure modes

  • Propose resilience investments and operational changes based on patterns


The transition here is important: an agent doesn’t need to “run restoration.” It needs to reduce the coordination overhead so humans can focus on judgment calls and safety.


Top 5 agentic AI grid use cases (quick list):

  1. AI agents for outage management: detection, clustering, and triage support

  2. ETR and customer update orchestration with consistent status logic

  3. Predictive maintenance for utilities that generates work orders

  4. ADMS/DMS decision support for switching and constraint management

  5. Storm response utilities automation: situational reporting and resource coordination


Transforming Customer Service with Agentic AI (Beyond Chatbots)

Customer experience improvements often start with a chatbot and end with frustration because the bot can’t do anything. Agentic AI for utility grid management changes this dynamic by connecting customer intent to real workflow actions, while still protecting sensitive data and enforcing policy constraints.


Proactive, personalized outage communications

The fastest way to reduce inbound volume and frustration is to reduce uncertainty. When customers don’t know what’s happening, they call.


An agent can trigger proactive updates via SMS, email, app notifications, or automated voice based on OMS status, geography, and customer preferences. The best version is not “one message to everyone,” but neighborhood- and incident-specific status:


  • ETR ranges with confidence, not false precision

  • Clear explanations of what has been completed (assessment, switching, repairs)

  • Safety guidance relevant to conditions (downed wire warnings, generator safety)

  • Links to local resources where appropriate


This is where agentic AI in utilities improves trust: consistent, timely updates that don’t contradict what customers see on the outage map or hear from a live agent.


“Resolve my issue” self-service agent (billing, service, field)

A customer self-service virtual agent becomes genuinely useful when it can take approved actions. For Con Edison, that means tying together CIS/billing, CRM, service orders, appointment scheduling, and field dispatch workflows.


Actions an agent can support, under guardrails:


  • Start, stop, or move service with identity verification and policy checks

  • Payment arrangement workflows, eligibility screening, and next-step instructions

  • Dispute triage: gather required information, categorize the case, and open the right ticket

  • Schedule appointments and coordinate field visits, including confirmations and reminders


Even when it hands off to a human, it can reduce transfers by summarizing intent, relevant account context, prior actions, and what the customer has already tried. That’s where contact center automation utilities programs see the most immediate operational gains.


Step-by-step: how an agentic AI resolves a billing issue

  1. Authenticate the customer and confirm account scope (single or multiple locations)

  2. Identify the issue type (high bill, payment posting, rate plan confusion, disputed charge)

  3. Pull relevant data: recent bills, meter reads, usage patterns, payments, adjustments

  4. Check for known events: estimated reads, service changes, outages, billing schedule shifts

  5. Explain the situation in plain language and present options allowed by policy

  6. If needed, create a case with the correct category and attach a summary plus evidence

  7. Offer next actions: payment plan, review request, appointment, or escalation path

  8. Confirm what will happen next and when the customer will hear back

  9. Log the interaction for audit and quality review


Contact center agent assist (real-time)

Not every call should be automated, especially complex or emotionally charged situations. But agent assist can reduce handle time and improve consistency without risking incorrect autonomous actions.


Real-time capabilities include:


  • Live summarization and structured notes during the call

  • Next-best action prompts based on policy and the customer’s issue type

  • Required disclosure reminders for compliance consistency

  • Fast knowledge retrieval from tariffs, program requirements, outage playbooks, and internal procedures


In practice, this improves first contact resolution (FCR) because agents spend less time searching and more time resolving.


Multilingual and accessibility improvements

Utilities serve diverse communities. Agentic AI can improve clarity and access when it is trained on approved terminology and wrapped in strong quality controls:


  • Higher-quality translation for customer communications

  • Plain-language explanations of bills, programs, and fees

  • Support for accessibility-aligned content formatting and delivery across channels


This is not just a brand improvement; it reduces repeat contacts that happen when customers don’t understand what they’re being told.


Reference Architecture: How Agentic AI Would Plug into Con Edison Systems

A successful agentic AI for utility grid management program depends on integration and control, not just model choice. The agent must plug into the systems where work happens and provide end-to-end auditability.


Core systems the agents must integrate with (examples)

Grid/operations:


  • SCADA

  • OMS

  • ADMS/DMS

  • GIS

  • AMI/MDMS

  • Asset management / EAM

  • Work management and scheduling

  • Field mobility tools


Customer:


  • CIS/billing

  • CRM

  • Contact center/IVR platform

  • Web/app experiences

  • Outage map and notification systems


Enterprise:


  • Data platform (lakehouse/warehouse)

  • Document management and knowledge base (SOPs, playbooks, policies)

  • Ticketing and collaboration tools


The agent stack (layered view)

A practical stack for agentic AI in utilities usually looks like this:


  • Orchestration layer: plans tasks, breaks work into steps, manages retries and timeouts, routes to tools

  • Tool layer: approved connectors and APIs to OMS/DMS/CRM/EAM, messaging systems, ticketing, scheduling

  • Knowledge layer: procedures, policies, and playbooks the agent can reference for consistent outputs

  • Observability layer: logs, evaluations, approvals, versioning, and monitoring for production governance


This layered approach also makes it easier to scale: you can build a second or third agent that reuses the same tools and governance patterns.


Human-in-the-loop controls by risk level

Utilities can’t treat all actions the same. A simple but effective approach is to tier controls:


Low-risk (can automate with monitoring):


  • Draft outage updates and service emails for review or auto-send with approved templates

  • Summarize tickets and generate structured reports

  • Retrieve policy answers and propose guidance


Medium-risk (recommendation with approval):


  • Recommend dispatch changes or scheduling adjustments

  • Propose switching plans and restoration sequences

  • Generate maintenance work orders pending planner approval


High-risk (strict authorization required):


  • Any action that directly changes operational states or could affect safety and reliability must require explicit authorization, step-level logging, and role-based permissions


This is how agentic AI for utility grid management becomes usable: it earns trust through boundaries.


Governance, Security, and Regulatory Realities (Non-Negotiables)

Utilities operate in a high-accountability environment. Agentic systems must be designed with risk management at the center, not added later.


Safety and operational risk management

The safest agent is one that can never exceed operating limits or bypass required approvals. Practical controls include:


  • Fail-safe design: agents should default to escalation when uncertain

  • Role-based access control (RBAC) with least privilege

  • Change management processes for workflows, policies, and tool permissions

  • Clear separation between recommendation and execution for operational actions


A disciplined approach also improves adoption: operators are far more likely to trust an agent that behaves predictably and documents its work.


Data privacy and customer trust

Customer data is sensitive, and utilities can’t afford mishandled PII. Core requirements typically include:


  • Redaction or minimization of PII in transcripts and logs where possible

  • Strong retention controls aligned with policy and regulatory needs

  • Transparent disclosures when customers are interacting with AI-supported channels

  • Consistent language and escalation paths for sensitive situations


Trust is built when customers feel informed, not handled.


Cybersecurity for agentic systems

Utility cybersecurity for AI systems has a different threat model than a standalone analytics tool. When an agent can call tools, it becomes a new control plane that must be protected.


Common risks:


  • Prompt injection through untrusted text inputs (tickets, emails, notes)

  • Tool misuse if the agent is given overly broad permissions

  • Data exfiltration through connectors or misconfigured logs

  • Compromised APIs or credentials used by agent tools


Core defenses:


  • Network segmentation respecting IT/OT boundaries

  • Allowlisted tools and allowlisted actions per workflow

  • Strong authentication, secrets management, and key rotation

  • Continuous monitoring and anomaly detection for tool calls

  • Incident response runbooks specifically for agent components


Compliance considerations to address early

Even when specific frameworks vary, the themes are consistent:


  • Records retention and auditability for actions, approvals, and communications

  • Procurement and vendor risk management

  • Accessibility requirements for customer-facing communications

  • Clear accountability for model updates, workflow changes, and approvals


When governance is built into the workflow, scaling becomes much easier.


Implementation Roadmap (90 Days to 12 Months)

Agentic AI for utility grid management succeeds when it is deployed like a utility program: phased, measured, governed, and operationally owned. The fastest wins typically come from customer and communications workflows first, then deeper grid integrations.


Phase 1 (0–90 days): Pilot with measurable ROI

Pick 1–2 use cases where value is visible and risk is manageable:


  • Contact center agent assist

  • Outage communications automation


Set success metrics upfront, such as:


  • Average handle time (AHT) reduction

  • Digital containment / call deflection improvement

  • CSAT movement for outage interactions

  • ETR update consistency and reduced “where is my crew” follow-up calls


Build the foundation during the pilot:


  • Secure data connectors to a limited set of systems

  • Logging and audit trails for every agent action

  • An evaluation harness to test quality, compliance, and failure modes


This phase should prove that the agent can operate reliably within guardrails.


Phase 2 (3–6 months): Expand to cross-functional workflows

After the pilot, expand integration and permissions carefully:


  • Add OMS and CRM integration for unified outage context and customer communications

  • Connect EAM/work management for maintenance and work order workflows

  • Introduce role-based action permissions and approval routing by risk tier

  • Establish governance routines: testing, red-teaming, QA, and versioned procedures


This is where grid operations automation starts to become real: the agent is no longer a helpful assistant, but a workflow engine.


Phase 3 (6–12 months): Production-scale agentic operations

At scale, the big gains come from multi-agent orchestration, where specialized agents coordinate like a team:


  • Storm response operations: reporting, prioritization support, and communications coordination

  • Field service dispatch optimization recommendations based on evolving conditions

  • Continuous improvement loops where human feedback updates playbooks and constraints


Training and change management matter here. The goal is adoption, not novelty. Operators, supervisors, and customer service leaders should help define what “good” looks like and what the agent should never do.


90-day pilot checklist (practical)

  • Select one customer workflow and one operational workflow with clear scope

  • Map inputs and outputs: systems, data fields, required policies, escalation points

  • Define risk tiers and approval rules before building

  • Build connectors with least-privilege permissions and strong logging

  • Create test cases from real historical events (including storms and edge cases)

  • Run evaluations weekly and track error categories, not just averages

  • Launch with a limited user group and a clear feedback loop

  • Document ownership: who changes procedures, who approves expansions, who audits logs


Measuring Success: KPIs That Matter for Con Edison

To justify scaling agentic AI for utility grid management, measurement should cover outcomes, efficiency, and risk. The best programs define attribution rules early so improvements don’t get lost in broader modernization efforts.


Grid operations KPIs

  • SAIDI/SAIFI improvements where process speed and coordination contribute

  • Fault location time and time-to-triage

  • Restoration time for comparable events

  • ETR accuracy and update consistency

  • Preventive vs corrective maintenance ratio shifts

  • Crew utilization and truck rolls avoided via better scheduling and triage


Customer service KPIs

  • Call deflection and digital containment rates

  • AHT and after-call work reduction (especially with agent assist)

  • First contact resolution (FCR)

  • CSAT/NPS changes, especially during outage events

  • Complaint rate and escalation volume


Business and risk KPIs

  • Cost-to-serve changes by channel and interaction type

  • Agent error rate by category (policy errors, hallucinations, tool errors, missing context)

  • Override rate and escalation rate by workflow

  • Audit findings, security incidents detected/prevented, and time-to-remediate


A mature program will treat these like operational KPIs, not data science metrics.


Contentious Questions (And Practical Answers)

Will AI replace dispatchers or customer service agents?

In practice, agentic AI in utilities is more effective as augmentation than replacement. Utilities have deep institutional knowledge embedded in experienced people. The immediate value comes from removing low-value work:


  • Searching across multiple systems for context

  • Repeating the same explanations and documentation

  • Manual report creation and status reconciliation

  • Routine case creation and routing


Over time, roles may evolve. But the most realistic near-term outcome is that teams handle more volume with less burnout and more consistent quality.


Can we trust AI during storms?

Only if the system is engineered for storms, not just tested on calm-day data. Trust comes from:


  • Guardrails that prevent unsafe actions

  • Approval workflows for higher-risk recommendations

  • Strong testing on historical storm events and edge cases

  • A clear fall-back mode when data quality degrades


A sensible approach is to start with decision support and communications, then graduate to constrained actions as confidence grows.


What data do we need, and what if it’s messy?

Messy data is the norm, not the exception. The key is to define minimum viable data for a pilot and improve progressively.


Minimum viable inputs often include:


  • A reliable incident source (OMS tickets, outage events)

  • A communication channel and customer preferences source (CRM/CIS plus notification platform)

  • A knowledge base of approved language and procedures

  • Basic telemetry indicators (AMI last gasp or SCADA alarms) if the use case requires it


As the program scales, data quality work can be prioritized based on where it increases automation safely.


Conclusion: A Dual Transformation Powered by Agentic AI

Con Edison doesn’t need agentic AI for utility grid management because it’s trendy. It needs it because grid reliability and customer trust now depend on faster, more consistent execution across dozens of interconnected systems and teams. The same agentic layer that speeds outage triage can also improve outage communications. The same governance that protects operational actions can also protect customer data and compliance.


The best path forward is phased:


  • Start with a 90-day pilot that improves customer experience and reduces operational load

  • Expand to cross-functional workflows that connect OMS, CRM, and work management

  • Scale into production-grade storm response and field coordination with strong controls


To see what an enterprise-grade agentic workflow looks like in practice, book a StackAI demo: https://www.stack-ai.com/demo

StackAI

AI Agents for the Enterprise


Table of Contents

Make your organization smarter with AI.

Deploy custom AI Assistants, Chatbots, and Workflow Automations to make your company 10x more efficient.