Beta Digital

AI

The Future of SAP Support Is Not a Chatbot. It Is a Governed Support Plane

Adam Hislop
Abstract hero image showing AI insights, governance and the right support action balanced within a governed SAP support plane.

There is a version of "AI for support" that sounds impressive in a demo and then falls apart the first time it meets a real enterprise system.

You point a model at some logs. It summarises the error. Maybe it suggests a likely cause. Perhaps it even drafts a ticket update. Useful? Sometimes. Transformational? Not really.

That is not a support model. It is autocomplete over incidents.

The harder question is this: what would it take for AI to become genuinely useful in enterprise application support, particularly in a SAP landscape where a single failed process might represent a delayed order, a blocked customer, a duplicate posting, a security issue, or a perfectly valid business rejection?

For organisations in Australia and New Zealand running SAP estates, this is not an academic question. Support is increasingly spread across S/4HANA, ECC remnants, SAP BTP, Integration Suite, Cloud ALM, hyperscaler services, partner systems, ITSM platforms and managed service teams. The context needed to resolve an incident rarely lives in one place.

The future of support is not simply a smarter chat window. It is an AI support plane: a governed layer that connects operational knowledge, live system evidence, ITSM context and approved actions, while runbooks and guardrails decide what the AI is allowed to do.

The AI proposes. Governance disposes.

The support problem is structural

Most support processes still rely on people stitching context together manually: alerts in one tool, application logs in another, SAP state behind a specialist, runbooks in a document library, prior incidents buried in ITSM.

This is slow, but the bigger problem is consistency. Different analysts can take different paths, and the outcome depends too heavily on who is looking at the incident and how much of the landscape they personally understand.

AI can help, but only if it is connected to the right things and constrained in the right way. The important distinction is not really chatbot versus non-chatbot. It is ungoverned tool access versus governed tool access. The interface may still be conversational. The difference is that the conversation is connected to controlled systems of record, not model confidence alone.

Runbooks need to become operational policy

Runbooks have always been part of good support. The problem is that many runbooks are passive documents. They are useful if someone finds them, trusts them, reads them correctly and applies them at the right point in the incident.

In an AI-assisted support model, runbooks need to become machine-readable operational knowledge: still readable by humans, but structured enough for tools and workflows to use safely.

A useful runbook should define the failure category, eligible and ineligible symptoms, required evidence, systems to check, allowed actions, approval rules, evidence write-back and recurrence handling.

This is where the support model starts to change. The runbook stops being a document someone reads after the event. It becomes the policy layer that grounds AI-assisted support in approved operational practice.

There is a governance implication too. If runbooks become policy, stale runbooks become stale policy. They need owners, review cycles, version control and evidence that the support path still works as the SAP estate changes. This is consistent with the broader direction of AI management systems such as ISO/IEC 42001, where governance, accountability and controlled operation matter as much as the AI capability itself. [6]

The support plane connects knowledge to evidence

Once runbooks are structured, the next step is connecting them to live evidence: application state, jobs, workflow, authorisations, master data, configuration, integration logs, monitoring signals, security events, identity status, prior incidents and approved automation flows.

This is where MCP-style tool access becomes interesting. MCP, the Model Context Protocol, provides a standard way for AI applications to connect to external tools, data sources and workflows. In practical terms, it lets an AI client call controlled tool surfaces rather than relying on pasted screenshots, copied logs or ad hoc API integrations. [1]

But tool access is not, by itself, the answer. In fact, unrestricted tool access is exactly how AI support becomes dangerous. The support plane has to separate reading from acting.

Reading can be broad, but still governed: metadata-first, least privilege, audited, no unnecessary payload exposure, no secrets in model context. Acting must be narrow: approved operations only, explicit preconditions, evidence capture, human approval where required, and no ability for the model to override the policy.

Guardrails are what make this enterprise-safe

The biggest mistake in AI support is to treat the model's answer as the control point. It should not be.

The model can assemble facts. It can explain likely causes. It can retrieve the runbook. It can propose the next action. But the decision about whether an action is allowed should come from deterministic guardrails.

This is not only a design preference. It lines up with mainstream AI risk guidance. NIST's Generative AI Profile frames generative AI risk as something to govern, map, measure and manage, while OWASP's LLM06:2025 Excessive Agency risk calls out excessive functionality, permissions and autonomy in tool-using LLM systems. Its mitigations - minimise tools, apply least privilege, require human confirmation for high-impact actions and log activity - read very much like this support-plane pattern. [2][3]

For example: business validation rejections are never replayed; duplicates are never blindly replayed; security, credential and certificate issues stop and hand off; missing business keys or correlation IDs mean do not act. Receiver or connectivity failures may be replayable, but only after health checks, retry limits, idempotency checks, evidence capture and human approval.

A critical point is who classifies the incident. The safe answer is not "the model decides". The model can suggest a runbook match, but the gate should key off structured system signals: status codes, SAP return messages, error categories stamped by the integration flow, business rejection codes, receiver health, retry count, business key and correlation ID. If those facts are missing or ambiguous, the gate should fail closed.

That is the difference between a guardrail and a suggestion. The AI's natural-language interpretation is not trusted for the act decision. It supplies or retrieves facts. The guardrail engine evaluates them.

A naive bot might see an error and try again. A governed support plane asks whether retry is even the right class of action.

ITSM becomes part of the loop

This also changes the role of ITSM. In many organisations, ITSM is where incidents are recorded, assigned and closed. In a stronger model, ITSM becomes part of the evidence loop.

The incident should record which runbook matched, which evidence was gathered, which guardrails passed or failed, who approved the action, what was done, what was refused, and whether the issue is recurring.

That matters because the goal is operational learning, not just faster ticket closure. ITIL-style known-error records document root causes and workarounds; SRE postmortem practice treats incidents as records of impact, action, cause and follow-up work to prevent recurrence. [5][4]

If the same receiver outage happens five times in a month, the answer is not to get better at replaying messages. It is to raise a problem record and fix the underlying design. AI-assisted support becomes interesting when it helps the service improve.

Testing the model: a SAP BTP support-plane demo

To make one part of this model concrete, we built a working SAP BTP support-plane demonstrator around an order-to-cash integration.

The scenario is deliberately familiar. An ordering platform submits customer orders through SAP BTP Integration Suite to a SAP backend. The backend is an S/4HANA system: determine whether the issue sits in the ordering platform, the integration runtime, SAP, business data, or the control process around replay.

Claude is used as the prototype support interface because the point of the demonstrator is to put AI at the centre of the support workflow, not at the edge of it. Claude brings together runbooks, monitoring signals, message logs, application state, SAP-side evidence and guardrail decisions in one conversation.

In production, the distinction is not that Claude disappears into the background. The distinction is that Claude is embedded into the organisation’s support tooling, access model, ITSM process and operational controls. MCP surfaces define what it can reach. Runbooks define the approved policy. Guardrails define what it can and cannot do. Human approval remains explicit for consequential action.

Claude is the support-plane interface. Governance is the control layer.

Underneath the demonstrator sits a Docker Desktop MCP profile with four tool surfaces: GitHub for the runbook repository, a support-plane server for the ordering platform and the SAP backend, a BTP Integration Suite MPL server for message evidence, and a SAP Cloud ALM server for monitoring analytics.

The runbooks live in GitHub as Markdown with structured front matter. They encode error categories, eligible and ineligible symptoms, replay rules, approval requirements and evidence requirements. Claude can read those runbooks. It cannot rewrite them.

The monitoring and runtime tools provide the evidence trail. Cloud ALM gives the aggregate signal. The Integration Suite message log gives the specific message: GUID, status, correlation ID, business key, error category, error text and timings.

The support loop is: Cloud ALM signal -> BTP Integration Suite message evidence -> ordering and SAP-side state -> GitHub runbook -> deterministic guardrail -> human approval or refusal -> evidence record -> problem management.

That is the loop a strong support engineer would follow manually. The AI support plane makes it conversational and more consistent.

The moment the demo becomes real

The most useful part of the demo is not the happy path. It is the contrast between two incidents that look similar from a distance but require opposite decisions.

In the first case, the SAP receiver is temporarily unavailable. The order has left the ordering platform, but it has not reached a definitive business outcome in SAP. The Integration Suite message fails with a receiver or connectivity category. The runbook says controlled replay may be allowed, but only after the receiver is healthy, the business key and correlation ID are present, the replay count is below the ceiling, evidence has been captured, and the operator explicitly approves.

Even then, replay is not automatic. Claude can ask for approval. It cannot approve itself.

In the second case, the integration message completes technically. The runtime can look green. But SAP has returned a business rejection: the customer is blocked.

This is the important point: technically completed does not always mean commercially successful.

The order reached SAP. SAP processed it. SAP gave a definitive business response. Replaying that order would not solve anything. It would simply repeat the same rejection and create noise. The correct action is to write the evidence and hand off to the business, master-data or credit process owner.

Green in the dashboard, lost in the business.

That is the kind of distinction a real support model has to make. A chatbot over logs will struggle with it. A governed support plane can demonstrate the design: the runbook, the structured SAP evidence and the guardrail engine all point to the same outcome. This is not replayable.

Trust matters more than cleverness

This is where enterprise design matters. In the demo, each MCP server is single-purpose. Credentials are injected at runtime. The GitHub server is read-only. The SAP monitoring credentials are read-only. The systems-plane tools are metadata-first. Raw payload bodies are not exposed by default. Public-facing mode defaults to read-only. Destructive action is not performed; controlled replay is modelled and evidence is written.

For Australian and New Zealand organisations, this read-plane discipline matters beyond AI safety. Many SAP estates carry commercially sensitive, regulated or residency-sensitive data. A support-plane model should not require raw business payloads, credentials or unrestricted system access to be pushed into model context. Metadata-first access becomes part of the data-residency posture: enough context to support the investigation, without turning the support conversation into another uncontrolled data store.

The demo does not claim that an AI should be given broad production authority on day one. Quite the opposite. It shows that the useful first step is to let AI observe broadly, reason across evidence, and act narrowly only where the organisation has already defined the control model. Holding back destructive action is not a weakness. It is the correct first step in proving the operating model.

What this means for SAP managed services

For SAP managed services, the implication is broader than integration support. The managed service of the future is not just a queue of tickets staffed by people with access to dashboards. It is an engineered support loop across the SAP estate: business processes, applications, integrations, batch jobs, user access, data flows, platform services, security events, change activity and automation.

A finance close issue may involve jobs, logs, authorisations, master data and prior incidents. A sales order problem may sit between pricing, credit, ATP, integration, workflow or a warehouse system. A BTP issue may involve destinations, certificates, Cloud Connector, identity, entitlements, application logs and hyperscaler connectivity.

The support analyst's job is often less about solving one technical error and more about assembling enough context to know what kind of problem it really is.

A support-plane model would not remove that complexity. It would make the path through it more consistent: better first response, cleaner escalation evidence, and a safer route from manual handling to operator-approved action, then to bounded automation only when the pattern is well understood, observable, low-risk and reversible.

That is the hypothesis this demonstrator supports. It does not prove universal MTTR reduction, nor does it remove the need for experienced SAP support people. It shows a credible direction: support improves when knowledge, evidence, action and learning are connected in a governed loop.

Most importantly, the service starts to learn. Repeated job failures, business rejections, certificate issues or destination problems should become problem-management signals. If support keeps applying the same workaround, the better answer may be a design change, not a faster ticket.

The point is not autonomous AI fixing production. The point is governed operational knowledge connected to live evidence, with AI as the support interface and humans retaining accountability.

That is where I think SAP support is heading.

Not a chatbot.

A support plane.

Sources & references

  1. Model Context Protocol - Introductionmodelcontextprotocol.io

    MCP is described as an open standard for connecting AI applications to external systems, including tools, data sources and workflows.

  2. NIST AI 600-1 - Artificial Intelligence Risk Management Framework: Generative AI ProfileNational Institute of Standards and Technology

    A companion profile to the NIST AI RMF for generative AI risk management.

  3. OWASP GenAI Security Project - LLM06:2025 Excessive AgencyGenAI Security Project

    OWASP describes excessive agency in tool-using LLM systems and recommends controls such as minimising extensions, least privilege, human approval for high-impact actions and audit logging.

  4. Google SRE Book - Blameless Postmortem CultureGoogle

    Background for the idea that incidents should create evidence, learning and follow-up actions.

  5. Atlassian ITSM - Problem Management ProcessAtlassian

    Accessible explainer of ITIL-style known-error records, root causes, workarounds and known-error databases.

  6. ISO/IEC 42001:2023 - Artificial intelligence management systems. ISO.org

    Further reading for enterprise AI governance and management-system thinking.