The Design System As The Operational Layer for Responsible Human-AI Interaction

Sara Portell • February 6, 2026

Share this article

Design systems were built to scale consistency, efficiency and quality in user-centric applications: reusable components, shared patterns and practices, and a common language across design and engineering, promoting collaboration. They improve velocity because teams stop solving the same interface problems repeatedly, providing measurable ROI. 


AI introduces both immense opportunities and complex (technical, legal and social) challenges, and it is reshaping the operating conditions traditional design systems were built for. User-facing outputs are adaptive and can vary by input, model behaviour can shift over time and responses that sound credible can still be wrong. These systems can also reproduce or amplify bias, creating unequal outcomes across users. In high-confidence, relational interactions, they can shape user judgment and behaviour. These shifts raise the bar for accountability, transparency, and governance across the full product lifecycle.


The challenge is not only consistency and quality. It is ensuring consistency and quality safely, fairly and responsibly as both system behaviour and human behaviour evolve.


At the same time, AI-powered copilots and no-code tools are increasingly used in the design process to support ideation, prototyping, and delivery, but their adoption also raises concerns about transparency, bias, privacy, and the need to preserve human judgment and oversight. Fast, polished design outputs often look complete even when the underlying logic is incomplete or flawed. As a result, familiar UX failures, misalignment with real user needs, hidden edge cases and context breakdowns, become harder to detect and more costly to correct later.


Design systems can take on a bigger operational role in AI-enabled product development by codifying user-centric foundations, rules and infrastructure that guide consistent, safe, ethical and scalable human-AI experiences.

In AI-enabled contexts, design systems increasingly function as product systems, codifying behavioural guardrails, human oversight controls, and lifecycle governance. These system-level safeguards help teams manage risks that accumulate over time, including model drift, hallucinations and inaccuracies, over- or under-trust, erosion of user agency and decision-making, unequal outcomes from bias, and contextual or cultural misfit.


The designer’s role is expanding beyond interface craft into shaping system behavior, orchestrating human-AI collaboration and managing interaction risk (likely to emerge over time). Accountability is now distributed. Outcomes are shaped by interdependent variables owned across teams (i.e., prompts, models, retrieval pipelines, guardrails, interaction patterns, monitoring/update cycles). As a result, governance cannot be treated as a policy layer. Governance becomes a cross-functional design challenge embedded in day-to-day product decisions.


AI Ethics Standards provide guideliness and structure, but product teams still need to convert those principles into everyday decisions (i.e., what to ship, how it behaves, how it's explained, what to block, what to review (by who/ at what point), what to escalate, etc.). In practice, this is where teams operationalise recognised frameworks like NIST AI RMF and ISO/IEC 42001/23894, and, in the EU, align interaction controls with the EU AI Act’s risk-based obligations. That translation gap is where design systems can create important leverage. Because they function as shared cross-functional operational memory, design systems can turn governance into design and delivery logic. They can enforce safe and effective interaction patterns, human oversight and controls embedded in how teams already work.


In other words, governance becomes built-in by default, not layered on after release, making design systems central to sustaining UX quality and safety over time.

Creating a Design System for Responsible Human-AI interaction

A practical implementation in 5 modules:

A responsible design system helps teams ship effectively at scale while allowing teams to maintain quality and manage behavioral impact and risk.


Such as system should be across interfaces (UI, voice, agentic experiences), model providers and tooling environments (Figma, code assistants, no-code builders). It should also connect principles to enforceable rules in production.

1

Audit the human-AI interaction

2

Build the responsible interaction layer

3

Operationalise governance

4

Enforce constraints in AI-assisted production

5

Systematic testing and continous monitoring

module 1

Audit the human-AI interaction

Start by mapping where AI is already present in the experience, and where it soon will be. A strong baseline audit should produce:

AI touchpoint inventory

list where users encounter AI outputs and actions, with consistent metadata (surface, modality, capability type, automation level, owner, initial risk rating). For high-risk touchpoints, add traceability fields (user goal, decision stakes, key inputs, escalation/ownership) and an evidence pack (touchpoint spec, decision rationale, change/version log, disclosure & UX copy, human oversight/escalation playbook, and incident/near-miss record) so evidence and accountability remain auditable over time.


Failure risks

Identify the interaction risks that can occur across the experience  (e.g., misleading certainty signals, inaccuracies, hidden automation, bias/ unfair outcomes, inadequate contestability, unsafe delegation or responses, sensitive inference, escalation failures), mapped back to the touchpoints where they appear.


Gap analysis

Compare the current experience to defined internal standards (principles, safety, UX, accessibility, content) and relevant external obligations (ethical requirements, regulations and industry-level standards). Record the evidence reviewed (designs, flows, policies, logs, evaluations) so gaps are traceable + auditable.


Risk map

Rank issues by severity of impact, likelihood, exposure/scale, and detectability. Include regulatory classification as one input to prioritization (alongside user impact and operational risk). Include vulnerability as a risk input, both cohort vulnerability (e.g., minors, mental health contexts, low literacy) and situational vulnerability (e.g., high-pressure decisions, urgency, on-the-go), since it materially shifts stakes and harm likelihood.


module 2

Build the responsible interaction layer

The interaction layer is where we translate evidence-informed principles and behavioural insights into enforceable requirements for human-AI interaction.


It also turns those requirements into reusable, responsible building blocks and patterns.

Principles 

Principles should translate into interaction requirements, reusable patterns and review criteria. They should define both what behavioural and ethical outcomes to enable (e.g, informed trust, better decisions, confident recovery) and harm to prevent (e.g., bias, opacity, privacy risk, unsafe or manipulative interaction patterns).


Core areas include agency and recoverability, transparency, trust calibration, decision support under uncertainty, safety safeguards, fairness, traceability and human oversight. Trust calibration needs explicit design: show what the system used (and didn’t), communicate uncertainty without false precision, nudge verification in proportion to stakes, and add  “how this works” primers to prevent magical thinking.


Foundations 

Foundations are baseline rules for all AI-mediated flows. They define role boundaries, tone and behavioural limits, confirmation norms, required disclosures, automation thresholds, recovery patterns, and data-use rules at the interaction layer.


They also define change transparency (i.e., which behaviour shifts require internal escalation, and which (such as those affecting trust, control, outcomes or data use) must be disclosed to users + the signals that trigger both.


Set clear data boundaries (what data can be used for inference, personalisation, and training, with purpose-specific rules, retention limits, and user controls). Prohibit or tightly control sensitive inference with detection and escalation paths, and require explicit consent for proactive and background actions.


These are minimum foundations, not a closed list. They should be adapted by domain, risk tier and level of automation.


Components and interaction patterns

Components provide the building blocks teams use daily (e.g., review-before-apply, pause/undo automation controls, uncertainty signals, escalation handoffs,  provenance cues where AI contributes to outputs).


Patterns define how those blocks work together across real user journeys (e.g., setting expectations, supporting recovery, preserving user control, consent mechanisms, handling safe refusal, calibrating trust, reducing over-reliance over time).


Where organizations usually fail

1. Unstructured foundations

(Design tokens aren’t AI-ready)

In transformation work, Mark Reynolds (Design Systems expert) sees three recurring structural failures:

Most design systems were built for human consumption, not machine interpretation. Poorly structured tokens, inconsistent naming, messy hierarchies, and unclear JSON schemas make it difficult for AI to reliably understand color, spacing, typography, and semantic intent, resulting in incorrect or inconsistent outputs.

Mark Reynolds, Design System Director; Founder at Atomle

2. Schema-free components

(Forcing AI to Guess)

Without explicit schemas for components, patterns and templates, AI is forced to reverse-engineer intent by inspecting Figma files and component libraries. This visual interpretation is unreliable, brittle and context-blind, leading to hallucinated properties, broken layouts and misuse of components at scale.

3. Missing guardrails

(No built-in brand, accessibility or responsibility controls)


Design systems rarely encode brand rules, accessibility requirements, or responsible design constraints in a way AI can enforce. Without these baked-in guardrails, AI-generated outputs drift off-brand, violate accessibility standards and introduce compliance and ethical risks that teams must manually fix afterward.


module 3

Operationalise governance

This module focuses on making governance executable in day-to-day delivery. The goal is to define clear human accountability, decision rights, revisions and evidence requirements so teams can innovate (without bottlenecks) while maintaining safety and accountability and being aligned to compliance requirements.

1

Human oversight model

Where & when human oversight is required (including override/escalation rights).

2

Decision & approval criteria

How decisions are made (pass/fail thresholds, required evidence, and documentation standards). By risk tier, define ship/hold/monitor checkpoints, release/rollout controls, and incident-response requirements. High-risk touchpoints require logged sign-off.

3

Ownership & accountability model

Responsibilities and decision authority across required functions (i.e., Design, Product, Engineering, Legal/ Risk).

4

 Behaviour change

How changes in system behaviour (model, prompt, policy, retrieval) and human behaviour (usage patterns, workarounds, risk signals, decision drift) are monitored, documented, approved and communicated.

5

Escalation triggers

Exception rules that define when the normal flow must stop, when a higher-risk path is activated, and which function owns the escalation.

module 4

Co-pilots and no-code tools

A responsible design system must operate in the reality of AI-assisted production. Copilots and no-code tools now generate UI and code, compressing development cycles. In this environment, documentation is necessary but insufficient.


Teams also need a risk-tiered Evidence Pack for high-impact patterns (touchpoint spec, decision rationale, change/version log, disclosure & UX copy, human oversight/escalation playbook, and incident/near-miss record) that travels with the work and is required for release. To keep this scalable, guardrails can’t live only in docs. They need to be built into how work is produced and shipped.


That means translating standards into reusable building blocks and non-negotiable checks (required disclosures, accessibility, traceable records of key AI actions and user controls, and clear no-go patterns for high-risk interactions), plus clear requirements for AI UI elements (attribution, uncertainty, user override) and consistent tracking so teams can monitor drift and catch issues early.

AI can also extend governance across more of the product lifecycle. Policy-aware agents can review implementation quality, flag deviations, support conformance checks, and, in low-risk cases, suggest or auto-correct adoption issues.


A practical model combines global enforcement of user-centered principles and ethical, safety, and compliance constraints with local flexibility in implementation. At the same time, teams must avoid encoding principles so rigidly that AI-assisted outputs become formulaic. Effective governance combines hard safety constraints with flexible guidance that preserves creativity and contextual judgment.

 1. Platform-level

non-negotiables

E.g., approved AI interaction patterns, mandatory disclosures, telemetry/logging requirements, explicit confirmation for high-stakes actions.

2. Team-level

flexibility

E.g., tone adaptation, microcopy variants, contextual nudges, domain-specific implementation choices.

module 5

Testing and monitoring playbook

Continuous research and testing with users helps you design for real-world conditions and anticipate behavioural risks. Pair this with scenario-based evaluations across end-to-end journeys and targeted stress testing (red teaming) of high-risk interactions.

After launch, continuous human oversight and feedback loops make emergent behaviour and risk visible and manageable. Combine telemetry with ongoing user research to detect both model and behaviour drift that metrics alone won’t capture.

Pay special attention to behavioural failure and risk modes that develop over time, such as:

  • over-reliance, bias and accuracy risk (e.g., rising error rates, increases in accepted-wrong outcomes, widening gaps across user groups or contexts)
  • lack of adoption (e.g., trust or usefulness mismatches, poor fit to real workflows)
  • misplaced or transferred authority (e.g., treating output as expert judgment, increasing reliance, low verification)
  • relationship attachment (e.g., anthropomorphism, emotional reliance, oversharing)
  • misuse + weak recovery (e.g., off-label use/retry loops, jail-breaking, silent agent actions, limited undo/appeal pathways, repeat incidents)

In product organizations, design systems can serve as one operational mechanism to make responsible human-AI interaction repeatable, allowing quality, safety and governance to scale with delivery.

Where to start? assess readiness before you scale

Before scaling AI across products and teams, assess whether your design system and governance can support it safely and consistently.



Start with a Design System + AI Readiness Sprint, led by design systems and human-centered responsible AI practitioners from HCRAI and Atomle.


We’ll assess your system foundations (tokens, components, interaction patterns), documentation, and governance, then deliver a practical gap analysis and a prioritized roadmap to support AI-native workflows and responsible human–AI interaction at scale.

References


Ashfin, P. (2024). Towards Responsible Engineering Software: Ethical, Legal and Social Implications of Automated Design and AI-Driven Tools. (2024). Frontiers in Computer Science and Artificial Intelligence, 3(1), 01-14.

Fabricio de Barros, C., & Sandberg, R. (2025). Designing for UX Designers : Creating Sustainable and Usable Design Systems (Dissertation). Retrieved from https://urn.kb.se/resolve?urn=urn:nbn:se:umu:diva-245806


Fessenden, T. (2021).
Design systems 101. Nielsen Norman Group


Kimm, G. (2025). Supporting Designers’ Authorship with AI: Design Computing Patterns to Navigate Across Human and Artificial Intelligences (Version 1). Swinburne. https://doi.org/10.25916/sut.28340456.v1


Lee, K. S., Choi, M. & Asni., E.Y. (2025). AI Opportunity Cards: Developing a Toolkit for AI as a Design Material. In Proceedings of the 2025 International Conference on Information Technology for Social Good (GoodIT '25). Association for Computing Machinery, New York, NY, USA, 396–402. https://doi.org/10.1145/3748699.3749817


Lere, H. M., & Bilkisu, H. (2025). AI-driven architectural design: Opportunities and ethical challenges.
ARCN International Journal of Sustainable Development, 14(2), 97–110.  ISSN: 2384-5341

Myllylä, M., Karvonen, A., Koskinen, H. (2024). Design Systems for Intelligent Technology. In: Tareq Ahram, Waldemar Karwowski, Dario Russo and Giuseppe Di Bucchianico (eds) Intelligent Human Systems Integration (IHSI 2024): Integrating People and Intelligent Systems. AHFE (2024) International Conference. AHFE Open Access, vol 119. AHFE International, USA.

http://doi.org/10.54941/ahfe1004490


Okpala, B. (2024). Examining the Impact of Generative AI on UX/UI Design. SSRN. 
http://dx.doi.org/10.2139/ssrn.5312384


Saeidnia, H. R. and Ausloos, M. (2024). Integrating Artificial Intelligence into Design Thinking: A Comprehensive Examination of the Principles and Potentialities of AI for Design Thinking Framework. InfoScience Trends, 1(2), 1-9. doi: 10.61186/ist.202401.01.09


Salem, Al.(2024) Component Constellations: Future Perspectives on Design Systems. [MRP]. OCAD. https://openresearch.ocadu.ca/id/eprint/4188


Speicher, M., & Baena Wehrmann, G. (2022).
One formula to rule them all: The ROI of a design system. Smashing Magazine.


Windarto, Y. (2024). Study of Research Trends and Leveraging AI on User Experience and Interface Design. SSRN. http://dx.doi.org/10.2139/ssrn.5142285


Yu, C., Zheng, P., Peng, T., Xu, X., Vos, S., & Ren, X. (2025). Design meets AI: challenges and opportunities. Journal of Engineering Design, 36(5–6), 637–641. https://doi.org/10.1080/09544828.2025.2484085

Author

Sara Portell
Behavioural Scientist & Responsible AI Advisor
Founder, HCRAI



Recent Posts

AI agents for mental health Different therapeutic styles and outcomes
By Yasmina El Fassi February 19, 2026
W hat do Woebot , Wysa and Youper have in common? These are all AI agents that use therapeutic techniques to help users improve mental well-being, guide meditation and even help with managing anxiety. In this article, AI mental‑health agents are goal‑directed conversational systems that sit with you in a chat or voice interface to support specific wellbeing tasks; for example, walking through CBT‑style exercises, practicing coping strategies, or checking in on mood over time. I n the broader AI literature , these would be considered agents because they are built around particular goals and workflows, whereas “agentic” AI usually refers to more autonomous systems that can independently plan multi‑step actions, call tools, and adap t their behaviour with relatively little human steering.
When AI Enters the Learning Process: Design Failures, Regulatory Risk and Guardrails for EdTech
By Sara Portell January 21, 2026
Generative AI (GenAI) and emerging agentic systems are moving AI into the learning process itself. These systems don’t stop at delivering content. They explain, adapt, remember and guide learners through tasks. In doing so, they change where cognitive effort sits. I.e., what learners do themselves and what gets delegated to machines. This shift unlocks significant opportunities. GenAI can provide on-demand explanations, examples and feedback at a scale. It can diversify learning resources through multimodal content, support learners working in a second language and reduce friction when students get stuck, lowering barriers to engagement and persistence. For some learners, AI-mediated feedback can feel psychologically safer, encouraging experimentation (trial and error), revision and assistance without fear of judgement . But these gains come with important risks. The same design choices that improve short-term performance, confidence, or engagement can weaken i ndependent reasoning, distort social development or introduce hidden dependencies over time . In educational contexts, especially those involving children and teens, we are talking about learning, safeguarding, regulatory and reputational risks. If the “Google effect” (digital amnesia) raised concerns about outsourcing memory to search engines , LLMs can be even more powerful in practice.
Designing AI Mental Health and Wellbeing Tools: Risks, Interaction Patterns and Governance
By Sara Portell January 13, 2026
Designing AI Mental Health and Wellbeing Tools: Risks, Interaction Patterns and Governance
Building AI Responsibly for Children: A Practical Framework
By Sara Portell January 4, 2026
AI is alread y a core part of children’s and teens’ digital lives. In the UK, 67% of teenagers now use AI , and in the US 64% of teens report using AI chatbots . Even among younger children, adoption is significant: 39% of elementary school children in the US use AI for learning, and 37% of children aged 9-11 in Argentina report using ChatGPT to seek information, as stated in the latest Unicef Guidance on AI and Children. In parallel, child-facing AI products are expanding: more than 1,500 AI toy companies w ere reportedly operating in China as of October 2025. Adoption is accelerating across age groups and regions, often surpassing the development of child-specific ethical standards, safeguards and governance mechanisms.

Contact Us

Tel:  +351 915 679 908

Tel: +44 79 38 52 1514

contact@hcrai.com