The Yes Machine: Sycophantic AI and Its Developmental Risks for Children

Sycophancy in the context of LLMs “refers to the propensity of models to excessively agree with or flatter users, often at the expense of factual accuracy or ethical considerations”. It is different from “calibrated empathy”, which we introduced in our previous post. We have introduced the later as the ability of an AI agent to respond with emotional attunement while remaining grounded in honesty and therapeutic utility. To put the distinction simply: Calibrated empathy validates the person; sycophancy validates the claim.

Sycophancy emerges from Reinforcement Learning from Human Feedback (RLHF), which are models trained to maximize approval, not accuracy. Researchers such as Cheng and colleagues address the ELEPHANT in the room in their piece about social sycophancy, as opposed to other types of sycophancy such as regressive, progressive and opinion-based.

Now, if you’re already thinking about the numerous dangers that this behavioural pattern can have on adults, imagine the repercussions on a younger audience. Children in key stages (6–12, 13–17) are actively building self-concept, resilience, and metacognitive skills. They anthropomorphize AI more readily, so sycophantic praise carries more emotional weight. They lack the critical AI literacy to interrogate or discount AI feedback.

AI systems that praise indiscriminately create an illusory sense of competence decoupled from actual performance. A 2025 controlled study found that LLMs affirm user actions 50% more than humans do, even when those actions are objectively flawed, and users rated these sycophantic responses as ”higher quality”. For children ages 6–12, this is particularly harmful. It is the exact window when children should be transitioning toward more realistic self-appraisal. Blocking this calibration risks producing a fragile ego that collapses under genuine evaluative pressure.

Some LLMs might not have any corrective feedback loops, and this could lead to lower resilience and grit. Corrective feedback is not just pedagogically useful, it is a developmental necessity. Fyfe and colleagues (2022) reviewed 44 empirical studies and found corrective feedback improved children's learning outcomes in 93% of cases. Duckworth's (2007) grit research identifies persistence through difficulty as the core mechanism of long-term achievement, a capacity that only develops through productive failure. An AI that smooths all friction removes the very conditions needed for resilience and grit to form.

Adolescents experiencing social anxiety are disproportionately drawn to AI companions for validation, making them the most vulnerable to relational displacement. A 2025 US survey found that 20% of teens aged 13–17 spent as much or more time with AI companions than with real friends. This is developmentally dangerous: peer relationships in adolescence are the primary mechanism for learning conflict resolution, perspective-taking, and identity negotiation, and these are functions AI cannot replicate.

Dweck and Mueller's landmark study showed experimentally that praising children for intelligence, rather than process, caused them to avoid challenges and perform worse after setbacks. An AI that never questions effort produces the same effect at scale: it signals innate capability rather than developing capability. As we have previously raised in our article about AI in education, EdTech platforms optimizing for engagement systematically bias toward positive sentiment, creating a structural design failure with real consequences for children's cognitive growth.

Design tiered feedback profiles based on age group (e.g., 6–9, 10–12, 13–17) that modulate tone and directness, ensuring even young children receive constructive, not just validating responses.
Implement "effort + growth" framing responses should acknowledge what the child did well and suggest one concrete next step, modeled on established pedagogical frameworks like formative assessment.
Audit outputs regularly for sycophantic patterns across age groups using red-teaming prompts that simulate common child inputs (e.g., seeking praise for mediocre work, presenting false beliefs for validation).

Provide parents with interaction summaries and not full transcripts by default, but periodic reports that surface patterns such as repeated validation-seeking or emotionally dependent exchanges.
Include sycophancy indicators in parental dashboards, which should not be read-only: parents should be able to flag specific patterns directly within the interface, submitting structured reports that feed into a product-level review queue. These reports should be categorised (e.g., "excessive praise," "unchallenged false belief," "emotional dependency signal") and reviewed by a designated team on a defined cadence, closing the loop between parental concern and product accountability.
Offer opt-in "honest mode" controls that parents can activate to increase the calibration of feedback, with clear explanation of what this means and why it matters developmentally. “Honest Mode” could do three things: it reduces praise frequency by raising the threshold at which positive reinforcement is generated; it introduces corrective responses when the child's work or belief contains a factual or evaluative error; and it replaces agreement with probing questions. For example, substituting "That's a great point!" with "That's an interesting view, what made you think of it that way?"

Cap consecutive agreement sequences: if the model agrees with or praises a child more than a defined number of times in a row, trigger a diversity-of-perspective injection, like a gentle alternative viewpoint or a probing question.
Build in reflective prompts that shift the dynamic from validation-seeking to critical thinking (e.g., "That's an interesting view. What made you think of it that way?"), modeled on Socratic questioning techniques.
Log and flag validation loop patterns at the system level for human review, particularly in mental health or educational contexts where distorted feedback carries the highest developmental risk.

Establish a multidisciplinary review board that includes child psychologists, clinical therapists, and educators as members of the product development cycle.
Conduct clinical scenario testing prior to any major model update, using realistic child-use cases developed with practitioner input to assess whether the model's feedback patterns remain developmentally appropriate.
Publish transparency reports detailing the clinical oversight process, the types of sycophancy evaluations conducted, and how child safety considerations are integrated into RLHF reward modeling.

Now that we have a good grasp of what happens when systems optimized for adult approval are deployed, largely unchecked, in the hands of children. Sycophancy in AI is an alignment problem but in child-facing applications, it is also a developmental one.

Let this be a reminder that children are in the process of building the cognitive and emotional architecture that will carry them through life. When an AI short-circuits that process with unconditional validation, the damage might be not loud or visible, but it is there.

We work with founders, product teams, and builders to assess how AI systems affect children and teenagers in practice. We look closely at interaction patterns, behaviour over time, points where risk accumulates, and the safeguards needed as products evolve.

Our work turns behavioural evidence into practical product decisions: safer interaction design, clearer system boundaries, stronger oversight, and better alignment with emerging regulatory expectations.

If you’re building or deploying AI systems for children or teens and want a clearer view of real-world safety risks, get in touch.

References

Carey, T. A., & Mullan, R. J. (2004). What is Socratic questioning? Psychotherapy, 41(3), 217–226. https://doi.org/10.1037/0033-3204.41.3.217

Cheng, M., Lee, C., Khadpe, P., Yu, S., Han, D., & Jurafsky, D. (2025). Sycophantic AI decreases prosocial intentions and promotes dependence. ArXiv.org. https://doi.org/10.48550/arxiv.2510.01395

Cheng, M., Yu, S., Lee, C., Khadpe, P., Ibrahim, L., & Jurafsky, D. (2025). ELEPHANT: Measuring and understanding social sycophancy in LLMs. arXiv (Cornell University). https://doi.org/10.48550/arxiv.2505.13995

Dai, J., Pan, X., Sun, R., Ji, J., Xu, X., Liu, M., Wang, Y., & Yang, Y. (2023). Safe RLHF: Safe Reinforcement Learning from Human Feedback. arXiv (Cornell University). https://doi.org/10.48550/arxiv.2310.12773

Duckworth, A. L., Peterson, C., Matthews, M. D., & Kelly, D. R. (2007). Grit: Perseverance and passion for long-term goals. Journal of Personality and Social Psychology, 92(6), 1087–1101. https://doi.org/10.1037/0022-3514.92.6.1087

Fyfe, E. R., Borriello, G. A., & Merrick, M. (2022). A developmental perspective on feedback: How corrective feedback influences children’s literacy, mathematics, and problem solving. Educational Psychologist, 58(3), 130–145. https://doi.org/10.1080/00461520.2022.2108426

Harter, S. (2015). The Construction of the Self, second edition: Developmental and Sociocultural Foundations. Guilford Publications.

Jiao, J., Afroogh, S., Chen, K., Murali, A., Atkinson, D., & Dhurandhar, A. (2025). LLMS and Childhood Safety: Identifying risks and proposing a protection Framework for safe Child-LLM interaction. ArXiv.org. https://doi.org/10.48550/arxiv.2502.11242

Malmqvist, L. (2025). Sycophancy in Large Language Models: Causes and mitigations. In Lecture notes in networks and systems(pp. 61–74). https://doi.org/10.1007/978-3-031-92611-2_5

Moss, C. M., & Brookhart, S. M. (2019). Advancing formative assessment in every classroom: A Guide for Instructional Leaders. ASCD.

Mueller, C. M., & Dweck, C. S. (1998). Praise for intelligence can undermine children’s motivation and performance. Journal of Personality and Social Psychology, 75(1), 33–52. https://doi.org/10.1037/0022-3514.75.1.33

Neugnot-Cerioli, M. (2026). Adolescents & Anthropomorphic AI: Rethinking Design for Wellbeing An Evidence-Informed Synthesis for Youth Wellbeing and Safety. arXiv (Cornell University). https://doi.org/10.48550/arxiv.2603.06960

Portell, S. (2026). When AI Enters the Learning Process: Design Failures, Regulatory Risk and Guardrails for EdTech. HCRAI. https://www.hcrai.com/when-ai-enters-the-learning-process-design-failures-regulatory-risk-and-guardrails-for-edtech

Shapira, I., Benade, G., & Procaccia, A. D. (2026). How RLHF amplifies sycophancy. arXiv (Cornell University). https://doi.org/10.48550/arxiv.2602.01002

Spry, L., & Olsson, C. (2025). Teens are increasingly turning to AI companions, and it could be harming them. The Conversation. https://doi.org/10.64628/aa.seteyqwd5

< Older Post

Newer Post >

Bridging the Gap: When AI Output Becomes Real-World Action

By Silvia Rocha • May 4, 2026

A practitioner roundtable on AI governance

AI Agents For Mental Health: Different Therapeutic Styles and Outcomes

By Yasmina El Fassi • February 19, 2026

W hat do Woebot , Wysa and Youper have in common? These are all AI agents that use therapeutic techniques to help users improve mental well-being, guide meditation and even help with managing anxiety. In this article, AI mental‑health agents are goal‑directed conversational systems that sit with you in a chat or voice interface to support specific wellbeing tasks; for example, walking through CBT‑style exercises, practicing coping strategies, or checking in on mood over time. I n the broader AI literature , these would be considered agents because they are built around particular goals and workflows, whereas “agentic” AI usually refers to more autonomous systems that can independently plan multi‑step actions, call tools, and adap t their behaviour with relatively little human steering.

The Design System As The Operational Layer for Responsible Human-AI Interaction

By Sara Portell • February 6, 2026

Design systems were built to scale consistency, efficiency and quality in user-centric applications: reusable components, shared patterns and practices, and a common language across design and engineering , promoting collaboration. They improve velocity because teams stop solving the same interface problems repeatedly, providing measurable ROI . AI introduces both immense opportunities and complex (technical, legal and social) challenges, and it is reshaping the operating conditions traditional design systems were built for. User-facing outputs are adaptive and can vary by input, model behaviour can shift over time and responses that sound credible can still be wrong . These systems can also reproduce or amplify bias, creating unequal outcomes across users. In high-confidence, relational interactions, they can shape user judgment and behaviour . These shifts raise the bar for accountability, transparency, and governance across the full product lifecycle. The challenge is not only consistency and quality. It is ensuring consistency and quality safely, fairly and responsibly as both system behaviour and human behaviour evolve. At the same time, AI-powered copilots and no-code tools are increasingly used in the design process to support ideation, prototyping, and delivery, but their adoption also raises concerns about transparency, bias, privacy, and the need to preserve human judgment and oversight . Fast, polished design outputs often look complete even when the underlying logic is incomplete or flawed. As a result, familiar UX failures, misalignment with real user needs, hidden edge cases and context breakdowns, become harder to detect and more costly to correct later. Design systems can take on a bigger operational role in AI-enabled product development by codifying user-centric foundations, rules and infrastructure that guide consistent, safe, ethical and scalable human-AI experiences.

When AI Enters the Learning Process: Design Failures, Regulatory Risk and Guardrails for EdTech

By Sara Portell • January 21, 2026

Generative AI (GenAI) and emerging agentic systems are moving AI into the learning process itself. These systems don’t stop at delivering content. They explain, adapt, remember and guide learners through tasks. In doing so, they change where cognitive effort sits. I.e., what learners do themselves and what gets delegated to machines. This shift unlocks significant opportunities. GenAI can provide on-demand explanations, examples and feedback at a scale. It can diversify learning resources through multimodal content, support learners working in a second language and reduce friction when students get stuck, lowering barriers to engagement and persistence. For some learners, AI-mediated feedback can feel psychologically safer, encouraging experimentation (trial and error), revision and assistance without fear of judgement . But these gains come with important risks. The same design choices that improve short-term performance, confidence, or engagement can weaken i ndependent reasoning, distort social development or introduce hidden dependencies over time .

Designing AI Mental Health and Wellbeing Tools: Risks, Interaction Patterns and Governance

By Sara Portell • January 13, 2026

Designing AI Mental Health and Wellbeing Tools: Risks, Interaction Patterns and Governance

Building AI Responsibly for Children: A Practical Framework

By Sara Portell • January 4, 2026

AI is alread y a core part of children’s and teens’ digital lives. In the UK, 67% of teenagers now use AI , and in the US 64% of teens report using AI chatbots . Even among younger children, adoption is significant: 39% of elementary school children in the US use AI for learning, and 37% of children aged 9-11 in Argentina report using ChatGPT to seek information, as stated in the latest Unicef Guidance on AI and Children. In parallel, child-facing AI products are expanding: more than 1,500 AI toy companies w ere reportedly operating in China as of October 2025. Adoption is accelerating across age groups and regions, often surpassing the development of child-specific ethical standards, safeguards and governance mechanisms.

Get in touch