AI Summaries Are Reshaping Consumer Decisions: The Real Risk Is More Than Misinformation

AI summaries are reshaping consumer decisions: the real risk is more than misinformation.

Research from the University of California, San Diego shows that when summaries appear before the original content, platforms are not just organizing information—they are shaping users’ first impressions and behavioral tendencies.

Key Insights:

A study by the University of California, San Diego found that participants showed a 32% higher purchase intention toward the same product after reading an LLM-generated review summary than after reading the original reviews.
The real significance of this study is not just high hallucination rates, but that AI summaries reorder human judgment through framing bias and primacy bias.
As AI summaries become the default entry point for shopping, search, and enterprise knowledge systems, the focus of competition is no longer just model accuracy, but who controls what users see first, understand first, and then decide whether to act. The first half is research-supported fact; the second half is extended interpretation based on platform product trends.

Many people, upon seeing a “60% hallucination rate,” instinctively categorize it as another familiar AI risk story: the model is not reliable enough, so don’t trust it too much. But this view only captures half the picture. The deeper problem is that the study measured not just factual errors, but the fact that the summary format itself is enough to alter users’ perceptions and preferences toward the same set of information. In other words, this is not just a question of content correctness, but of how the interface preorganizes content, compresses context, and highlights certain signals for you.

01｜Clarify the Study Boundaries First to Avoid Overgeneralizing Numbers

This research comes from a team at the University of California, San Diego (UC San Diego), with the paper titled Quantifying Cognitive Bias Induction in LLM-Generated Content, accepted for IJCNLP-AACL 2025. It does not broadly discuss “whether AI is biased,” but attempts to quantify three specific risks: framing bias, primacy bias, and hallucinations stemming from knowledge cutoff limitations. The team also tested 18 mitigation methods, meaning the paper not only identifies problems but also moves toward corrective mechanisms.

Four core figures are publicly verifiable:

LLM-generated content altered the emotional tone of the original context in 26.42% of cases—this is framing bias.
The rate of primacy bias was 10.12%.
When answering questions about new information beyond the knowledge cutoff, models produced hallucinations in an average of 60.33% of cases.
In product review experiments, human purchase intention was 32% higher after reading LLM summaries. The university’s press release phrased it more simply: among 70 participants, 84% of those who saw chatbot summaries stated they would buy the product, compared to 52% of those who read original reviews.

However, two boundaries must be clarified upfront. First, this does not prove “all AI summaries make people more willing to buy,” but that under a specific controlled experimental design, the summary format has the ability to alter judgment. Second, the study measures purchase intention or tendency, not actual conversion, so it cannot be directly extrapolated to “AI summaries definitely boost sales.” The research supports that behavioral tendencies are affected, not that the same effects replicate in every real transaction scenario.

02｜The Real Risk Is Not Just Misinformation—but Summaries Prestructure Your Understanding

What truly deserves attention is that the power of summaries often comes not from fabricating facts, but from deciding how you interpret facts first. Most people’s understanding of AI risk still stops at “will it make things up?” This concern is valid, but limiting risk to true/false questions underestimates the role of summary interfaces. In real usage, people rarely verify every piece of information word for word; instead, they form first impressions based on a small amount of condensed, sorted content, then decide whether to invest more time. Google’s official description of AI Overviews is to provide key information and related links first; Amazon’s official description of review highlights is to help consumers quickly grasp common opinions. These product designs do the same thing: reduce your cognitive load first, and leave raw material for later.

The key point is that this can change behavior without obvious fabrication. Suppose 100 original reviews contain both positive and negative feedback. When the model condenses them into “most consumers praise battery life and easy installation, while a few find the price slightly high,” what you receive is no longer 100 conflicting opinions, but a unified narrative impression. This impression may not invent reviews out of thin air, but it has collapsed complex, ambiguous raw data into a direction that facilitates quick decisions. Framing bias, as identified in the study, takes effect during this condensation process.

Therefore, the value and risk of the AI summary layer are inherently two sides of the same coin. Platforms can rightly claim this makes users more efficient and provides a smoother experience without reading hundreds of entries. This is true and one reason AI summaries have spread rapidly. The problem is that once users’ attention is captured by default summaries, platforms are no longer just information carriers—they become context shapers. For most people, first impressions are not final decisions, but they are often the starting point of final decisions. This is why the study deserves broad discussion. The first half is product fact; the second half is extended interpretation combining research findings and product design.

03｜Fluent Summaries Are Influential Because They Make People Believe They Grasp the Point Faster

A deeper problem is that judgment is often influenced not by how wrong the content is, but by the polished format itself being more trustworthy. Put simply, the study reminds us of a basic human reality: people are influenced not only by facts, but by “organized facts.” Original reviews are often lengthy, repetitive, tonally inconsistent, and even contradictory. The advantage of AI summaries lies precisely in turning this messy material into smoother, shorter, more conclusive text. This formal advantage makes users more likely to feel “I already get the point,” and this feeling itself can reduce motivation to check the original data. This paragraph is a reasonable inference based on research findings and real product usage, not a direct conclusion of the paper.

Beyond framing bias, the paper also discusses primacy bias. This is especially notable in real products, as many usage scenarios involve long reviews, documents, or discussion threads rather than single-turn Q&A. When a model overweights early content in summaries, it may not completely ignore later counterexamples or supplements, but those lose equal visibility during condensation. For users, this is not an obvious error, as the summary still appears comprehensive. This shift is less conspicuous than outright fabricating a nonexistent specification, but more likely to seep into real decisions gradually.

More thought-provoking is that the change in purchase intention in the study shows the summary format alone can shift attitudes toward the same information. Combining the 26.42% framing bias rate and 32% purchase intention difference, a reasonable interpretation is: influence may not rest entirely on content correctness, but also on perceived credibility of tone, reduced cognitive cost, and overall impressions formed by reordered information. This statement is extended interpretation and should not be presented as a causal conclusion directly proven by the research.

04｜When “Summary-First” Becomes the Default Interface, Platform Competition Shifts to Interpretive Power

This study cannot be treated merely as an academic warning because “summary-first” has already become a mainstream product design direction. More and more products treat AI summaries as the default interface, not an optional feature. Amazon states officially that review highlights provide a short text on product pages summarizing common positive, neutral, and negative consumer opinions to help users quickly judge suitability. Google’s positioning of AI Overviews is consistent: deliver key information faster, then extend via links. These product claims are no secret; they center “reducing user cognitive cost” as core value.

From a business logic perspective, this makes sense. If platforms help users form actionable understanding in less time, conversion rates, dwell time, satisfaction, and search completion rates may all improve. This complicates the problem. The more efficient the summary layer becomes, the more it serves not just users, but expands platform influence. Platforms previously competed mainly on data volume, indexing ability, logistics speed, or algorithmic ranking; now they also compete on who can translate complex information faster into a statement that prompts immediate next steps. This is not a minor product update, but an upgrade of the information gateway role. The first half is reasonable business inference; the second half is interpretation of product structural trends.

Viewed within a larger industrial structure, this is not unique to e-commerce. Search, maps, travel price comparison, medical information, enterprise knowledge bases, customer service self-service centers, insurance policy explanations, and regulatory interpretation interfaces are all moving in the same direction: give you a summary first, then let you decide whether to read the original. Once this interface becomes mainstream, information advantage no longer lies only with “who has more content,” but increasingly with “who can condense context more effectively for you.” The value of the study is to add a critical reminder to this trend: condensation is not a neutral act—it can alter decisions.

05｜The Question Is Not Whether to Use AI Summaries—but Whether Enterprises Can Govern Them

No truly neutral summary exists in the real world; the difference is whether biases are visible, verifiable, and governable. Human editors have positions, review rankings are affected by platform rules, and original reviews may be dominated by emotional language and extreme samples. From this perspective, AI summaries do not create bias out of nowhere, but reorganize biases scattered across large volumes of content into a more readable form. Furthermore, most users cannot sift through raw data line by line for every shopping, search, or work judgment. Summaries exist partly in response to information overload, not creating it. This counterargument aligns with Amazon and Google’s official positioning of summary products.

This counterargument cannot be dismissed lightly, as it reflects product reality. Many AI tools are adopted precisely because people lack time. A system that condenses 100-page documents into one page, organizes dozens of complaints into three common issues, or compresses hundreds of product reviews into a scannable description truly reduces massive cognitive burden. For customer service managers, procurement staff, compliance assistants, researchers, and HR training teams, summaries are not a luxury but a necessary tool to move work forward. This paragraph is an inference about enterprise usage scenarios, not directly measured results of the study.

The real problem is not whether to use summaries, but whether summaries can be governed. Risk arises not from AI itself, but from adoption practices: if systems only provide conclusions without source traceback; if high-risk scenarios treat summaries as directly credible output; or if vendors only talk about efficiency, not bias monitoring, human review, and error correction mechanisms. NIST’s AI Risk Management Framework and its Generative AI Profile prioritize governance, measurement, traceability, human oversight, and contextualized risk—areas enterprises should focus on during adoption.

06｜Where Summaries Are Placed in Workflows Determines Whether They Are Efficiency Tools or Risk Amplifiers

For enterprises, the first distinction is not whether the tool is easy to use, but whether the summary aids reading or pre-shapes judgment. Many organizations react to generative AI by finding time-saving opportunities. This direction is valid, but without distinguishing workflows that merely accelerate reading from those that shape decisions, different risk-level tasks will be handled uniformly. This is the greatest danger. This paragraph is a practical inference based on research findings and governance frameworks.

A simple classification divides summary tasks into three layers:

Low-risk navigational summaries: e.g., preliminary product review sorting, internal meeting notes, research lead consolidation. These can be widely used but must retain access to original text.
Medium-risk interpretive summaries: e.g., supplier comparisons, complaint analysis, internal audit anomaly sorting. These require not just summaries, but sampling verification and exception disclosure.
High-risk decision-preparatory summaries: e.g., medical summaries, compliance interpretations, labor assessments, credit data, student evaluations, public policy references. Even with summaries, these require human review, record-keeping, and responsibility assignment embedded in workflows—not just disclaimers on tool interfaces. This three-layer framework is proposed herein, but aligns with NIST and EU AI Act requirements for high-risk AI.

In practical enterprise applications, at least two specific scenarios stand out. First, procurement and supplier management. If procurement teams use AI summaries to organize bid documents, specification differences, customer feedback, and supplier risks, the real questions are not “whether to summarize,” but whether summaries retain citations, flag inconsistencies, and disclose omitted negative signals. Second, customer service and knowledge management. Many enterprises use models to organize large volumes of customer service records, FAQs, and repair instructions into knowledge base entries. If summaries downplay rare but serious exceptions, frontline staff may gain false confidence from seemingly smooth knowledge cards. These scenarios are practical examples based on governance principles.

07｜A Practical Next Step Is to Establish an Actionable Three-Question Checklist

To apply this research in real enterprises, the most valuable outcome is not added anxiety, but translating risk judgment into actionable check questions. AI summaries are dangerous not because they always make big mistakes, but because they often seem “good enough.” The most common failure point in organizational decisions is not absurd errors, but “close-enough correctness” that makes people skip further checks. The first sentence is the author’s judgment; the second is an empirical governance reminder.

I recommend enterprises start with a practical three-question checklist:

Does this summary directly affect whether people take action (purchase, sign, lend, approve, reject, diagnose, promote)? If yes, it is not an ordinary summary.
Can users easily return to the original source? If not, risk rises significantly.
After an error occurs, can the organization trace which layer failed? If it cannot distinguish between model rewriting, data sources, workflow design, or human review failure, the system is unfit for high-risk workflows. These three questions form an actionable framework proposed herein, translating NIST and high-risk AI governance requirements into department-level check items.

Furthermore, vendor evaluation should be upgraded. When purchasing summary AI tools, enterprises should no longer ask only about model brand, speed, and pricing, but dig deeper: source citation and traceability; ability to highlight inconsistent information; sampling verification access; operation log export; error correction feedback to models or rules; mandatory human confirmation for high-risk scenarios. These questions sound less flashy than product demos, but better reflect adoption maturity. NIST and EU guidelines for trustworthy AI are pushing enterprises toward this governance-focused procurement.

08｜The Study Is Important—but Not a Universal Law for All Scenarios

Clarifying limitations prevents the market from overamplifying an important study into simplistic slogans.

Results are measured under specific experimental design, proving the summary format affects behavioral tendencies—not that all platforms, tasks, or user groups show the same shift magnitude.
Purchase intention differs from actual buying; it is a valuable indicator but not equivalent to real sales.
The 60.33% hallucination rate applies to knowledge-cutoff questions, a specific task risk—not a generalization for all generative AI scenarios.

Additionally, while the paper evaluated 18 mitigation methods, public coverage of their real-world effectiveness remains limited. Claims about “proven effective governance techniques” require deeper analysis of the full paper or follow-up research. This reminds us that a reasonable stance now is not to declare problems solved, but to acknowledge: the value of summary AI is market-proven, its risks are being quantified by research, but the optimal balance between the two is far from settled.

Summary｜When AI Summaries Become Gateways, What Needs Governance Is Front-End Interpretive and Control Power

The most important takeaway from this study is not “AI hallucinates,” but “AI-generated summaries can make people more willing to act.” This subtle difference shifts the issue from model quality to interface power. Platforms once mainly focused on retrieving data; now they increasingly present preinterpreted data to users. As interpretive power shifts from original content to the summary layer, front-end summary control becomes critical. This conclusion is extended interpretation based on research and platform product trends.

For decision-makers, the real adjustment is not rejecting AI summaries entirely, but stopping treating them as low-risk minor features. Any summary affecting purchases, approvals, assessments, credit, healthcare, compliance, or knowledge interpretation should be treated as a workflow design issue, not a single model problem. This means part of future organizational competitiveness will come from placing summary tools correctly: heavy use in low-risk scenarios, plus traceback, records, sampling, review, and accountability chains for medium-to-high-risk scenarios—not treating all “summarize this” requests equally. The first sentence is the author’s recommendation; the second aligns with current governance frameworks.

The most valuable ongoing metric to watch is the rate at which users return to original sources after reading AI summaries. This metric is not directly measured in the study, but is the most suitable internal governance indicator for enterprises. If platforms and enterprise systems move toward summary priority while original-content revisit rates decline, first impressions will concentrate increasingly on the summary layer. At that point, organizations must ask directly: is the AI we adopted helping employees understand data faster, or making them give up judgment faster?

FAQ

Q1｜What Risk Does the Study Really Identify?

The study’s core finding is not just that AI summaries may hallucinate, but that AI-generated summaries themselves can alter how humans interpret the same information and their behavioral tendencies. The problem lies not only in “factual errors,” but in “how information is organized, sorted, condensed, and emphasized.” When users see summaries first, not raw material, their first impressions are already influenced.

Q2｜Why Can’t This Only Be Understood as “AI Hallucinates”?

Limiting risk to hallucinations narrows the problem excessively. Hallucinations refer to fabricated information, but the study’s more important finding is that even without obvious fabrication, the summary format itself can shift users’ emotional interpretation, priority sorting, and willingness to act. This risk is less detectable than explicit errors and more likely to seep into daily decision-making.

Q3｜What Does the 32% Purchase Intention Difference Mean?

It means that under specific controlled experimental conditions, participants showed higher purchase intention toward the same product after reading LLM-generated summaries. This figure proves the summary format can influence behavioral tendencies, not directly that AI summaries boost real-world sales. In short, the study delivers a strong warning: summaries save time and also alter attitudes toward information and decision starting points.

Q4｜What Are Framing Bias and Primacy Bias?

Framing bias means reorganizing, condensing, or phrasing the same original information shifts users’ emotional understanding and judgment direction. Primacy bias means people assign higher weight to information appearing first, with later supplements, corrections, or counterexamples less likely to reverse initial impressions. In AI summaries, this means models can alter final user understanding through organization and sorting, even without fabricating content.

Q5｜Why Are Amazon and Google Summary Products Notable?

Both are pushing AI summaries to the front of information gateways. Amazon places AI-generated review highlights on product pages; Google positions AI Overviews as a summary interface delivering key information and links first. Platforms no longer only provide raw data—they preinterpret information for users. As summaries become the first layer of access, platforms evolve from information carriers to context shapers.

Q6｜What Should Enterprises Ask First When Adopting AI Summaries?

The first question is not ease of use, but whether the summary aids reading or pre-shapes judgment. If summaries directly affect purchases, contracts, credit, approvals, healthcare, compliance, or assessments, they are not low-risk features. Enterprises must verify at minimum: user access to original sources, error traceability, and embedded human review and accountability.

Q7｜What Is the Most Practical Reminder for Enterprises?

The practical reminder is not “avoid AI entirely,” but not treating summary tools as low-risk features for all workflows. Instead, identify the risk level of the host workflow first, then decide whether to add original-source traceback, citations, sampling verification, human review, record-keeping, and accountability chains. The focus is not opposing summaries, but governing them.