Our testing found that the AI tools professionals trust most still miss errors, invent citations, and deliver inconsistent results.
AI-powered document summaries promise to save time, and many professionals now trust them to analyze contracts, reports, and policies with minimal oversight. To test how well that trust holds up, Smallpdf asked four leading AI tools to summarize 40 business documents, each containing a deliberately planted contradiction, and surveyed more than 1,000 professionals about their experiences. The results revealed a significant gap between confidence in AI summaries and the accuracy professionals actually receive.
AI document summaries can appear trustworthy even when they miss important details: 88% of professionals consider them accurate, though we found AI tools failed to catch a document's planted contradiction 54% of the time on average.
Across all four AI tools tested, 1 in 6 document summaries contained at least one citation to a section or page that doesn't exist.
66% of professionals favor ChatGPT, yet it hallucinated more than any other tool (16% of its citations pointed to something that didn't exist).
71% of professionals act on AI document summaries without checking the original file.
43% have caught AI inventing a number, date, or clause that was never in the document, and of those, 13% only discovered it after they had already acted on it.
AI tools can process large documents in seconds, but speed doesn't always translate into accuracy. Important details are often hidden in footnotes, conflicting clauses, or references that require careful review.





AI has quickly become a routine part of workplace productivity. Across industries, professionals are relying on these tools to summarize, review, and analyze documents every day.

Daily AI use was especially common in several industries. Retail led the way at 64%, followed by information technology at 60%. Manufacturing workers reported daily use at 57%, while finance and consulting or professional services each reached 51%.

Many professionals use AI summaries to save time, especially when reviewing lengthy documents. However, relying on summaries without verifying the source can create significant risks.


These mistakes sometimes led to measurable business consequences. Fourteen percent experienced a missed deadline or project delay after relying on inaccurate AI output. Another 12% had to correct or retract work with a client or external party, while 10% reported a financial error or loss connected to fabricated or inaccurate information.
AI tools have become valuable workplace assistants, helping professionals process documents faster than ever before. Yet the findings suggest that speed and convenience can create a false sense of certainty when important details are overlooked or fabricated.
While some platforms performed better than others, every tool tested missed contradictions, produced hallucinated citations, or delivered inconsistent results under certain conditions. At the same time, most professionals continue to trust AI summaries and frequently act on them without reviewing the original document. As AI becomes more deeply integrated into document workflows, the most productive approach may not be choosing between AI and human review, but combining both to ensure critical decisions are built on accurate information.
We created 40 business documents, including 10 contracts, 10 financial reports, 10 HR policies, and 10 research summaries. Each document contained one deliberately planted error: a clause that contradicted an earlier clause, a math error in a subtotal, a footnote that modified a body figure, or a reference to a section that doesn't exist. We logged every fact in every document in advance (1,082 facts in total) as our answer key.
We then uploaded each PDF to four leading AI tools on their paid tiers, including ChatGPT (GPT-5), Gemini 3.1 Pro, Microsoft Copilot (Smart mode), and ChatPDF Plus, using the same standardized prompt, asking each AI to summarize the document and list every fact. That produced 160 baseline tests (4 tools × 40 documents). Each test used a fresh chat session with no follow-up questions. We reran 12 of the tests in separate sessions to measure whether the same tool gives the same answer twice.
The score: Every output was compared against the ground-truth fact log. We measured whether each AI caught the planted contradiction, how completely it covered the document's facts, and whether its citations referenced things that actually exist in the source document. The composite grade weighted these three measures at 40%/40%/20%.
We also surveyed 1,005 working professionals about how they use AI to summarize, draft, review, and analyze documents and how often that AI output turns out to be inaccurate. Participants spanned a range of job levels, from individual contributors (56%) and managers (29%) to senior leaders including directors, vice presidents, C-suite executives, and company owners (14%). The sample spanned generations, including Gen Z (16%), millennials (55%), Gen X (25%), and baby boomers (4%). Methodology percentages not totaling 100% are due to rounding. The data was collected in May 2026.
Smallpdf helps professionals, students, and businesses work more efficiently with digital documents. From converting and compressing PDFs to editing, organizing, and eSigning files, Smallpdf provides simple tools that streamline document workflows and reduce friction. As AI becomes more common in document management, Smallpdf helps users maintain accuracy, security, and control throughout the process.
The information and findings presented in this article may be used for noncommercial purposes only. If you share or reference this content, please provide proper attribution and include a link back to Smallpdf.
Related Articles
