Crash Test

Perplexity AI

Example excerpt heres

Woody John

26 Mar 2026 — 5 min read

Phase 1

First Contact

Phase 2

Stress Test

Phase 3

Real Work

Phase 4

Verdict

Category

AI Research & Search

Pricing

Free tier + Pro ($20/mo)

Testing Period

14 days — March 2026

Platform

Web, iOS, Chrome extension

First Contact

The onboarding experience is genuinely good. Fast, frictionless, no lengthy setup wizard. I was running queries within two minutes of landing. The interface is clean and the Copilot mode stood out immediately — it asked clarifying questions before diving in, which is behaviour most research tools skip entirely. Initial impressions were high enough that I went into Phase 2 expecting it to hold.

Phase 1 Evidence Log

4 entries

Day 1

SUCCESS

Onboarding was fast and frictionless. No account required to start querying. First search returned results in under three seconds with inline citations visible. Zero setup friction.

Day 2

SUCCESS

Copilot mode prompted me to clarify a vague research question before answering. Asked whether I wanted recent news or foundational background. This is the right behaviour — most tools just answer and move on.

Day 3

OBSERVATION

Source attribution UI creates a strong sense of credibility. Numbered citations next to claims, clickable through to source. The visual design implies verification. I noticed I was trusting the output more because it looked cited — before I had checked whether the citations were accurate.

Day 4

OBSERVATION

Chrome extension tested — adds Perplexity answers alongside Google results. Marginal value. Useful when the Google results are thin, but not a workflow changer. Ignored after day four.

Stress Test

This is where it came apart. The citation UI that looked so trustworthy in Phase 1 turned out to be decorative in too many cases. I ran a structured series of verifiable-fact queries across topics I could cross-check independently. The failure rate was higher than I expected from a tool whose entire value proposition is sourced answers. Phase 2 score of 54 is not a rounding error — it reflects genuine problems with the core promise.

SOURCING WARNING

Perplexity regularly cites sources it has not fully accessed. Paywalled articles, discontinued pages, and summaries derived from secondary sources are all presented with the same citation UI as fully-verified content. The interface does not distinguish between these. Treat every citation as unverified until you have opened and read the source yourself.

Phase 2 Evidence Log

6 entries

Day 5

FAILURE

Query on UK digital media regulation returned a statistic attributed to a DCMS report: "34% of UK adults cannot identify sponsored content online." The DCMS report cited does not contain this figure. The number appears to be a hallucination attached to a real document reference.

Day 6

FAILURE

Cross-checked 8 citations from a single response on AI regulation. 3 of the 8 sources were behind paywalls Perplexity could not have accessed. The summaries presented as sourced content were fabricated from the article title and visible preview text only.

Day 7

FAILURE

Asked a follow-up question that directly contradicted a claim in the previous answer. Perplexity agreed with the new framing and argued against its own prior position — without flagging the contradiction or acknowledging the reversal. No consistency checking between turns.

Day 8

OBSERVATION

Intentionally submitted a vague, under-specified query to test default behaviour. Rather than asking for clarification (as Copilot mode had done in Phase 1), the standard interface returned a confident, detailed answer. The confidence of the output is not calibrated to the quality of the input.

Day 9

FAILURE

Query on a specific SaaS tool returned a feature list that included capabilities from a version discontinued 18 months prior. The product page has been updated. Perplexity's answer was drawn from older indexed content and presented as current without any recency caveat.

Day 10

INSIGHT

Multi-step reasoning queries revealed the core limitation: Perplexity is retrieval with a summarisation layer. It finds and stitches. It does not reason across sources to synthesise a new position. Queries requiring actual inference — not just aggregation — consistently returned thin or evasive answers.

Real Work Integration

I ran Perplexity inside actual work for the second half of the test — daily research briefings, competitor analysis, background prep for calls. The results were nuanced. For orientation tasks it genuinely saved time. For anything where accuracy at the fact level mattered, I was spending that saved time doing verification anyway. The habit-formation risk is real: the tool makes it easy to feel done when you are not.

Phase 3 Evidence Log

4 entries

Day 11

SUCCESS

Used as a daily research briefing tool across a week. Saved an estimated 12–15 minutes per day on initial topic orientation. Required 2–3 corrections per session when I spot-checked against primary sources. Net time saving still positive for low-stakes orientation work.

Day 12

OBSERVATION

Competitor analysis query returned a mixed result: recent funding rounds accurate, product feature descriptions drawn from 6–9 month old content. No recency markers on individual claims, only a general "sources from the last 30 days" label on the response. Recency claims are at the response level, not the claim level.

Day 13

SUCCESS

Follow-up questioning within a thread is genuinely strong. Perplexity maintains context across a conversation and handles progressive refinement well. Asked five follow-up questions on a single topic without losing thread. This is where the tool earns its place.

Day 14

FAILURE

Caught myself accepting a Perplexity answer on a topic I know well. The answer was wrong on a specific detail I would normally have questioned. The citation UI had reduced my verification instinct. Habit formation risk: the tool trains you to trust before you verify.

Verdict

Perplexity earns a place in a research workflow if the scope is orientation, not conclusion. It is fast and the conversational thread handling is genuinely good. But the citation layer is partially theatre — it signals rigour without consistently delivering it. Use it to find the right questions. Do not use it to find the right answers.

WORKS WELL

Fast orientation on unfamiliar topics
Copilot mode asks clarifying questions
Thread-based follow-up questioning is strong
Clean interface with low friction entry
Daily briefing use case: net time positive

FAILS CONSISTENTLY

Cites paywalled sources it has not accessed
Hallucinated statistics attached to real documents
No consistency checking between conversation turns
Confident output on under-specified input
Outdated content presented without recency caveat
Multi-step reasoning is retrieval in disguise

FINAL VERDICT

Perplexity earns a place in a research workflow if you treat every output as a hypothesis, not a finding. It is genuinely fast and the conversational follow-up is strong. The citation UI will train you to trust before you verify if you let it. Score 71: useful tool, wrong mental model. Use it for orientation. Use something else for conclusions.

Perplexity AI

Woody John

Read more

ChatGPT Crash Test

Future CTAI Products

The CTAI 4 Methodology

Building CTAI