Perplexity Sonar
by Perplexity AI
Web-grounded Sonar models delivering cited, real-time API answers
About
Perplexity Sonar is the in-house family of web-grounded AI search models that power the Perplexity API Platform and much of the Perplexity product experience. Rather than operating as a static, offline language model, Sonar couples large language model generation with search and retrieval over the live web, so every response can be grounded in current online content and accompanied by inline citations. This design makes it suitable for applications where factual accuracy, source transparency, and up-to-date information are critical. Within the Sonar family, Perplexity offers multiple specialized variants tuned for different workloads. The base Sonar model is a general-purpose grounded search model optimized for fast, cited answers over web content. Higher tiers such as Sonar Pro and Sonar Reasoning Pro add greater reasoning depth and more advanced multi-step search capabilities for complex research and analysis tasks, while Sonar Deep Research is designed for long-context, agentic research workflows that autonomously explore, read, and synthesize large volumes of web pages. All of these models can call Perplexity’s web_search tool and related search APIs to retrieve relevant documents and then synthesize answers that reference those sources. Developers access Sonar through the Perplexity API Platform using standard HTTP APIs. The platform exposes Sonar through agent-style interfaces, chat-style completions, and dedicated Search APIs, so it can be embedded into chatbots, research assistants, data copilots, or any product that needs real-time answers grounded in web data. Because the models return citations and are designed around a retrieval-augmented pipeline, they are well suited for use cases like market and competitive research, technical troubleshooting with current documentation, or knowledge tools that must reflect the latest changes on the web without custom indexing. What makes Sonar distinctive is its focus on being web-native and grounded by default, with clear, transparent pricing at the API level and optional tools like the Search API for raw web results. The combination of in-house LLMs, live web search, agentic research behaviors in the higher-end Sonar variants, and production-oriented APIs means teams can build applications that behave more like a research analyst than a static chatbot. By offloading the search, retrieval, and synthesis stack to Perplexity’s infrastructure, developers can focus on product logic and UX while still giving end users fast, cited answers that trace back to actual web pages.
What you can do with it
- Build a customer-facing research assistant that answers questions with live web citations embedded in a SaaS product
- Automate competitive intelligence reports by querying Sonar Deep Research on markets, companies, and technologies
- Power a developer documentation copilot that searches the web and official docs to troubleshoot errors with cited fixes
- Integrate Sonar into an internal knowledge tool that synthesizes recent news and industry updates for executives
- Create a browser or IDE extension that surfaces grounded, real-time answers and source links directly in the workflow
Pricing
Sonar — $1.00/M input tokens, $1.00/M output tokens, plus per-request web_search fees when used Sonar Pro — $3.00/M input tokens, $15.00/M output tokens, plus per-request web_search fees when used Sonar Reasoning Pro — $2.00/M input tokens, $8.00/M output tokens, plus per-request web_search fees when used Sonar Deep Research — $2.00/M input tokens, $8.00/M output tokens, optimized for long-context research Search API — $5.00 per 1,000 requests, no additional token costs web_search tool — $0.005 per invocation, added as a request fee on top of model token costs for Sonar family calls that use web search
How to access
Perplexity Sonar is accessed through the Perplexity API Platform via HTTPS endpoints for Agent API, Chat Completions, and Search, using API keys created in a Perplexity account with open self-serve signup and higher-volume or enterprise access available through Perplexity sales; it is primarily integrated into web backends, serverless functions, and custom applications, and also powers parts of the Perplexity web and mobile experiences.
Access is via the Perplexity API Platform with API keys created in a Perplexity account, so you must sign up or log in with an email or SSO identity and generate keys in the dashboard; Sonar models are available through HTTPS API calls (Agent API, Chat Completions, and Search endpoints) and can be integrated into web, backend, and mobile apps; there is a free tier/credited usage to get started and paid usage is billed based on token and request pricing, with higher-volume or enterprise tiers available through Perplexity sales.
Tips for getting the best results
To use Sonar effectively, first sign up for the Perplexity API Platform, create an API key, and decide which Sonar variant (Sonar, Sonar Pro, Sonar Reasoning Pro, or Sonar Deep Research) matches your latency-versus-depth needs. Start by sending clear, well-scoped natural language queries and ensure you enable web_search or the Search API when you need fresh, grounded answers, since turning off search will reduce Sonar to more standard LLM behavior. For complex research tasks, structure prompts to specify goals, constraints, and desired citation formats (for example, asking for bullet summaries with 3–5 citations) so that the agent can plan multi-step research; Deep Research and Reasoning Pro benefit from prompts that encourage multi-hop reasoning and explicit verification. In integration, cache or store structured outputs (like extracted facts or URLs) rather than re-calling Sonar for identical queries to manage costs, and monitor token usage plus web_search invocation counts because total cost is token fees plus per-request search fees. Many new users are tripped up by not handling partial or long responses—implement streaming or pagination where possible and design your UI to surface citations clearly so users can click through to verify sources.
Known limitations
Because Sonar relies on live web content, its answers are constrained by crawl coverage, site accessibility, and the quality or bias of available sources; it may miss information behind paywalls, in walled gardens, or on sites that block crawlers. Like other LLM-based systems, it can still hallucinate or misinterpret sources, especially on niche topics or where web results are sparse or contradictory, so citations must be treated as starting points rather than definitive proof. Costs can grow with heavy usage because pricing combines token-based billing with per-request web_search fees, and repeated broad queries without caching or scoping can become expensive. The API is focused on text and web-grounded search; while some Sonar variants support longer context and more complex reasoning, they are not a drop-in replacement for specialized multimodal or structured-knowledge systems. Finally, developers must implement their own guardrails, content filtering, and compliance controls, as Sonar does not automatically enforce domain-specific policies or legal requirements in downstream applications.
Model / Technology
RAG pipeline over live web index with Perplexity Sonar LLM family
Commercial use
Perplexity positions the API Platform, including Sonar models, for production and commercial use, allowing developers and organizations to integrate it into commercial products as long as they comply with Perplexity’s API terms of service and acceptable use policies; outputs can generally be used commercially without attribution requirements to individual web sources, but developers remain responsible for respecting external site terms and any downstream IP or data-use obligations, and enterprise customers may operate under bespoke agreements.
Training data
Public documentation indicates that Perplexity’s in-house Sonar models are trained using a mixture of licensed data, publicly available web content, and proprietary curation, and then augmented at inference time with retrieval over a live web index; like other web-trained LLMs, this implies large-scale web scraping and aggregation, and while no major Sonar-specific controversies are widely reported, general industry debates about the use of web content, copyright, and data consent apply to Perplexity’s models as well; for API users, the key point is that Sonar’s answers are grounded in current crawled pages rather than a fixed static corpus.