Cognition • agent
Cloud-based autonomous software engineer aimed at real engineering teams, multi-repo work and delegated ticket execution.
Context Window
—
Input Price/1M
—
Output Price/1M
—
Parameters
—
Devin results on the main AI model evaluation benchmarks. Higher scores indicate better performance.
| Benchmark | Score | Maximum | Methodology |
|---|---|---|---|
| SWEN Agent Composite | 91.6 | 100.0 | SWEN Agent Registry v2026-06-22. Editorial multimodal ranking with modality-specific scoring based on product capability, control, speed, value and integration readiness. |
| Benchmark | Score | Maximum | Methodology |
|---|---|---|---|
| SWEN Agent Autonomy | 93.0 | 100.0 | SWEN Agent Registry v2026-06-22. Editorial multimodal ranking with modality-specific scoring based on product capability, control, speed, value and integration readiness. |
| Benchmark | Score | Maximum | Methodology |
|---|---|---|---|
| SWEN Agent Integration | 90.0 | 100.0 | SWEN Agent Registry v2026-06-22. Editorial multimodal ranking with modality-specific scoring based on product capability, control, speed, value and integration readiness. |
| Benchmark | Score | Maximum | Methodology |
|---|---|---|---|
| SWEN Agent Reliability | 88.0 | 100.0 | SWEN Agent Registry v2026-06-22. Editorial multimodal ranking with modality-specific scoring based on product capability, control, speed, value and integration readiness. |
| Benchmark | Score | Maximum | Methodology |
|---|---|---|---|
| SWEN Agent Tool Use | 92.0 | 100.0 | SWEN Agent Registry v2026-06-22. Editorial multimodal ranking with modality-specific scoring based on product capability, control, speed, value and integration readiness. |
| Benchmark | Score | Maximum | Methodology |
|---|---|---|---|
| SWEN Agent Value | 79.0 | 100.0 | SWEN Agent Registry v2026-06-22. Editorial multimodal ranking with modality-specific scoring based on product capability, control, speed, value and integration readiness. |
Devin is an AI model developed by Cognition, classified as a agent model. It focuses on text processing and natural language generation. As a proprietary model, it is available via Cognition's cloud API.
Devin does not have public per-token pricing available at this time. Some models offer access via enterprise plans or research programs. Check Cognition's official website for up-to-date availability and pricing.
Devin was evaluated on 6 different benchmarks, covering categories like agent, Autonomy, Integration, Reliability, Tool Use, Value. Results show exceptional performance across available evaluations.
It's important to note that benchmarks measure specific aspects and don't capture the full user experience. Factors like instruction adherence, behavior in long conversations, and real-world task quality vary significantly between models and aren't always reflected in standard scores.
Devin specializes in agent, offering advanced capabilities for creating and processing agent content.
In the 2026 AI model ecosystem, Devin competes directly with similarly capable models. Cognition competes in this segment against OpenAI, Anthropic, Google, and Meta. The choice between models depends on the specific use case, budget, latency requirements, and need for features like multimodality and tool calling.
For a detailed side-by-side comparison, use our comparison tool or check the overall model ranking.
Cloud-based autonomous software engineer aimed at real engineering teams, multi-repo work and delegated ticket execution.
Devin does not have public per-token pricing available at this time. Check Cognition's official website for up-to-date information.
In available benchmarks, Devin scored: SWEN Agent Composite: 91.6/100, SWEN Agent Autonomy: 93/100, SWEN Agent Integration: 90/100. See the full table above for a detailed comparison.
No, Devin is a proprietary model from Cognition. It is available via cloud API. For open source alternatives, check our open source model ranking.
Devin excels at general-purpose language tasks. It supports tool calling for API integrations and automation.
Last updated: June 22, 2026 • View methodology →