This website uses cookies to anonymously analyze website traffic using Google Analytics.
AcceptDecline
Model Platform
Model Platform

Products

Serverless Inference

API for inference on open-source models.

Dedicated Endpoints

Deploy models on custom hardware.

Fine-Tuning

Train & improve high-quality, fast models.

Together Chat

Chat app for open-source AI.

Code Execution

Code Sandbox

Build AI development environments.

Code Interpreter

Execute LLM-generated code.

Tools

Which LLM to Use

Find the ‘right’ model for your use case.

Models

See all models →

Chat
DeepSeek-V3-0324
 →
Chat
DeepSeek-V3-0324
DeepSeek-V3-0324
 →
try it →
Chat
Llama 4 Maverick
 →
Chat
Llama 4 Maverick
Llama 4 Maverick
 →
try it →
Chat
Qwen3 235B
 →
Chat
Qwen3 235B
Qwen3 235B
 →
try it →
Image
FLUX.1 [DEV]
 →
Image
FLUX.1 [DEV]
FLUX.1 [DEV]
 →
try it →
GPU Cloud
GPU Cloud

Clusters of Any Size

Instant Clusters

Self-serve up to 64 NVIDIA GPUs.

Reserved Clusters

64 → 1,000 → 10,000+ NVIDIA GPUs.

Cloud Services

Data Center Locations

Global GPU power in 25+ cities.

Slurm

Cluster management system.

GPUs

NVIDIA GB200 NVL72
 →
NVIDIA GB00 NVL72
try it →
NVIDIA HGX B200
 →
NVIDIA HGX B200
try it →
NVIDIA H200
 →
NVIDIA H200
try it →
NVIDIA H100
 →
NVIDIA H100
try it →
Solutions
Solutions

Solutions

Enterprise

Secure, reliable AI infrastructure.

Customer Stories

Testimonials from AI pioneers.

Why Open Source

How to own your AI.

Industries & Use-Cases

Scale your business with Together AI.

Case Studies

From AWS to Together Dedicated Endpoints: Arcee AI's journey to greater inference flexibility

How Zomato built an AI customer support bot that doubled customer satisfaction and scaled to over 1,000 messages per minute

Developers
Developers

Developers

Documentation

Technical docs for using Together AI.

Research

Advancing the open-source AI frontier.

Model Library

All our open-source models.

Cookbooks

Practical implementation guides.

Example Apps

Our open-source demo apps.

Videos

DeepSeek-R1: How It Works, Simplified!

Together Code Sandbox: How To Build AI Coding Agents

Pricing
Pricing

Pricing

Pricing Overview

Our platform & GPU pricing.

Inference

Per-token & per-minute pricing.

Fine-Tuning

LoRA and full fine-tuning pricing.

GPU Clusters

Hourly rates & custom pricing.

Questions?

We’re here to help!

Talk to us →

Company
Company

Company

About us

Get to know us.

Values

Our approach to open-source AI.

Team

Meet our leadership.

Careers

Join our mission.

Resources

Blog

Our latest news & blog posts.

Research

Advancing the open-source AI frontier.

Knowledge Base

Find answers to your questions.

Featured Blog Posts

Together AI acquires Refuel.ai to unlock data for developers and businesses building production-grade AI applications

Together AI Announces $305M Series B to Scale AI Acceleration Cloud for Open Source and Enterprise AI

Get Started
Chat
Docs
Blog
Support
Contact Sales

Contact Together AI

What would you like to do:

Contact Sales

For inquiries about our products and solutions, connect with the Sales Team

Help Center

Check out our constantly expanding knowledge base and get expert help from our Support Team

Subscribe to newsletter

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
  • Products
  • Solutions
  • Research
  • Blog
  • About
  • Pricing
  • Contact
  • Support
  • Status

© 2025 San Francisco, CA 94114

  • Privacy policy
  • Terms of service