together.pricing

Inference pricing

Over 100 leading open-source Chat, Language, Image, Code, and Embedding models are available through the Together Inference API. For these models you pay just for what you use.

Serverless Endpoints

Prices are per 1 million tokens including input and output tokens for Chat, Language and Code models, only including input tokens for Embedding models, and based on image size and steps for Image models. Special promotional pricing for Llama-2 and CodeLlama models.

CHat, language, and code models
- Model size
  price 1M tokens
- Up to 4B
  price 1M tokens
  $0.10
- 4.1B - 8B
  price 1M tokens
  $0.20
- 8.1B - 21B
  price 1M tokens
  $0.30
- 21.1B - 41B
  price 1M tokens
  $0.80
- 41B - 70B
  price 1M tokens
  $0.90
Mixture-of-experts
- Model size
  price 1M tokens
- Up to 56B total parameters
  price 1M tokens
  $0.60
- 56.1B - 176B total parameters
  price 1M tokens
  $1.20
- 176.1B - 480B total parameters
  price 1M tokens
  $2.40
EMbeddings models
- Model size
  price 1M tokens
- Up to 150M
  price 1M tokens
  $0.008
- 151M - 350M
  price 1M tokens
  $0.016
Image models
- Image Size
  25 steps
  50 steps
  75 steps
  100 steps
- 512X512
  25 steps
  $0.001
  50 steps
  $0.002
  75 steps
  $0.0035
  100 steps
  $0.005
- 1024X1024
  25 steps
  $0.01
  50 steps
  $0.02
  75 steps
  $0.035
  100 steps
  $0.05
GENOMIC MODELS
- Model size
  price 1M tokens
- 4.1B - 8B
  price 1M tokens
  $2.00

Dedicated instances

When hosting your own model you pay hourly for the GPU instances, whether it is a model you fine-tuned using Together Fine-tuning or any other model you choose to host. You can start or stop your instance any time through the web-based Playground or using the start/stop instance APIs.

Your fine-tuned models
- hardware type
  price per hour hosting
  model size
- Fractional L40 48GB
  price per hour hosting
  $0.70
  price per hour hosting
  Up to 4B
- 1x L40 48GB
  price per hour hosting
  $1.40
  price per hour hosting
  4.1B - 21B
- 2x L40 48GB
  price per hour hosting
  $2.80
  price per hour hosting
  21.1B - 41B
- 2x A100 80GB
  price per hour hosting
  $6.17
  price per hour hosting
  41.1B - 70B
  8x7B MoE

Interested in a dedicated instance for your own model?

Fine-tuning pricing

Pricing for fine-tuning is based on model size, dataset size, and the number of epochs.

Download checkpoints and final model weights.
View job status and logs through CLI or Playgrounds.
Deploy a model instantly once it’s fine-tuned.

Try the interactive calculator

Model:

Dataset (tokens)

Epochs (# of iterations)

Esimated Cost

Together GPU Clusters Pricing

Together Compute provides private, state of the art clusters with H100 and A100 GPUs, connected over fast 200 Gbps non-blocking Ethernet or up to 3.2 Tbps InfiniBand networks.

haRDWARE TYPES AVAILABLE
NETWORKING
A100 PCIe 80GB
price 1k tokens
200 Gbps non-blocking Ethernet
A100 SXM 80GB
price 1k tokens
200 Gbps non-blocking Ethernet or 1.6 Tbps Infiniband configs available
H100 80GB
price 1k tokens
3.2 Tbps Infiniband

Request your cluster today

together.pricing

Inference pricing

Serverless Endpoints

CHat, language, and code models

Model size

price 1M tokens

price 1M tokens

price 1M tokens

price 1M tokens

price 1M tokens

price 1M tokens

Mixture-of-experts

Model size

price 1M tokens

price 1M tokens

price 1M tokens

price 1M tokens

EMbeddings models

Model size

price 1M tokens

price 1M tokens

price 1M tokens

Image models

Image Size

25 steps

50 steps

75 steps

100 steps

25 steps

50 steps

75 steps

100 steps

25 steps

50 steps

75 steps

100 steps

GENOMIC MODELS

Model size

price 1M tokens

price 1M tokens

Dedicated instances

Your fine-tuned models

hardware type

price per hour hosting

model size

price per hour hosting

price per hour hosting

price per hour hosting

price per hour hosting

price per hour hosting

price per hour hosting

price per hour hosting

price per hour hosting

Fine-tuning pricing

Try the interactive calculator

Together GPU Clusters Pricing

haRDWARE TYPES AVAILABLE

NETWORKING

price 1k tokens

price 1k tokens

price 1k tokens

Subscribe to newsletter