SambaNova Cloud

Verified Truly Free

SambaNova Cloud delivers the world's fastest inference for open-source models like Llama 3.1 405B and Qwen 2.5, powered by the purpose-built SN40L Reconfigurable Dataflow Unit (RDU). It offers lightning-fast speed and a generous free credit for new users.

Fastest Inference RDU Hardware Llama 3.1 405B Free Credits Community Pick
Get API Key Suggest Edit
2669

Overview

Provider Type

Trial Credits

API Endpoint

https://api.sambanova.ai/v1

Free Tier Highlights

Varies by model

Why Choose SambaNova Cloud?

SambaNova Cloud stands out for its unique features and capabilities. With a developer-friendly API and comprehensive documentation, you can integrate AI capabilities into your applications within minutes.

Quick Start Guide

1

Create Account

Sign up at cloud.sambanova.ai to get your $5 free credit.
2

Get API Key

Navigate to the API Keys section in the dashboard.
3

Configure

Use the OpenAI SDK with base_url='https://api.sambanova.ai/v1' and your API key.

Available Models

Model Name ID Context Capabilities
Llama 3.3 70B Instruct
Meta-Llama-3.3-70B-Instruct
128 000 Context
-
Llama 3.1 405B Instruct
Meta-Llama-3.1-405B-Instruct
8 000 Context
-
Llama 3.1 70B Instruct
Meta-Llama-3.1-70B-Instruct
128 000 Context
-
Llama 3.1 8B Instruct
Meta-Llama-3.1-8B-Instruct
128 000 Context
-
Qwen 2.5 Coder 32B
Qwen2.5-Coder-32B-Instruct
32 000 Context
-
Qwen 2.5 72B Instruct
Qwen2.5-72B-Instruct
32 000 Context
-

Integration Examples

Ready-to-use code snippets for your applications.

main.py
import os
import openai

client = openai.OpenAI(
    api_key="YOUR_SAMBANOVA_API_KEY",
    base_url="https://api.sambanova.ai/v1",
)

response = client.chat.completions.create(
    model="Meta-Llama-3.3-70B-Instruct",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Explain quantum computing in one sentence."}
    ],
    temperature=0.1,
    top_p=0.1
)

print(response.choices[0].message.content)

Free Tier Pricing & Limits

Rate Limit

Requests per minute

Varies by model

Daily Quota

Requests per day

Dependent on credits

Token Limit

Tokens per minute

~30 000 000 Llama 8B Tokens (w/ $5 credit)

Monthly Quota

Per month limit

$5 Free Credit (New Users)

Free Credits

One-time

$5

Use Cases

Real-time Complex Reasoning

Agentic Workflows

Live Coding Assistants

High-Volume Processing

Financial Modeling

Limitations & Considerations

Free credits are one-time for new users

Context window varies by model (8k - 128k)

Rate limits apply to free tier

Community Hub

Live

Join the discussion, share tips, and rate SambaNova Cloud.

Quick Reactions

Add Discussion

Comments are moderated. Be helpful and respectful.

Recent Activity

0 comments

Ready to Get Started?

Join thousands of developers using SambaNova Cloud

Start Building Now