Select the pay-as-you-go model for per-minute billing with detailed resource summaries, or switch to the monthly plan for a fixed fee with additional billing only when you exceed the planned quota.

On-demand instances GPU

Q: How can I get expert support?

Cloud and GPU experts help size your infrastructure and recommend the right configuration for your workloads; share your context through the contact form and expect a reply within 24 business hours.

Access on-demand GPUs for intensive rendering, simulation, and AI workloads while keeping full control over your costs.

Real-world use cases for your teams.

Each scenario combines the same promise: deploy sovereign GPUs, keep control of your costs and deliver faster.

AI / inference

A powerful foundation to deploy, test and fine-tune your models

Deploy your low latency inference as production assistants
Test and fine-tune your models (Llama, Mistral, audio, vision).

Explore the AI / inference guide

Video & 3D Rendering

An express GPU pipeline for studios and creators.

Reduce time-to-render with a shared CUDA pool.
Run your Blender, Unreal or Houdini batches in parallel.

Explore the rendering guide

Scientific Computing

Raw power to simulate and explore.

Accelerate your parallel workloads in Python, R or C++.
Process large datasets without moving sensitive data.
Track each run with auditable metrics.

Explore the scientific computing guide

Production ready GPU - CI CD compatible

Deploy your pipelines and push your workloads to production

Add GPU runners to Jenkins, GitLab or GitHub Actions natively or via Kubernetes
Automate tests, benchmarks and package your workloads

Explore the GPU CI/CD guide

Trusted by privacy-first and real-time leaders

Case study · Gladia

+20% performance, 0% extra cost.

Realtime audio inference scaled with modular Shadow GPUs.

Discover more

Case study · Qwant

100% of prod AI on Shadow, ~50% cheaper.

From GTX 1080s to 90+ A4500s via OpenStack in minutes.

Discover more

GPU Performance

Real-world benchmarks of our GPU configurations on production AI models.

AI Model	GPU	Time to First Token	Avg Throughput	Peak Throughput
Llama 3.2 (3B)	RTX A4500 x4	from 0.56s	~ 510 tok/s	↑ 550
Llama 3.2 (3B)	RTX 2000 Ada x4	from 0.91s	~ 320 tok/s	↑ 410
Mistral Small 3.2 (24B)	RTX A4500 x4	from 0.86s	~ 120 tok/s	↑ 160

GPU infrastructure that adapts to your pace

Three pillars to guarantee performance, flexibility and total cost control.

Configurable Power at Will

Build your GPU stack exactly as you envision it.

API NVIDIA RTX 2000 Ada · RTX A4500

Resources RAM · Storage · OS of choice

Scaling Scale up/down frictionlessly

Native Integration with Your Stack

Plug into your existing pipelines in 5 minutes.

GPU OpenStack · Industry standard

IaC Terraform · Ansible · Pulumi

Orchestration Kubernetes ready

Absolute Financial Transparency

Every euro spent is tracked, justified and optimizable.

Model Pay-as-you-go per minute

Monitoring Real-time cost dashboards

Flexibility Zero commitment · Stop anytime

Discover configurations

Choose the billing model that fits you

Total flexibility, controlled budget and sovereign GPU infrastructure. Select the model suited to your workload, from testing to production.

Instant

Pay as you go

Pay only what you consume, no commitment. Ideal for one-off needs and quick tests.

Pay-as-you-go.

Instant start with no commitment
Ultra-fine per-minute billing
Standard weekday support

Predictable

Monthly plan

Fixed and predictable monthly budget. Perfect for regular use with controlled costs.

Fixed monthly billing

Guaranteed 24/7 access to your instances
Fixed and predictable monthly budget
Success Manager on demand

Enterprise

Custom offer

Fully customized solution. Designed for organizations with specific and critical needs.

Customized terms.

Quotas and SLAs negotiated to your activity
Custom integrations (SSO, reporting, connectors)
Dedicated Success Manager

Cost simulator

Compare billing models and estimate your costs based on actual usage.

Billing model

Configuration

Number of instances

Rate type

Estimated daily usage

h/day

Cost estimate

Model: PAYG

Estimated usage h/day

/week · /month

Configuration 1 × RTX 2000 Ada

Rate type Spot · €0.360/min

Hourly cost —

Daily cost —

Weekly cost (5d) —

Fixed monthly cost —

Available GPU configurations

Choose the configuration tailored to your AI and 3D rendering needs.

RTX 2000 Ada GPU Instance

Latest generation Ada Lovelace architecture, displaying RT Core performance of 27.7 TFLOPS and Tensor of 191.9 TFLOPS, doubled compared to the previous generation

starting at €0.29/h (approximately €220/month)

Handle models or datasets without saturation, for AI inference use cases on LLMs with a few billion parameters or for 3D rendering
Boost your AI capabilities by accelerating your inference tasks for image creation or NLP model processing

RTX A4500 GPU Instance

RT Core performance of 46.2 TFLOPS and Tensor of 189.2 TFLOPS, multiplied by parallelizing up to 8 cards within the same instance

starting at €0.35/h (approximately €250/month)

Enjoy an ideal power/cost balance for your demanding AI tasks and complex 3D renders without investing in expensive DGX-type workstations
Run your AI inference workloads on large datasets such as multilingual NLP or real-time audio TSS/STT processing, and fine-tune large pre-trained models
Achieve advanced 3D rendering and visualization, such as architectural ray tracing rendering thanks to 20 GB of VRAM allowing detailed scenes to be loaded

An offer for every need, from testing to production.

Choose a Spot, On Demand or Dedicated model to align costs, availability and governance with your challenges.

Spot

Performance at best price

Economic instances for workloads tolerant to interruptions.

Not guaranteed

Preemptible based on availability

Use cases:

R&D and experimentation
Batch processing
CI/CD
One-off calculations

On Demand

Flexibility and continuity

Guaranteed instances, activatable on demand for your active projects.

Guaranteed

Once allocated, availability assured

Use cases:

3D and video rendering
AI inference
Development
Interactive workloads

Dedicated 24/7

Permanently guaranteed capacity

Reserved and isolated capacity, ideal for production and critical environments.

Complete guarantee

For the entire reservation period

Use cases:

Model training
Production AI pipelines
Permanent workloads
Critical environments

Choose my model

The future of Cloud GPU offering

💡 We continuously innovate to give technical teams a head start and create new touchpoints with our community.

Coming soon: Inference as a service

We're making AI model deployment even simpler. Soon, you'll be able to upload your private models or use public models hosted by Cloud GPU, and only be billed for usage via a simple endpoint.

Be notified at launch

Frequently asked questions

Everything you need to know about instance limits, billing and support from our experts.

When can the instance limit be increased?

The limit can be revised after several regular billing cycles. Contact our Sales team for quick validation and to avoid service interruption.

How am I billed?

Two billing modes are available:

Pay-as-you-go : per-minute billing, with a detailed summary of consumed resources.
Monthly plan : fixed rate each month, with additional billing in case of exceeding the planned quota.

How can I get expert support?

Our Cloud and GPU experts support you in sizing your infrastructure and choosing the configuration best suited to your needs. Fill out the contact form and we'll get back to you quickly.

Ready to deploy your GPUs?

Join the teams who chose performance, transparency and sovereignty.

⚡ 24h activation • 🔒 Secure data • 🇪🇺 Sovereign infrastructure

On-demand instances GPU

Real-world use cases for your teams.

AI / inference

Video & 3D Rendering

Scientific Computing

Production ready GPU - CI CD compatible

Trusted by privacy-first and real-time leaders

+20% performance, 0% extra cost.

100% of prod AI on Shadow, ~50% cheaper.

GPU Performance

OctaneBench

Cinebench GPU

GPU infrastructure that adapts to your pace

Configurable Power at Will

Native Integration with Your Stack

Absolute Financial Transparency

Choose the billing model that fits you

Pay as you go

Monthly plan

Custom offer

Cost simulator

Cost estimate

Available GPU configurations

RTX 2000 Ada GPU Instance

RTX A4500 GPU Instance

An offer for every need, from testing to production.

Spot

On Demand

Dedicated 24/7

The future of Cloud GPU offering

Coming soon: Inference as a service

Frequently asked questions

Ready to deploy your GPUs?