Test premium GPU for free * Documentation

On-demand instances GPU

Access on-demand GPUs for intensive rendering, simulation, and AI workloads while keeping full control over your costs.

Available Datacenters 🇪🇺 Europe : Strasbourg, Dunkirk, Frankfurt | 🇺🇸 North America : Washington, Portland, Montreal

Real-world use cases for your teams.

Each scenario combines the same promise: deploy sovereign GPUs, keep control of your costs and deliver faster.

AI / inference

A powerful foundation to deploy, test and fine-tune your models

  • Deploy your low latency inference as production assistants
  • Test and fine-tune your models (Llama, Mistral, audio, vision).

Video & 3D Rendering

An express GPU pipeline for studios and creators.

  • Reduce time-to-render with a shared CUDA pool.
  • Run your Blender, Unreal or Houdini batches in parallel.

Scientific Computing

Raw power to simulate and explore.

  • Accelerate your parallel workloads in Python, R or C++.
  • Process large datasets without moving sensitive data.
  • Track each run with auditable metrics.

Production ready GPU - CI CD compatible

Deploy your pipelines and push your workloads to production

  • Add GPU runners to Jenkins, GitLab or GitHub Actions natively or via Kubernetes
  • Automate tests, benchmarks and package your workloads

Case study · Gladia x Shadow GPU

Discover the Gladia case study: +20% performance, zero extra cost.

See how a modular GPU strategy unlocked realtime audio inference while keeping spend flat.

Cover for the Gladia x Shadow GPU case study

🚀 French pioneer and cloud technology leader since 2015

A proven infrastructure powering the most ambitious projects worldwide.

GPU Performance

Real-world benchmarks of our GPU configurations on production AI models.

AI Model GPU Time to First Token Avg Throughput Peak Throughput
Llama 3.2 (3B) RTX A4500 x4 from 0.56s ~ 510 tok/s 550
RTX 2000 Ada x4 from 0.91s ~ 320 tok/s 410
Mistral Small 3.2 (24B) RTX A4500 x4 from 0.86s ~ 120 tok/s 160

GPU infrastructure that adapts to your pace

Three pillars to guarantee performance, flexibility and total cost control.

Configurable Power at Will

Build your GPU stack exactly as you envision it.

API NVIDIA RTX 2000 Ada · RTX A4500
Resources RAM · Storage · OS of choice
Scaling Scale up/down frictionlessly

Native Integration with Your Stack

Plug into your existing pipelines in 5 minutes.

GPU OpenStack · Industry standard
IaC Terraform · Ansible · Pulumi
Orchestration Kubernetes ready

Absolute Financial Transparency

Every euro spent is tracked, justified and optimizable.

Model Pay-as-you-go per minute
Monitoring Real-time cost dashboards
Flexibility Zero commitment · Stop anytime

Choose the billing model that fits you

Total flexibility, controlled budget and sovereign GPU infrastructure. Select the model suited to your workload, from testing to production.

Instant

Pay as you go

Pay only what you consume, no commitment. Ideal for one-off needs and quick tests.

Pay-as-you-go.
  • Instant start with no commitment
  • Ultra-fine per-minute billing
  • Standard weekday support

Predictable

Monthly plan

Fixed and predictable monthly budget. Perfect for regular use with controlled costs.

Fixed monthly billing
  • Guaranteed 24/7 access to your instances
  • Fixed and predictable monthly budget
  • Success Manager on demand

Enterprise

Custom offer

Fully customized solution. Designed for organizations with specific and critical needs.

Customized terms.
  • Quotas and SLAs negotiated to your activity
  • Custom integrations (SSO, reporting, connectors)
  • Dedicated Success Manager

Cost simulator

Compare billing models and estimate your costs based on actual usage.

Billing model
Configuration
Rate type
Daily usage: h/day

Cost estimate

Model: PAYG

Estimated usage h/day
/week · /month

Configuration 1 × RTX 2000 Ada
Rate type Spot · €0.360/min

Hourly cost
Daily cost
Weekly cost (5d)
Fixed monthly cost

Available GPU configurations

Choose the configuration tailored to your AI and 3D rendering needs.

RTX 2000 Ada GPU Instance

Latest generation Ada Lovelace architecture, displaying RT Core performance of 27.7 TFLOPS and Tensor of 191.9 TFLOPS, doubled compared to the previous generation

starting at €0.29/h (approximately €220/month)

  • Handle models or datasets without saturation, for AI inference use cases on LLMs with a few billion parameters or for 3D rendering
  • Boost your AI capabilities by accelerating your inference tasks for image creation or NLP model processing

RTX A4500 GPU Instance

RT Core performance of 46.2 TFLOPS and Tensor of 189.2 TFLOPS, multiplied by parallelizing up to 8 cards within the same instance

starting at €0.35/h (approximately €250/month)

  • Enjoy an ideal power/cost balance for your demanding AI tasks and complex 3D renders without investing in expensive DGX-type workstations
  • Run your AI inference workloads on large datasets such as multilingual NLP or real-time audio TSS/STT processing, and fine-tune large pre-trained models
  • Achieve advanced 3D rendering and visualization, such as architectural ray tracing rendering thanks to 20 GB of VRAM allowing detailed scenes to be loaded

An offer for every need, from testing to production.

Choose a Spot, On Demand or Dedicated model to align costs, availability and governance with your challenges.

Spot

Performance at best price

Economic instances for workloads tolerant to interruptions.

Not guaranteed

Preemptible based on availability

Use cases:

  • R&D and experimentation
  • Batch processing
  • CI/CD
  • One-off calculations

On Demand

Flexibility and continuity

Guaranteed instances, activatable on demand for your active projects.

Guaranteed

Once allocated, availability assured

Use cases:

  • 3D and video rendering
  • AI inference
  • Development
  • Interactive workloads

Dedicated 24/7

Permanently guaranteed capacity

Reserved and isolated capacity, ideal for production and critical environments.

Complete guarantee

For the entire reservation period

Use cases:

  • Model training
  • Production AI pipelines
  • Permanent workloads
  • Critical environments

The future of Cloud GPU offering

💡 We continuously innovate to give technical teams a head start and create new touchpoints with our community.

Coming soon: Inference as a service

We're making AI model deployment even simpler. Soon, you'll be able to upload your private models or use public models hosted by Cloud GPU, and only be billed for usage via a simple endpoint.

Frequently asked questions

Everything you need to know about instance limits, billing and support from our experts.

When can the instance limit be increased?

The limit can be revised after several regular billing cycles. Contact our Sales team for quick validation and to avoid service interruption.

How am I billed?

Two billing modes are available:

  • Pay-as-you-go : per-minute billing, with a detailed summary of consumed resources.
  • Monthly plan : fixed rate each month, with additional billing in case of exceeding the planned quota.
How can I get expert support?

Our Cloud and GPU experts support you in sizing your infrastructure and choosing the configuration best suited to your needs. Fill out the contact form and we'll get back to you quickly.

Ready to deploy your GPUs?

Join the teams who chose performance, transparency and sovereignty.

⚡ 24h activation • 🔒 Secure data • 🇪🇺 Sovereign infrastructure