How scaling works

The AI runs on your machine.
Your hardware sets the ceiling.

Studio Copilot has no cloud dependency. Every embedding, cluster, and recommendation runs locally. A basic laptop gets you started — a GPU-equipped machine like the NVIDIA GB10 gets you real-time results on a full shoot.

Why local-first?

Most AI tools for creatives work by uploading your files to a server, running inference in the cloud, and returning results. That model works — but it means your client's photos sit on someone else's infrastructure, you pay per image, and you need a connection to work.

Studio Copilot runs the entire pipeline — embedding, clustering, similarity search, recommendations — on hardware you control. The tradeoff is that what your machine can do determines what you get. That's the point. Better hardware means better results, not a higher subscription tier.

Your photos never leave your machine

Client images are often under NDA. Running AI locally means no upload to third-party servers, no retention policy to worry about, no breach surface.

Works offline, always

On location after a shoot, on a flight between events — the pipeline doesn't need a connection. It runs on your hardware, on your schedule.

No per-inference cost

Cloud AI charges per image or per API call. Local inference is free after setup. A 10,000-photo shoot costs the same compute as a 10-photo test.

What actually changes as you scale?

Scaling up hardware doesn't change the interface or the workflow. It changes three things: how fast the pipeline runs, how powerful the model is, and how large a shoot it can handle in one pass.

Processing time

Every photo gets converted into an embedding vector — a numerical fingerprint. More CPU/GPU cores means more photos per second. The difference between Base and Power is 20× faster.

Model depth

Larger models (ViT-L, ViT-H) have seen more training data and understand scene context better — catching that the 'best' portrait has the right light, not just the sharpest focus. They need GPU memory to run.

Index size

More RAM means the system can keep a larger embedding index in memory — essential when you're comparing 2,000 frames from a multi-day shoot all at once.

Real-world example: 500-photo wedding shoot

Base (laptop CPU)

~8 min

Pro (fast CPU, 16 GB RAM)

~2 min

Power (NVIDIA GB10 GPU)

~25 sec

Hardware tiers

Tier 01

Base

Any laptop or desktop

The full workflow runs on whatever you already own. Sort a shoot, manage paperwork, track reviews — no extra setup.

Unlocks: Full workflow, CPU inference, up to ~300 photos/shoot

SpeedGood

ModelViT-B/32

500 photos~8 min / 500 photos

Tier 02

Pro

Modern CPU, 16 GB+ RAM

Faster indexing means you can process mid-size shoots during a coffee break. Deeper similarity clusters surface better selects.

Unlocks: Larger shoots, richer grouping, stronger recommendations

SpeedFast

ModelViT-L/14

500 photos~2 min / 500 photos

Tier 03

Power

Dedicated GPU (e.g. NVIDIA GB10)

GPU acceleration makes the pipeline feel instant. A 500-photo wedding shoot processes in under 30 seconds. This is how we ran it at GTC.

Unlocks: Real-time processing, largest models, full shoot in seconds

SpeedReal-time

ModelViT-H/14 or custom

500 photos~25 sec / 500 photos

Tier 04

Hosted

Shared cloud infrastructure

Optional layer for teams that need shared project state and remote access. Only worth adding once local value is clear.

Unlocks: Multi-user, remote access, shared projects

SpeedVariable

ModelConfigurable

500 photosScales with budget

Works on any machineFaster with more hardware

What stays the same

Same workflow at every tier.

Sorting, paperwork, review, and delivery work identically whether you're on a MacBook or an NVIDIA workstation. You never need to change how you work to get more power — just plug in the hardware and the pipeline gets faster automatically.

What changes

Speed, depth, and model size.

Better hardware means faster scans, higher-quality similarity matching, and the ability to load larger models that understand lighting, composition, and scene context at a deeper level. The GB10 we used at GTC made a 400-photo shoot process in under 30 seconds.

Try it

Running on a laptop right now. Upgrade when you're ready.

Open dashboard See the demo

The AI runs on your machine.Your hardware sets the ceiling.