6 GPU and ML
06_gpu_and_ml
Let’s get to what we’re all here for: GPUs!
6.1 import_torch.py
6.1.1 PyTorch with CUDA GPU support
This example shows how you can use CUDA GPUs in Modal, with a minimal PyTorch image. You can specify GPU requirements in the app.function
decorator.
import_torch.py
import time
import modal
= modal.App(
app "example-import-torch",
=modal.Image.debian_slim().pip_install(
image"torch", find_links="https://download.pytorch.org/whl/cu116"
),
)
@app.function(gpu="any")
def gpu_function():
import subprocess
import torch
"nvidia-smi"])
subprocess.run([print("Torch version:", torch.__version__)
print("CUDA available:", torch.cuda.is_available())
print("CUDA device count:", torch.cuda.device_count())
if __name__ == "__main__":
= time.time()
t0 with app.run():
gpu_function.remote()print("Full time spent:", time.time() - t0)
Let’s run it:
$ modal run 06_gpu_and_ml/import_torch.py
✓ Initialized. View run at https://modal.com/charlotte-llm/main/apps/ap-xxxxxxxxxx
Building image im-q9v0dExl8NyFXzmsp0RKxA
=> Step 0: FROM base
=> Step 1: RUN python -m pip install torch --find-links https://download.pytorch.org/whl/cu116
Looking in indexes: http://pypi-mirror.modal.local:5555/simple
Looking in links: https://download.pytorch.org/whl/cu116
Collecting torch
Downloading http://pypi-mirror.modal.local:5555/simple/torch/torch-2.3.1-cp310-cp310-manylinux1_x86_64.whl (779.1 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 779.1/779.1 MB 249.1 MB/s eta 0:00:00
...
Installing collected packages: mpmath, typing-extensions, sympy, nvidia-nvtx-cu12, nvidia-nvjitlink-cu12, nvidia-nccl-cu12, nvidia-curand-cu12, nvidia-cufft-cu12, nvidia-cuda-runtime-cu12, nvidia-cuda-nvrtc-cu12, nvidia-cuda-cupti-cu12, nvidia-cublas-cu12, networkx, MarkupSafe, fsspec, filelock, triton, nvidia-cusparse-cu12, nvidia-cudnn-cu12, jinja2, nvidia-cusolver-cu12, torch
Attempting uninstall: typing-extensions
Found existing installation: typing_extensions 4.7.0
Uninstalling typing_extensions-4.7.0:
Successfully uninstalled typing_extensions-4.7.0
Successfully installed MarkupSafe-2.1.5 filelock-3.15.1 fsspec-2024.6.0 jinja2-3.1.4 mpmath-1.3.0 networkx-3.3 nvidia-cublas-cu12-12.1.3.1 nvidia-cuda-cupti-cu12-12.1.105 nvidia-cuda-nvrtc-cu12-12.1.105 nvidia-cuda-runtime-cu12-12.1.105 nvidia-cudnn-cu12-8.9.2.26 nvidia-cufft-cu12-11.0.2.54 nvidia-curand-cu12-10.3.2.106 nvidia-cusolver-cu12-11.4.5.107 nvidia-cusparse-cu12-12.1.0.106 nvidia-nccl-cu12-2.20.5 nvidia-nvjitlink-cu12-12.5.40 nvidia-nvtx-cu12-12.1.105 sympy-1.12.1 torch-2.3.1 triton-2.3.1 typing-extensions-4.12.2
[notice] A new release of pip is available: 23.1.2 -> 24.0
[notice] To update, run: pip install --upgrade pip
Creating image snapshot...
Finished snapshot; took 11.22s
Built image im-q9v0dExl8NyFXzmsp0RKxA in 106.04s
✓ Created objects.
├── 🔨 Created mount /Users/ryan/modal-examples/06_gpu_and_ml/import_torch.py
└── 🔨 Created function gpu_function.
Fri Jun 14 20:37:32 2024
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.54.15 Driver Version: 550.54.15 CUDA Version: 12.4 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA A10 On | 00000000:CA:00.0 Off | ERR! |
| 0% 30C ERR! 15W / 150W | 0MiB / 23028MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| No running processes found |
+-----------------------------------------------------------------------------------------+
Torch version: 2.3.1+cu121
CUDA available: True
CUDA device count: 1
Stopping app - local entrypoint completed.
✓ App completed. View run at https://modal.com/charlotte-llm/main/apps/ap-xxxxxxxxxx
6.2 Stable diffusion via HF
For this, you’ll need to create a secret via Modal for Huggingface.
stable_diffusion.py
import io
import os
import modal
= modal.App()
app
@app.function(
=modal.Image.debian_slim().pip_install("torch", "diffusers[torch]", "transformers", "ftfy"),
image=[modal.Secret.from_name("huggingface")],
secrets="any",
gpu
)async def run_stable_diffusion(prompt: str):
from diffusers import StableDiffusionPipeline
= StableDiffusionPipeline.from_pretrained(
pipe "runwayml/stable-diffusion-v1-5",
=os.environ["HF_TOKEN"],
use_auth_token"cuda")
).to(
= pipe(prompt, num_inference_steps=10).images[0]
image
= io.BytesIO()
buf format="PNG")
image.save(buf, = buf.getvalue()
img_bytes
return img_bytes
@app.local_entrypoint()
def main():
= run_stable_diffusion.remote("Tri-color beagle riding a bike in Paris, wearing a black beret, and a baguette in a bag in the bike's front basket.")
img_bytes with open("/tmp/parisian-beagle.png", "wb") as f:
f.write(img_bytes)
Let’s run it!
$