6  GPU and ML

06_gpu_and_ml

Author

Ryan Wesslen

Let’s get to what we’re all here for: GPUs!

6.1 import_torch.py

6.1.1 PyTorch with CUDA GPU support

This example shows how you can use CUDA GPUs in Modal, with a minimal PyTorch image. You can specify GPU requirements in the app.function decorator.

import_torch.py
import time

import modal

app = modal.App(
    "example-import-torch",
    image=modal.Image.debian_slim().pip_install(
        "torch", find_links="https://download.pytorch.org/whl/cu116"
    ),
)


@app.function(gpu="any")
def gpu_function():
    import subprocess

    import torch

    subprocess.run(["nvidia-smi"])
    print("Torch version:", torch.__version__)
    print("CUDA available:", torch.cuda.is_available())
    print("CUDA device count:", torch.cuda.device_count())


if __name__ == "__main__":
    t0 = time.time()
    with app.run():
        gpu_function.remote()
    print("Full time spent:", time.time() - t0)

Let’s run it:

$ modal run 06_gpu_and_ml/import_torch.py 
 Initialized. View run at https://modal.com/charlotte-llm/main/apps/ap-xxxxxxxxxx
Building image im-q9v0dExl8NyFXzmsp0RKxA

=> Step 0: FROM base

=> Step 1: RUN python -m pip install torch --find-links https://download.pytorch.org/whl/cu116
Looking in indexes: http://pypi-mirror.modal.local:5555/simple
Looking in links: https://download.pytorch.org/whl/cu116
Collecting torch
  Downloading http://pypi-mirror.modal.local:5555/simple/torch/torch-2.3.1-cp310-cp310-manylinux1_x86_64.whl (779.1 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 779.1/779.1 MB 249.1 MB/s eta 0:00:00

...

Installing collected packages: mpmath, typing-extensions, sympy, nvidia-nvtx-cu12, nvidia-nvjitlink-cu12, nvidia-nccl-cu12, nvidia-curand-cu12, nvidia-cufft-cu12, nvidia-cuda-runtime-cu12, nvidia-cuda-nvrtc-cu12, nvidia-cuda-cupti-cu12, nvidia-cublas-cu12, networkx, MarkupSafe, fsspec, filelock, triton, nvidia-cusparse-cu12, nvidia-cudnn-cu12, jinja2, nvidia-cusolver-cu12, torch
  Attempting uninstall: typing-extensions
    Found existing installation: typing_extensions 4.7.0
    Uninstalling typing_extensions-4.7.0:
      Successfully uninstalled typing_extensions-4.7.0
Successfully installed MarkupSafe-2.1.5 filelock-3.15.1 fsspec-2024.6.0 jinja2-3.1.4 mpmath-1.3.0 networkx-3.3 nvidia-cublas-cu12-12.1.3.1 nvidia-cuda-cupti-cu12-12.1.105 nvidia-cuda-nvrtc-cu12-12.1.105 nvidia-cuda-runtime-cu12-12.1.105 nvidia-cudnn-cu12-8.9.2.26 nvidia-cufft-cu12-11.0.2.54 nvidia-curand-cu12-10.3.2.106 nvidia-cusolver-cu12-11.4.5.107 nvidia-cusparse-cu12-12.1.0.106 nvidia-nccl-cu12-2.20.5 nvidia-nvjitlink-cu12-12.5.40 nvidia-nvtx-cu12-12.1.105 sympy-1.12.1 torch-2.3.1 triton-2.3.1 typing-extensions-4.12.2

[notice] A new release of pip is available: 23.1.2 -> 24.0
[notice] To update, run: pip install --upgrade pip
Creating image snapshot...
Finished snapshot; took 11.22s

Built image im-q9v0dExl8NyFXzmsp0RKxA in 106.04s
 Created objects.
├── 🔨 Created mount /Users/ryan/modal-examples/06_gpu_and_ml/import_torch.py
└── 🔨 Created function gpu_function.
Fri Jun 14 20:37:32 2024       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.54.15              Driver Version: 550.54.15      CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA A10                     On  |   00000000:CA:00.0 Off |                 ERR! |
|  0%   30C ERR!              15W /  150W |       0MiB /  23028MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                                                         
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|  No running processes found                                                             |
+-----------------------------------------------------------------------------------------+
Torch version: 2.3.1+cu121
CUDA available: True
CUDA device count: 1
Stopping app - local entrypoint completed.
 App completed. View run at https://modal.com/charlotte-llm/main/apps/ap-xxxxxxxxxx

6.2 Stable diffusion via HF

For this, you’ll need to create a secret via Modal for Huggingface.

stable_diffusion.py
import io
import os

import modal

app = modal.App()


@app.function(
    image=modal.Image.debian_slim().pip_install("torch", "diffusers[torch]", "transformers", "ftfy"),
    secrets=[modal.Secret.from_name("huggingface")],
    gpu="any",
)
async def run_stable_diffusion(prompt: str):
    from diffusers import StableDiffusionPipeline

    pipe = StableDiffusionPipeline.from_pretrained(
        "runwayml/stable-diffusion-v1-5",
        use_auth_token=os.environ["HF_TOKEN"],
    ).to("cuda")

    image = pipe(prompt, num_inference_steps=10).images[0]

    buf = io.BytesIO()
    image.save(buf, format="PNG")
    img_bytes = buf.getvalue()

    return img_bytes


@app.local_entrypoint()
def main():
    img_bytes = run_stable_diffusion.remote("Tri-color beagle riding a bike in Paris, wearing a black beret, and a baguette in a bag in the bike's front basket.")
    with open("/tmp/parisian-beagle.png", "wb") as f:
        f.write(img_bytes)

Let’s run it!

$