Technical & DevelopmentIntermediate

hugging-face-jobs

Run compute jobs and Python scripts on HF infrastructure

Developer Setup

Setup & Installation

bash

npx skills add https://github.com/huggingface/skills --skill hugging-face-jobs

npx skills add https://github.com/huggingface/skills --skill hugging-face-jobs

Or paste this URL into your assistant to install:

https://github.com/huggingface/skills/tree/main/skills/hugging-face-jobs View on GitHub

Overview

What This Skill Does

Runs Python workloads on Hugging Face managed infrastructure — CPUs, GPUs, or TPUs — without any local setup. Supports UV scripts with inline dependencies, Docker-based jobs, scheduled tasks, and result persistence to the Hugging Face Hub. Handles authentication, secrets management, hardware selection, and timeout configuration.

Application

When to use this Skill

Configuring integration settings for custom agent workflows.
Optimizing query execution and response latency in production.
Developing clean, standard-compliant implementations for enterprise services.
Troubleshooting connection timeouts and authentication handshakes.
Monitoring API rate limits and execution pipelines programmatically.

Documentation

Show Skills.md file

Hugging Face ZeroGPU

Rules and patterns for ML demos on Hugging Face Spaces with ZeroGPU hardware. Covers @spaces.GPU, duration and quota tuning, process isolation, the CUDA availability model, concurrency safety, and CUDA build constraints.

Scope

This skill is for Gradio SDK Spaces using ZeroGPU hardware. Docker and Static Spaces cannot schedule onto ZeroGPU, and Streamlit apps now run as Docker Spaces — so this skill applies only to Gradio. For general Gradio coding (components, layouts, event listeners), see the huggingface-gradio skill in this repo. The authoritative ZeroGPU docs live at https://huggingface.co/docs/hub/spaces-zerogpu — refer to them for the current backing GPU, runtime version lists, and tier thresholds, all of which change over time.

Reference Files

Reference	When to read
`references/concurrency.md`	Always read alongside SKILL.md when writing ZeroGPU code — handlers run in parallel by default
`references/how-zerogpu-works.md`	When reasoning about cold-starts, worker reuse, why module-scope warmup does not carry to requests, or why returning CUDA tensors hangs
`references/how-quota-works.md`	When choosing `duration` values, debugging `illegal duration` vs `quota exceeded` errors, or explaining why default 60s blocks short tasks
`references/cuda-and-deps.md`	When installing CUDA-dependent packages (e.g. `flash-attn`), pinning torch side-cars, or reading wheel filename tags

Hardware

ZeroGPU exposes two GPU sizes that map to a fraction of the backing card:

`size`	Slice of backing GPU	Quota cost
`large` (default)	Half	1x
`xlarge`	Full	2x

Lines 1 - 25 of 284

Recommendations

Explore other random skills

TechnicalIntermediate

azure-speech-to-text-rest-py

REST speech-to-text for short audio

TechnicalIntermediate

azure-storage-blob-py

Blob object storage client

TechnicalIntermediate

azure-storage-file-datalake-py

Hierarchical data lake storage

All skills My patterns