Technical & DevelopmentIntermediate

hugging-face-model-trainer

Train models with TRL: SFT, DPO, GRPO, GGUF conversion

Developer Setup

Setup & Installation

bash

npx skills add https://github.com/huggingface/skills --skill hugging-face-model-trainer

npx skills add https://github.com/huggingface/skills --skill hugging-face-model-trainer

Or paste this URL into your assistant to install:

https://github.com/huggingface/skills/tree/main/skills/hugging-face-model-trainer View on GitHub

Overview

What This Skill Does

Trains and fine-tunes language models on Hugging Face's cloud GPU infrastructure using TRL. Supports SFT, DPO, GRPO, and reward modeling. Handles job submission, dataset validation, cost estimation, and GGUF conversion for local deployment.

Application

When to use this Skill

Configuring integration settings for custom agent workflows.
Optimizing query execution and response latency in production.
Developing clean, standard-compliant implementations for enterprise services.
Troubleshooting connection timeouts and authentication handshakes.
Monitoring API rate limits and execution pipelines programmatically.

Documentation

Show Skills.md file

TRL Training on Hugging Face Jobs

Overview

Train language models using TRL (Transformer Reinforcement Learning) on fully managed Hugging Face infrastructure. No local GPU setup required—models train on cloud GPUs and results are automatically saved to the Hugging Face Hub.

TRL provides multiple training methods:

SFT (Supervised Fine-Tuning) - Standard instruction tuning
DPO (Direct Preference Optimization) - Alignment from preference data
GRPO (Group Relative Policy Optimization) - Online RL training
Reward Modeling - Train reward models for RLHF

For detailed TRL method documentation:

hf_doc_search("your query", product="trl")
hf_doc_fetch("https://huggingface.co/docs/trl/sft_trainer")  # SFT
hf_doc_fetch("https://huggingface.co/docs/trl/dpo_trainer")  # DPO
# etc.

See also: references/training_methods.md for method overviews and selection guidance

When to Use This Skill

Use this skill when users want to:

Lines 1 - 25 of 732

Recommendations

Explore other random skills

TechnicalAdvanced

acul-screen-generator

Generates complete, branded Auth0 Advanced Custom Universal Login (ACUL) screen implementations using the React or Vanilla JS SDK. Use when a developer asks to create, add, or modify ACUL login screens with custom branding, social login, theming, or specific authentication flows. Triggers on requests like "generate a custom login screen", "add a signup screen to my ACUL project", "customize my Auth0 Universal Login with our brand colors", "apply our theme to all ACUL screens", or any task involving Auth0 Universal Login customization with @auth0/auth0-acul-react or @auth0/auth0-acul-js.

TechnicalIntermediate

auth0-android

Use when adding authentication to Android applications (Kotlin/Java) with Web Auth, biometric-protected credentials, and MFA - integrates com.auth0.android:auth0 SDK for native Android apps

TechnicalIntermediate

auth0-angular

Use when adding authentication to Angular applications with route guards and HTTP interceptors - integrates @auth0/auth0-angular SDK for SPAs

All skills My patterns