Technical & DevelopmentIntermediate

hugging-face-evaluation

Model evaluation with vLLM/lighteval and eval tables

Developer Setup

Setup & Installation

bash

npx skills add https://github.com/huggingface/skills --skill hugging-face-evaluation

npx skills add https://github.com/huggingface/skills --skill hugging-face-evaluation

Or paste this URL into your assistant to install:

https://github.com/huggingface/skills/tree/main/skills/hugging-face-evaluation View on GitHub

Overview

What This Skill Does

Adds and manages evaluation results in Hugging Face model cards using the model-index metadata format. Supports extracting benchmark tables from README files, importing scores from the Artificial Analysis API, and running evaluations with vLLM or lighteval on local GPUs or HF Jobs infrastructure.

Application

When to use this Skill

Configuring integration settings for custom agent workflows.
Optimizing query execution and response latency in production.
Developing clean, standard-compliant implementations for enterprise services.
Troubleshooting connection timeouts and authentication handshakes.
Monitoring API rate limits and execution pipelines programmatically.

Documentation

Show Skills.md file

Overview

This skill is for running evaluations against models on the Hugging Face Hub on local hardware.

It covers:

inspect-ai with local inference
lighteval with local inference
choosing between vllm, Hugging Face Transformers, and accelerate
smoke tests, task selection, and backend fallback strategy

It does not cover:

Hugging Face Jobs orchestration
model-card or model-index edits
README table extraction
Artificial Analysis imports
.eval_results generation or publishing
PR creation or community-evals automation

If the user wants to run the same eval remotely on Hugging Face Jobs, hand off to the hugging-face-jobs skill and pass it one of the local scripts in this skill.

If the user wants to publish results into the community evals workflow, stop after generating the evaluation run and hand off that publishing step to ~/code/community-evals.

All paths below are relative to the directory containing this SKILL.md.

When To Use Which Script

Lines 1 - 25 of 202

Recommendations

Explore other random skills

EnterpriseIntermediate

azure-keyvault-secrets-rust

Key Vault secret storage

TechnicalIntermediate

azure-storage-blob-rust

Blob object storage client

TechnicalIntermediate

azure-ai-contentsafety-ts

Content safety for text and images

All skills My patterns