skills.vishalvoidskills/vishalvoid
Technical & DevelopmentIntermediate

hugging-face-evaluation

Model evaluation with vLLM/lighteval and eval tables

Developer Setup

Setup & Installation

bash
npx skills add https://github.com/huggingface/skills --skill hugging-face-evaluation

Overview

What This Skill Does

Adds and manages evaluation results in Hugging Face model cards using the model-index metadata format. Supports extracting benchmark tables from README files, importing scores from the Artificial Analysis API, and running evaluations with vLLM or lighteval on local GPUs or HF Jobs infrastructure.

Application

When to use this Skill

Documentation

Show Skills.md file

Overview

This skill is for running evaluations against models on the Hugging Face Hub on local hardware.

It covers:

  • inspect-ai with local inference
  • lighteval with local inference
  • choosing between vllm, Hugging Face Transformers, and accelerate
  • smoke tests, task selection, and backend fallback strategy

It does not cover:

  • Hugging Face Jobs orchestration
  • model-card or model-index edits
  • README table extraction
  • Artificial Analysis imports
  • .eval_results generation or publishing
  • PR creation or community-evals automation

If the user wants to run the same eval remotely on Hugging Face Jobs, hand off to the hugging-face-jobs skill and pass it one of the local scripts in this skill.

If the user wants to publish results into the community evals workflow, stop after generating the evaluation run and hand off that publishing step to ~/code/community-evals.

All paths below are relative to the directory containing this SKILL.md.

When To Use Which Script

Lines 1 - 25 of 202

Recommendations

Explore other random skills

All skillsMy patterns