Office & DocumentsIntermediate
read-file
Read any data file (CSV, JSON, Parquet, Avro, Excel, spatial) local...
Developer Setup
Setup & Installation
bash
npx skills add https://github.com/duckdb/duckdb-skills --skill read-filenpx skills add https://github.com/duckdb/duckdb-skills --skill read-fileOr paste this URL into your assistant to install:
Overview
What This Skill Does
Read any data file (CSV, JSON, Parquet, Avro, Excel, spatial) locally or from remote storage
Application
When to use this Skill
- Integrating read file into your development workflow.
- Following best practices for read any data file (csv, json, parquet, avro, excel, spatial) locally or from remote storage.
- Automating repetitive tasks with AI-assisted tooling.
- Building production-grade applications with proper standards.
- Debugging and troubleshooting common implementation issues.
Documentation
Show Skills.md file
You are helping the user read and analyze a data file using DuckDB.
Filename given: $0
Question: ${1:-describe the data}
Step 1 — Read it
RESOLVED_PATH is $0. If the user gave a bare filename (no /), resolve it to a full path with find first.
Run a single DuckDB command that defines the read_any macro inline and reads the file.
For remote files, prepend the necessary LOAD/SECRET before the macro:
| Protocol | Prepend |
|---|---|
https:// / http:// |
LOAD httpfs; |
s3:// |
LOAD httpfs; CREATE SECRET (TYPE S3, PROVIDER credential_chain); |
gs:// / gcs:// |
LOAD httpfs; CREATE SECRET (TYPE GCS, PROVIDER credential_chain); |
az:// / azure:// / abfss:// |
LOAD httpfs; LOAD azure; CREATE SECRET (TYPE AZURE, PROVIDER credential_chain); |
For local files, no prefix needed.
duckdb -csv -c "
CREATE OR REPLACE MACRO read_any(file_name) AS TABLE
WITH json_case AS (FROM read_json_auto(file_name))
, csv_case AS (FROM read_csv(file_name))
, parquet_case AS (FROM read_parquet(file_name))
, avro_case AS (FROM read_avro(file_name))
, blob_case AS (FROM read_blob(file_name))
, spatial_case AS (FROM st_read(file_name))
, excel_case AS (FROM read_xlsx(file_name))
, sqlite_case AS (FROM sqlite_scan(file_name, (SELECT name FROM sqlite_master(file_name) LIMIT 1)))
, ipynb_case AS (
WITH nb AS (FROM read_json_auto(file_name))
SELECT cell_idx, cell.cell_type,
array_to_string(cell.source, '') AS source,
cell.execution_count
FROM nb, UNNEST(cells) WITH ORDINALITY AS t(cell, cell_idx)
ORDER BY cell_idx
)
FROM query_table(
CASE
WHEN file_name ILIKE '%.json' OR file_name ILIKE '%.jsonl' OR file_name ILIKE '%.ndjson' OR file_name ILIKE '%.geojson' OR file_name ILIKE '%.geojsonl' OR file_name ILIKE '%.har' THEN 'json_case'
WHEN file_name ILIKE '%.csv' OR file_name ILIKE '%.tsv' OR file_name ILIKE '%.tab' OR file_name ILIKE '%.txt' THEN 'csv_case'
WHEN file_name ILIKE '%.parquet' OR file_name ILIKE '%.pq' THEN 'parquet_case'
WHEN file_name ILIKE '%.avro' THEN 'avro_case'
WHEN file_name ILIKE '%.xlsx' OR file_name ILIKE '%.xls' THEN 'excel_case'
WHEN file_name ILIKE '%.shp' OR file_name ILIKE '%.gpkg' OR file_name ILIKE '%.fgb' OR file_name ILIKE '%.kml' THEN 'spatial_case'
WHEN file_name ILIKE '%.ipynb' THEN 'ipynb_case'
WHEN file_name ILIKE '%.db' OR file_name ILIKE '%.sqlite' OR file_name ILIKE '%.sqlite3' THEN 'sqlite_case'
ELSE 'blob_case'
END
);
DESCRIBE FROM read_any('RESOLVED_PATH');
SELECT count(*) AS row_count FROM read_any('RESOLVED_PATH');
FROM read_any('RESOLVED_PATH') LIMIT 20;
"
Lines 1 - 60 of 71
Recommendations