SQL
The query
command allows you to run SQL queries against CSV, JSON, and Parquet files as if they were database tables.
Basic Usage
rabbet query --as <table_name> <file> -- "<SQL query>"
Arguments
--as
: Alias name for the table in your SQL queryfile
: Input file (CSV, JSON, or Parquet)--
: Separator between options and the SQL querySQL query
: Standard SQL SELECT statement
Multiple Tables
You can query multiple files by specifying multiple --as
options:
rabbet query --as customers customers.csv --as orders orders.csv -- \
"SELECT * FROM customers JOIN orders ON customers.id = orders.customer_id"
Supported SQL Features
SELECT
with column selection and aliasesFROM
with table aliasesWHERE
with complex conditionsJOIN
(INNER, LEFT, RIGHT, FULL)GROUP BY
with aggregationsORDER BY
with ASC/DESCLIMIT
andOFFSET
- Common functions (COUNT, SUM, AVG, MIN, MAX, etc.)
- String functions (UPPER, LOWER, CONCAT, etc.)
- Date functions (DATE_TRUNC, EXTRACT, etc.)
Examples
Basic Query
Test query command with simple SELECT statement
$ rabbet query --as orders data/orders/orders.csv -- "SELECT * FROM orders WHERE product_id = 'PRODUCT-005'"
╭────────────────────────────────────────────────────────────────────────╮
│ order_id customer_id product_id quantity price order_date │
╞════════════════════════════════════════════════════════════════════════╡
│ ORDER-001 CUSTOMER-003 PRODUCT-005 1 10.0 2022-01-01 │
│ ORDER-002 CUSTOMER-003 PRODUCT-005 2 20.0 2022-01-02 │
╰────────────────────────────────────────────────────────────────────────╯
Aggregation Query
Test query command with aggregation functions
$ rabbet query --as orders data/orders/orders.csv -- "SELECT product_id, COUNT(*) as order_count, SUM(quantity) as total_quantity, AVG(price) as avg_price FROM orders GROUP BY product_id ORDER BY order_count DESC, product_id ASC"
╭────────────────────────────────────────────────────────╮
│ product_id order_count total_quantity avg_price │
╞════════════════════════════════════════════════════════╡
│ PRODUCT-005 2 3 15.0 │
│ PRODUCT-001 1 5 50.0 │
│ PRODUCT-002 1 4 40.0 │
│ PRODUCT-003 1 3 30.0 │
│ PRODUCT-004 1 6 60.0 │
╰────────────────────────────────────────────────────────╯
Complex Join Query
For complex multi-table queries:
rabbet query --as customers data/orders/customers.csv --as orders data/orders/orders.csv -- <<EOF
SELECT
c.customer_name,
c.customer_city,
COUNT(o.order_id) as order_count,
SUM(o.quantity * o.price) as total_spent,
FROM customers c
LEFT JOIN orders o ON c.customer_id = o.customer_id
GROUP BY c.customer_name, c.customer_city
HAVING order_count > 0
ORDER BY total_spent DESC
EOF
╭─────────────────────────────────────────────────────────────╮
│ customer_name customer_city order_count total_spent │
╞═════════════════════════════════════════════════════════════╡
│ Robert Brown Anytown 1 250.0 │
│ Emily Davis Anytown 1 160.0 │
│ Michael Johnson Anytown 3 140.0 │
╰─────────────────────────────────────────────────────────────╯
Notes
- Table names in SQL must match the aliases specified with
--as
- SQL keywords are case-insensitive
- Column names are case-sensitive and must match the file headers
- The query must be a SELECT statement (no INSERT, UPDATE, DELETE)
- Complex queries may require more memory for processing