SKILL.md 7.8 KB


name: timescaledb

description: Manage time-series data in PostgreSQL with TimescaleDB. Use this skill to install, configure, optimize, and interact with TimescaleDB for high-performance time-series data storage and analysis. This includes creating hypertables, continuous aggregates, handling data retention, and querying time-series data efficiently.

TimescaleDB Skill

Manage time-series data in PostgreSQL using TimescaleDB, extending PostgreSQL for high-performance time-series workloads.

When to Use This Skill

Use this skill when you need to:

  • Work with time-series data in PostgreSQL
  • Install and configure TimescaleDB
  • Create and manage hypertables
  • Optimize performance for time-series data
  • Implement continuous aggregates for rollup data
  • Manage data retention and compression
  • Query and analyze time-series data
  • Migrate existing PostgreSQL tables to hypertables
  • Integrate with other PostgreSQL tools and extensions

Not For / Boundaries

This skill is NOT for:

  • General PostgreSQL administration (use a specific PostgreSQL skill for that)
  • Deep database tuning unrelated to time-series performance
  • Replacing dedicated time-series databases if TimescaleDB's PostgreSQL foundation is not a requirement
  • Providing data visualization beyond basic SQL queries (use a BI tool or separate visualization library)

Quick Reference

Installation & Configuration

Install TimescaleDB Extension (Debian/Ubuntu):

sudo apt install -y postgresql-{{pg_version}}-timescaledb
sudo pg_createcluster {{pg_version}} main --start
sudo pg_ctlcluster {{pg_version}} main start
sudo -u postgres psql -c "CREATE EXTENSION IF NOT EXISTS timescaledb CASCADE;"

(Replace {{pg_version}} with your PostgreSQL version, e.g., 16)

Configuration (postgresql.conf):

# Add to postgresql.conf
shared_preload_libraries = 'timescaledb'
timescaledb.max_background_workers = 8 # Adjust based on CPU cores
max_connections = 100 # Adjust based on workload

(After changes, restart PostgreSQL: sudo systemctl restart postgresql)

Hypertables

Create Hypertables:

CREATE TABLE sensor_data (
  time TIMESTAMPTZ NOT NULL,
  device_id INT,
  temperature DOUBLE PRECISION,
  humidity DOUBLE PRECISION
);

SELECT create_hypertable('sensor_data', 'time');

Convert Existing Table to Hypertable:

SELECT create_hypertable('your_existing_table', 'time_column', migrate_data => true);

Show Hypertables:

\d+
SELECT * FROM timescaledb_information.hypertables;

Continuous Aggregates

Create Continuous Aggregate:

CREATE MATERIALIZED VIEW device_hourly_summary
WITH (timescaledb.continuous) AS
SELECT
  time_bucket('1 hour', time) AS bucket,
  device_id,
  AVG(temperature) AS avg_temp,
  MAX(temperature) AS max_temp
FROM sensor_data
GROUP BY time_bucket('1 hour', time), device_id
WITH NO DATA; -- Initially create without data

-- Refresh the continuous aggregate
CALL refresh_continuous_aggregate('device_hourly_summary', NULL, NULL);

Get Continuous Aggregates Info:

SELECT * FROM timescaledb_information.continuous_aggregates;

Data Retention & Compression

Set Data Retention Policy (Drop data older than 3 months):

SELECT add_retention_policy('sensor_data', INTERVAL '3 months');

Enable Compression (Compress data older than 7 days):

ALTER TABLE sensor_data SET (timescaledb.compress = TRUE);
SELECT add_compression_policy('sensor_data', INTERVAL '7 days');

Show Compression Status:

SELECT * FROM timescaledb_information.compression_settings;

Querying Time-Series Data

Basic Time-Range Query:

SELECT * FROM sensor_data
WHERE time >= NOW() - INTERVAL '1 day'
  AND time < NOW()
ORDER BY time DESC;

Gapfilling and Interpolation:

SELECT
  time_bucket('1 hour', time) AS bucket,
  AVG(temperature) AS avg_temp,
  locf(AVG(temperature)) OVER (ORDER BY time_bucket('1 hour', time)) AS avg_temp_locf
FROM sensor_data
GROUP BY bucket
ORDER BY bucket;

High-Performance Queries

Approximate Count:

SELECT COUNT(*) FROM sensor_data TABLESAMPLE BERNOULLI (1);

Top-N Queries:

SELECT time, device_id, temperature
FROM sensor_data
WHERE time >= NOW() - INTERVAL '1 day'
ORDER BY temperature DESC
LIMIT 10;

Examples

Example 1: IoT Sensor Data Pipeline

  • Input: Stream of sensor readings (time, device_id, value)
  • Steps:
    1. Create a hypertable for iot_readings.
    2. Ingest data into the hypertable.
    3. Create a continuous aggregate to compute hourly average readings.
    4. Query the continuous aggregate for a specific device's hourly trend.
    5. Set a retention policy to keep only 1 year of raw data.
  • Expected output / acceptance: Efficient storage, automatic hourly rollups, and proper data pruning.

Example 2: Financial Tick Data Analysis

  • Input: High-frequency financial tick data (timestamp, symbol, price, volume)
  • Steps:
    1. Create a hypertable tick_data with proper chunk sizing for high ingest rate.
    2. Enable compression for older tick_data.
    3. Query tick_data to calculate 5-minute VWAP (Volume Weighted Average Price) for a specific symbol.
    4. Visualize the VWAP over the last trading day.
  • Expected output / acceptance: Ability to ingest and analyze millions of rows/second, with optimized storage and fast analytical queries.

Example 3: Monitoring System Metrics

  • Input: Server metrics (timestamp, host_id, cpu_usage, memory_usage, network_io)
  • Steps:
    1. Create a hypertable system_metrics partitioned by time and host_id.
    2. Use a time_bucket_gapfill query to find CPU usage for all hosts over the last 24 hours, filling in missing data points.
    3. Create an alert based on MAX(cpu_usage) exceeding a threshold using a continuous aggregate.
  • Expected output / acceptance: Comprehensive monitoring with gap-filled data for visualization and real-time alerting.

References

  • references/installation.md: Detailed installation and setup
  • references/hypertables.md: Deep dive into hypertable management
  • references/continuous_aggregates.md: Advanced continuous aggregate techniques
  • references/compression.md: Comprehensive guide to data compression
  • references/api.md: TimescaleDB SQL functions and commands reference
  • references/performance.md: Performance tuning and best practices
  • references/getting_started.md: Official TimescaleDB Getting Started Guide
  • references/llms.md: Using TimescaleDB with LLMs (e.g., storing embeddings, RAG)
  • references/llms-full.md: Full LLM integration scenarios
  • references/tutorials.md: Official TimescaleDB Tutorials and Use Cases
  • references/time_buckets.md: Guide to time_bucket and gapfilling functions
  • references/hyperfunctions.md: Advanced analytical functions for time-series

Maintenance

  • Sources: Official TimescaleDB Documentation, GitHub repository, blog posts.
  • Last updated: 2025-12-17
  • Known limits: This skill focuses on core TimescaleDB features. Advanced PostgreSQL features (e.g., PostGIS, JSONB) are covered by other specialized skills.

Troubleshooting

Slow Queries

  • Ensure indexes are on time and other frequently queried columns.
  • Verify chunk sizing is appropriate for your data ingestion rate.
  • Use EXPLAIN ANALYZE to identify bottlenecks.
  • Consider creating continuous aggregates for frequently accessed aggregated data.

High Disk Usage

  • Implement data retention policies for older, less critical data.
  • Enable compression for older chunks.
  • Regularly run VACUUM ANALYZE on your tables.

Failed to Create Hypertable

  • Ensure the time column is TIMESTAMPTZ or a supported integer type.
  • The table must be empty or you must use migrate_data => true.
  • Check for existing triggers or foreign keys that might conflict.

This skill provides a robust foundation for managing time-series data with TimescaleDB!