[GH-ISSUE #21] feat: Structured log aggregator for Docker container observability #5

Closed
opened 2026-03-02 05:12:26 +03:00 by kerem · 0 comments
Owner

Originally created by @dviejokfs on GitHub (Feb 26, 2026).
Original GitHub issue: https://github.com/gotempsh/temps/issues/21

Summary

Add a comprehensive structured log aggregation system for Docker container observability. The platform currently has no way to collect, store, search, or stream logs from deployed containers.

Requirements

Core Log Collection

  • Real-time Docker log streaming from running containers
  • Automatic container discovery via sh.temps.* Docker labels
  • Streaming resilience: reconnect tracking, container-gone detection, bounded retries
  • Docker events listener with permanent liveness (outer retry loop)

Storage

  • Compressed NDJSON chunk storage using zstd compression
  • Pluggable storage backends: filesystem (default) and S3
  • Configurable via TEMPS_LOG_STORAGE_BACKEND and TEMPS_LOG_S3_* env vars

Search & Retrieval

  • Dual search paths: TimescaleDB index for ERROR/WARN (fast), archive scan for full-text (thorough)
  • Full-text search with 24-hour time range limit
  • Field-based filtering (eq, gt operators on structured JSON fields)
  • Level, service, environment, and deploy_id filtering
  • Cursor-based pagination
  • Context retrieval (surrounding lines around a match)

Live Tail

  • Server-Sent Events (SSE) streaming with project/service/level filtering
  • 30-minute inactivity auto-close

Retention & Cleanup

  • Configurable retention policies (default: 30 days chunks, 7 days events)
  • Manual purge endpoint with audit logging
  • Automated 24-hour retention scheduler

Security

  • Permission-guarded handlers: LogsRead for search/context/tail, LogsDelete for purge
  • Audit logging on destructive operations (purge)

Technical Design

  • New temps-log-aggregator crate with: parser, storage, chunk_writer, collector, metadata, search, tail, retention services
  • TimescaleDB tables: log_chunks (chunk metadata), log_events (indexed ERROR/WARN lines)
  • project_id_to_uuid() bridges platform's i32 IDs to UUID-based log storage
  • Plugin system integration via LogAggregatorPlugin

Acceptance Criteria

  • 101 tests passing (unit + integration)
  • Zero compiler warnings
  • Compression roundtrip verified at scale (1000+ lines)
  • Permission guards tested (Reader cannot purge, can search)
  • Full-text search verified through storage roundtrip
  • No regressions in existing crates (temps-auth, temps-deployer)
Originally created by @dviejokfs on GitHub (Feb 26, 2026). Original GitHub issue: https://github.com/gotempsh/temps/issues/21 ## Summary Add a comprehensive structured log aggregation system for Docker container observability. The platform currently has no way to collect, store, search, or stream logs from deployed containers. ## Requirements ### Core Log Collection - Real-time Docker log streaming from running containers - Automatic container discovery via `sh.temps.*` Docker labels - Streaming resilience: reconnect tracking, container-gone detection, bounded retries - Docker events listener with permanent liveness (outer retry loop) ### Storage - Compressed NDJSON chunk storage using zstd compression - Pluggable storage backends: filesystem (default) and S3 - Configurable via `TEMPS_LOG_STORAGE_BACKEND` and `TEMPS_LOG_S3_*` env vars ### Search & Retrieval - Dual search paths: TimescaleDB index for ERROR/WARN (fast), archive scan for full-text (thorough) - Full-text search with 24-hour time range limit - Field-based filtering (eq, gt operators on structured JSON fields) - Level, service, environment, and deploy_id filtering - Cursor-based pagination - Context retrieval (surrounding lines around a match) ### Live Tail - Server-Sent Events (SSE) streaming with project/service/level filtering - 30-minute inactivity auto-close ### Retention & Cleanup - Configurable retention policies (default: 30 days chunks, 7 days events) - Manual purge endpoint with audit logging - Automated 24-hour retention scheduler ### Security - Permission-guarded handlers: `LogsRead` for search/context/tail, `LogsDelete` for purge - Audit logging on destructive operations (purge) ## Technical Design - New `temps-log-aggregator` crate with: parser, storage, chunk_writer, collector, metadata, search, tail, retention services - TimescaleDB tables: `log_chunks` (chunk metadata), `log_events` (indexed ERROR/WARN lines) - `project_id_to_uuid()` bridges platform's i32 IDs to UUID-based log storage - Plugin system integration via `LogAggregatorPlugin` ## Acceptance Criteria - [ ] 101 tests passing (unit + integration) - [ ] Zero compiler warnings - [ ] Compression roundtrip verified at scale (1000+ lines) - [ ] Permission guards tested (Reader cannot purge, can search) - [ ] Full-text search verified through storage roundtrip - [ ] No regressions in existing crates (temps-auth, temps-deployer)
kerem closed this issue 2026-03-02 05:12:27 +03:00
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/temps#5
No description provided.