Files
Tomas Kracmar 0a4fb7b55e
Some checks failed
CI / lint-docs (push) Has been cancelled
CI / build-firmware (push) Has been cancelled
CI / test-backend (push) Has been cancelled
CI / test-web (push) Has been cancelled
feat: initial KosmoConnect platform v0.1
Includes:
- Backend services: ingestion (:8001), weather API (:8002),
  gateway (:8003), billing (:8004) with BTCPay integration
- Shared asyncpg pool, TimescaleDB hypertable, Redis, Mosquitto MQTT
- React frontend: Dashboard (MapLibre) and Messaging (chat UI)
- Bridge daemon for Pi + Meshtastic (Serial/TCP T-Deck support)
- Production Docker Compose, Nginx reverse proxy, ops scripts
- DEPLOY.md with step-by-step deployment guide
2026-04-12 17:30:15 +02:00
..

Operations (Ops)

This directory contains all infrastructure-as-code, deployment automation, and monitoring configuration.

Structure

ops/
├── terraform/            # Cloud infrastructure definitions
│   ├── modules/
│   ├── environments/
│   │   ├── staging/
│   │   └── production/
│   └── global/
├── ansible/              # Server provisioning and configuration
│   ├── playbooks/
│   ├── roles/
│   └── inventory/
└── monitoring/           # Observability stack
    ├── prometheus/
    ├── grafana/
    ├── loki/
    └── alertmanager/

Terraform

Defines the cloud infrastructure on the chosen provider (Hetzner, AWS, or DigitalOcean recommended for cost efficiency).

Resources:

  • Kubernetes cluster or Docker Swarm hosts
  • PostgreSQL managed database (or self-hosted)
  • TimescaleDB instance
  • RabbitMQ / Redis managed service
  • Object storage (S3-compatible) for backups and kit assets
  • Load balancers and DNS records
  • VPN / WireGuard for secure bridge-to-cloud communication

Ansible

Playbooks for:

  • Installing Docker and dependencies on bare metal
  • Configuring infrastructure nodes (Raspberry Pi OS setup, bridge daemon deployment)
  • Rotating TLS certificates
  • Security hardening (fail2ban, firewall rules)

Monitoring

Stack: Prometheus + Grafana + Loki + Alertmanager

Metrics:

  • Node uptime and health
  • Message throughput (inbound/outbound)
  • API request rates and error rates
  • Database performance
  • Bridge daemon connectivity

Alerts:

  • Node offline > 6 hours
  • Bridge daemon disconnected > 15 minutes
  • API error rate > 1%
  • Disk space > 85%
  • Subscription payment failures spike