Distributed Systems

SmartQueue

AI-powered adaptive task scheduler with LSTM runtime prediction and K8s autoscaling.

Next.jsFastAPITypeScriptPythonPostgreSQLKubernetesLSTM

1→5

Worker Scaling

<100

Scheduling Latency

2

LSTM Layers

Improving

Prediction Accuracy

01

Problem

Problem statement, constraint shape, and the gap this project explores.

Problem Statement

How do you build a task scheduler that improves its priority decisions over time without human intervention?

Challenge

Built a distributed, multi-tenant task scheduling platform that learns from historical execution patterns using a custom LSTM to predict job runtimes and dynamically adjust priorities.

Why Existing Approaches Failed

Static priority schedulers are brittle — they assign priority at submission time and never adapt. In heterogeneous workloads where job runtime varies by 10x, static priorities cause priority inversion: low-priority short jobs queue behind high-priority long jobs. SmartQueue uses a custom LSTM trained on historical execution data to predict runtime and compute dynamic priority scores, enabling the scheduler to continuously improve.

02

Constraints

Non-negotiable boundaries that shaped the implementation.

Multi-tenancy

Org-scoped job isolation with RBAC

Concurrency

Safe concurrent queue access under workers

Scaling

K8s HPA 1→5 worker pods

ML

LSTM from scratch (NumPy only)

Consistency

No double-processing under parallel workers

03

Architecture

The primary design surface: flow, subsystem roles, and state boundaries.

Architecture Brief

SmartQueue has a Next.js frontend, FastAPI scheduling backend, and PostgreSQL state store. Workers poll the queue using FOR UPDATE SKIP LOCKED for safe concurrent dequeue. The LSTM service runs as a FastAPI sidecar, consuming execution history and outputting runtime predictions used to recalculate priority scores.

Execution Flow
01

Job submit

02

Priority score (LSTM prediction)

03

PostgreSQL queue

04

Worker dequeue (SKIP LOCKED)

05

Execute

06

Record result

07

Retrain LSTM

01

Job API

FastAPI endpoints for job submission, status, and cancellation.

02

Scheduler

PostgreSQL advisory locks for leader election; min-heap for priority queue.

03

Workers

Stateless pods polling queue via FOR UPDATE SKIP LOCKED.

04

LSTM Service

2-layer LSTM built in NumPy trained on execution history.

05

Analytics Dashboard

Next.js frontend showing predicted vs actual runtime and accuracy.

06

K8s HPA

Horizontal pod autoscaler scaling workers 1→5 based on queue depth.

04

Engineering Tradeoffs

Design review notes: what was optimized and what was deliberately left behind.

EDR-01
Decision

FOR UPDATE SKIP LOCKED vs application-level locking

Why Chosen

Application-level locking requires distributed coordination. SKIP LOCKED is a single atomic operation that eliminates double-dequeue without a distributed lock manager.

Alternative Rejected

Database-agnostic implementation

Impact

PostgreSQL FOR UPDATE SKIP LOCKED

EDR-02
Decision

Custom LSTM vs PyTorch

Why Chosen

The goal was full understanding of LSTM mechanics and gradient flow. Using PyTorch would obscure the learning objective.

Alternative Rejected

GPU acceleration, automatic differentiation

Impact

Hand-built 2-layer LSTM in NumPy

EDR-03
Decision

PostgreSQL advisory locks for scheduler election

Why Chosen

PostgreSQL is already in the stack. Adding Redis/etcd for one lock is operational overhead without benefit.

Alternative Rejected

Distributed lock manager (Redis, etcd)

Impact

pg_try_advisory_lock for leader election

05

Failure Modes

Incident-style notes for the ways the design can break.

Priority inversion under high load

FM-01
Impact

LSTM predictions lag actual runtime changes.

Mitigation

by recalculating scores at dequeue time, not just submission time.

Worker crash mid-job

FM-02
Impact

Job locked but not completed.

Mitigation

with job heartbeat timeout — jobs not heartbeating within 30s are re-enqueued.

LSTM training on stale data

FM-03
Impact

Long-lived jobs skew runtime distribution.

Mitigation

with exponential decay weighting — recent executions weighted higher.

K8s HPA lag

FM-04
Impact

Autoscaler reacts after queue depth spikes.

Mitigation

by pre-warming workers based on predicted queue growth from LSTM.

06

Benchmarks

Environment first, numbers second. Metrics should be inspectable, not ornamental.

Test Environment
Runtime

Next.js / FastAPI

Workload

K8s HPA 1→5 workers

Stack

Next.js, FastAPI, TypeScript, Python

Scope

Distributed Systems

Evidence

Project-level benchmark notes

Performance Results
Worker Scaling

1→5pods (HPA)

Scheduling Latency

<100ms

LSTM Layers

2layers (NumPy)

Prediction Accuracy

Improvingover time

Concurrent Workers

5max

07

Lessons Learned

Engineering takeaways from the implementation, including remaining work.

01

FOR UPDATE SKIP LOCKED is underrated

it eliminates an entire class of distributed coordination problems with one SQL clause.

02

Building LSTM from scratch in NumPy is tedious but essential for deep understanding of gate mechanics and gradient flow.

03

K8s HPA reacts to metrics, not predictions. Adding predictive scaling on top of reactive HPA reduces cold-start latency significantly.

04

Future: replace NumPy LSTM with a proper online learning approach that updates weights incrementally rather than full retraining.

Akshat