Project documentation

AI for Kuala Lumpur — Documents

This page is the full documentation layer of the project. It explains the origin of the platform, the business problem it addresses, the architecture choices, the data logic, the AI copilot design, the implementation roadmap, and the value of the project from both an engineering and recruiter perspective. It is designed to act as a hybrid between a product handbook, a README, an interview support, and a portfolio-ready technical narrative.

Why this project exists

Project vision

AI for Kuala Lumpur is a real-time smart city intelligence platform designed to monitor urban traffic, air quality, weather, humidity, and transport conditions through a modern enterprise-style data stack.

The project was intentionally designed as more than a dashboard. It aims to simulate a complete decision-support platform where data is ingested, transformed, served, interpreted, and finally turned into AI-assisted operational recommendations.

It demonstrates strong skills in Data Engineering, Analytics Engineering, AI Engineering, dashboard design, real-time serving, and product thinking.

How the platform was born

Project story

The project started from a simple idea: many dashboards can display data, but far fewer projects show how data is ingested, how it is served live, how it is transformed analytically, and how it can be turned into operational reasoning through AI.

The first objective was to create a visible product layer so that the use case would immediately feel concrete. Once the product angle was clear, the backend, live serving logic, warehouse layer, and AI copilot could be progressively introduced.

This evolution mirrors how real enterprise systems are often built: first make the use case tangible, then industrialize the data stack behind it.

What the project is trying to solve

Problem statement

Urban operations generate multiple types of signals: mobility, environment, weather, and transport reliability. In practice, these signals often live in separate systems and are difficult to align in real time.

Decision makers usually face two gaps: live data can be noisy and hard to interpret, while historical analytics are more stable but less immediate. This project addresses that tension directly.

The platform is therefore designed to combine live visibility, analytical perspective, and AI-assisted interpretation in a single operational surface.

Operational and business value

Why this project is useful

From an operational point of view, the project helps answer questions such as: Which district is currently under the most pressure? Is the issue driven by traffic, AQI, temperature, or transit delays? Which districts look structurally more sensitive according to warehouse analytics?

From a business point of view, the platform demonstrates how a company can move from passive dashboards to active decision-support systems combining data engineering, analytics, and AI.

This is useful for consulting, public-sector transformation, mobility platforms, urban services, smart city experimentation, and any context where operational data must be interpreted quickly and responsibly.

Product walkthrough

Current pages and features

Live Overview shows the current operational picture of Kuala Lumpur through traffic, AQI, temperature, humidity, and transit metrics, now with a multi-district overview and a live district focus.

Live Map visualizes district activity geographically and serves as the spatial entry point of the platform.

AI Copilot interprets the current snapshot, warehouse context, and governance logic to produce grounded operational, analytical, or explanatory answers.

Documents acts as the project handbook, technical note, README equivalent, and recruiter-friendly project narrative.

Governance is intended to explain data lineage, quality rules, control logic, AI grounding principles, and enterprise governance alignment.

How city signals are represented

Data sources and APIs

In its current state, the platform uses simulated city signals in order to reproduce realistic operating conditions while keeping the project simple, free, reproducible, and under full control.

The current live layer generates district snapshots including traffic index, AQI, temperature, humidity, and transit delay. These signals are then pushed through the serving layer to simulate how a city monitoring platform would behave in practice.

In a production version, the platform could connect to real APIs such as:
• traffic APIs (Google Maps, TomTom, Waze-like services)
• air quality APIs (OpenAQ, governmental sensor networks)
• weather APIs (OpenWeather, Meteostat)
• transport APIs (GTFS feeds, city transport data)
• event / disruption feeds (public notices, city incidents)

This design choice is important: the project is not limited by a specific external provider. It is built as an extensible architecture able to ingest real signals later without changing the overall product logic.

How data moves across the platform

End-to-end data pipeline

The project follows a clear end-to-end pipeline logic: a producer generates raw city signals, a consumer processes them, Redis stores the latest operational state, FastAPI exposes that state to the frontend, and DuckDB/dbt provide analytical transformations for warehouse-level reasoning.

This separation is intentional. The live layer is optimized for freshness and rapid serving. The warehouse layer is optimized for structured transformation, aggregation, risk scoring, and stable analytical consumption.

The current pipeline includes:
• raw signal generation
• live cache storage in Redis
• API serving via FastAPI
• analytical storage in DuckDB
• transformation logic in dbt marts
• consumption by dashboard pages and AI endpoints

Why both layers are necessary

Live data vs warehouse analytics

Live data answers the question: what is happening now? It is immediate, but it can be noisy, local, and highly variable.

Warehouse analytics answer the question: what is consistently happening across districts or over time? They are less immediate but much better suited for comparison, aggregation, prioritization, and structured reasoning.

The project is strong precisely because it does not choose between the two. It combines both layers and uses AI to bridge the gap between real-time perception and analytical interpretation.

How the assistant reasons

AI copilot and RAG logic

The AI copilot is not designed as a generic chatbot. It is designed as a grounded assistant specialized on the platform. Its role is to interpret the current city state, connect it to warehouse context, and explain the system using governance-aware reasoning.

Its RAG logic is based on three context layers: live snapshot context, warehouse analytical context, and governance knowledge context. This allows the assistant to answer operational, analytical, and explanatory questions.

This is also an important design decision for trust: the assistant should not invent a city narrative unsupported by the platform. It should answer from structured project context.

Over time, this layer can evolve into a more advanced enterprise-grade assistant with retrieval scoring, memory, richer documentation search, and stronger explainability.

Current and target architecture

Technical stack

Frontend: Next.js, TypeScript, Tailwind CSS, component-based dashboard architecture, multilingual support.

Backend: FastAPI, REST endpoints, live serving endpoints, AI copilot endpoint, warehouse refresh/status endpoints.

Streaming foundation: producer/consumer logic, Redis live cache, local real-time serving mode.

Warehouse foundation: DuckDB analytical database, dbt transformations, marts for risk and latest district logic.

AI layer: LLM-compatible endpoint (Groq/OpenAI style), intent routing, grounded generation, governance-aware context injection.

Why it was designed this way

Architecture logic

The platform separates the presentation layer, the serving/API layer, the streaming/live layer, and the warehouse / analytics layer.

This mirrors how enterprise products are built: data is ingested, cached or stored, transformed, served through APIs, and then consumed by dashboards and AI systems.

The architecture also supports two deployment mindsets: a richer local real-time engineering mode, and a simpler demo-compatible mode suitable for public frontend deployment.

What made the project hard

Main difficulties encountered

One difficulty was balancing realism and simplicity. Real-time systems can quickly become operationally heavy, especially when mixing producer/consumer logic, Redis, warehouse refresh, and frontend rendering.

Another challenge was making the frontend credible. Some early visual blocks looked decorative rather than meaningful, so the interface had to be reoriented toward true operational value.

The AI layer also created an important challenge: how to make the copilot more natural and intelligent without letting it hallucinate? This is why intent routing, structured context, and grounded answers matter so much.

What will be added next

Implementation roadmap

Phase 1: premium frontend, live overview, map, Redis-powered serving, API foundation, warehouse risk exposure, and AI copilot V1.

Phase 2: stronger documents and governance layers, multilingual consistency, cleaner UX, and better multi-page product structure.

Phase 3: richer warehouse automation, improved refresh logic, stronger analytical comparisons, and more realistic district-level intelligence.

Phase 4: advanced AI assistant, better retrieval, deeper governance integration, and eventually real external APIs and production-style ingestion.

Enterprise readiness

Why this roadmap matters

It shows a progression from visible product value to full data-platform maturity.

This is exactly the kind of evolution recruiters and managers like to see: not just a static showcase, but a project that can grow toward architecture, orchestration, analytics, and AI responsibility.

Each iteration adds a new proof point: frontend quality, backend logic, data movement, transformation, governance, and business-oriented AI.

Real-world constraints

Deployment challenges & technical decisions

This project was not only about building features, but also about facing real-world deployment constraints. The initial objective was to deploy a fully real-time architecture with a streaming pipeline, background worker, Redis cache, and a live dashboard behaving exactly like in local development.

The first attempt was done with Vercel. However, Vercel is optimized for frontend and serverless functions, not for long-running backend processes or real-time pipelines. This led to build errors, configuration issues, and limitations around persistent services.

The architecture was then migrated to a combination of Netlify (frontend) and Render (backend API). This solved deployment stability, but introduced another constraint: background workers required to simulate real-time streaming are not available for free in most platforms.

Instead of forcing a paid infrastructure, a pragmatic solution was designed: the frontend itself triggers live data generation every few seconds through the API. This preserves the real-time visual experience while remaining fully free and deployable.

This dual-mode architecture is intentional:
- Local mode: full producer/consumer streaming with Redis
- Public demo mode: frontend-driven live simulation

This demonstrates the ability to adapt architecture to real constraints while preserving product experience.

This phase of the project highlights an important engineering mindset: building a system is not only about technology, but also about trade-offs, cost constraints, and delivering a convincing experience under real-world limitations.

What the finished platform becomes

Final target state

A modern enterprise AI platform combining real-time ingestion, live serving, warehouse transformation, analytical reasoning, AI copilots, and decision-support dashboards.

In portfolio terms, this becomes much more than a dashboard. It becomes a concrete demonstration of product thinking, engineering structure, AI grounding, documentation quality, and enterprise readiness.

In README terms, this page can serve as the narrative backbone of the repository: what the project does, why it matters, how it works, where the data comes from, what was difficult, and how the platform can evolve.