Est. MMXXIV London · AI Systems Reach →

Ozgur Guler · A public record

Production AI, from first principles.

I build, write about, and ship production AI systems — agent harnesses, long-running workflows, inference architecture, EvalOps, and the infrastructure underneath.

A curated record: books in progress, essays, code, build notes, talks, and selected startup work. No hype, no metrics theatre — just the trail.

Now
Drafting AI Inference Engineering and AI Agents in Production.
Building
Durable agent harnesses on Azure AI Foundry · MCP boundaries · EvalOps.
Open to
Talks, sober technical consulting, and selected startup work.

Selected work

A curated index.

view all
  1. № 01 AI Inference Engineering Serving stacks, latency, throughput, KV-cache pressure, GPU economics, and AI factory architecture. Book — in progress
  2. № 02 AI Agents in Production Memory, durable execution, tool boundaries, evals, replay, governance, and enterprise deployment. Book — in progress
  3. № 03 foundry-demo Hosted agents, tracing, governance, grounding, MCP tooling, and A2A flows — packaged as a hands-on workshop. Code — Azure AI Foundry workshop
  4. № 04 agent-framework-ozg Runnable samples for agents, workflows, memory, reasoning, and Azure AI Foundry integration. Code — workshop fork
  5. № 05 Production Agent Workflows: Orchestration and Observability Typed workflow graphs, fan-out/fan-in execution, checkpointing, human approval gates, and telemetry. Essay
  6. № 06 Local PII Pre-Filter with Presidio and Qwen 2.5 A local pre-prompt guardrail using Presidio and a small CPU model to reduce PII exposure before LLM calls. Essay — guardrails

Practice

Two tracks, one discipline.

Production AI is the meeting point of model behaviour, infrastructure economics, and operator trust. I work where those three meet.

Agents

  • Memory, state, and forgetting policies
  • Durable execution and long-running workflows
  • Tool boundaries, MCP, and approval gates
  • Evals, replay, and run-trace observability
  • Governance and enterprise deployment

Inference

  • Serving topology and model routing
  • Latency, throughput, and batching
  • KV-cache pressure and scheduling
  • GPU economics and AI-factory architecture
  • Benchmarking, reliability, observability

Books

Long-form, in progress.

view all

Recent

Writing and build notes.

Startup work

Public-safe notes.

Selected involvement; deep technical work, public framing.

view all

Enlighty.ai

AI-native consumer intelligence platform turning fragmented data into trusted insight.

enlighty.ai

Eachlabs.ai

AI workflow and model platform for app builders, with curated image, video, voice, and text models.

eachlabs.ai

Ozgur Guler
AI systems builder.

London · Est. MMXXIV
Set in Newsreader, Inter, & JetBrains Mono.
Built with Astro · No tracking · No nonsense.