Ozgur Guler · A public record
Production AI, from first principles.
I build, write about, and ship production AI systems — agent harnesses, long-running workflows, inference architecture, EvalOps, and the infrastructure underneath.
A curated record: books in progress, essays, code, build notes, talks, and selected startup work. No hype, no metrics theatre — just the trail.
- Now
- Drafting AI Inference Engineering and AI Agents in Production.
- Building
- Durable agent harnesses on Azure AI Foundry · MCP boundaries · EvalOps.
- Open to
- Talks, sober technical consulting, and selected startup work.
Selected work
A curated index.
- № 01 AI Inference Engineering Serving stacks, latency, throughput, KV-cache pressure, GPU economics, and AI factory architecture. Book — in progress
- № 02 AI Agents in Production Memory, durable execution, tool boundaries, evals, replay, governance, and enterprise deployment. Book — in progress
- № 03 foundry-demo Hosted agents, tracing, governance, grounding, MCP tooling, and A2A flows — packaged as a hands-on workshop. Code — Azure AI Foundry workshop
- № 04 agent-framework-ozg Runnable samples for agents, workflows, memory, reasoning, and Azure AI Foundry integration. Code — workshop fork
- № 05 Production Agent Workflows: Orchestration and Observability Typed workflow graphs, fan-out/fan-in execution, checkpointing, human approval gates, and telemetry. Essay
- № 06 Local PII Pre-Filter with Presidio and Qwen 2.5 A local pre-prompt guardrail using Presidio and a small CPU model to reduce PII exposure before LLM calls. Essay — guardrails
Practice
Two tracks, one discipline.
Production AI is the meeting point of model behaviour, infrastructure economics, and operator trust. I work where those three meet.
Agents
- Memory, state, and forgetting policies
- Durable execution and long-running workflows
- Tool boundaries, MCP, and approval gates
- Evals, replay, and run-trace observability
- Governance and enterprise deployment
Inference
- Serving topology and model routing
- Latency, throughput, and batching
- KV-cache pressure and scheduling
- GPU economics and AI-factory architecture
- Benchmarking, reliability, observability
Books
Long-form, in progress.
Recent
Writing and build notes.
Essays
- September's Mega Rounds A public snapshot of AI infrastructure, foundation model, sovereign AI, and agent funding patterns.
- Production Agent Workflows: Orchestration and Observability Typed workflow graphs, fan-out/fan-in execution, checkpointing, human approval gates, and telemetry for production-grade agent systems.
- Generative UI: From Static Screens to Adaptive Systems A practical view of model-assisted interfaces that adapt with contracts, policies, design tokens, and evaluation.
- Local PII Pre-Filter with Microsoft Presidio and Qwen 2.5 A local pre-prompt guardrail using Presidio and a small CPU model to reduce PII exposure before LLM calls.
Build log
- Agent workflow orchestration needs typed state Converted agent orchestration notes into a reusable production pattern: typed outputs, checkpoints, replay, and telemetry.
- Local PII filtering before model invocation Packaged a local Presidio plus small-model guardrail pattern for pre-prompt privacy control.
- Data requirements for agentic AI talks Mapped agentic AI, assisted coding, and governance shifts to concrete data estate requirements.
- Copilot Studio review after a break Revisited Copilot Studio capabilities through the lens of practical agent deployment and governance.
Startup work
Public-safe notes.
Selected involvement; deep technical work, public framing.
Enlighty.ai
AI-native consumer intelligence platform turning fragmented data into trusted insight.
Eachlabs.ai
AI workflow and model platform for app builders, with curated image, video, voice, and text models.
Ozgur Guler
AI systems builder.