<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en"><generator uri="https://jekyllrb.com/" version="4.3.3">Jekyll</generator><link href="https://xavidop.me/feed.xml" rel="self" type="application/atom+xml" /><link href="https://xavidop.me/" rel="alternate" type="text/html" hreflang="en" /><updated>2026-05-17T16:51:16+00:00</updated><id>https://xavidop.me/feed.xml</id><title type="html">Xavier Portilla Edo</title><subtitle>Personal Blog of Xavier Portilla Edo.
</subtitle><author><name>Xavi Portilla Edo</name><email>xavierportillaedo@gmail.com</email></author><entry xml:lang="en"><title type="html">Durable Genkit Flows with Temporal: Introducing the genkitx-temporal Plugin (English)</title><link href="https://xavidop.me/genkit/2026-05-16-genkitx-temporal-durable-genkit-flows/" rel="alternate" type="text/html" title="Durable Genkit Flows with Temporal: Introducing the genkitx-temporal Plugin (English)" /><published>2026-05-16T00:00:00+00:00</published><updated>2026-05-17T16:50:50+00:00</updated><id>https://xavidop.me/genkit/genkitx-temporal-durable-genkit-flows</id><content type="html" xml:base="https://xavidop.me/genkit/2026-05-16-genkitx-temporal-durable-genkit-flows/"><![CDATA[<ol class="no_toc" id="markdown-toc">
  <li><a href="#introduction" id="markdown-toc-introduction">Introduction</a></li>
  <li><a href="#what-is-genkitx-temporal" id="markdown-toc-what-is-genkitx-temporal">What is <code class="language-plaintext highlighter-rouge">genkitx-temporal</code>?</a></li>
  <li><a href="#why-temporal-and-genkit-are-a-great-match" id="markdown-toc-why-temporal-and-genkit-are-a-great-match">Why Temporal and Genkit are a great match</a></li>
  <li><a href="#installation" id="markdown-toc-installation">Installation</a></li>
  <li><a href="#usage-end-to-end" id="markdown-toc-usage-end-to-end">Usage end to end</a>    <ol>
      <li><a href="#1-define-a-flow-with-definetemporalflow" id="markdown-toc-1-define-a-flow-with-definetemporalflow">1. Define a flow with <code class="language-plaintext highlighter-rouge">defineTemporalFlow</code></a></li>
      <li><a href="#2-start-a-worker" id="markdown-toc-2-start-a-worker">2. Start a Worker</a></li>
      <li><a href="#3-execute-a-flow-as-a-temporal-workflow" id="markdown-toc-3-execute-a-flow-as-a-temporal-workflow">3. Execute a flow as a Temporal Workflow</a></li>
      <li><a href="#configuration" id="markdown-toc-configuration">Configuration</a></li>
      <li><a href="#advanced-combine-with-your-own-workflows-and-activities" id="markdown-toc-advanced-combine-with-your-own-workflows-and-activities">Advanced: combine with your own workflows and activities</a></li>
    </ol>
  </li>
  <li><a href="#api-summary" id="markdown-toc-api-summary">API summary</a></li>
  <li><a href="#when-to-reach-for-genkitx-temporal" id="markdown-toc-when-to-reach-for-genkitx-temporal">When to reach for <code class="language-plaintext highlighter-rouge">genkitx-temporal</code></a></li>
  <li><a href="#how-it-pairs-with-the-rest-of-the-genkit-ecosystem" id="markdown-toc-how-it-pairs-with-the-rest-of-the-genkit-ecosystem">How it pairs with the rest of the Genkit ecosystem</a></li>
  <li><a href="#conclusion" id="markdown-toc-conclusion">Conclusion</a></li>
</ol>

<h2 id="introduction">Introduction</h2>

<p>Anyone who has shipped a non-trivial Gen AI feature to production has hit the same wall: the happy path is fun to demo, but the unhappy path is brutal. Model providers throw 5xx errors. Rate limits kick in at the worst possible moment. A long-running agent gets halfway through a 12-step plan and the pod is restarted by Kubernetes. A user closes the tab in the middle of a streaming response and you have no idea what state the flow ended up in.</p>

<p><a href="https://genkit.dev">Genkit</a> gives you a clean way to author those LLM-orchestrating pipelines as <strong>flows</strong>, but a Genkit flow, by itself, is just a function. It lives and dies with the process that invokes it. There is no durable history, no automatic retry policy, no built-in cancellation, no operations UI.</p>

<p>This is exactly the gap that <a href="https://temporal.io">Temporal</a> was designed to close for general-purpose backend code, and it is the gap that the new <a href="https://github.com/xavidop/genkitx-temporal"><strong>genkitx-temporal</strong></a> plugin closes for Genkit flows.</p>

<p>In this article we will look at:</p>

<ul>
  <li>What the plugin actually does.</li>
  <li>Why Temporal and Genkit are a particularly good match.</li>
  <li>How to use it end to end: define, register, run.</li>
  <li>When you should reach for it (and when you probably shouldn’t).</li>
</ul>

<h2 id="what-is-genkitx-temporal">What is <code class="language-plaintext highlighter-rouge">genkitx-temporal</code>?</h2>

<p><code class="language-plaintext highlighter-rouge">genkitx-temporal</code> is a Genkit plugin that lets you <strong>execute any Genkit flow inside a Temporal Workflow</strong>, transparently. You keep writing flows the way you always have; the plugin takes care of:</p>

<ul>
  <li>Registering the flow so a Temporal Worker can pick it up.</li>
  <li>Wrapping each execution in a deterministic Temporal Workflow.</li>
  <li>Running the non-deterministic LLM/tool/RAG work inside a Temporal Activity, so retries and timeouts are safe.</li>
  <li>Giving you helpers to start workflows from any client.</li>
</ul>

<p>Under the hood, the plugin ships a generic, deterministic workflow called <code class="language-plaintext highlighter-rouge">runGenkitFlow</code> that invokes a single Temporal Activity called <code class="language-plaintext highlighter-rouge">runGenkitFlowActivity</code>. The activity looks up your Genkit flow by name in an in-process registry and runs it inside a full Node environment, where all the messy, non-deterministic things (network calls, model calls, tool calls, RAG lookups) are perfectly fine.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>┌────────────┐  start workflow   ┌──────────────────────┐
│  Client    │ ────────────────▶ │  Temporal Server     │
└────────────┘                   └─────────┬────────────┘
                                           │ task
                                           ▼
                                ┌──────────────────────┐
                                │  Worker process      │
                                │  ┌────────────────┐  │
                                │  │ runGenkitFlow  │  │  (workflow, sandboxed)
                                │  │       │        │  │
                                │  │       ▼        │  │
                                │  │ runGenkit-     │  │  (activity, full Node)
                                │  │ FlowActivity   │  │
                                │  │       │        │  │
                                │  │       ▼        │  │
                                │  │  your Genkit   │  │
                                │  │  flow (LLM…)   │  │
                                │  └────────────────┘  │
                                └──────────────────────┘
</code></pre></div></div>

<p>This split is the whole trick. Temporal Workflows are required to be deterministic so that they can be <strong>replayed</strong> from event history after a crash, deploy or scale event. LLM calls obviously are not deterministic. Putting the LLM work inside an Activity is the canonical Temporal pattern, and the plugin does it for you so you don’t have to think about it.</p>

<h2 id="why-temporal-and-genkit-are-a-great-match">Why Temporal and Genkit are a great match</h2>

<p>It is easy to underestimate how much engineering is hiding behind “just call the model again if it fails”. Once you start building real agents, the list of things you want from your runtime grows quickly. Genkit gives you the authoring primitives, and Temporal gives you the runtime guarantees. Together:</p>

<ul>
  <li><strong>Automatic retries for transient errors.</strong> LLM providers regularly return 429s and 5xx. Tool calls hit the network. Vector stores time out. Temporal lets you express retry policies (exponential backoff, max attempts, non-retryable error types) declaratively, applied to every flow execution.</li>
  <li><strong>Durable history.</strong> Every event in your flow’s execution is persisted by the Temporal Server. If your Worker pod is killed mid-flow, another Worker picks up the workflow exactly where it left off, with all prior activity results intact. No partial state, no double-charging the user, no orphan jobs.</li>
  <li><strong>Timeouts, heartbeats and cancellation.</strong> Long-running agents that browse, plan and call tools for minutes are a nightmare to control with plain HTTP. Temporal models start-to-close, schedule-to-close and heartbeat timeouts as first-class concepts, plus explicit cancellation semantics you can wire to a user closing a tab.</li>
  <li><strong>Operational visibility out of the box.</strong> The Temporal UI gives you a per-execution timeline of every workflow and activity. You can see inputs, outputs, retries, failures and stack traces, search by workflow id, terminate or signal running workflows. This complements the <a href="/genkit/2026-05-14-dev-ui-shift-left-genkit-vercel-mastra/">Genkit Developer UI</a> nicely: Genkit’s UI is your <strong>local debugging tool</strong> during development; Temporal’s UI is your <strong>operational dashboard</strong> in staging and production.</li>
  <li><strong>Horizontal scalability for free.</strong> Need more throughput? Run more Worker processes pointing at the same task queue. The Temporal Server load-balances workflows and activities across them. Your flows did not need to change.</li>
  <li><strong>Long-running, human-in-the-loop friendly.</strong> Temporal workflows can sleep for days, wait for signals from external systems, and resume cleanly. Perfect for agents that need approval before executing a sensitive tool, or RAG pipelines that wait for a document indexing job to finish.</li>
</ul>

<p>In other words, <code class="language-plaintext highlighter-rouge">genkitx-temporal</code> turns your Genkit flows from “smart functions inside a Node process” into <strong>first-class durable workloads</strong> without changing the way you write them.</p>

<h2 id="installation">Installation</h2>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>npm <span class="nb">install </span>genkitx-temporal genkit
</code></pre></div></div>

<p>The Temporal SDK packages are peer-installed automatically as dependencies of the plugin.</p>

<p>You also need a running Temporal Server. For local development, the easiest path is:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>brew <span class="nb">install </span>temporal
temporal server start-dev
</code></pre></div></div>

<p>This starts a Temporal Server on <code class="language-plaintext highlighter-rouge">localhost:7233</code> and the UI on <code class="language-plaintext highlighter-rouge">http://localhost:8233</code>.</p>

<h2 id="usage-end-to-end">Usage end to end</h2>

<p>The plugin exposes a small, focused API. Three things are enough to ship a durable Genkit flow: define it with the Temporal-aware helper, start a Worker, and execute it from a client.</p>

<h3 id="1-define-a-flow-with-definetemporalflow">1. Define a flow with <code class="language-plaintext highlighter-rouge">defineTemporalFlow</code></h3>

<p><code class="language-plaintext highlighter-rouge">defineTemporalFlow</code> is a drop-in replacement for <code class="language-plaintext highlighter-rouge">ai.defineFlow</code>. The returned object is a normal Genkit flow, so you can still call it directly, expose it via the Developer UI, run it from tests, etc. The only difference is that it is also <strong>registered</strong> internally so a Temporal Worker can find it by name.</p>

<div class="language-ts highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// flows.ts</span>
<span class="k">import</span> <span class="p">{</span> <span class="nx">genkit</span><span class="p">,</span> <span class="nx">z</span> <span class="p">}</span> <span class="k">from</span> <span class="dl">'</span><span class="s1">genkit</span><span class="dl">'</span><span class="p">;</span>
<span class="k">import</span> <span class="p">{</span> <span class="nx">googleAI</span> <span class="p">}</span> <span class="k">from</span> <span class="dl">'</span><span class="s1">@genkit-ai/google-genai</span><span class="dl">'</span><span class="p">;</span>
<span class="k">import</span> <span class="p">{</span> <span class="nx">defineTemporalFlow</span><span class="p">,</span> <span class="nx">temporal</span> <span class="p">}</span> <span class="k">from</span> <span class="dl">'</span><span class="s1">genkitx-temporal</span><span class="dl">'</span><span class="p">;</span>

<span class="k">export</span> <span class="kd">const</span> <span class="nx">ai</span> <span class="o">=</span> <span class="nf">genkit</span><span class="p">({</span>
  <span class="na">plugins</span><span class="p">:</span> <span class="p">[</span>
    <span class="nf">googleAI</span><span class="p">(),</span>
    <span class="nf">temporal</span><span class="p">({</span> <span class="na">taskQueue</span><span class="p">:</span> <span class="dl">'</span><span class="s1">my-queue</span><span class="dl">'</span> <span class="p">}),</span>
  <span class="p">],</span>
  <span class="na">model</span><span class="p">:</span> <span class="nx">googleAI</span><span class="p">.</span><span class="nf">model</span><span class="p">(</span><span class="dl">'</span><span class="s1">gemini-flash-latest</span><span class="dl">'</span><span class="p">),</span>
<span class="p">});</span>

<span class="k">export</span> <span class="kd">const</span> <span class="nx">jokeFlow</span> <span class="o">=</span> <span class="nf">defineTemporalFlow</span><span class="p">(</span>
  <span class="nx">ai</span><span class="p">,</span>
  <span class="p">{</span>
    <span class="na">name</span><span class="p">:</span> <span class="dl">'</span><span class="s1">jokeFlow</span><span class="dl">'</span><span class="p">,</span>
    <span class="na">inputSchema</span><span class="p">:</span> <span class="nx">z</span><span class="p">.</span><span class="nf">string</span><span class="p">(),</span>
    <span class="na">outputSchema</span><span class="p">:</span> <span class="nx">z</span><span class="p">.</span><span class="nf">string</span><span class="p">(),</span>
  <span class="p">},</span>
  <span class="k">async </span><span class="p">(</span><span class="nx">subject</span><span class="p">)</span> <span class="o">=&gt;</span> <span class="p">{</span>
    <span class="kd">const</span> <span class="p">{</span> <span class="nx">text</span> <span class="p">}</span> <span class="o">=</span> <span class="k">await</span> <span class="nx">ai</span><span class="p">.</span><span class="nf">generate</span><span class="p">(</span><span class="s2">`Tell me a joke about </span><span class="p">${</span><span class="nx">subject</span><span class="p">}</span><span class="s2">`</span><span class="p">);</span>
    <span class="k">return</span> <span class="nx">text</span><span class="p">;</span>
  <span class="p">},</span>
<span class="p">);</span>
</code></pre></div></div>

<p>A few things to notice:</p>

<ul>
  <li>You configure the plugin like any other Genkit plugin, passing the Temporal task queue (and optionally <code class="language-plaintext highlighter-rouge">address</code>, <code class="language-plaintext highlighter-rouge">namespace</code>, etc.).</li>
  <li>The flow body is <strong>plain Genkit code</strong>. No Temporal-specific imports, no <code class="language-plaintext highlighter-rouge">proxyActivities</code>, no determinism gymnastics. The plugin handles all that for you.</li>
  <li>The flow’s <code class="language-plaintext highlighter-rouge">name</code> is also its Temporal registration key. Keep it unique within your Worker.</li>
</ul>

<h3 id="2-start-a-worker">2. Start a Worker</h3>

<p>Workers are the processes that actually execute your flows. A Worker imports your flows (so the registry is populated) and then calls <code class="language-plaintext highlighter-rouge">startTemporalWorker</code>:</p>

<div class="language-ts highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// worker.ts</span>
<span class="k">import</span> <span class="dl">'</span><span class="s1">./flows</span><span class="dl">'</span><span class="p">;</span>   <span class="c1">// side-effect import: registers the flows</span>
<span class="k">import</span> <span class="p">{</span> <span class="nx">startTemporalWorker</span> <span class="p">}</span> <span class="k">from</span> <span class="dl">'</span><span class="s1">genkitx-temporal</span><span class="dl">'</span><span class="p">;</span>

<span class="nf">startTemporalWorker</span><span class="p">({</span> <span class="na">taskQueue</span><span class="p">:</span> <span class="dl">'</span><span class="s1">my-queue</span><span class="dl">'</span> <span class="p">})</span>
  <span class="p">.</span><span class="k">catch</span><span class="p">((</span><span class="nx">e</span><span class="p">)</span> <span class="o">=&gt;</span> <span class="p">{</span> <span class="nx">console</span><span class="p">.</span><span class="nf">error</span><span class="p">(</span><span class="nx">e</span><span class="p">);</span> <span class="nx">process</span><span class="p">.</span><span class="nf">exit</span><span class="p">(</span><span class="mi">1</span><span class="p">);</span> <span class="p">});</span>
</code></pre></div></div>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>node ./dist/worker.js
</code></pre></div></div>

<p>You can run as many Worker processes as you want against the same task queue. Temporal will distribute work across them automatically. If a Worker dies mid-flow, another one picks up.</p>

<h3 id="3-execute-a-flow-as-a-temporal-workflow">3. Execute a flow as a Temporal Workflow</h3>

<p>From any client (an HTTP handler, a CLI, a cron job, another workflow), use <code class="language-plaintext highlighter-rouge">executeTemporalFlow</code> to start a workflow and <code class="language-plaintext highlighter-rouge">await</code> its result:</p>

<div class="language-ts highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// client.ts</span>
<span class="k">import</span> <span class="p">{</span> <span class="nx">executeTemporalFlow</span> <span class="p">}</span> <span class="k">from</span> <span class="dl">'</span><span class="s1">genkitx-temporal</span><span class="dl">'</span><span class="p">;</span>
<span class="k">import</span> <span class="p">{</span> <span class="nx">jokeFlow</span> <span class="p">}</span> <span class="k">from</span> <span class="dl">'</span><span class="s1">./flows</span><span class="dl">'</span><span class="p">;</span>

<span class="kd">const</span> <span class="nx">result</span> <span class="o">=</span> <span class="k">await</span> <span class="nf">executeTemporalFlow</span><span class="p">(</span><span class="nx">jokeFlow</span><span class="p">,</span> <span class="dl">'</span><span class="s1">cats</span><span class="dl">'</span><span class="p">,</span> <span class="p">{</span>
  <span class="na">taskQueue</span><span class="p">:</span> <span class="dl">'</span><span class="s1">my-queue</span><span class="dl">'</span><span class="p">,</span>
<span class="p">});</span>
<span class="nx">console</span><span class="p">.</span><span class="nf">log</span><span class="p">(</span><span class="nx">result</span><span class="p">);</span>
</code></pre></div></div>

<p>If you don’t want to block on the result, use <code class="language-plaintext highlighter-rouge">startTemporalFlow</code> instead. It returns the raw Temporal <code class="language-plaintext highlighter-rouge">WorkflowHandle</code>, which lets you query, signal, or cancel the running workflow later. This is the building block for human-in-the-loop scenarios, scheduled flows, fan-out/fan-in patterns, and so on.</p>

<h3 id="configuration">Configuration</h3>

<p><code class="language-plaintext highlighter-rouge">temporal(options)</code> and every helper accept the same connection options. Anything you don’t pass falls back to environment variables, then to sensible defaults:</p>

<table>
  <thead>
    <tr>
      <th>Option</th>
      <th>Env var</th>
      <th>Default</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">address</code></td>
      <td><code class="language-plaintext highlighter-rouge">TEMPORAL_ADDRESS</code></td>
      <td><code class="language-plaintext highlighter-rouge">localhost:7233</code></td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">namespace</code></td>
      <td><code class="language-plaintext highlighter-rouge">TEMPORAL_NAMESPACE</code></td>
      <td><code class="language-plaintext highlighter-rouge">default</code></td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">taskQueue</code></td>
      <td><code class="language-plaintext highlighter-rouge">TEMPORAL_TASK_QUEUE</code></td>
      <td><code class="language-plaintext highlighter-rouge">genkit</code></td>
    </tr>
  </tbody>
</table>

<p>This makes it straightforward to run the same code locally against a dev server and in production against Temporal Cloud or a self-hosted cluster.</p>

<h3 id="advanced-combine-with-your-own-workflows-and-activities">Advanced: combine with your own workflows and activities</h3>

<p>The bundled <code class="language-plaintext highlighter-rouge">runGenkitFlow</code> workflow is enough for the common case. If you want to mix Genkit flows with your own existing Temporal workflows and activities, the plugin lets you bring your own:</p>

<div class="language-ts highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">await</span> <span class="nf">startTemporalWorker</span><span class="p">({</span>
  <span class="na">taskQueue</span><span class="p">:</span> <span class="dl">'</span><span class="s1">my-queue</span><span class="dl">'</span><span class="p">,</span>
  <span class="na">workflowsPath</span><span class="p">:</span> <span class="nx">require</span><span class="p">.</span><span class="nf">resolve</span><span class="p">(</span><span class="dl">'</span><span class="s1">./my-workflows</span><span class="dl">'</span><span class="p">),</span>
  <span class="na">activities</span><span class="p">:</span> <span class="p">{</span> <span class="p">...</span><span class="nf">require</span><span class="p">(</span><span class="dl">'</span><span class="s1">./my-activities</span><span class="dl">'</span><span class="p">)</span> <span class="p">},</span>
<span class="p">});</span>
</code></pre></div></div>

<div class="language-ts highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// my-activities.ts</span>
<span class="k">export</span> <span class="p">{</span> <span class="nx">runGenkitFlowActivity</span> <span class="p">}</span> <span class="k">from</span> <span class="dl">'</span><span class="s1">genkitx-temporal/activities</span><span class="dl">'</span><span class="p">;</span>
<span class="k">export</span> <span class="k">async</span> <span class="kd">function</span> <span class="nf">myOtherActivity</span><span class="p">(</span><span class="cm">/* ... */</span><span class="p">)</span> <span class="p">{</span> <span class="cm">/* ... */</span> <span class="p">}</span>
</code></pre></div></div>

<p>Re-exporting <code class="language-plaintext highlighter-rouge">runGenkitFlowActivity</code> keeps the built-in workflow working, so you can compose Genkit flows alongside hand-written activities for the parts of your system that aren’t AI.</p>

<h2 id="api-summary">API summary</h2>

<p>The full public surface is tiny:</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">temporal(options?)</code> — the Genkit plugin.</li>
  <li><code class="language-plaintext highlighter-rouge">defineTemporalFlow(ai, config, fn)</code> — define a flow and register it for Temporal execution.</li>
  <li><code class="language-plaintext highlighter-rouge">startTemporalWorker(options?)</code> — start a Worker process.</li>
  <li><code class="language-plaintext highlighter-rouge">executeTemporalFlow(flow, input, options?)</code> — run a flow inside a Workflow and await the result.</li>
  <li><code class="language-plaintext highlighter-rouge">startTemporalFlow(flow, input, options?)</code> — same, but returns the <code class="language-plaintext highlighter-rouge">WorkflowHandle</code> for fire-and-forget / signalling.</li>
  <li><code class="language-plaintext highlighter-rouge">runGenkitFlowActivity</code> — the underlying activity, re-exported so you can combine it with your own activities.</li>
  <li><code class="language-plaintext highlighter-rouge">registerTemporalFlow(name, flow)</code> — manually register a flow that was defined elsewhere (useful when wrapping flows you don’t own).</li>
</ul>

<p>A small API surface is the point. The plugin is intentionally a thin bridge between two well-designed systems; it does not try to reinvent either of them.</p>

<h2 id="when-to-reach-for-genkitx-temporal">When to reach for <code class="language-plaintext highlighter-rouge">genkitx-temporal</code></h2>

<p>Not every flow needs a durable runtime. A streaming chat response that takes 800ms and either succeeds or is retried by the user is fine running on a plain HTTP handler.</p>

<p>You will feel the benefits the moment your flows look like one of these:</p>

<ul>
  <li><strong>Multi-step agents</strong> that orchestrate several model calls and tool invocations, where partial progress is expensive to throw away.</li>
  <li><strong>Long-running pipelines</strong> (document ingestion, batch summarization, fine-tuning prep) where individual steps can take minutes and the process must survive deploys.</li>
  <li><strong>Critical business workflows</strong> (refunds, account changes, contract generation) where you cannot afford to lose state or accidentally execute a step twice.</li>
  <li><strong>Human-in-the-loop agents</strong> that need to pause for approval, an external webhook, or a manual review before proceeding.</li>
  <li><strong>Anything you currently glue together with a job queue, a retry library, and a state machine.</strong> Temporal subsumes all three, and <code class="language-plaintext highlighter-rouge">genkitx-temporal</code> plugs Genkit straight into it.</li>
</ul>

<p>If your team already runs Temporal for non-AI workloads, this plugin is a no-brainer: it lets your Gen AI features inherit all the operational maturity your platform team has already built around it.</p>

<h2 id="how-it-pairs-with-the-rest-of-the-genkit-ecosystem">How it pairs with the rest of the Genkit ecosystem</h2>

<p>The thing I like the most about this plugin is that it composes cleanly with everything else Genkit gives you, not just one or two features:</p>

<ul>
  <li><strong>Genkit Developer UI.</strong> Because <code class="language-plaintext highlighter-rouge">defineTemporalFlow</code> returns a normal Genkit flow, you can still iterate on it locally with the <a href="/genkit/2026-05-14-dev-ui-shift-left-genkit-vercel-mastra/">Genkit Developer UI</a>: fast feedback loop during development, durable execution in production.</li>
  <li><strong>Genkit middleware.</strong> <a href="/genkit/2026-05-13-genkit-middleware/">Middleware</a> (in-call retries, fallbacks, skill injection, tool approval, prompt rewriting, etc.) runs <em>inside</em> the activity. You get two complementary layers of resilience: middleware for fine-grained in-call recovery, Temporal for whole-flow durability and replay.</li>
  <li><strong>Tools and function calling.</strong> Tools defined with <code class="language-plaintext highlighter-rouge">ai.defineTool</code> are just normal flow code from Temporal’s perspective. Their calls, retries and outputs appear in both the Genkit trace and the Temporal event history.</li>
  <li><strong>RAG primitives.</strong> Retrievers, indexers, embedders and rerankers all live inside the activity. That means heavy ingestion jobs (chunking, embedding, upserting into a vector store) inherit Temporal’s retry policies and survive restarts mid-batch.</li>
  <li><strong>Evaluators and datasets.</strong> Genkit’s evaluators are flows like any other, so you can run eval jobs as Temporal Workflows, schedule them, fan them out across Workers, and inspect every run in the Temporal UI.</li>
  <li><strong>Prompts and Dotprompt.</strong> Versioned <code class="language-plaintext highlighter-rouge">.prompt</code> files, prompt registries and structured outputs all work unchanged. The flow body is plain Genkit.</li>
  <li><strong>Plugins and model providers.</strong> Any Genkit plugin (Google AI, Vertex AI, OpenAI, Anthropic, Ollama, local models, vector stores, etc.) plugs in as usual; the Temporal layer doesn’t care which provider is on the other side of the call.</li>
  <li><strong>Telemetry and tracing.</strong> Genkit’s OpenTelemetry traces continue to be emitted from inside the activity, so they show up in whatever observability backend you already use, alongside Temporal’s own event history.</li>
  <li><strong>Deployment surfaces.</strong> Flows can still be exposed as HTTP endpoints, Cloud Functions, Firebase callable functions or Express handlers; <code class="language-plaintext highlighter-rouge">executeTemporalFlow</code> is just one more entry point, and you can mix and match (e.g. quick chat requests over HTTP, long agent runs through Temporal).</li>
  <li><strong>Multi-language story.</strong> Genkit is available in JS/TS, Go, Python (preview), Dart/Flutter (preview) and through a community Java SDK. This particular plugin targets JS/TS, but the architectural pattern — define your AI logic in Genkit, run it as a Temporal Workflow — is reusable across runtimes thanks to Temporal’s polyglot SDKs.</li>
</ul>

<h2 id="conclusion">Conclusion</h2>

<p>Genkit is a great way to <strong>author</strong> Gen AI logic. Temporal is a great way to <strong>run</strong> any long-running, failure-prone workload. <code class="language-plaintext highlighter-rouge">genkitx-temporal</code> is the missing adapter between the two: a small, focused plugin that turns every Genkit flow into a durable, retryable, observable Temporal Workflow without asking you to rewrite a single line of business logic.</p>

<p>If you are building anything more ambitious than a single-turn chat endpoint, give it a try:</p>

<ul>
  <li>Source: <a href="https://github.com/xavidop/genkitx-temporal">github.com/xavidop/genkitx-temporal</a></li>
  <li>Docs: <a href="https://xavidop.github.io/genkitx-temporal/">xavidop.github.io/genkitx-temporal</a></li>
  <li>Runnable example: <a href="https://github.com/xavidop/genkitx-temporal/tree/main/examples/test-app"><code class="language-plaintext highlighter-rouge">examples/test-app</code></a></li>
</ul>

<p>Your future on-call self will thank you the first time a model provider has a bad afternoon and your agents keep humming along regardless.</p>]]></content><author><name>Xavier Portilla Edo</name><email>xavierportillaedo@gmail.com</email></author><category term="genkit" /><category term="genkit" /><category term="temporal" /><category term="genkitx" /><category term="durable-execution" /><category term="workflows" /><summary type="html"><![CDATA[Genkit flows are great at orchestrating LLMs, tools and RAG, but they live and die with the process that runs them. The new genkitx-temporal plugin lets you run any Genkit flow as a Temporal Workflow, giving you retries, durable history, timeouts, cancellation and a UI to inspect every execution.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://xavidop.me/assets/img/blog/post-headers/genkit-temporal.png" /><media:content medium="image" url="https://xavidop.me/assets/img/blog/post-headers/genkit-temporal.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry xml:lang="en"><title type="html">Why a Local Debugging Tool is Non-Negotiable for Building AI Apps: Genkit Developer UI vs Vercel AI SDK DevTools vs Mastra Studio (English)</title><link href="https://xavidop.me/genkit/2026-05-14-debugging-tool-shift-left-genkit-vercel-mastra/" rel="alternate" type="text/html" title="Why a Local Debugging Tool is Non-Negotiable for Building AI Apps: Genkit Developer UI vs Vercel AI SDK DevTools vs Mastra Studio (English)" /><published>2026-05-14T00:00:00+00:00</published><updated>2026-05-16T09:41:39+00:00</updated><id>https://xavidop.me/genkit/debugging-tool-shift-left-genkit-vercel-mastra</id><content type="html" xml:base="https://xavidop.me/genkit/2026-05-14-debugging-tool-shift-left-genkit-vercel-mastra/"><![CDATA[<ol class="no_toc" id="markdown-toc">
  <li><a href="#introduction" id="markdown-toc-introduction">Introduction</a></li>
  <li><a href="#shift-left-applied-to-gen-ai" id="markdown-toc-shift-left-applied-to-gen-ai">Shift-left, applied to Gen AI</a>    <ol>
      <li><a href="#local-vs-hosted" id="markdown-toc-local-vs-hosted">Local vs hosted</a></li>
      <li><a href="#what-good-looks-like" id="markdown-toc-what-good-looks-like">What “good” looks like</a></li>
    </ol>
  </li>
  <li><a href="#genkit-developer-ui" id="markdown-toc-genkit-developer-ui">Genkit Developer UI</a></li>
  <li><a href="#vercel-ai-sdk-devtools" id="markdown-toc-vercel-ai-sdk-devtools">Vercel AI SDK DevTools</a></li>
  <li><a href="#mastra-studio" id="markdown-toc-mastra-studio">Mastra Studio</a></li>
  <li><a href="#side-by-side-comparison" id="markdown-toc-side-by-side-comparison">Side-by-side comparison</a></li>
  <li><a href="#when-i-would-pick-which" id="markdown-toc-when-i-would-pick-which">When I would pick which</a></li>
  <li><a href="#the-moral-of-the-story-pick-a-debugging-tool-any-debugging-tool" id="markdown-toc-the-moral-of-the-story-pick-a-debugging-tool-any-debugging-tool">The moral of the story: pick a debugging tool, any debugging tool</a></li>
  <li><a href="#conclusion" id="markdown-toc-conclusion">Conclusion</a></li>
</ol>

<h2 id="introduction">Introduction</h2>

<p>If you have ever shipped a backend service without a debugger, a hot-reload dev server or decent logs, you remember how painful it was. Now imagine doing it in a world where your “function” is a non-deterministic black box that can rewrite its own output every time you call it. Welcome to building Gen AI applications.</p>

<p>The single biggest productivity multiplier I have found in the last two years of shipping AI apps is <strong>a good local debugging tool</strong>. Not a hosted dashboard. Not a cloud trace viewer with a 30-second propagation delay. A local tool that runs next to your code, picks up your edits, lets you replay a request with a tweaked prompt, and shows the full trace, tool call by tool call. We will get into the local-vs-hosted trade-off later in the article.</p>

<p>This article is about why that matters and how the three leading JS/TS Gen AI frameworks approach it:</p>

<ul>
  <li><a href="https://genkit.dev/docs/js/devtools/">Genkit Developer UI</a></li>
  <li><a href="https://ai-sdk.dev/docs/ai-sdk-core/devtools">Vercel AI SDK DevTools</a></li>
  <li><a href="https://mastra.ai/docs/studio/overview">Mastra Studio</a></li>
</ul>

<p>They are three very different products solving overlapping problems, and each one makes different trade-offs that are worth understanding before you commit to one.</p>

<h2 id="shift-left-applied-to-gen-ai">Shift-left, applied to Gen AI</h2>

<p>“Shift-left” comes from the world of testing and security: the earlier in the development lifecycle you catch a problem, the cheaper it is to fix. Bugs found in production cost orders of magnitude more than bugs found while you’re typing.</p>

<p>Gen AI apps make this principle existential, not just economical, because the failure modes are weirder:</p>

<ul>
  <li>A prompt regression doesn’t show up as a stack trace. It shows up as the wrong tone, a hallucinated fact or a tool call that misfires once every twenty runs.</li>
  <li>A new model version can quietly change behavior across thousands of code paths with no warning.</li>
  <li>A tool that returns slightly different JSON can cause silent downstream breakage.</li>
</ul>

<p>The only way to keep your sanity is to <strong>collapse the feedback loop to seconds</strong>. You want to be able to:</p>

<ol>
  <li>Change a prompt or a piece of orchestration code.</li>
  <li>Re-run that exact unit, with that exact input.</li>
  <li>See the model’s input, its output, every tool call, every retry, every token used.</li>
  <li>Compare it to the previous run.</li>
  <li>Decide if it’s better, worse or different.</li>
</ol>

<p>If any one of those steps requires deploying, redeploying, opening a cloud console or grep’ing logs, you have already lost. Cycle time is the metric. A debugging tool is what makes that cycle time low enough to iterate productively.</p>

<h3 id="local-vs-hosted">Local vs hosted</h3>

<p>It is worth being explicit about this trade-off, because all three tools in this article are primarily local:</p>

<ul>
  <li>A <strong>local</strong> debugging tool runs on your machine, next to your code, with zero network latency between your edits and what you see. Iteration is fast, traces are immediate, and there is no risk of leaking prompts/responses to a third party. The downside is that it’s just for you.</li>
  <li>A <strong>hosted</strong> observability platform (Langfuse, LangSmith, cloud-native APM tools, etc.) is shared by your team, persists data long-term and is essential for production monitoring. The trade-off is propagation delay, configuration overhead and, in some cases, data residency concerns.</li>
</ul>

<p>These are complementary, not competing. The argument here is that the <em>local</em> part of the loop is the one that’s still often missing, and that’s where shift-left lives.</p>

<h3 id="what-good-looks-like">What “good” looks like</h3>

<p>After working with all three of the tools below, I would summarize the qualities of a good Gen AI debugging tool as:</p>

<ul>
  <li><strong>Zero-config or near-zero-config</strong> — it should pick up your code, not the other way around.</li>
  <li><strong>Live reload</strong> — edit your code, save, hit “run” again. No restart.</li>
  <li><strong>Interactive runners</strong> — invoke any prompt, tool, model, or higher-level primitive (agent, workflow, etc.) directly with arbitrary input.</li>
  <li><strong>Full traces</strong> — every step of the generation loop, with input/output/usage at each node.</li>
  <li><strong>Replay and modify</strong> — re-run any past trace with a tweak.</li>
  <li><strong>Minimal code changes</strong> — the framework should be observable by default, ideally without you having to remember to wrap each model call.</li>
</ul>

<p>Let’s see how each tool stacks up.</p>

<h2 id="genkit-developer-ui">Genkit Developer UI</h2>

<p>The Genkit team treats the <a href="https://genkit.dev/docs/js/devtools/">Developer UI</a> as a first-class part of the framework. It ships with the <code class="language-plaintext highlighter-rouge">genkit-cli</code> package and <strong>requires no code changes</strong> in your application to attach to your running process.</p>

<p>You install the CLI once globally:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>npm <span class="nb">install</span> <span class="nt">-g</span> genkit-cli
</code></pre></div></div>

<p>And then start your app under it:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>genkit start <span class="nt">--</span> npx tsx <span class="nt">--watch</span> src/index.ts
</code></pre></div></div>

<p>That’s it. Genkit attaches to your running Node process, discovers every flow, prompt, model, tool, retriever, indexer, embedder and evaluator you have defined, and exposes them all in a local web app at <code class="language-plaintext highlighter-rouge">http://localhost:4000</code>.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Telemetry API running on http://localhost:4033
Genkit Developer UI: http://localhost:4000
</code></pre></div></div>

<p>What you actually get inside:</p>

<ul>
  <li><strong>Interactive runners for every primitive</strong>. Flows, prompts, tools, models, retrievers, indexers, embedders and evaluators all get an interactive panel where you fill out the input (validated against the Zod schema) and hit run.</li>
  <li><strong>Full traces</strong> with the entire generation graph: every model call, every tool invocation, every middleware, with input/output/usage tokens.</li>
  <li><strong>Live reload</strong>. Combined with <code class="language-plaintext highlighter-rouge">--watch</code>, edits to your code show up without restarting the UI.</li>
  <li><strong>Prompt iteration</strong>. Tweak prompts and re-run inline.</li>
  <li><strong>Evals</strong>. Run evaluators against datasets directly from the UI.</li>
  <li><strong>Open by default</strong>. Add <code class="language-plaintext highlighter-rouge">-o</code> to auto-open in your browser.</li>
</ul>

<p>What stands out here is that <strong>observability comes for free</strong>. You do not wrap your model and you do not add a middleware just to see traces. Anything defined with the Genkit primitives is automatically introspected and traced, which makes the “let me see what’s happening” cost essentially zero.</p>

<p>If you also use the <a href="/genkit/2026-05-13-genkit-middleware-v2/">middleware system</a>, every middleware also shows up in the trace, which makes debugging things like retries and fallbacks a lot easier.</p>

<h2 id="vercel-ai-sdk-devtools">Vercel AI SDK DevTools</h2>

<p>Vercel introduced the <a href="https://ai-sdk.dev/docs/ai-sdk-core/devtools">AI SDK DevTools</a> more recently, and it is a meaningful step up from having nothing. The design philosophy is different from Genkit’s: it is a <strong>middleware-based capture tool</strong> that focuses tightly on the language-model layer, rather than an integrated development environment for the whole AI app.</p>

<p>You opt in by wrapping your model:</p>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">import</span> <span class="p">{</span> <span class="nx">wrapLanguageModel</span><span class="p">,</span> <span class="nx">gateway</span> <span class="p">}</span> <span class="k">from</span> <span class="dl">'</span><span class="s1">ai</span><span class="dl">'</span><span class="p">;</span>
<span class="k">import</span> <span class="p">{</span> <span class="nx">devToolsMiddleware</span> <span class="p">}</span> <span class="k">from</span> <span class="dl">'</span><span class="s1">@ai-sdk/devtools</span><span class="dl">'</span><span class="p">;</span>

<span class="kd">const</span> <span class="nx">model</span> <span class="o">=</span> <span class="nf">wrapLanguageModel</span><span class="p">({</span>
  <span class="na">model</span><span class="p">:</span> <span class="nf">gateway</span><span class="p">(</span><span class="dl">'</span><span class="s1">anthropic/claude-sonnet-4.5</span><span class="dl">'</span><span class="p">),</span>
  <span class="na">middleware</span><span class="p">:</span> <span class="nf">devToolsMiddleware</span><span class="p">(),</span>
<span class="p">});</span>
</code></pre></div></div>

<p>And then run the viewer in another terminal:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>npx @ai-sdk/devtools
<span class="c"># open http://localhost:4983</span>
</code></pre></div></div>

<p>What it captures:</p>

<ul>
  <li>Input parameters and prompts.</li>
  <li>Output content and tool calls.</li>
  <li>Token usage and timing.</li>
  <li>Raw provider request/response payloads.</li>
  <li>Multi-step interactions are grouped into “runs” with multiple “steps”.</li>
</ul>

<p>Things to keep in mind when comparing it with the others:</p>

<ul>
  <li><strong>It is opt-in per model.</strong> Every model you want to observe has to be wrapped explicitly. In a real codebase with many models, this usually means a shared factory or repeated wrapping. If you forget on one path, you have a blind spot.</li>
  <li><strong>It focuses on the language-model layer.</strong> Anything happening above that layer (your own application logic, custom orchestration, business steps) is not captured unless you add it yourself.</li>
  <li><strong>It is a viewer, not an interactive playground.</strong> You cannot “re-run this with a different prompt” from the UI; you go back to your code, edit and re-call.</li>
  <li><strong>It does not enumerate AI primitives</strong> because, in the AI SDK, those are just functions in your codebase. There is nothing to enumerate at runtime.</li>
  <li><strong>Storage is a JSON file</strong> (<code class="language-plaintext highlighter-rouge">.devtools/generations.json</code>). The team is clear that this is local-only, never for production, and the middleware automatically appends <code class="language-plaintext highlighter-rouge">.devtools</code> to your <code class="language-plaintext highlighter-rouge">.gitignore</code>.</li>
</ul>

<p>The upside is that it is <strong>probably the quickest of the three to drop into an existing project</strong>: install the package, wrap a model, run the viewer. If you live inside Next.js and the AI SDK and you mainly want a “what just happened?” panel for your model calls, it does the job well. If you are debugging a multi-step agent with custom orchestration, you will probably want to combine it with something that sees more of the picture.</p>

<h2 id="mastra-studio">Mastra Studio</h2>

<p><a href="https://mastra.ai/docs/studio/overview">Mastra Studio</a> is the most ambitious of the three in terms of surface area. It is bundled with <code class="language-plaintext highlighter-rouge">mastra dev</code> and runs at <code class="language-plaintext highlighter-rouge">http://localhost:4111</code>, doubling as a local development tool <strong>and</strong> a deployable team console (you can ship Studio to production for non-developers).</p>

<p>Out of the box, it gives you:</p>

<ul>
  <li><strong>Agents tab</strong> — chat with your agents, hot-swap models, tweak temperature/top-p, view traces, attach scorers.</li>
  <li><strong>Workflows tab</strong> — visualize workflows as graphs, run them step by step with custom inputs, watch the active step in real time, inspect tool calls and JSON outputs.</li>
  <li><strong>Processors tab</strong> — see input/output processors and guardrails wired to each agent.</li>
  <li><strong>Tools tab</strong> — run tools in isolation to debug them.</li>
  <li><strong>Workspaces tab</strong> — file browser into the agent’s workspace filesystem, with a Skills tab listing discovered skills.</li>
  <li><strong>MCP servers tab</strong> — list attached MCP servers and their tools.</li>
  <li><strong>Request context</strong> — set runtime variables that flow into agent instructions/tools through DI, with schema-driven forms.</li>
  <li><strong>Evaluation suite</strong> — Scorers, Datasets and Experiments tabs to run datasets through agents/workflows, attach scorers and compare experiments side-by-side.</li>
  <li><strong>Observability</strong> with traces and logs.</li>
  <li><strong>Editor integration</strong> for non-technical teammates to iterate on agents and version every change without redeploying.</li>
</ul>

<p>This is more than a debugging tool — it is closer to a <strong>development environment plus a team-facing console</strong>. The natural trade-off is that Studio is tightly coupled to Mastra’s primitives (Agents, Workflows, Processors, Workspaces), so it shines the most when your application is structured the Mastra way.</p>

<p>If you build non-trivial agentic systems and you want evaluation and dataset management baked into the same tool, Studio is genuinely very good. For a quick “see my prompt and the model’s response” loop, it can feel like more than you need.</p>

<h2 id="side-by-side-comparison">Side-by-side comparison</h2>

<table>
  <thead>
    <tr>
      <th>Capability</th>
      <th>Genkit Developer UI</th>
      <th>Vercel AI SDK DevTools</th>
      <th>Mastra Studio</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>Setup</td>
      <td><code class="language-plaintext highlighter-rouge">genkit start --</code> your code</td>
      <td><code class="language-plaintext highlighter-rouge">wrapLanguageModel(...)</code> + <code class="language-plaintext highlighter-rouge">npx @ai-sdk/devtools</code></td>
      <td><code class="language-plaintext highlighter-rouge">mastra dev</code></td>
    </tr>
    <tr>
      <td>Code changes required</td>
      <td>None (auto-discovers)</td>
      <td>Yes (wrap each model)</td>
      <td>None (auto-discovers Mastra primitives)</td>
    </tr>
    <tr>
      <td>Live reload</td>
      <td>Yes (<code class="language-plaintext highlighter-rouge">--watch</code>)</td>
      <td>N/A (viewer only)</td>
      <td>Yes</td>
    </tr>
    <tr>
      <td>Interactive runners</td>
      <td>Flow, prompt, model, tool, retriever, indexer, embedder, evaluator</td>
      <td>None (viewer only)</td>
      <td>Agent chat, workflow runner, tool runner</td>
    </tr>
    <tr>
      <td>Traces</td>
      <td>Full generation graph</td>
      <td>Per-model-call run/step</td>
      <td>Workflow + agent traces</td>
    </tr>
    <tr>
      <td>Re-run / replay</td>
      <td>Yes, from any primitive</td>
      <td>No (must edit code and re-call)</td>
      <td>Yes, including workflow step-through</td>
    </tr>
    <tr>
      <td>Dataset / eval UI</td>
      <td>Evaluators panel</td>
      <td>Not built-in</td>
      <td>Datasets + Experiments + Scorers</td>
    </tr>
    <tr>
      <td>Workspace / files browser</td>
      <td>No</td>
      <td>No</td>
      <td>Yes</td>
    </tr>
    <tr>
      <td>MCP server browser</td>
      <td>Not first-class</td>
      <td>No</td>
      <td>Yes</td>
    </tr>
    <tr>
      <td>Deployable to team</td>
      <td>No (local-only)</td>
      <td>No (local-only)</td>
      <td>Yes (Studio on Mastra platform)</td>
    </tr>
    <tr>
      <td>Languages</td>
      <td>JS/TS (Go, Python, Dart all have local tooling too)</td>
      <td>JS/TS only</td>
      <td>JS/TS only</td>
    </tr>
    <tr>
      <td>Style</td>
      <td>Auto-discovery of primitives</td>
      <td>Per-model middleware capture</td>
      <td>All-in-one studio for the agent lifecycle</td>
    </tr>
  </tbody>
</table>

<p>A quick note on the Vercel column: simplicity of integration is a real feature for many teams, and it is fair to call this the lightest-touch option. The trade-off is that observability is opt-in per model, so you have to be deliberate to avoid blind spots.</p>

<h2 id="when-i-would-pick-which">When I would pick which</h2>

<p>After using all three, this is roughly how I’d think about it:</p>

<ul>
  <li><strong>Building a serious agent or multi-step pipeline with tools, retries, fallbacks and evals</strong> → <strong>Genkit Developer UI</strong> is hard to beat on pure iteration speed. Zero-config tracing of the full pipeline, every primitive runnable, and middleware shows up in the trace for free.</li>
  <li><strong>Working primarily inside Next.js with the Vercel AI SDK and you mainly want better visibility into model calls</strong> → <strong>Vercel AI SDK DevTools</strong> is a quick win. Lightweight setup and it does what it says on the tin; pair it with something else if you need higher-level orchestration views.</li>
  <li><strong>Building agentic systems with workflows, datasets and scorers, and you want a UI your PM or domain expert can also use</strong> → <strong>Mastra Studio</strong>. The evaluation suite and the deployability are real differentiators.</li>
  <li><strong>Multi-language stack</strong> (JS/TS plus Go, Python or Dart) → <strong>Genkit</strong> is the only one of the three with first-party multi-language support; the local tooling story is consistent across runtimes.</li>
</ul>

<h2 id="the-moral-of-the-story-pick-a-debugging-tool-any-debugging-tool">The moral of the story: pick a debugging tool, any debugging tool</h2>

<p>In a normal backend project, you can sometimes get away with a weak dev loop because the code is deterministic. Bad UX, but it works.</p>

<p>In an AI project, the code is not deterministic. Every iteration is also a small experiment. If your loop is slow or your visibility is poor, you don’t just iterate slower, <strong>you iterate worse</strong>, because you can’t tell whether your changes are improvements or regressions. Shift-left in this world is not a virtue, it is a survival strategy.</p>

<p>A good local debugging tool:</p>

<ul>
  <li>Turns “I think the prompt got better” into “I can see the trace, the tokens, the eval score, side by side”.</li>
  <li>Catches regressions before they reach a user.</li>
  <li>Lets you onboard new engineers in hours instead of days, because they can see the system.</li>
  <li>Lets non-engineers (PMs, designers, domain experts) participate in iteration when the UI is friendly enough — Mastra Studio is explicit about this.</li>
</ul>

<p>If you are starting a Gen AI project today and you are picking a framework, the local tooling story should weigh as much as the model abstraction or the tool API. It is the difference between writing AI code and <strong>engineering</strong> AI features.</p>

<h2 id="conclusion">Conclusion</h2>

<p>All three tools are good in their lane:</p>

<ul>
  <li><strong>Genkit Developer UI</strong> is the most introspective of the three and requires zero code changes to be useful. It’s the one I reach for when I want to move fast and see every step of the pipeline.</li>
  <li><strong>Vercel AI SDK DevTools</strong> is the lightest-touch option to drop into an existing AI SDK app, with a clear focus on the model-call layer. Think of it as a focused viewer rather than a full development environment.</li>
  <li><strong>Mastra Studio</strong> is the most ambitious, closer to a full IDE-meets-console for agentic systems, with first-class evaluation, datasets and a deployable team UI. It’s at its best when your app is structured around Mastra’s primitives.</li>
</ul>

<p>Pick the one that matches the shape of your project, but please pick something. Building Gen AI without a local debugging tool in 2026 is engineering with the lights off.</p>

<p>Further reading:</p>

<ul>
  <li><a href="https://genkit.dev/docs/js/devtools/">Genkit Developer Tools</a></li>
  <li><a href="https://ai-sdk.dev/docs/ai-sdk-core/devtools">Vercel AI SDK DevTools</a></li>
  <li><a href="https://mastra.ai/docs/studio/overview">Mastra Studio overview</a></li>
  <li><a href="/genkit/2026-04-16-top-jsts-genai-frameworks-2026/">Top JS/TS Gen AI Frameworks for 2026</a></li>
  <li><a href="/genkit/2026-05-13-genkit-middleware-v2/">Genkit Middleware deep dive</a></li>
</ul>]]></content><author><name>Xavier Portilla Edo</name><email>xavierportillaedo@gmail.com</email></author><category term="genkit" /><category term="genkit" /><category term="vercel-ai-sdk" /><category term="mastra" /><category term="dev-ui" /><category term="devtools" /><summary type="html"><![CDATA[Building AI applications without a local debugging tool is like writing backend code without a debugger. A look at the "shift-left" philosophy applied to Gen AI development, and a hands-on comparison of Genkit Developer UI, Vercel AI SDK DevTools and Mastra Studio.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://xavidop.me/assets/img/blog/post-headers/genkit-dev-ui-comparison.png" /><media:content medium="image" url="https://xavidop.me/assets/img/blog/post-headers/genkit-dev-ui-comparison.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry xml:lang="en"><title type="html">Genkit Middleware: Intercept, Extend and Harden your Gen AI Pipelines (English)</title><link href="https://xavidop.me/genkit/2026-05-13-genkit-middleware/" rel="alternate" type="text/html" title="Genkit Middleware: Intercept, Extend and Harden your Gen AI Pipelines (English)" /><published>2026-05-13T00:00:00+00:00</published><updated>2026-05-14T10:59:44+00:00</updated><id>https://xavidop.me/genkit/genkit-middleware</id><content type="html" xml:base="https://xavidop.me/genkit/2026-05-13-genkit-middleware/"><![CDATA[<ol class="no_toc" id="markdown-toc">
  <li><a href="#introduction" id="markdown-toc-introduction">Introduction</a></li>
  <li><a href="#what-is-middleware-in-genkit" id="markdown-toc-what-is-middleware-in-genkit">What is middleware in Genkit</a></li>
  <li><a href="#installation" id="markdown-toc-installation">Installation</a></li>
  <li><a href="#the-built-in-middleware-catalogue" id="markdown-toc-the-built-in-middleware-catalogue">The built-in middleware catalogue</a>    <ol>
      <li><a href="#filesystem--give-the-model-a-sandboxed-file-system" id="markdown-toc-filesystem--give-the-model-a-sandboxed-file-system"><code class="language-plaintext highlighter-rouge">filesystem</code> — give the model a sandboxed file system</a></li>
      <li><a href="#skills--auto-load-markdown-skills-as-system-context" id="markdown-toc-skills--auto-load-markdown-skills-as-system-context"><code class="language-plaintext highlighter-rouge">skills</code> — auto-load Markdown skills as system context</a></li>
      <li><a href="#toolapproval--human-in-the-loop-for-tool-calls" id="markdown-toc-toolapproval--human-in-the-loop-for-tool-calls"><code class="language-plaintext highlighter-rouge">toolApproval</code> — human-in-the-loop for tool calls</a></li>
      <li><a href="#retry--exponential-backoff-with-jitter-for-transient-errors" id="markdown-toc-retry--exponential-backoff-with-jitter-for-transient-errors"><code class="language-plaintext highlighter-rouge">retry</code> — exponential backoff with jitter for transient errors</a></li>
      <li><a href="#fallback--gracefully-degrade-to-a-different-model" id="markdown-toc-fallback--gracefully-degrade-to-a-different-model"><code class="language-plaintext highlighter-rouge">fallback</code> — gracefully degrade to a different model</a></li>
    </ol>
  </li>
  <li><a href="#building-your-own-middleware-with-generatemiddleware" id="markdown-toc-building-your-own-middleware-with-generatemiddleware">Building your own middleware with <code class="language-plaintext highlighter-rouge">generateMiddleware</code></a></li>
  <li><a href="#composition-stacking-middlewares" id="markdown-toc-composition-stacking-middlewares">Composition: stacking middlewares</a></li>
  <li><a href="#the-importance-of-middleware-for-production-agents" id="markdown-toc-the-importance-of-middleware-for-production-agents">The importance of middleware for production agents</a></li>
  <li><a href="#conclusion" id="markdown-toc-conclusion">Conclusion</a></li>
</ol>

<h2 id="introduction">Introduction</h2>

<p>If you have been building anything non-trivial with Genkit, you have probably bumped into the same set of cross-cutting concerns over and over again: retrying transient model errors, falling back to a cheaper model when quota explodes, gating tool execution behind human approval, injecting filesystem access for coding agents, logging every request and response for observability…</p>

<p>Until now, you ended up either wrapping <code class="language-plaintext highlighter-rouge">ai.generate()</code> calls by hand or writing ad-hoc helpers that ended up duplicated across flows. The new <strong>Genkit Middleware</strong> changes that. It introduces a first-class, composable middleware layer for the <code class="language-plaintext highlighter-rouge">generate()</code> pipeline, with hooks for the <strong>model</strong>, the <strong>tool execution</strong> and the <strong>high-level generation loop</strong>, plus a small but very useful set of official middlewares published in the brand new <code class="language-plaintext highlighter-rouge">@genkit-ai/middleware</code> package.</p>

<p>This article is a practical tour of what the new middleware system gives you, the built-in middlewares you can drop in today, and how to write your own with <code class="language-plaintext highlighter-rouge">generateMiddleware</code>.</p>

<blockquote>
  <p>The official documentation lives at <a href="https://genkit.dev/docs/js/middleware/">Genkit Middleware</a>. All examples below assume the JavaScript/TypeScript SDK.</p>
</blockquote>

<blockquote>
  <p>A quick reminder: although this article focuses on the JS/TS middleware API, <strong>Genkit is a multi-language framework</strong>. The official SDKs cover <strong>JavaScript/TypeScript</strong> (primary, stable), <strong>Go</strong>, <strong>Python</strong> (preview) and <strong>Dart/Flutter</strong> (preview), and there is a community-maintained <strong>Java</strong> SDK used in production. The middleware concepts described here are JS/TS-specific today, but the underlying <code class="language-plaintext highlighter-rouge">generate()</code> pipeline exists across all SDKs and the same patterns are landing on the other runtimes.</p>
</blockquote>

<h2 id="what-is-middleware-in-genkit">What is middleware in Genkit</h2>

<p>Conceptually, Genkit middleware behaves like the middleware you already know from Express or Koa, only applied to the LLM lifecycle instead of HTTP requests:</p>

<ol>
  <li>A <code class="language-plaintext highlighter-rouge">generate()</code> call is intercepted before it reaches the model.</li>
  <li>Each middleware can inspect or modify the request, decide whether to call <code class="language-plaintext highlighter-rouge">next()</code>, and inspect or modify the response on the way back.</li>
  <li>Multiple middlewares can be chained. They run in the order they are declared and unwind in reverse order, exactly like an onion.</li>
</ol>

<p>What makes Genkit’s design interesting is that it does not give you a single chokepoint, it gives you <strong>three orthogonal interception phases</strong>:</p>

<ul>
  <li><strong><code class="language-plaintext highlighter-rouge">model</code></strong> — wraps the call to the underlying model. Perfect for retries, fallbacks, request/response logging or response transformations.</li>
  <li><strong><code class="language-plaintext highlighter-rouge">tool</code></strong> — wraps tool execution. Ideal for approvals, sandboxing, audit logs or input/output validation.</li>
  <li><strong><code class="language-plaintext highlighter-rouge">generate</code></strong> — wraps the whole high-level generation loop (prompting, tool calling, output parsing). Best for things like injecting tools or system instructions before the loop starts.</li>
</ul>

<p>You opt in per call via a <code class="language-plaintext highlighter-rouge">use:</code> array, which keeps things explicit and avoids global side effects:</p>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">const</span> <span class="nx">response</span> <span class="o">=</span> <span class="k">await</span> <span class="nx">ai</span><span class="p">.</span><span class="nf">generate</span><span class="p">({</span>
  <span class="na">model</span><span class="p">:</span> <span class="nx">googleAI</span><span class="p">.</span><span class="nf">model</span><span class="p">(</span><span class="dl">'</span><span class="s1">gemini-flash-latest</span><span class="dl">'</span><span class="p">),</span>
  <span class="na">prompt</span><span class="p">:</span> <span class="dl">'</span><span class="s1">Hello</span><span class="dl">'</span><span class="p">,</span>
  <span class="na">use</span><span class="p">:</span> <span class="p">[</span><span class="nf">retry</span><span class="p">({</span> <span class="na">maxRetries</span><span class="p">:</span> <span class="mi">3</span> <span class="p">}),</span> <span class="nf">loggerMiddleware</span><span class="p">({</span> <span class="na">verbose</span><span class="p">:</span> <span class="kc">true</span> <span class="p">})],</span>
<span class="p">});</span>
</code></pre></div></div>

<h2 id="installation">Installation</h2>

<p>The official middlewares ship in their own package, decoupled from the Genkit core:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>npm <span class="nb">install</span> @genkit-ai/middleware
<span class="c"># or</span>
pnpm add @genkit-ai/middleware
</code></pre></div></div>

<p>You still need <code class="language-plaintext highlighter-rouge">genkit</code> itself and a model provider plugin (for example <code class="language-plaintext highlighter-rouge">@genkit-ai/google-genai</code>).</p>

<h2 id="the-built-in-middleware-catalogue">The built-in middleware catalogue</h2>

<p>Let’s go through the five middlewares the Genkit team ships out of the box.</p>

<h3 id="filesystem--give-the-model-a-sandboxed-file-system"><code class="language-plaintext highlighter-rouge">filesystem</code> — give the model a sandboxed file system</h3>

<p><code class="language-plaintext highlighter-rouge">filesystem</code> injects a standard set of file manipulation tools (<code class="language-plaintext highlighter-rouge">list_files</code>, <code class="language-plaintext highlighter-rouge">read_file</code>, <code class="language-plaintext highlighter-rouge">write_file</code>, <code class="language-plaintext highlighter-rouge">search_and_replace</code>) into the generation loop, restricted to a root directory of your choice.</p>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">import</span> <span class="p">{</span> <span class="nx">genkit</span> <span class="p">}</span> <span class="k">from</span> <span class="dl">'</span><span class="s1">genkit</span><span class="dl">'</span><span class="p">;</span>
<span class="k">import</span> <span class="p">{</span> <span class="nx">googleAI</span> <span class="p">}</span> <span class="k">from</span> <span class="dl">'</span><span class="s1">@genkit-ai/google-genai</span><span class="dl">'</span><span class="p">;</span>
<span class="k">import</span> <span class="p">{</span> <span class="nx">filesystem</span> <span class="p">}</span> <span class="k">from</span> <span class="dl">'</span><span class="s1">@genkit-ai/middleware</span><span class="dl">'</span><span class="p">;</span>

<span class="kd">const</span> <span class="nx">ai</span> <span class="o">=</span> <span class="nf">genkit</span><span class="p">({</span> <span class="na">plugins</span><span class="p">:</span> <span class="p">[</span><span class="nf">googleAI</span><span class="p">()]</span> <span class="p">});</span>

<span class="kd">const</span> <span class="nx">response</span> <span class="o">=</span> <span class="k">await</span> <span class="nx">ai</span><span class="p">.</span><span class="nf">generate</span><span class="p">({</span>
  <span class="na">model</span><span class="p">:</span> <span class="nx">googleAI</span><span class="p">.</span><span class="nf">model</span><span class="p">(</span><span class="dl">'</span><span class="s1">gemini-flash-latest</span><span class="dl">'</span><span class="p">),</span>
  <span class="na">prompt</span><span class="p">:</span> <span class="dl">'</span><span class="s1">Create a hello world Node app in the workspace</span><span class="dl">'</span><span class="p">,</span>
  <span class="na">use</span><span class="p">:</span> <span class="p">[</span>
    <span class="nf">filesystem</span><span class="p">({</span>
      <span class="na">rootDirectory</span><span class="p">:</span> <span class="dl">'</span><span class="s1">./workspace</span><span class="dl">'</span><span class="p">,</span>
      <span class="na">allowWriteAccess</span><span class="p">:</span> <span class="kc">true</span><span class="p">,</span>
    <span class="p">}),</span>
  <span class="p">],</span>
<span class="p">});</span>
</code></pre></div></div>

<p>Useful options:</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">rootDirectory</code> (required) — sandbox root, all paths are confined to it.</li>
  <li><code class="language-plaintext highlighter-rouge">allowWriteAccess</code> — defaults to <code class="language-plaintext highlighter-rouge">false</code>. Read-only by default is a sane choice for safety.</li>
  <li><code class="language-plaintext highlighter-rouge">toolNamePrefix</code> — namespace the injected tools to avoid collisions with your own.</li>
</ul>

<p>This is essentially the building block for a “coding agent” pattern, without you having to write tool definitions or path validation logic.</p>

<h3 id="skills--auto-load-markdown-skills-as-system-context"><code class="language-plaintext highlighter-rouge">skills</code> — auto-load Markdown skills as system context</h3>

<p><code class="language-plaintext highlighter-rouge">skills</code> scans a directory for <code class="language-plaintext highlighter-rouge">SKILL.md</code> files (plus their YAML frontmatter), injects relevant ones into the system prompt, and exposes a <code class="language-plaintext highlighter-rouge">use_skill</code> tool the model can call when it needs more specific guidance.</p>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">import</span> <span class="p">{</span> <span class="nx">skills</span> <span class="p">}</span> <span class="k">from</span> <span class="dl">'</span><span class="s1">@genkit-ai/middleware</span><span class="dl">'</span><span class="p">;</span>

<span class="kd">const</span> <span class="nx">response</span> <span class="o">=</span> <span class="k">await</span> <span class="nx">ai</span><span class="p">.</span><span class="nf">generate</span><span class="p">({</span>
  <span class="na">prompt</span><span class="p">:</span> <span class="dl">'</span><span class="s1">How do I run tests in this repo?</span><span class="dl">'</span><span class="p">,</span>
  <span class="na">use</span><span class="p">:</span> <span class="p">[</span><span class="nf">skills</span><span class="p">({</span> <span class="na">skillPaths</span><span class="p">:</span> <span class="p">[</span><span class="dl">'</span><span class="s1">./skills</span><span class="dl">'</span><span class="p">]</span> <span class="p">})],</span>
<span class="p">});</span>
</code></pre></div></div>

<p>Think of it as a lightweight, file-based knowledge layer: every skill is a self-contained Markdown file with metadata, and the middleware decides when to surface them. It is a really clean alternative to ad-hoc system prompt soup.</p>

<h3 id="toolapproval--human-in-the-loop-for-tool-calls"><code class="language-plaintext highlighter-rouge">toolApproval</code> — human-in-the-loop for tool calls</h3>

<p><code class="language-plaintext highlighter-rouge">toolApproval</code> enforces an allowlist of tools the model is allowed to execute autonomously. Anything outside the list raises a <code class="language-plaintext highlighter-rouge">ToolInterruptError</code>, so you can pause execution, ask the user, and resume.</p>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">import</span> <span class="p">{</span> <span class="nx">genkit</span><span class="p">,</span> <span class="nx">restartTool</span> <span class="p">}</span> <span class="k">from</span> <span class="dl">'</span><span class="s1">genkit</span><span class="dl">'</span><span class="p">;</span>
<span class="k">import</span> <span class="p">{</span> <span class="nx">toolApproval</span> <span class="p">}</span> <span class="k">from</span> <span class="dl">'</span><span class="s1">@genkit-ai/middleware</span><span class="dl">'</span><span class="p">;</span>

<span class="kd">const</span> <span class="nx">response</span> <span class="o">=</span> <span class="k">await</span> <span class="nx">ai</span><span class="p">.</span><span class="nf">generate</span><span class="p">({</span>
  <span class="na">prompt</span><span class="p">:</span> <span class="dl">'</span><span class="s1">write a file</span><span class="dl">'</span><span class="p">,</span>
  <span class="na">tools</span><span class="p">:</span> <span class="p">[</span><span class="nx">writeFileTool</span><span class="p">],</span>
  <span class="na">use</span><span class="p">:</span> <span class="p">[</span><span class="nf">toolApproval</span><span class="p">({</span> <span class="na">approved</span><span class="p">:</span> <span class="p">[]</span> <span class="p">})],</span> <span class="c1">// empty list -&gt; always interrupt</span>
<span class="p">});</span>

<span class="k">if </span><span class="p">(</span><span class="nx">response</span><span class="p">.</span><span class="nx">finishReason</span> <span class="o">===</span> <span class="dl">'</span><span class="s1">interrupted</span><span class="dl">'</span><span class="p">)</span> <span class="p">{</span>
  <span class="kd">const</span> <span class="nx">interrupt</span> <span class="o">=</span> <span class="nx">response</span><span class="p">.</span><span class="nx">interrupts</span><span class="p">[</span><span class="mi">0</span><span class="p">];</span>

  <span class="c1">// ... ask the user, then mark the tool call as approved</span>
  <span class="kd">const</span> <span class="nx">approvedPart</span> <span class="o">=</span> <span class="nf">restartTool</span><span class="p">(</span><span class="nx">interrupt</span><span class="p">,</span> <span class="p">{</span> <span class="na">toolApproved</span><span class="p">:</span> <span class="kc">true</span> <span class="p">});</span>

  <span class="kd">const</span> <span class="nx">resumedResponse</span> <span class="o">=</span> <span class="k">await</span> <span class="nx">ai</span><span class="p">.</span><span class="nf">generate</span><span class="p">({</span>
    <span class="na">messages</span><span class="p">:</span> <span class="nx">response</span><span class="p">.</span><span class="nx">messages</span><span class="p">,</span>
    <span class="na">resume</span><span class="p">:</span> <span class="p">{</span> <span class="na">restart</span><span class="p">:</span> <span class="p">[</span><span class="nx">approvedPart</span><span class="p">]</span> <span class="p">},</span>
    <span class="na">use</span><span class="p">:</span> <span class="p">[</span><span class="nf">toolApproval</span><span class="p">({</span> <span class="na">approved</span><span class="p">:</span> <span class="p">[]</span> <span class="p">})],</span>
  <span class="p">});</span>
<span class="p">}</span>
</code></pre></div></div>

<p>This is exactly the pattern you want for any agent that touches the real world (filesystem writes, payments, sending emails). No more home-grown approval flags scattered across the codebase.</p>

<h3 id="retry--exponential-backoff-with-jitter-for-transient-errors"><code class="language-plaintext highlighter-rouge">retry</code> — exponential backoff with jitter for transient errors</h3>

<p>The <code class="language-plaintext highlighter-rouge">retry</code> middleware retries failed model calls on transient status codes (<code class="language-plaintext highlighter-rouge">UNAVAILABLE</code>, <code class="language-plaintext highlighter-rouge">DEADLINE_EXCEEDED</code>, <code class="language-plaintext highlighter-rouge">RESOURCE_EXHAUSTED</code>, <code class="language-plaintext highlighter-rouge">ABORTED</code>, <code class="language-plaintext highlighter-rouge">INTERNAL</code>) using exponential backoff with jitter.</p>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">import</span> <span class="p">{</span> <span class="nx">retry</span> <span class="p">}</span> <span class="k">from</span> <span class="dl">'</span><span class="s1">@genkit-ai/middleware</span><span class="dl">'</span><span class="p">;</span>

<span class="kd">const</span> <span class="nx">response</span> <span class="o">=</span> <span class="k">await</span> <span class="nx">ai</span><span class="p">.</span><span class="nf">generate</span><span class="p">({</span>
  <span class="na">model</span><span class="p">:</span> <span class="nx">googleAI</span><span class="p">.</span><span class="nf">model</span><span class="p">(</span><span class="dl">'</span><span class="s1">gemini-pro-latest</span><span class="dl">'</span><span class="p">),</span>
  <span class="na">prompt</span><span class="p">:</span> <span class="dl">'</span><span class="s1">Heavy reasoning task...</span><span class="dl">'</span><span class="p">,</span>
  <span class="na">use</span><span class="p">:</span> <span class="p">[</span>
    <span class="nf">retry</span><span class="p">({</span>
      <span class="na">maxRetries</span><span class="p">:</span> <span class="mi">3</span><span class="p">,</span>
      <span class="na">initialDelayMs</span><span class="p">:</span> <span class="mi">1000</span><span class="p">,</span>
      <span class="na">backoffFactor</span><span class="p">:</span> <span class="mi">2</span><span class="p">,</span>
    <span class="p">}),</span>
  <span class="p">],</span>
<span class="p">});</span>
</code></pre></div></div>

<p>Knobs you actually care about:</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">maxRetries</code> (default <code class="language-plaintext highlighter-rouge">3</code>)</li>
  <li><code class="language-plaintext highlighter-rouge">statuses</code> — which status codes to retry on</li>
  <li><code class="language-plaintext highlighter-rouge">initialDelayMs</code> / <code class="language-plaintext highlighter-rouge">maxDelayMs</code> / <code class="language-plaintext highlighter-rouge">backoffFactor</code></li>
  <li><code class="language-plaintext highlighter-rouge">noJitter</code> — if you really want deterministic delays</li>
</ul>

<p>This is one of those things every team writes once, badly. Having it in the framework is a very welcome change.</p>

<h3 id="fallback--gracefully-degrade-to-a-different-model"><code class="language-plaintext highlighter-rouge">fallback</code> — gracefully degrade to a different model</h3>

<p><code class="language-plaintext highlighter-rouge">fallback</code> switches to an alternate model when the primary one fails on configurable status codes. The classic use case is “try Pro first, fall back to Flash when quota is exhausted”:</p>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">import</span> <span class="p">{</span> <span class="nx">fallback</span> <span class="p">}</span> <span class="k">from</span> <span class="dl">'</span><span class="s1">@genkit-ai/middleware</span><span class="dl">'</span><span class="p">;</span>

<span class="kd">const</span> <span class="nx">response</span> <span class="o">=</span> <span class="k">await</span> <span class="nx">ai</span><span class="p">.</span><span class="nf">generate</span><span class="p">({</span>
  <span class="na">model</span><span class="p">:</span> <span class="nx">googleAI</span><span class="p">.</span><span class="nf">model</span><span class="p">(</span><span class="dl">'</span><span class="s1">gemini-pro-latest</span><span class="dl">'</span><span class="p">),</span>
  <span class="na">prompt</span><span class="p">:</span> <span class="dl">'</span><span class="s1">Try the pro model first...</span><span class="dl">'</span><span class="p">,</span>
  <span class="na">use</span><span class="p">:</span> <span class="p">[</span>
    <span class="nf">fallback</span><span class="p">({</span>
      <span class="na">models</span><span class="p">:</span> <span class="p">[</span><span class="nx">googleAI</span><span class="p">.</span><span class="nf">model</span><span class="p">(</span><span class="dl">'</span><span class="s1">gemini-flash-latest</span><span class="dl">'</span><span class="p">)],</span>
      <span class="na">statuses</span><span class="p">:</span> <span class="p">[</span><span class="dl">'</span><span class="s1">RESOURCE_EXHAUSTED</span><span class="dl">'</span><span class="p">],</span>
    <span class="p">}),</span>
  <span class="p">],</span>
<span class="p">});</span>
</code></pre></div></div>

<p>You can chain multiple fallback models, and <code class="language-plaintext highlighter-rouge">isolateConfig</code> lets you decide whether the fallback inherits the original request configuration or starts clean (handy when the fallback model does not support the same options as the primary).</p>

<h2 id="building-your-own-middleware-with-generatemiddleware">Building your own middleware with <code class="language-plaintext highlighter-rouge">generateMiddleware</code></h2>

<p>The same primitive that powers all the built-ins is exposed for you. The <code class="language-plaintext highlighter-rouge">generateMiddleware</code> helper gives you typed config schemas (via Zod) and access to the <code class="language-plaintext highlighter-rouge">ai</code> instance.</p>

<p>Here is the canonical “logger” example, straight from the docs but lightly annotated:</p>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">import</span> <span class="p">{</span> <span class="nx">generateMiddleware</span><span class="p">,</span> <span class="nx">z</span> <span class="p">}</span> <span class="k">from</span> <span class="dl">'</span><span class="s1">genkit</span><span class="dl">'</span><span class="p">;</span>

<span class="k">export</span> <span class="kd">const</span> <span class="nx">loggerMiddleware</span> <span class="o">=</span> <span class="nf">generateMiddleware</span><span class="p">(</span>
  <span class="p">{</span>
    <span class="na">name</span><span class="p">:</span> <span class="dl">'</span><span class="s1">loggerMiddleware</span><span class="dl">'</span><span class="p">,</span>
    <span class="na">description</span><span class="p">:</span> <span class="dl">'</span><span class="s1">Logs requests and responses</span><span class="dl">'</span><span class="p">,</span>
    <span class="na">configSchema</span><span class="p">:</span> <span class="nx">z</span><span class="p">.</span><span class="nf">object</span><span class="p">({</span>
      <span class="na">verbose</span><span class="p">:</span> <span class="nx">z</span><span class="p">.</span><span class="nf">boolean</span><span class="p">().</span><span class="nf">optional</span><span class="p">(),</span>
    <span class="p">}),</span>
  <span class="p">},</span>
  <span class="p">({</span> <span class="nx">config</span><span class="p">,</span> <span class="nx">ai</span> <span class="p">})</span> <span class="o">=&gt;</span> <span class="p">{</span>
    <span class="k">return</span> <span class="p">{</span>
      <span class="c1">// Phase 1: intercept the model call</span>
      <span class="na">model</span><span class="p">:</span> <span class="k">async </span><span class="p">(</span><span class="nx">req</span><span class="p">,</span> <span class="nx">ctx</span><span class="p">,</span> <span class="nx">next</span><span class="p">)</span> <span class="o">=&gt;</span> <span class="p">{</span>
        <span class="k">if </span><span class="p">(</span><span class="nx">config</span><span class="p">?.</span><span class="nx">verbose</span><span class="p">)</span> <span class="p">{</span>
          <span class="nx">console</span><span class="p">.</span><span class="nf">log</span><span class="p">(</span><span class="dl">'</span><span class="s1">Request:</span><span class="dl">'</span><span class="p">,</span> <span class="nx">JSON</span><span class="p">.</span><span class="nf">stringify</span><span class="p">(</span><span class="nx">req</span><span class="p">));</span>
        <span class="p">}</span>
        <span class="kd">const</span> <span class="nx">resp</span> <span class="o">=</span> <span class="k">await</span> <span class="nf">next</span><span class="p">(</span><span class="nx">req</span><span class="p">,</span> <span class="nx">ctx</span><span class="p">);</span>
        <span class="k">if </span><span class="p">(</span><span class="nx">config</span><span class="p">?.</span><span class="nx">verbose</span><span class="p">)</span> <span class="p">{</span>
          <span class="nx">console</span><span class="p">.</span><span class="nf">log</span><span class="p">(</span><span class="dl">'</span><span class="s1">Response:</span><span class="dl">'</span><span class="p">,</span> <span class="nx">JSON</span><span class="p">.</span><span class="nf">stringify</span><span class="p">(</span><span class="nx">resp</span><span class="p">));</span>
        <span class="p">}</span>
        <span class="k">return</span> <span class="nx">resp</span><span class="p">;</span>
      <span class="p">},</span>
      <span class="c1">// You could also add `tool: ...` and `generate: ...` hooks here.</span>
    <span class="p">};</span>
  <span class="p">}</span>
<span class="p">);</span>
</code></pre></div></div>

<p>Using it is identical to the official ones:</p>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">const</span> <span class="nx">response</span> <span class="o">=</span> <span class="k">await</span> <span class="nx">ai</span><span class="p">.</span><span class="nf">generate</span><span class="p">({</span>
  <span class="na">model</span><span class="p">:</span> <span class="nx">googleAI</span><span class="p">.</span><span class="nf">model</span><span class="p">(</span><span class="dl">'</span><span class="s1">gemini-flash-latest</span><span class="dl">'</span><span class="p">),</span>
  <span class="na">prompt</span><span class="p">:</span> <span class="dl">'</span><span class="s1">Hello</span><span class="dl">'</span><span class="p">,</span>
  <span class="na">use</span><span class="p">:</span> <span class="p">[</span><span class="nf">loggerMiddleware</span><span class="p">({</span> <span class="na">verbose</span><span class="p">:</span> <span class="kc">true</span> <span class="p">})],</span>
<span class="p">});</span>
</code></pre></div></div>

<p>A few patterns I have found very useful:</p>

<ul>
  <li><strong>PII redaction</strong> — implement a <code class="language-plaintext highlighter-rouge">model</code> hook that scrubs the request prompt and the response text against a regex/dictionary, returning the cleaned version.</li>
  <li><strong>Cost accounting</strong> — wrap the <code class="language-plaintext highlighter-rouge">model</code> hook to read <code class="language-plaintext highlighter-rouge">usage</code> tokens from the response, and emit them to your metrics backend tagged by user/feature.</li>
  <li><strong>Per-tenant quotas</strong> — use the <code class="language-plaintext highlighter-rouge">generate</code> hook to check a counter (Redis, Firestore…) before calling <code class="language-plaintext highlighter-rouge">next()</code>; throw your own custom error if the tenant is over quota.</li>
  <li><strong>Caching</strong> — keyed on a hash of the model + request, return a cached response if hit, otherwise call <code class="language-plaintext highlighter-rouge">next()</code> and persist the result.</li>
</ul>

<p>For more inspiration, the source of the official middlewares is open in the <a href="https://github.com/genkit-ai/genkit/tree/main/js/plugins/middleware">Genkit GitHub repository</a>, and reading them is genuinely educational.</p>

<h2 id="composition-stacking-middlewares">Composition: stacking middlewares</h2>

<p>Middlewares compose in array order. A reasonable production stack might look like this:</p>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">const</span> <span class="nx">response</span> <span class="o">=</span> <span class="k">await</span> <span class="nx">ai</span><span class="p">.</span><span class="nf">generate</span><span class="p">({</span>
  <span class="na">model</span><span class="p">:</span> <span class="nx">googleAI</span><span class="p">.</span><span class="nf">model</span><span class="p">(</span><span class="dl">'</span><span class="s1">gemini-pro-latest</span><span class="dl">'</span><span class="p">),</span>
  <span class="na">prompt</span><span class="p">:</span> <span class="nx">userPrompt</span><span class="p">,</span>
  <span class="na">tools</span><span class="p">:</span> <span class="nx">myTools</span><span class="p">,</span>
  <span class="na">use</span><span class="p">:</span> <span class="p">[</span>
    <span class="nf">loggerMiddleware</span><span class="p">({</span> <span class="na">verbose</span><span class="p">:</span> <span class="kc">false</span> <span class="p">}),</span>       <span class="c1">// outermost: see everything</span>
    <span class="nf">retry</span><span class="p">({</span> <span class="na">maxRetries</span><span class="p">:</span> <span class="mi">3</span> <span class="p">}),</span>                   <span class="c1">// recover from transient failures</span>
    <span class="nf">fallback</span><span class="p">({</span>                                  <span class="c1">// degrade if Pro is overloaded</span>
      <span class="na">models</span><span class="p">:</span> <span class="p">[</span><span class="nx">googleAI</span><span class="p">.</span><span class="nf">model</span><span class="p">(</span><span class="dl">'</span><span class="s1">gemini-flash-latest</span><span class="dl">'</span><span class="p">)],</span>
      <span class="na">statuses</span><span class="p">:</span> <span class="p">[</span><span class="dl">'</span><span class="s1">RESOURCE_EXHAUSTED</span><span class="dl">'</span><span class="p">],</span>
    <span class="p">}),</span>
    <span class="nf">toolApproval</span><span class="p">({</span> <span class="na">approved</span><span class="p">:</span> <span class="p">[</span><span class="dl">'</span><span class="s1">searchDocs</span><span class="dl">'</span><span class="p">]</span> <span class="p">}),</span> <span class="c1">// gate dangerous tools</span>
  <span class="p">],</span>
<span class="p">});</span>
</code></pre></div></div>

<p>The order matters: outer middlewares see the result of the inner ones. Put logging on the outside if you want it to record the final state after retries and fallbacks; put it on the inside if you want to see every individual model attempt.</p>

<h2 id="the-importance-of-middleware-for-production-agents">The importance of middleware for production agents</h2>

<p>Genkit Middleware is one of those features that does not look flashy in a changelog but quietly fixes a lot of real-world friction. It pushes Genkit closer to a “batteries-included” framework for production agents:</p>

<ul>
  <li>Cross-cutting concerns are no longer copy-pasted across flows.</li>
  <li>Safety-critical behavior (approvals, sandboxes, fallbacks) is declarative.</li>
  <li>The <code class="language-plaintext highlighter-rouge">model</code> / <code class="language-plaintext highlighter-rouge">tool</code> / <code class="language-plaintext highlighter-rouge">generate</code> split gives you precise control without forcing you to monkey-patch.</li>
  <li>The middleware contract is small enough that the community can ship plugins that interoperate.</li>
</ul>

<p>If you maintain any non-trivial Genkit application, the upgrade is a no-brainer. Drop in <code class="language-plaintext highlighter-rouge">retry</code> and <code class="language-plaintext highlighter-rouge">fallback</code> first, you will probably see incidents disappear within the week. Then start writing your own middlewares for the things that are unique to your domain.</p>

<h2 id="conclusion">Conclusion</h2>

<p>Middleware turns Genkit’s <code class="language-plaintext highlighter-rouge">generate()</code> from “a function you call” into “a pipeline you compose”. The official <code class="language-plaintext highlighter-rouge">@genkit-ai/middleware</code> package covers the most common production needs (filesystem access, skills, tool approval, retries, fallbacks), and <code class="language-plaintext highlighter-rouge">generateMiddleware</code> makes writing your own a 20-line affair instead of a refactor.</p>

<p>For the next steps, take a look at:</p>

<ul>
  <li><a href="https://genkit.dev/docs/js/middleware/">Genkit Middleware documentation</a></li>
  <li><a href="https://github.com/genkit-ai/genkit/tree/main/js/plugins/middleware">Genkit middleware source on GitHub</a></li>
  <li><a href="https://genkit.dev/docs/js/flows/">Genkit flows</a> — middleware composes especially well with typed flows</li>
  <li><a href="https://genkit.dev/docs/js/tool-calling/">Tool calling</a> and <a href="https://genkit.dev/docs/js/interrupts/">Interrupts</a> — the foundation that <code class="language-plaintext highlighter-rouge">toolApproval</code> builds on</li>
</ul>

<p>Happy hacking, and may your fallback models always be cheaper than your primary one.</p>]]></content><author><name>Xavier Portilla Edo</name><email>xavierportillaedo@gmail.com</email></author><category term="genkit" /><category term="genkit" /><category term="javascript" /><category term="typescript" /><category term="middleware" /><category term="gemini" /><summary type="html"><![CDATA[A deep dive into the new Genkit middleware system for JavaScript/TypeScript: built-in middleware (filesystem, skills, toolApproval, retry, fallback), how to build your own with `generateMiddleware`, and the new `model`/`tool`/`generate` interception hooks.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://xavidop.me/assets/img/blog/post-headers/genkit-middleware.png" /><media:content medium="image" url="https://xavidop.me/assets/img/blog/post-headers/genkit-middleware.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry xml:lang="en"><title type="html">Vercel AI SDK Middleware vs Genkit Middleware: a Hands-On Comparison (English)</title><link href="https://xavidop.me/genkit/2026-05-13-vercel-ai-sdk-vs-genkit-middleware/" rel="alternate" type="text/html" title="Vercel AI SDK Middleware vs Genkit Middleware: a Hands-On Comparison (English)" /><published>2026-05-13T00:00:00+00:00</published><updated>2026-05-14T10:59:44+00:00</updated><id>https://xavidop.me/genkit/vercel-ai-sdk-vs-genkit-middleware</id><content type="html" xml:base="https://xavidop.me/genkit/2026-05-13-vercel-ai-sdk-vs-genkit-middleware/"><![CDATA[<ol class="no_toc" id="markdown-toc">
  <li><a href="#introduction" id="markdown-toc-introduction">Introduction</a></li>
  <li><a href="#tldr" id="markdown-toc-tldr">TL;DR</a></li>
  <li><a href="#the-mental-model" id="markdown-toc-the-mental-model">The mental model</a>    <ol>
      <li><a href="#vercel-ai-sdk-wrap-the-model" id="markdown-toc-vercel-ai-sdk-wrap-the-model">Vercel AI SDK: wrap the model</a></li>
      <li><a href="#genkit-opt-in-per-call" id="markdown-toc-genkit-opt-in-per-call">Genkit: opt in per call</a></li>
    </ol>
  </li>
  <li><a href="#the-hooks-side-by-side" id="markdown-toc-the-hooks-side-by-side">The hooks side by side</a>    <ol>
      <li><a href="#vercel-ai-sdk" id="markdown-toc-vercel-ai-sdk">Vercel AI SDK</a></li>
      <li><a href="#genkit" id="markdown-toc-genkit">Genkit</a></li>
    </ol>
  </li>
  <li><a href="#built-ins-head-to-head" id="markdown-toc-built-ins-head-to-head">Built-ins, head to head</a>    <ol>
      <li><a href="#vercel-ai-sdk-1" id="markdown-toc-vercel-ai-sdk-1">Vercel AI SDK</a></li>
      <li><a href="#genkit-1" id="markdown-toc-genkit-1">Genkit</a></li>
    </ol>
  </li>
  <li><a href="#composition" id="markdown-toc-composition">Composition</a>    <ol>
      <li><a href="#vercel-ai-sdk-2" id="markdown-toc-vercel-ai-sdk-2">Vercel AI SDK</a></li>
      <li><a href="#genkit-2" id="markdown-toc-genkit-2">Genkit</a></li>
    </ol>
  </li>
  <li><a href="#per-request-metadata" id="markdown-toc-per-request-metadata">Per-request metadata</a></li>
  <li><a href="#streaming" id="markdown-toc-streaming">Streaming</a></li>
  <li><a href="#tools-and-agents" id="markdown-toc-tools-and-agents">Tools and agents</a></li>
  <li><a href="#observability-and-tracing" id="markdown-toc-observability-and-tracing">Observability and tracing</a></li>
  <li><a href="#a-concrete-decision-guide" id="markdown-toc-a-concrete-decision-guide">A concrete decision guide</a></li>
  <li><a href="#conclusion" id="markdown-toc-conclusion">Conclusion</a></li>
</ol>

<h2 id="introduction">Introduction</h2>

<p>Two of the most popular Gen AI frameworks in JavaScript/TypeScript, <strong>Vercel AI SDK</strong> and <strong>Genkit</strong>, both ship a middleware system to extend their model calls with cross-cutting behavior: logging, caching, RAG, retries, fallbacks, guardrails, tool approval, etc.</p>

<p>On the surface they look very similar. In practice, they sit at different abstraction levels and embody different philosophies. This article puts them side by side using their official docs as the source of truth:</p>

<ul>
  <li><a href="https://ai-sdk.dev/docs/ai-sdk-core/middleware">Vercel AI SDK — Language Model Middleware</a></li>
  <li><a href="https://genkit.dev/docs/js/middleware/">Genkit — Middleware</a></li>
</ul>

<p>I’ll walk through the API surface, the built-ins, how composition works, how you write your own, and finish with a concrete decision matrix.</p>

<blockquote>
  <p>One scope note before we dive in: <strong>Vercel AI SDK is JavaScript/TypeScript only</strong>, while <strong>Genkit is a multi-language framework</strong> with official SDKs for <strong>JavaScript/TypeScript</strong> (primary, stable), <strong>Go</strong>, <strong>Python</strong> (preview) and <strong>Dart/Flutter</strong> (preview), plus a community-maintained <strong>Java</strong> SDK. The middleware comparison below is JS/TS-to-JS/TS, but if you also need to share patterns with Go, Python, Dart or Java services, that is a Genkit-only conversation.</p>
</blockquote>

<h2 id="tldr">TL;DR</h2>

<table>
  <thead>
    <tr>
      <th>Topic</th>
      <th>Vercel AI SDK</th>
      <th>Genkit</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>Primitive</td>
      <td><code class="language-plaintext highlighter-rouge">wrapLanguageModel({ model, middleware })</code> returns a wrapped model</td>
      <td><code class="language-plaintext highlighter-rouge">use: [...]</code> array on each <code class="language-plaintext highlighter-rouge">generate()</code> call</td>
    </tr>
    <tr>
      <td>Granularity</td>
      <td>Wraps the <strong>language model</strong> (<code class="language-plaintext highlighter-rouge">doGenerate</code> / <code class="language-plaintext highlighter-rouge">doStream</code>)</td>
      <td>Wraps the <strong>model</strong>, <strong>tool execution</strong>, and <strong>generation loop</strong></td>
    </tr>
    <tr>
      <td>Hooks</td>
      <td><code class="language-plaintext highlighter-rouge">transformParams</code>, <code class="language-plaintext highlighter-rouge">wrapGenerate</code>, <code class="language-plaintext highlighter-rouge">wrapStream</code></td>
      <td><code class="language-plaintext highlighter-rouge">model</code>, <code class="language-plaintext highlighter-rouge">tool</code>, <code class="language-plaintext highlighter-rouge">generate</code></td>
    </tr>
    <tr>
      <td>Where to put it</td>
      <td>At model construction (provider-agnostic, model-level)</td>
      <td>At call site (per-request, declarative)</td>
    </tr>
    <tr>
      <td>Built-ins</td>
      <td>Reasoning extraction, JSON extraction, simulated streaming, default settings, tool input examples</td>
      <td>Filesystem, Skills, Tool approval, Retry, Fallback</td>
    </tr>
    <tr>
      <td>Streaming</td>
      <td>First-class (separate <code class="language-plaintext highlighter-rouge">wrapStream</code>)</td>
      <td>Handled inside the <code class="language-plaintext highlighter-rouge">model</code> hook</td>
    </tr>
    <tr>
      <td>Per-request metadata</td>
      <td><code class="language-plaintext highlighter-rouge">providerOptions</code> namespaced by middleware name</td>
      <td>Direct config object passed when invoking the middleware</td>
    </tr>
    <tr>
      <td>Distribution</td>
      <td>Provider-agnostic, follows <code class="language-plaintext highlighter-rouge">LanguageModelV3Middleware</code> spec</td>
      <td>Follows the <code class="language-plaintext highlighter-rouge">generateMiddleware</code> contract (with Zod config schemas)</td>
    </tr>
  </tbody>
</table>

<p>Both are good. They are designed for different mental models and you can absolutely use both in the same codebase if you mix the two SDKs.</p>

<h2 id="the-mental-model">The mental model</h2>

<h3 id="vercel-ai-sdk-wrap-the-model">Vercel AI SDK: wrap the model</h3>

<p>In Vercel’s world, middleware is a <strong>decorator over the language model itself</strong>. You take a model, wrap it, and the result is “still a model”. It plugs into <code class="language-plaintext highlighter-rouge">generateText</code>, <code class="language-plaintext highlighter-rouge">streamText</code>, <code class="language-plaintext highlighter-rouge">generateObject</code>, etc. without those functions even knowing middleware exists.</p>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">import</span> <span class="p">{</span> <span class="nx">wrapLanguageModel</span><span class="p">,</span> <span class="nx">streamText</span> <span class="p">}</span> <span class="k">from</span> <span class="dl">'</span><span class="s1">ai</span><span class="dl">'</span><span class="p">;</span>

<span class="kd">const</span> <span class="nx">wrappedLanguageModel</span> <span class="o">=</span> <span class="nf">wrapLanguageModel</span><span class="p">({</span>
  <span class="na">model</span><span class="p">:</span> <span class="nx">yourModel</span><span class="p">,</span>
  <span class="na">middleware</span><span class="p">:</span> <span class="nx">yourLanguageModelMiddleware</span><span class="p">,</span>
<span class="p">});</span>

<span class="kd">const</span> <span class="nx">result</span> <span class="o">=</span> <span class="nf">streamText</span><span class="p">({</span>
  <span class="na">model</span><span class="p">:</span> <span class="nx">wrappedLanguageModel</span><span class="p">,</span>
  <span class="na">prompt</span><span class="p">:</span> <span class="dl">'</span><span class="s1">What cities are in the United States?</span><span class="dl">'</span><span class="p">,</span>
<span class="p">});</span>
</code></pre></div></div>

<p>This is very clean: the middleware travels with the model and is transparent to the rest of your code.</p>

<h3 id="genkit-opt-in-per-call">Genkit: opt in per call</h3>

<p>Genkit takes the opposite stance. Instead of wrapping the model, you pass middlewares as part of each <code class="language-plaintext highlighter-rouge">generate()</code> call via the <code class="language-plaintext highlighter-rouge">use:</code> array, and you can intercept three different phases of the pipeline: the <strong>model</strong>, the <strong>tools</strong>, and the high-level <strong>generate</strong> loop.</p>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">const</span> <span class="nx">response</span> <span class="o">=</span> <span class="k">await</span> <span class="nx">ai</span><span class="p">.</span><span class="nf">generate</span><span class="p">({</span>
  <span class="na">model</span><span class="p">:</span> <span class="nx">googleAI</span><span class="p">.</span><span class="nf">model</span><span class="p">(</span><span class="dl">'</span><span class="s1">gemini-flash-latest</span><span class="dl">'</span><span class="p">),</span>
  <span class="na">prompt</span><span class="p">:</span> <span class="dl">'</span><span class="s1">Hello</span><span class="dl">'</span><span class="p">,</span>
  <span class="na">use</span><span class="p">:</span> <span class="p">[</span><span class="nf">retry</span><span class="p">({</span> <span class="na">maxRetries</span><span class="p">:</span> <span class="mi">3</span> <span class="p">}),</span> <span class="nf">loggerMiddleware</span><span class="p">({</span> <span class="na">verbose</span><span class="p">:</span> <span class="kc">true</span> <span class="p">})],</span>
<span class="p">});</span>
</code></pre></div></div>

<p>Trade-off: it is more explicit (you see exactly what runs at each call site), at the cost of being noisier when you want a global behavior. Both styles are easy to wrap in a small helper.</p>

<h2 id="the-hooks-side-by-side">The hooks side by side</h2>

<h3 id="vercel-ai-sdk">Vercel AI SDK</h3>

<p>Three hooks, all centered on the language model contract:</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">transformParams</code> — mutate the request before it hits <code class="language-plaintext highlighter-rouge">doGenerate</code> or <code class="language-plaintext highlighter-rouge">doStream</code>.</li>
  <li><code class="language-plaintext highlighter-rouge">wrapGenerate</code> — wrap the non-streaming call, observe and modify the result.</li>
  <li><code class="language-plaintext highlighter-rouge">wrapStream</code> — wrap the streaming call. You typically pipe the stream through a <code class="language-plaintext highlighter-rouge">TransformStream</code> to inspect or rewrite chunks.</li>
</ul>

<p>Example: the canonical logging middleware, including streaming:</p>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">import</span> <span class="kd">type</span> <span class="p">{</span>
  <span class="nx">LanguageModelV3Middleware</span><span class="p">,</span>
  <span class="nx">LanguageModelV3StreamPart</span><span class="p">,</span>
<span class="p">}</span> <span class="k">from</span> <span class="dl">'</span><span class="s1">@ai-sdk/provider</span><span class="dl">'</span><span class="p">;</span>

<span class="k">export</span> <span class="kd">const</span> <span class="nx">yourLogMiddleware</span><span class="p">:</span> <span class="nx">LanguageModelV3Middleware</span> <span class="o">=</span> <span class="p">{</span>
  <span class="na">wrapGenerate</span><span class="p">:</span> <span class="k">async </span><span class="p">({</span> <span class="nx">doGenerate</span><span class="p">,</span> <span class="nx">params</span> <span class="p">})</span> <span class="o">=&gt;</span> <span class="p">{</span>
    <span class="nx">console</span><span class="p">.</span><span class="nf">log</span><span class="p">(</span><span class="dl">'</span><span class="s1">doGenerate called</span><span class="dl">'</span><span class="p">,</span> <span class="nx">params</span><span class="p">);</span>
    <span class="kd">const</span> <span class="nx">result</span> <span class="o">=</span> <span class="k">await</span> <span class="nf">doGenerate</span><span class="p">();</span>
    <span class="nx">console</span><span class="p">.</span><span class="nf">log</span><span class="p">(</span><span class="dl">'</span><span class="s1">generated text:</span><span class="dl">'</span><span class="p">,</span> <span class="nx">result</span><span class="p">.</span><span class="nx">text</span><span class="p">);</span>
    <span class="k">return</span> <span class="nx">result</span><span class="p">;</span>
  <span class="p">},</span>

  <span class="na">wrapStream</span><span class="p">:</span> <span class="k">async </span><span class="p">({</span> <span class="nx">doStream</span><span class="p">,</span> <span class="nx">params</span> <span class="p">})</span> <span class="o">=&gt;</span> <span class="p">{</span>
    <span class="kd">const</span> <span class="p">{</span> <span class="nx">stream</span><span class="p">,</span> <span class="p">...</span><span class="nx">rest</span> <span class="p">}</span> <span class="o">=</span> <span class="k">await</span> <span class="nf">doStream</span><span class="p">();</span>
    <span class="kd">let</span> <span class="nx">generatedText</span> <span class="o">=</span> <span class="dl">''</span><span class="p">;</span>

    <span class="kd">const</span> <span class="nx">transformStream</span> <span class="o">=</span> <span class="k">new</span> <span class="nx">TransformStream</span><span class="o">&lt;</span>
      <span class="nx">LanguageModelV3StreamPart</span><span class="p">,</span>
      <span class="nx">LanguageModelV3StreamPart</span>
    <span class="o">&gt;</span><span class="p">({</span>
      <span class="nf">transform</span><span class="p">(</span><span class="nx">chunk</span><span class="p">,</span> <span class="nx">controller</span><span class="p">)</span> <span class="p">{</span>
        <span class="k">if </span><span class="p">(</span><span class="nx">chunk</span><span class="p">.</span><span class="kd">type</span> <span class="o">===</span> <span class="dl">'</span><span class="s1">text-delta</span><span class="dl">'</span><span class="p">)</span> <span class="nx">generatedText</span> <span class="o">+=</span> <span class="nx">chunk</span><span class="p">.</span><span class="nx">delta</span><span class="p">;</span>
        <span class="nx">controller</span><span class="p">.</span><span class="nf">enqueue</span><span class="p">(</span><span class="nx">chunk</span><span class="p">);</span>
      <span class="p">},</span>
      <span class="nf">flush</span><span class="p">()</span> <span class="p">{</span>
        <span class="nx">console</span><span class="p">.</span><span class="nf">log</span><span class="p">(</span><span class="dl">'</span><span class="s1">stream finished:</span><span class="dl">'</span><span class="p">,</span> <span class="nx">generatedText</span><span class="p">);</span>
      <span class="p">},</span>
    <span class="p">});</span>

    <span class="k">return</span> <span class="p">{</span> <span class="na">stream</span><span class="p">:</span> <span class="nx">stream</span><span class="p">.</span><span class="nf">pipeThrough</span><span class="p">(</span><span class="nx">transformStream</span><span class="p">),</span> <span class="p">...</span><span class="nx">rest</span> <span class="p">};</span>
  <span class="p">},</span>
<span class="p">};</span>
</code></pre></div></div>

<h3 id="genkit">Genkit</h3>

<p>Three hooks too, but cutting along <strong>execution phase</strong> instead of stream vs non-stream:</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">model</code> — wraps the model invocation. Streaming and non-streaming go through the same hook.</li>
  <li><code class="language-plaintext highlighter-rouge">tool</code> — wraps tool execution. This has no equivalent in Vercel’s spec (tools are not part of the language-model middleware contract there).</li>
  <li><code class="language-plaintext highlighter-rouge">generate</code> — wraps the entire high-level generation loop, including tool calling and output parsing.</li>
</ul>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">import</span> <span class="p">{</span> <span class="nx">generateMiddleware</span><span class="p">,</span> <span class="nx">z</span> <span class="p">}</span> <span class="k">from</span> <span class="dl">'</span><span class="s1">genkit</span><span class="dl">'</span><span class="p">;</span>

<span class="k">export</span> <span class="kd">const</span> <span class="nx">loggerMiddleware</span> <span class="o">=</span> <span class="nf">generateMiddleware</span><span class="p">(</span>
  <span class="p">{</span>
    <span class="na">name</span><span class="p">:</span> <span class="dl">'</span><span class="s1">loggerMiddleware</span><span class="dl">'</span><span class="p">,</span>
    <span class="na">description</span><span class="p">:</span> <span class="dl">'</span><span class="s1">Logs requests and responses</span><span class="dl">'</span><span class="p">,</span>
    <span class="na">configSchema</span><span class="p">:</span> <span class="nx">z</span><span class="p">.</span><span class="nf">object</span><span class="p">({</span> <span class="na">verbose</span><span class="p">:</span> <span class="nx">z</span><span class="p">.</span><span class="nf">boolean</span><span class="p">().</span><span class="nf">optional</span><span class="p">()</span> <span class="p">}),</span>
  <span class="p">},</span>
  <span class="p">({</span> <span class="nx">config</span> <span class="p">})</span> <span class="o">=&gt;</span> <span class="p">({</span>
    <span class="na">model</span><span class="p">:</span> <span class="k">async </span><span class="p">(</span><span class="nx">req</span><span class="p">,</span> <span class="nx">ctx</span><span class="p">,</span> <span class="nx">next</span><span class="p">)</span> <span class="o">=&gt;</span> <span class="p">{</span>
      <span class="k">if </span><span class="p">(</span><span class="nx">config</span><span class="p">?.</span><span class="nx">verbose</span><span class="p">)</span> <span class="nx">console</span><span class="p">.</span><span class="nf">log</span><span class="p">(</span><span class="dl">'</span><span class="s1">Request:</span><span class="dl">'</span><span class="p">,</span> <span class="nx">JSON</span><span class="p">.</span><span class="nf">stringify</span><span class="p">(</span><span class="nx">req</span><span class="p">));</span>
      <span class="kd">const</span> <span class="nx">resp</span> <span class="o">=</span> <span class="k">await</span> <span class="nf">next</span><span class="p">(</span><span class="nx">req</span><span class="p">,</span> <span class="nx">ctx</span><span class="p">);</span>
      <span class="k">if </span><span class="p">(</span><span class="nx">config</span><span class="p">?.</span><span class="nx">verbose</span><span class="p">)</span> <span class="nx">console</span><span class="p">.</span><span class="nf">log</span><span class="p">(</span><span class="dl">'</span><span class="s1">Response:</span><span class="dl">'</span><span class="p">,</span> <span class="nx">JSON</span><span class="p">.</span><span class="nf">stringify</span><span class="p">(</span><span class="nx">resp</span><span class="p">));</span>
      <span class="k">return</span> <span class="nx">resp</span><span class="p">;</span>
    <span class="p">},</span>
  <span class="p">})</span>
<span class="p">);</span>
</code></pre></div></div>

<p>Two important differences:</p>

<ol>
  <li><strong>Tools are first-class citizens</strong> in Genkit middleware. You can intercept tool execution itself, which is exactly what powers <code class="language-plaintext highlighter-rouge">toolApproval</code>. In Vercel AI SDK, tool gating typically lives in your application code or inside <code class="language-plaintext highlighter-rouge">experimental_prepareStep</code> for agents, not in the middleware spec.</li>
  <li><strong>Genkit middleware factories carry a Zod config schema</strong>, so misconfigured middleware fails fast with a clear error. Vercel middleware is a plain object that conforms to <code class="language-plaintext highlighter-rouge">LanguageModelV3Middleware</code>; configuration is your responsibility.</li>
</ol>

<h2 id="built-ins-head-to-head">Built-ins, head to head</h2>

<h3 id="vercel-ai-sdk-1">Vercel AI SDK</h3>

<p>Centered on adapting the model contract to real-world quirks:</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">extractReasoningMiddleware</code> — pulls <code class="language-plaintext highlighter-rouge">&lt;think&gt;...&lt;/think&gt;</code> blocks (and similar) out of the text and surfaces them as a <code class="language-plaintext highlighter-rouge">reasoning</code> property. Crucial for DeepSeek R1 and friends.</li>
  <li><code class="language-plaintext highlighter-rouge">extractJsonMiddleware</code> — strips Markdown code fences from JSON outputs so <code class="language-plaintext highlighter-rouge">Output.object()</code> keeps working with chatty models.</li>
  <li><code class="language-plaintext highlighter-rouge">simulateStreamingMiddleware</code> — fakes a streaming interface on top of a non-streaming model, so your UI code stays consistent.</li>
  <li><code class="language-plaintext highlighter-rouge">defaultSettingsMiddleware</code> — pins default settings (temperature, max output tokens, provider options).</li>
  <li><code class="language-plaintext highlighter-rouge">addToolInputExamplesMiddleware</code> — for providers that don’t support <code class="language-plaintext highlighter-rouge">inputExamples</code> natively, serializes them into the tool description.</li>
</ul>

<p>The theme is clear: <strong>smooth over differences between providers</strong>. Vercel runs an SDK that has to talk to dozens of model providers, so its middleware library reflects that.</p>

<h3 id="genkit-1">Genkit</h3>

<p>Centered on production patterns and agentic behavior:</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">filesystem</code> — sandboxed file tools (<code class="language-plaintext highlighter-rouge">list_files</code>, <code class="language-plaintext highlighter-rouge">read_file</code>, <code class="language-plaintext highlighter-rouge">write_file</code>, <code class="language-plaintext highlighter-rouge">search_and_replace</code>) restricted to a root directory.</li>
  <li><code class="language-plaintext highlighter-rouge">skills</code> — auto-injects <code class="language-plaintext highlighter-rouge">SKILL.md</code> files into the system prompt and exposes a <code class="language-plaintext highlighter-rouge">use_skill</code> tool.</li>
  <li><code class="language-plaintext highlighter-rouge">toolApproval</code> — human-in-the-loop gating for tool calls, with first-class <code class="language-plaintext highlighter-rouge">ToolInterruptError</code> and resume support.</li>
  <li><code class="language-plaintext highlighter-rouge">retry</code> — exponential backoff with jitter for transient errors.</li>
  <li><code class="language-plaintext highlighter-rouge">fallback</code> — automatic switch to a different model on failure.</li>
</ul>

<p>Theme: <strong>production hardening and agent ergonomics</strong>. Genkit assumes you’ll plug in your own providers but want batteries for retry/fallback/approval/skills.</p>

<p>Notice the very small overlap. If you wanted “extract reasoning” in Genkit you’d write it as a custom middleware (10-20 lines). If you wanted “retry with backoff” in Vercel AI SDK, same thing. Each ecosystem chose to ship the middlewares its users were asking for the most.</p>

<h2 id="composition">Composition</h2>

<h3 id="vercel-ai-sdk-2">Vercel AI SDK</h3>

<p>You pass an array to <code class="language-plaintext highlighter-rouge">wrapLanguageModel</code>. The order is “outermost first”: the first middleware in the array runs outside the second one.</p>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">const</span> <span class="nx">wrappedLanguageModel</span> <span class="o">=</span> <span class="nf">wrapLanguageModel</span><span class="p">({</span>
  <span class="na">model</span><span class="p">:</span> <span class="nx">yourModel</span><span class="p">,</span>
  <span class="na">middleware</span><span class="p">:</span> <span class="p">[</span><span class="nx">firstMiddleware</span><span class="p">,</span> <span class="nx">secondMiddleware</span><span class="p">],</span>
<span class="p">});</span>
<span class="c1">// applied as: firstMiddleware(secondMiddleware(yourModel))</span>
</code></pre></div></div>

<p>Composition is <strong>static</strong>. The wrapped model is a long-lived value. Great for “configure once at startup” scenarios.</p>

<h3 id="genkit-2">Genkit</h3>

<p>You pass an array to <code class="language-plaintext highlighter-rouge">use:</code> on every call. Order is the same onion model: outer middlewares wrap inner ones.</p>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">const</span> <span class="nx">response</span> <span class="o">=</span> <span class="k">await</span> <span class="nx">ai</span><span class="p">.</span><span class="nf">generate</span><span class="p">({</span>
  <span class="na">model</span><span class="p">:</span> <span class="nx">googleAI</span><span class="p">.</span><span class="nf">model</span><span class="p">(</span><span class="dl">'</span><span class="s1">gemini-pro-latest</span><span class="dl">'</span><span class="p">),</span>
  <span class="na">prompt</span><span class="p">:</span> <span class="nx">userPrompt</span><span class="p">,</span>
  <span class="na">use</span><span class="p">:</span> <span class="p">[</span>
    <span class="nf">loggerMiddleware</span><span class="p">({</span> <span class="na">verbose</span><span class="p">:</span> <span class="kc">false</span> <span class="p">}),</span>
    <span class="nf">retry</span><span class="p">({</span> <span class="na">maxRetries</span><span class="p">:</span> <span class="mi">3</span> <span class="p">}),</span>
    <span class="nf">fallback</span><span class="p">({</span>
      <span class="na">models</span><span class="p">:</span> <span class="p">[</span><span class="nx">googleAI</span><span class="p">.</span><span class="nf">model</span><span class="p">(</span><span class="dl">'</span><span class="s1">gemini-flash-latest</span><span class="dl">'</span><span class="p">)],</span>
      <span class="na">statuses</span><span class="p">:</span> <span class="p">[</span><span class="dl">'</span><span class="s1">RESOURCE_EXHAUSTED</span><span class="dl">'</span><span class="p">],</span>
    <span class="p">}),</span>
    <span class="nf">toolApproval</span><span class="p">({</span> <span class="na">approved</span><span class="p">:</span> <span class="p">[</span><span class="dl">'</span><span class="s1">searchDocs</span><span class="dl">'</span><span class="p">]</span> <span class="p">}),</span>
  <span class="p">],</span>
<span class="p">});</span>
</code></pre></div></div>

<p>Composition is <strong>dynamic</strong>. You can change the stack per request based on user, tenant, A/B test, environment, etc. without rebuilding model instances.</p>

<h2 id="per-request-metadata">Per-request metadata</h2>

<p>A common need is to pass per-request context (user id, tenant, trace id…) into a middleware.</p>

<p><strong>Vercel AI SDK</strong> uses <code class="language-plaintext highlighter-rouge">providerOptions</code>, namespaced by the middleware name:</p>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">export</span> <span class="kd">const</span> <span class="nx">yourLogMiddleware</span><span class="p">:</span> <span class="nx">LanguageModelV3Middleware</span> <span class="o">=</span> <span class="p">{</span>
  <span class="na">wrapGenerate</span><span class="p">:</span> <span class="k">async </span><span class="p">({</span> <span class="nx">doGenerate</span><span class="p">,</span> <span class="nx">params</span> <span class="p">})</span> <span class="o">=&gt;</span> <span class="p">{</span>
    <span class="nx">console</span><span class="p">.</span><span class="nf">log</span><span class="p">(</span><span class="dl">'</span><span class="s1">METADATA</span><span class="dl">'</span><span class="p">,</span> <span class="nx">params</span><span class="p">?.</span><span class="nx">providerMetadata</span><span class="p">?.</span><span class="nx">yourLogMiddleware</span><span class="p">);</span>
    <span class="k">return</span> <span class="nf">doGenerate</span><span class="p">();</span>
  <span class="p">},</span>
<span class="p">};</span>

<span class="k">await</span> <span class="nf">generateText</span><span class="p">({</span>
  <span class="na">model</span><span class="p">:</span> <span class="nf">wrapLanguageModel</span><span class="p">({</span> <span class="na">model</span><span class="p">:</span> <span class="dl">'</span><span class="s1">anthropic/claude-sonnet-4.5</span><span class="dl">'</span><span class="p">,</span> <span class="na">middleware</span><span class="p">:</span> <span class="nx">yourLogMiddleware</span> <span class="p">}),</span>
  <span class="na">prompt</span><span class="p">:</span> <span class="dl">'</span><span class="s1">Invent a new holiday and describe its traditions.</span><span class="dl">'</span><span class="p">,</span>
  <span class="na">providerOptions</span><span class="p">:</span> <span class="p">{</span>
    <span class="na">yourLogMiddleware</span><span class="p">:</span> <span class="p">{</span> <span class="na">hello</span><span class="p">:</span> <span class="dl">'</span><span class="s1">world</span><span class="dl">'</span> <span class="p">},</span>
  <span class="p">},</span>
<span class="p">});</span>
</code></pre></div></div>

<p><strong>Genkit</strong> is more direct: middleware is invoked as a factory, so you pass config when you instantiate it for that call:</p>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nx">use</span><span class="p">:</span> <span class="p">[</span><span class="nf">loggerMiddleware</span><span class="p">({</span> <span class="na">verbose</span><span class="p">:</span> <span class="kc">true</span><span class="p">,</span> <span class="na">requestId</span><span class="p">:</span> <span class="dl">'</span><span class="s1">abc-123</span><span class="dl">'</span> <span class="p">})],</span>
</code></pre></div></div>

<p>Both work, but the Genkit approach is type-checked end-to-end thanks to the Zod config schema, while Vercel’s <code class="language-plaintext highlighter-rouge">providerOptions</code> is more loosely typed.</p>

<h2 id="streaming">Streaming</h2>

<p>Vercel AI SDK has <strong>separate hooks</strong> for streaming and non-streaming, which is honest because the two have very different semantics. You almost always need to pipe the stream through a <code class="language-plaintext highlighter-rouge">TransformStream</code>.</p>

<p>Genkit folds streaming into the same <code class="language-plaintext highlighter-rouge">model</code> hook. The middleware sees the request and the response, and the underlying engine handles whether it was streamed. This is more ergonomic when your middleware doesn’t need per-chunk logic, but if you do need to inspect chunks (say, for guardrails on partial output), you’ll need to drop down to lower-level APIs.</p>

<h2 id="tools-and-agents">Tools and agents</h2>

<p>This is where the two systems diverge most.</p>

<p>In <strong>Vercel AI SDK</strong>, tools live above the middleware layer. The <code class="language-plaintext highlighter-rouge">LanguageModelV3Middleware</code> spec sits at the model level; agentic loops are handled by higher-level abstractions like <code class="language-plaintext highlighter-rouge">generateText</code>, <code class="language-plaintext highlighter-rouge">streamText</code> and the agent APIs (<code class="language-plaintext highlighter-rouge">experimental_prepareStep</code>, etc.).</p>

<p>In <strong>Genkit</strong>, the <code class="language-plaintext highlighter-rouge">tool</code> and <code class="language-plaintext highlighter-rouge">generate</code> hooks make tool execution and the agent loop first-class targets for middleware. This is what enables clean implementations of:</p>

<ul>
  <li>Tool approval (<code class="language-plaintext highlighter-rouge">toolApproval</code>)</li>
  <li>Tool sandboxing (<code class="language-plaintext highlighter-rouge">filesystem</code>)</li>
  <li>Agent-level skill injection (<code class="language-plaintext highlighter-rouge">skills</code>)</li>
</ul>

<p>If you build a lot of agents with tool calling, Genkit’s middleware surface is genuinely more expressive.</p>

<h2 id="observability-and-tracing">Observability and tracing</h2>

<p>Both frameworks are observable but in different shapes:</p>

<ul>
  <li><strong>Genkit</strong> has a built-in Developer UI that shows traces of every <code class="language-plaintext highlighter-rouge">generate()</code> call, including middleware, tools, and nested flows. Middleware automatically participates in traces.</li>
  <li><strong>Vercel AI SDK</strong> is OpenTelemetry-friendly and integrates naturally with Vercel’s <code class="language-plaintext highlighter-rouge">@vercel/otel</code>, AI Gateway and Vercel Observability. You can wire your middleware into spans yourself.</li>
</ul>

<p>If “open the Dev UI and see what every middleware did” is high on your list, Genkit wins. If you already live inside the Vercel platform, the AI SDK plays beautifully with the rest of the stack.</p>

<h2 id="a-concrete-decision-guide">A concrete decision guide</h2>

<p>I would reach for <strong>Vercel AI SDK middleware</strong> when:</p>

<ul>
  <li>I want a very small, well-defined surface area: tweak params, wrap generate, wrap stream.</li>
  <li>I am bouncing between many model providers and need normalization (reasoning extraction, JSON cleanup, default settings).</li>
  <li>I prefer “configure the model once at startup” over “compose per call”.</li>
  <li>I am already deep in the Vercel ecosystem (Next.js, AI Gateway, Vercel Observability).</li>
  <li>Streaming-aware behavior with per-chunk transforms is a hard requirement.</li>
</ul>

<p>I would reach for <strong>Genkit middleware</strong> when:</p>

<ul>
  <li>I am building agents with tool calling, and I need tool approval, sandboxing or skill injection.</li>
  <li>I want production-grade <code class="language-plaintext highlighter-rouge">retry</code> and <code class="language-plaintext highlighter-rouge">fallback</code> behavior out of the box.</li>
  <li>I want to compose middleware <strong>per request</strong> based on user, tenant or experiment.</li>
  <li>I value the Developer UI and rich tracing over my entire pipeline.</li>
  <li>I want typed, validated middleware configuration via Zod.</li>
</ul>

<p>And honestly, in many real systems you might use both: Vercel AI SDK on the front-end / edge for chat UX, and Genkit on the backend for orchestration, tool calling and agentic flows. The middleware contracts are scoped, you won’t get the kind of conflicts you would expect.</p>

<h2 id="conclusion">Conclusion</h2>

<p>Vercel AI SDK middleware and Genkit middleware are two well-thought-out answers to the same question, asked from different vantage points.</p>

<ul>
  <li>Vercel AI SDK treats middleware as a <strong>provider abstraction layer</strong>: you adapt and normalize what providers give you.</li>
  <li>Genkit treats middleware as a <strong>production-pipeline composition layer</strong>: you orchestrate models, tools and agentic loops with reusable building blocks.</li>
</ul>

<p>Pick the one that matches the shape of your problem, and don’t be afraid to mix them. The best part of 2026 in JS/TS Gen AI is that both are mature, both are open source, and both let you write your own middleware in a handful of lines.</p>

<p>Further reading:</p>

<ul>
  <li><a href="https://ai-sdk.dev/docs/ai-sdk-core/middleware">Vercel AI SDK — Language Model Middleware</a></li>
  <li><a href="https://genkit.dev/docs/js/middleware/">Genkit Middleware documentation</a></li>
  <li><a href="/genkit/2026-04-16-top-jsts-genai-frameworks-2026/">Top JS/TS Gen AI Frameworks for 2026</a></li>
  <li><a href="https://github.com/genkit-ai/genkit">Genkit GitHub repository</a></li>
  <li><a href="https://github.com/vercel/ai">Vercel AI SDK GitHub repository</a></li>
</ul>]]></content><author><name>Xavier Portilla Edo</name><email>xavierportillaedo@gmail.com</email></author><category term="genkit" /><category term="genkit" /><category term="vercel-ai-sdk" /><category term="middleware" /><category term="typescript" /><summary type="html"><![CDATA[A side-by-side comparison of the two leading middleware systems in the JS/TS Gen AI ecosystem: Vercel AI SDK's `wrapLanguageModel` and Genkit's `generateMiddleware`. APIs, mental model, built-ins, composition, observability and when to pick each.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://xavidop.me/assets/img/blog/post-headers/genkit-vs-vercel-middleware.png" /><media:content medium="image" url="https://xavidop.me/assets/img/blog/post-headers/genkit-vs-vercel-middleware.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry xml:lang="es"><title type="html">Introduccion a Genkit con JavaScript y TypeScript (Guia paso a paso)</title><link href="https://xavidop.me/genkit/2026-05-06-genkit-jsts-introduccion/" rel="alternate" type="text/html" title="Introduccion a Genkit con JavaScript y TypeScript (Guia paso a paso)" /><published>2026-05-06T00:00:00+00:00</published><updated>2026-05-06T04:23:37+00:00</updated><id>https://xavidop.me/genkit/genkit-jsts-introduccion</id><content type="html" xml:base="https://xavidop.me/genkit/2026-05-06-genkit-jsts-introduccion/"><![CDATA[<ol class="no_toc" id="markdown-toc">
  <li><a href="#introduccion" id="markdown-toc-introduccion">Introduccion</a></li>
  <li><a href="#que-es-genkit-y-por-que-usarlo" id="markdown-toc-que-es-genkit-y-por-que-usarlo">Que es Genkit y por que usarlo</a></li>
  <li><a href="#prerrequisitos" id="markdown-toc-prerrequisitos">Prerrequisitos</a></li>
  <li><a href="#1-crear-el-proyecto-typescript" id="markdown-toc-1-crear-el-proyecto-typescript">1) Crear el proyecto TypeScript</a></li>
  <li><a href="#2-instalar-genkit" id="markdown-toc-2-instalar-genkit">2) Instalar Genkit</a></li>
  <li><a href="#3-configurar-la-api-key" id="markdown-toc-3-configurar-la-api-key">3) Configurar la API key</a></li>
  <li><a href="#4-crear-tu-primer-flow-con-genkit" id="markdown-toc-4-crear-tu-primer-flow-con-genkit">4) Crear tu primer flow con Genkit</a></li>
  <li><a href="#5-ejecutar-la-aplicacion" id="markdown-toc-5-ejecutar-la-aplicacion">5) Ejecutar la aplicacion</a></li>
  <li><a href="#6-probar-en-la-developer-ui" id="markdown-toc-6-probar-en-la-developer-ui">6) Probar en la Developer UI</a></li>
  <li><a href="#script-opcional-en-packagejson" id="markdown-toc-script-opcional-en-packagejson">Script opcional en package.json</a></li>
  <li><a href="#errores-tipicos-al-empezar" id="markdown-toc-errores-tipicos-al-empezar">Errores tipicos al empezar</a></li>
  <li><a href="#siguientes-pasos-recomendados" id="markdown-toc-siguientes-pasos-recomendados">Siguientes pasos recomendados</a></li>
  <li><a href="#conclusiones" id="markdown-toc-conclusiones">Conclusiones</a></li>
</ol>

<h2 id="introduccion">Introduccion</h2>

<p>Si vienes del ecosistema Node.js y quieres construir funcionalidades de IA generativa sin pelearte con demasiada infraestructura, <strong>Genkit</strong> es una de las mejores opciones ahora mismo.</p>

<p>En esta guia vas a crear tu primer proyecto con <strong>JavaScript/TypeScript</strong>, definir un <strong>flow tipado</strong>, ejecutarlo en local y probarlo en la <strong>Developer UI</strong>.</p>

<p>Esta introduccion esta basada en la guia oficial de Genkit para JS/TS:
<a href="https://genkit.dev/docs/js/get-started/">Get started with Genkit</a></p>

<h2 id="que-es-genkit-y-por-que-usarlo">Que es Genkit y por que usarlo</h2>

<p>Genkit es un framework open source para construir aplicaciones de IA con una capa de desarrollo muy práctica para backend:</p>

<ol>
  <li>Define inputs y outputs con esquemas (Zod).</li>
  <li>Crea flows reutilizables y tipados.</li>
  <li>Prueba todo en local con trazas en la Developer UI.</li>
  <li>Te deja cambiar de proveedor/modelo con menos fricción.</li>
</ol>

<h2 id="prerrequisitos">Prerrequisitos</h2>

<p>Antes de empezar, necesitas:</p>

<ol>
  <li>Node.js <strong>v20 o superior</strong>.</li>
  <li>npm.</li>
  <li>Una API key de Gemini desde <a href="https://aistudio.google.com/apikey">Google AI Studio</a>.</li>
</ol>

<h2 id="1-crear-el-proyecto-typescript">1) Crear el proyecto TypeScript</h2>

<p>Desde terminal:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nb">mkdir </span>my-genkit-app
<span class="nb">cd </span>my-genkit-app

npm init <span class="nt">-y</span>
npm pkg <span class="nb">set type</span><span class="o">=</span>module

npm <span class="nb">install</span> <span class="nt">-D</span> typescript tsx
npx tsc <span class="nt">--init</span>

<span class="nb">mkdir </span>src
<span class="nb">touch </span>src/index.ts
</code></pre></div></div>

<p>Con esto ya tienes la base de un proyecto TS en Node.</p>

<h2 id="2-instalar-genkit">2) Instalar Genkit</h2>

<p>Instala primero la CLI (necesaria para la Developer UI):</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>npm <span class="nb">install</span> <span class="nt">-g</span> genkit-cli
</code></pre></div></div>

<p>Instala despues las dependencias del proyecto:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>npm <span class="nb">install </span>genkit @genkit-ai/google-genai
</code></pre></div></div>

<h2 id="3-configurar-la-api-key">3) Configurar la API key</h2>

<p>Exporta la variable de entorno en tu shell:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nb">export </span><span class="nv">GEMINI_API_KEY</span><span class="o">=</span>&lt;tu_api_key&gt;
</code></pre></div></div>

<p>Si usas <code class="language-plaintext highlighter-rouge">zsh</code>, tambien puedes dejarla en tu <code class="language-plaintext highlighter-rouge">.zshrc</code> para no repetir este paso cada vez.</p>

<h2 id="4-crear-tu-primer-flow-con-genkit">4) Crear tu primer flow con Genkit</h2>

<p>Edita <code class="language-plaintext highlighter-rouge">src/index.ts</code> y pega este ejemplo:</p>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">import</span> <span class="p">{</span> <span class="nx">googleAI</span> <span class="p">}</span> <span class="k">from</span> <span class="dl">'</span><span class="s1">@genkit-ai/google-genai</span><span class="dl">'</span><span class="p">;</span>
<span class="k">import</span> <span class="p">{</span> <span class="nx">genkit</span><span class="p">,</span> <span class="nx">z</span> <span class="p">}</span> <span class="k">from</span> <span class="dl">'</span><span class="s1">genkit</span><span class="dl">'</span><span class="p">;</span>

<span class="c1">// Inicializa Genkit con el plugin de Google AI y modelo por defecto.</span>
<span class="kd">const</span> <span class="nx">ai</span> <span class="o">=</span> <span class="nf">genkit</span><span class="p">({</span>
  <span class="na">plugins</span><span class="p">:</span> <span class="p">[</span><span class="nf">googleAI</span><span class="p">()],</span>
  <span class="na">model</span><span class="p">:</span> <span class="nx">googleAI</span><span class="p">.</span><span class="nf">model</span><span class="p">(</span><span class="dl">'</span><span class="s1">gemini-3-pro</span><span class="dl">'</span><span class="p">,</span> <span class="p">{</span>
    <span class="na">temperature</span><span class="p">:</span> <span class="mf">0.8</span><span class="p">,</span>
  <span class="p">}),</span>
<span class="p">});</span>

<span class="kd">const</span> <span class="nx">RecipeInputSchema</span> <span class="o">=</span> <span class="nx">z</span><span class="p">.</span><span class="nf">object</span><span class="p">({</span>
  <span class="na">ingredient</span><span class="p">:</span> <span class="nx">z</span><span class="p">.</span><span class="nf">string</span><span class="p">().</span><span class="nf">describe</span><span class="p">(</span><span class="dl">'</span><span class="s1">Ingrediente principal o tipo de cocina</span><span class="dl">'</span><span class="p">),</span>
  <span class="na">dietaryRestrictions</span><span class="p">:</span> <span class="nx">z</span>
    <span class="p">.</span><span class="nf">string</span><span class="p">()</span>
    <span class="p">.</span><span class="nf">optional</span><span class="p">()</span>
    <span class="p">.</span><span class="nf">describe</span><span class="p">(</span><span class="dl">'</span><span class="s1">Restricciones alimentarias, si existen</span><span class="dl">'</span><span class="p">),</span>
<span class="p">});</span>

<span class="kd">const</span> <span class="nx">RecipeSchema</span> <span class="o">=</span> <span class="nx">z</span><span class="p">.</span><span class="nf">object</span><span class="p">({</span>
  <span class="na">title</span><span class="p">:</span> <span class="nx">z</span><span class="p">.</span><span class="nf">string</span><span class="p">(),</span>
  <span class="na">description</span><span class="p">:</span> <span class="nx">z</span><span class="p">.</span><span class="nf">string</span><span class="p">(),</span>
  <span class="na">prepTime</span><span class="p">:</span> <span class="nx">z</span><span class="p">.</span><span class="nf">string</span><span class="p">(),</span>
  <span class="na">cookTime</span><span class="p">:</span> <span class="nx">z</span><span class="p">.</span><span class="nf">string</span><span class="p">(),</span>
  <span class="na">servings</span><span class="p">:</span> <span class="nx">z</span><span class="p">.</span><span class="nf">number</span><span class="p">(),</span>
  <span class="na">ingredients</span><span class="p">:</span> <span class="nx">z</span><span class="p">.</span><span class="nf">array</span><span class="p">(</span><span class="nx">z</span><span class="p">.</span><span class="nf">string</span><span class="p">()),</span>
  <span class="na">instructions</span><span class="p">:</span> <span class="nx">z</span><span class="p">.</span><span class="nf">array</span><span class="p">(</span><span class="nx">z</span><span class="p">.</span><span class="nf">string</span><span class="p">()),</span>
  <span class="na">tips</span><span class="p">:</span> <span class="nx">z</span><span class="p">.</span><span class="nf">array</span><span class="p">(</span><span class="nx">z</span><span class="p">.</span><span class="nf">string</span><span class="p">()).</span><span class="nf">optional</span><span class="p">(),</span>
<span class="p">});</span>

<span class="k">export</span> <span class="kd">const</span> <span class="nx">recipeGeneratorFlow</span> <span class="o">=</span> <span class="nx">ai</span><span class="p">.</span><span class="nf">defineFlow</span><span class="p">(</span>
  <span class="p">{</span>
    <span class="na">name</span><span class="p">:</span> <span class="dl">'</span><span class="s1">recipeGeneratorFlow</span><span class="dl">'</span><span class="p">,</span>
    <span class="na">inputSchema</span><span class="p">:</span> <span class="nx">RecipeInputSchema</span><span class="p">,</span>
    <span class="na">outputSchema</span><span class="p">:</span> <span class="nx">RecipeSchema</span><span class="p">,</span>
  <span class="p">},</span>
  <span class="k">async </span><span class="p">(</span><span class="nx">input</span><span class="p">)</span> <span class="o">=&gt;</span> <span class="p">{</span>
    <span class="kd">const</span> <span class="nx">prompt</span> <span class="o">=</span> <span class="s2">`Create a recipe with the following requirements:\nMain ingredient: </span><span class="p">${</span><span class="nx">input</span><span class="p">.</span><span class="nx">ingredient</span><span class="p">}</span><span class="s2">\nDietary restrictions: </span><span class="p">${</span><span class="nx">input</span><span class="p">.</span><span class="nx">dietaryRestrictions</span> <span class="o">||</span> <span class="dl">'</span><span class="s1">none</span><span class="dl">'</span><span class="p">}</span><span class="s2">`</span><span class="p">;</span>

    <span class="kd">const</span> <span class="p">{</span> <span class="nx">output</span> <span class="p">}</span> <span class="o">=</span> <span class="k">await</span> <span class="nx">ai</span><span class="p">.</span><span class="nf">generate</span><span class="p">({</span>
      <span class="nx">prompt</span><span class="p">,</span>
      <span class="na">output</span><span class="p">:</span> <span class="p">{</span> <span class="na">schema</span><span class="p">:</span> <span class="nx">RecipeSchema</span> <span class="p">},</span>
    <span class="p">});</span>

    <span class="k">if </span><span class="p">(</span><span class="o">!</span><span class="nx">output</span><span class="p">)</span> <span class="p">{</span>
      <span class="k">throw</span> <span class="k">new</span> <span class="nc">Error</span><span class="p">(</span><span class="dl">'</span><span class="s1">Failed to generate recipe</span><span class="dl">'</span><span class="p">);</span>
    <span class="p">}</span>

    <span class="k">return</span> <span class="nx">output</span><span class="p">;</span>
  <span class="p">}</span>
<span class="p">);</span>

<span class="k">async</span> <span class="kd">function</span> <span class="nf">main</span><span class="p">()</span> <span class="p">{</span>
  <span class="kd">const</span> <span class="nx">recipe</span> <span class="o">=</span> <span class="k">await</span> <span class="nf">recipeGeneratorFlow</span><span class="p">({</span>
    <span class="na">ingredient</span><span class="p">:</span> <span class="dl">'</span><span class="s1">aguacate</span><span class="dl">'</span><span class="p">,</span>
    <span class="na">dietaryRestrictions</span><span class="p">:</span> <span class="dl">'</span><span class="s1">vegetariana</span><span class="dl">'</span><span class="p">,</span>
  <span class="p">});</span>

  <span class="nx">console</span><span class="p">.</span><span class="nf">log</span><span class="p">(</span><span class="nx">recipe</span><span class="p">);</span>
<span class="p">}</span>

<span class="nf">main</span><span class="p">().</span><span class="k">catch</span><span class="p">(</span><span class="nx">console</span><span class="p">.</span><span class="nx">error</span><span class="p">);</span>
</code></pre></div></div>

<p>Este ejemplo muestra tres conceptos clave de Genkit:</p>

<ol>
  <li><strong>Esquemas de input/output con Zod</strong> para evitar respuestas ambiguas.</li>
  <li><strong>Flow reutilizable</strong> que despues puedes exponer como API.</li>
  <li><strong>Structured Output</strong> desde el modelo (no solo texto libre).</li>
</ol>

<h2 id="5-ejecutar-la-aplicacion">5) Ejecutar la aplicacion</h2>

<p>ejecuta el ejemplo:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>npx tsx src/index.ts
</code></pre></div></div>

<p>Si todo va bien, veras por consola un objeto JSON con la receta generada.</p>

<h2 id="6-probar-en-la-developer-ui">6) Probar en la Developer UI</h2>

<p>Arranca la UI de desarrollo de Genkit:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>genkit start <span class="nt">--</span> npx tsx <span class="nt">--watch</span> src/index.ts
</code></pre></div></div>

<p>Por defecto estara en:</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">http://localhost:4000</code></li>
</ul>

<p>Dentro de la UI:</p>

<ol>
  <li>Selecciona <code class="language-plaintext highlighter-rouge">recipeGeneratorFlow</code>.</li>
  <li>Introduce un input como este:</li>
</ol>

<div class="language-json highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">{</span><span class="w">
  </span><span class="nl">"ingredient"</span><span class="p">:</span><span class="w"> </span><span class="s2">"aguacate"</span><span class="p">,</span><span class="w">
  </span><span class="nl">"dietaryRestrictions"</span><span class="p">:</span><span class="w"> </span><span class="s2">"vegetariana"</span><span class="w">
</span><span class="p">}</span><span class="w">
</span></code></pre></div></div>

<ol>
  <li>Pulsa <code class="language-plaintext highlighter-rouge">Run</code>.</li>
</ol>

<p>Veras el structured output y la traza de ejecucion para debuggear prompts y tiempos.</p>

<h2 id="script-opcional-en-packagejson">Script opcional en package.json</h2>

<p>Para no recordar el comando largo, puedes añadir este script:</p>

<div class="language-json highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">{</span><span class="w">
  </span><span class="nl">"scripts"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w">
    </span><span class="nl">"genkit:ui"</span><span class="p">:</span><span class="w"> </span><span class="s2">"genkit start -- npx tsx --watch src/index.ts"</span><span class="w">
  </span><span class="p">}</span><span class="w">
</span><span class="p">}</span><span class="w">
</span></code></pre></div></div>

<p>Y ejecutarlo con:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>npm run genkit:ui
</code></pre></div></div>

<h2 id="errores-tipicos-al-empezar">Errores tipicos al empezar</h2>

<ol>
  <li><code class="language-plaintext highlighter-rouge">GEMINI_API_KEY</code> no definida.</li>
  <li>Usar Node &lt; 20.</li>
  <li>Intentar correr la UI sin <code class="language-plaintext highlighter-rouge">genkit-cli</code> instalado globalmente.</li>
</ol>

<h2 id="siguientes-pasos-recomendados">Siguientes pasos recomendados</h2>

<p>Cuando ya tengas este primer flow funcionando, te recomiendo seguir por este orden:</p>

<ol>
  <li><a href="https://genkit.dev/docs/js/devtools/">Developer tools</a></li>
  <li><a href="https://genkit.dev/docs/js/models/">Generating content</a></li>
  <li><a href="https://genkit.dev/docs/js/flows/">Creating flows</a></li>
  <li><a href="https://genkit.dev/docs/js/tool-calling/">Tool calling</a></li>
  <li><a href="https://genkit.dev/docs/js/dotprompt/">Dotprompt</a></li>
</ol>

<h2 id="conclusiones">Conclusiones</h2>

<p>Genkit en JS/TS te permite pasar de idea a prototipo funcional en muy poco tiempo, manteniendo un codigo limpio, tipado y facil de evolucionar.</p>

<p>Si trabajas en Node.js y quieres construir features con IA de forma seria (pero sin sobreingenieria), este stack es una apuesta muy solida.</p>]]></content><author><name>Xavier Portilla Edo</name><email>xavierportillaedo@gmail.com</email></author><category term="genkit" /><category term="genkit" /><category term="javascript" /><category term="typescript" /><category term="gemini" /><category term="gcp" /><summary type="html"><![CDATA[Aprende a empezar con Genkit en JS/TS desde cero: instalacion, primer flow, ejecucion local y uso de la Developer UI.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://xavidop.me/assets/img/blog/post-headers/genkit-jsts-introduccion.png" /><media:content medium="image" url="https://xavidop.me/assets/img/blog/post-headers/genkit-jsts-introduccion.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry xml:lang="en"><title type="html">Stop Using Python for Your Gen AI Apps, Use Go and Genkit Instead (English)</title><link href="https://xavidop.me/genkit/2026-05-04-stop-using-python-genai-use-genkit-go/" rel="alternate" type="text/html" title="Stop Using Python for Your Gen AI Apps, Use Go and Genkit Instead (English)" /><published>2026-05-04T00:00:00+00:00</published><updated>2026-05-05T15:04:11+00:00</updated><id>https://xavidop.me/genkit/stop-using-python-genai-use-genkit-go</id><content type="html" xml:base="https://xavidop.me/genkit/2026-05-04-stop-using-python-genai-use-genkit-go/"><![CDATA[<ol class="no_toc" id="markdown-toc">
  <li><a href="#introduction" id="markdown-toc-introduction">Introduction</a></li>
  <li><a href="#why-python-hurts-in-production-gen-ai" id="markdown-toc-why-python-hurts-in-production-gen-ai">Why Python Hurts in Production Gen AI</a>    <ol>
      <li><a href="#concurrency-is-a-constant-fight" id="markdown-toc-concurrency-is-a-constant-fight">Concurrency is a constant fight</a></li>
      <li><a href="#cold-starts-and-memory-footprint" id="markdown-toc-cold-starts-and-memory-footprint">Cold starts and memory footprint</a></li>
      <li><a href="#dependency-hell-is-worse-for-ai" id="markdown-toc-dependency-hell-is-worse-for-ai">Dependency hell is worse for AI</a></li>
      <li><a href="#types-are-optional-and-it-shows" id="markdown-toc-types-are-optional-and-it-shows">Types are optional, and it shows</a></li>
      <li><a href="#deployment-is-a-packaging-exercise" id="markdown-toc-deployment-is-a-packaging-exercise">Deployment is a packaging exercise</a></li>
      <li><a href="#the-performance-ceiling-is-real" id="markdown-toc-the-performance-ceiling-is-real">The performance ceiling is real</a></li>
    </ol>
  </li>
  <li><a href="#go-is-the-best-language-for-agentic-coders" id="markdown-toc-go-is-the-best-language-for-agentic-coders">Go Is the Best Language for Agentic Coders</a>    <ol>
      <li><a href="#strong-static-typing-closes-the-feedback-loop" id="markdown-toc-strong-static-typing-closes-the-feedback-loop">Strong, static typing closes the feedback loop</a></li>
      <li><a href="#there-is-usually-one-obvious-way-to-do-something" id="markdown-toc-there-is-usually-one-obvious-way-to-do-something">There is usually one obvious way to do something</a></li>
      <li><a href="#tooling-is-built-for-machines-not-just-humans" id="markdown-toc-tooling-is-built-for-machines-not-just-humans">Tooling is built for machines, not just humans</a></li>
      <li><a href="#and-then-genkit-go-takes-it-one-level-further" id="markdown-toc-and-then-genkit-go-takes-it-one-level-further">And then Genkit Go takes it one level further</a></li>
    </ol>
  </li>
  <li><a href="#why-genkit-go-specifically" id="markdown-toc-why-genkit-go-specifically">Why Genkit Go Specifically</a></li>
  <li><a href="#what-we-are-going-to-build" id="markdown-toc-what-we-are-going-to-build">What We Are Going to Build</a></li>
  <li><a href="#prerequisites" id="markdown-toc-prerequisites">Prerequisites</a>    <ol>
      <li><a href="#install-the-genkit-cli" id="markdown-toc-install-the-genkit-cli">Install the Genkit CLI</a></li>
    </ol>
  </li>
  <li><a href="#set-up-the-project" id="markdown-toc-set-up-the-project">Set Up the Project</a></li>
  <li><a href="#the-code-a-single-maingo" id="markdown-toc-the-code-a-single-maingo">The Code: a Single <code class="language-plaintext highlighter-rouge">main.go</code></a></li>
  <li><a href="#run-it" id="markdown-toc-run-it">Run It</a></li>
  <li><a href="#test-it-visually-with-the-developer-ui" id="markdown-toc-test-it-visually-with-the-developer-ui">Test It Visually with the Developer UI</a></li>
  <li><a href="#deploying-it" id="markdown-toc-deploying-it">Deploying It</a></li>
  <li><a href="#but-what-about" id="markdown-toc-but-what-about">“But What About…”</a></li>
  <li><a href="#wrapping-up" id="markdown-toc-wrapping-up">Wrapping Up</a></li>
  <li><a href="#references" id="markdown-toc-references">References</a></li>
</ol>

<h2 id="introduction">Introduction</h2>

<p>For the last few years, every Gen AI tutorial, framework, and “hello world” has assumed one thing: that you are writing Python. It made sense at the start. The research community lives in Python, the model providers ship Python SDKs first, and the notebook culture is hard to beat for prototyping. But there is a quiet, important shift happening in 2026: the teams actually shipping AI features at scale are increasingly moving their <strong>production</strong> Gen AI workloads off Python, and onto languages built for services.</p>

<p>Go is at the center of that shift. And <strong><a href="https://genkit.dev/docs/go/get-started/">Genkit Go</a></strong>, the Go flavor of Google’s open-source Gen AI framework, is the cleanest path I have seen to build production-ready AI services in Go: typed flows, structured output, built-in HTTP serving, observability, and a Developer UI, all from a single binary.</p>

<p>This article is two things at once. First, an honest argument about why Python is a poor fit for production Gen AI services. Second, a hands-on getting-started with Genkit Go so you can replace that Python microservice this week.</p>

<h2 id="why-python-hurts-in-production-gen-ai">Why Python Hurts in Production Gen AI</h2>

<p>Python is great for research and prototyping. But Gen AI applications are not really “AI code”, they are mostly <strong>I/O-heavy network services</strong> that happen to call a model. And that is exactly where Python struggles.</p>

<h3 id="concurrency-is-a-constant-fight">Concurrency is a constant fight</h3>

<p>Gen AI workloads are dominated by long, concurrent network calls: streaming completions, tool calls, embedding requests, vector DB lookups, MCP servers. Go’s goroutines and channels were literally designed for this. In Python you have a choice between three uncomfortable options: threads (limited by the GIL), <code class="language-plaintext highlighter-rouge">asyncio</code> (which infects your entire codebase and breaks the moment one library is sync), or multiprocessing (heavy, awkward, and unfriendly to shared state). None of them feel native. All of them leak through your abstractions.</p>

<h3 id="cold-starts-and-memory-footprint">Cold starts and memory footprint</h3>

<p>A Python AI service typically pulls in <code class="language-plaintext highlighter-rouge">pydantic</code>, <code class="language-plaintext highlighter-rouge">httpx</code>, an SDK or two, and a tokenizer. You are easily looking at 200, 400 MB of resident memory and several seconds of cold start before you serve a single request. A Go binary doing the same job is one statically linked file, tens of MB of RAM, and starts in milliseconds. On Cloud Run, Lambda, Azure Functions, or any autoscaling platform, this difference is not a micro-optimization, it is the difference between a service that scales to zero gracefully and one that does not.</p>

<h3 id="dependency-hell-is-worse-for-ai">Dependency hell is worse for AI</h3>

<p><code class="language-plaintext highlighter-rouge">pip</code>, <code class="language-plaintext highlighter-rouge">poetry</code>, <code class="language-plaintext highlighter-rouge">uv</code>, <code class="language-plaintext highlighter-rouge">conda</code>, <code class="language-plaintext highlighter-rouge">venv</code>, <code class="language-plaintext highlighter-rouge">requirements.txt</code>, <code class="language-plaintext highlighter-rouge">pyproject.toml</code>. Pin a Torch version, break a transitive dep. Upgrade an SDK, break Pydantic v1 vs v2. Every Python AI repo I have inherited has spent at least a day fixing the environment before running a single prompt. Go’s module system, with a single <code class="language-plaintext highlighter-rouge">go.mod</code> and <code class="language-plaintext highlighter-rouge">go.sum</code>, is boring, reproducible, and just works.</p>

<h3 id="types-are-optional-and-it-shows">Types are optional, and it shows</h3>

<p>Structured output, tool calling, and MCP all rely on <strong>schemas</strong>. In Python, the schema lives in Pydantic models, in docstrings, in comments, and sometimes in your head. In Go, the schema <strong>is</strong> the struct. The compiler enforces it. Genkit picks it up automatically via JSON schema tags. You cannot ship a flow whose input does not match what the model returns, because it will not compile.</p>

<h3 id="deployment-is-a-packaging-exercise">Deployment is a packaging exercise</h3>

<p>Python deployments are Dockerfiles full of system packages, base images that drift, and “works on my machine” surprises. Go deploys as a single static binary. <code class="language-plaintext highlighter-rouge">FROM scratch</code>, copy the binary, done. For AI services that need to run on Cloud Run, on Kubernetes, on the edge, or as a sidecar, that is a massive operational win.</p>

<h3 id="the-performance-ceiling-is-real">The performance ceiling is real</h3>

<p>Yes, the heavy lifting happens on the model provider’s GPUs. But your service still has to parse tokens off a streaming response, fan out tool calls, merge results, enforce timeouts, and push telemetry, <strong>per request, at concurrency</strong>. Go does that work an order of magnitude more efficiently than CPython, and without you having to think about it.</p>

<blockquote>
  <p>None of this means Python is wrong for <strong>research</strong>. It means Python is the wrong default for the <strong>service</strong> that exposes that research to your users.</p>
</blockquote>

<h2 id="go-is-the-best-language-for-agentic-coders">Go Is the Best Language for Agentic Coders</h2>

<p>There is one more reason to pick Go in 2026 that did not really exist two years ago: <strong>agentic coders</strong>. Tools like Claude Code, Cursor’s agent mode, GitHub Copilot’s agent, Gemini Code Assist, Codex, Aider, and the growing ecosystem of autonomous coding agents are now a real part of how software gets written. And it turns out that <strong>Go is the language they thrive in</strong>.</p>

<p>Why? It comes down to three properties of the language that align almost perfectly with how an LLM-based agent reasons about code:</p>

<h3 id="strong-static-typing-closes-the-feedback-loop">Strong, static typing closes the feedback loop</h3>

<p>Agentic coders work in a tight loop: write code, compile, read the error, fix, repeat. Go’s compiler is fast, strict, and brutally honest. When an agent generates a wrong call, the compiler tells it exactly what is wrong and where, in seconds. In Python, the same mistake might only surface at runtime, three layers deep, with a stack trace that requires the agent to spend tokens reasoning about dynamic behavior. Strong typing turns “guess and pray” into “verify and continue”.</p>

<h3 id="there-is-usually-one-obvious-way-to-do-something">There is usually one obvious way to do something</h3>

<p>Python has at least four HTTP clients, three async paradigms, two type systems, and an opinion war about every major design decision. An agent has to choose, and choices cost tokens and increase the chance of going off the rails. Go is famously opinionated: one formatter (<code class="language-plaintext highlighter-rouge">gofmt</code>), one module system, one idiomatic way to handle errors, one standard layout. Less surface area means less ambiguity, which means <strong>less token consumption and more correct code per iteration</strong>.</p>

<h3 id="tooling-is-built-for-machines-not-just-humans">Tooling is built for machines, not just humans</h3>

<p><code class="language-plaintext highlighter-rouge">go build</code>, <code class="language-plaintext highlighter-rouge">go test</code>, <code class="language-plaintext highlighter-rouge">go vet</code>, <code class="language-plaintext highlighter-rouge">gopls</code>, and <code class="language-plaintext highlighter-rouge">staticcheck</code> produce structured, parseable output. Agents can read it directly without heuristics. Combine that with <code class="language-plaintext highlighter-rouge">go doc</code> and the standard library being uniformly documented, and you give an agent a self-describing environment it can navigate without hallucinating.</p>

<h3 id="and-then-genkit-go-takes-it-one-level-further">And then Genkit Go takes it one level further</h3>

<p>Genkit Go leans into the same properties:</p>

<ul>
  <li>Flow inputs and outputs are <strong>Go structs</strong>, the schema is the type. An agent generating a new flow knows exactly what shape the data has, because the compiler will reject anything else.</li>
  <li>The API surface is small and consistent: <code class="language-plaintext highlighter-rouge">genkit.Init</code>, <code class="language-plaintext highlighter-rouge">genkit.DefineFlow</code>, <code class="language-plaintext highlighter-rouge">genkit.DefineTool</code>, <code class="language-plaintext highlighter-rouge">genkit.GenerateData</code>, <code class="language-plaintext highlighter-rouge">genkit.Handler</code>. There is one obvious way to define a flow, one obvious way to expose it, one obvious way to call a model.</li>
  <li>Tool definitions are typed end-to-end, so an agent writing a new tool gets compile-time guarantees that its signature matches what the runtime expects.</li>
</ul>

<p>The net effect is that an agentic coder pointed at a Genkit Go codebase will produce <strong>more correct code, in fewer iterations, with fewer tokens</strong> than the same agent pointed at an equivalent Python codebase. In a world where you are increasingly going to be the reviewer of agent-generated code rather than the author, that compounds fast.</p>

<h2 id="why-genkit-go-specifically">Why Genkit Go Specifically</h2>

<p>If you accept the premise that Go is the better runtime for Gen AI services, the next question is: which framework? You can absolutely call the Gemini, OpenAI, or Anthropic SDKs directly from Go. But you will quickly end up rebuilding the same primitives every Genkit user already has for free.</p>

<p>Here is what Genkit Go gives you out of the box, and what you would otherwise have to write yourself:</p>

<table>
  <tbody>
    <tr>
      <td>Feature</td>
      <td>Without Genkit</td>
      <td>With Genkit Go</td>
    </tr>
    <tr>
      <td> </td>
      <td> </td>
      <td> </td>
    </tr>
    <tr>
      <td>Call a model</td>
      <td>Hand-rolled HTTP client per provider, manual JSON, manual streaming</td>
      <td><code class="language-plaintext highlighter-rouge">genkit.Generate(...)</code>, one call, multi-provider</td>
    </tr>
    <tr>
      <td>Structured output</td>
      <td>Parse raw JSON, custom unmarshaling, validate by hand</td>
      <td><code class="language-plaintext highlighter-rouge">genkit.GenerateData[MyStruct]</code>, typed Go struct returned</td>
    </tr>
    <tr>
      <td>Expose as an API</td>
      <td><code class="language-plaintext highlighter-rouge">net/http</code> boilerplate per endpoint, request/response wiring</td>
      <td><code class="language-plaintext highlighter-rouge">genkit.Handler(flow)</code>, auto HTTP endpoint</td>
    </tr>
    <tr>
      <td>Tool calling</td>
      <td>Parse function call payloads, dispatch, re-submit</td>
      <td><code class="language-plaintext highlighter-rouge">genkit.DefineTool(...)</code>, automatic execution loop</td>
    </tr>
    <tr>
      <td>Observability</td>
      <td>Wire OpenTelemetry, define spans, ship metrics</td>
      <td>Built-in tracing, metrics, latency, zero config</td>
    </tr>
    <tr>
      <td>Local dev</td>
      <td><code class="language-plaintext highlighter-rouge">curl</code>, Postman, manual harnesses</td>
      <td><strong>Genkit Developer UI</strong>, visual flow runner, traces, prompt playground</td>
    </tr>
    <tr>
      <td>Multi-provider</td>
      <td>Different SDKs, different auth, different schemas</td>
      <td>Unified plugin interface (Google AI, Vertex, OpenAI, Anthropic, Bedrock, Azure, Ollama, …)</td>
    </tr>
  </tbody>
</table>

<p>It is the same philosophy as <a href="/genkit/gcp/2026-02-10-genkit-java-101/">Genkit Java</a> and the JavaScript flavor I covered in <a href="/genkit/2026-04-16-top-jsts-genai-frameworks-2026/">my 2026 JS/TS Gen AI frameworks comparison</a>: a thin, opinionated, cloud-agnostic layer that turns “AI logic” into a <strong>typed function</strong> you can call, test, deploy, and observe.</p>

<h2 id="what-we-are-going-to-build">What We Are Going to Build</h2>

<p>A Go service exposing a single AI flow that generates a structured <strong>recipe</strong> from a main ingredient and optional dietary restrictions. It will:</p>

<ul>
  <li>Accept a typed <code class="language-plaintext highlighter-rouge">RecipeInput</code> as input.</li>
  <li>Call <strong>Gemini 3 Pro</strong> via the Google AI plugin.</li>
  <li>Return a strongly-typed <code class="language-plaintext highlighter-rouge">Recipe</code> struct (no manual JSON parsing).</li>
  <li>Be served as an HTTP endpoint on <code class="language-plaintext highlighter-rouge">:3400</code>.</li>
  <li>Be testable visually in the <strong>Genkit Developer UI</strong>.</li>
</ul>

<p>All in <strong>a single <code class="language-plaintext highlighter-rouge">main.go</code> file</strong>. No web framework. No code generation. Just Go.</p>

<h2 id="prerequisites">Prerequisites</h2>

<ul>
  <li><strong>Go 1.24+</strong> (<a href="https://go.dev/doc/install">install</a>)</li>
  <li><strong>Node.js 18+</strong> (only required for the Genkit CLI / Developer UI)</li>
  <li>A <strong>Google GenAI API key</strong> (free, no credit card, from <a href="https://aistudio.google.com/apikey">Google AI Studio</a>)</li>
</ul>

<h3 id="install-the-genkit-cli">Install the Genkit CLI</h3>

<p>The Genkit CLI is your local companion for running and inspecting flows in the Developer UI:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>curl <span class="nt">-sL</span> cli.genkit.dev | bash
</code></pre></div></div>

<p>Verify it:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>genkit <span class="nt">--version</span>
</code></pre></div></div>

<h2 id="set-up-the-project">Set Up the Project</h2>

<p>Create a fresh module:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nb">mkdir </span>genkit-go-recipes <span class="o">&amp;&amp;</span> <span class="nb">cd </span>genkit-go-recipes
go mod init example/genkit-go-recipes
</code></pre></div></div>

<p>Install the Genkit Go package:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>go get github.com/firebase/genkit/go
</code></pre></div></div>

<p>Set your API key:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nb">export </span><span class="nv">GEMINI_API_KEY</span><span class="o">=</span>&lt;your API key&gt;
</code></pre></div></div>

<h2 id="the-code-a-single-maingo">The Code: a Single <code class="language-plaintext highlighter-rouge">main.go</code></h2>

<p>Create <code class="language-plaintext highlighter-rouge">main.go</code> with the following content. This is the entire service.</p>

<div class="language-go highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">package</span> <span class="n">main</span>

<span class="k">import</span> <span class="p">(</span>
    <span class="s">"context"</span>
    <span class="s">"encoding/json"</span>
    <span class="s">"fmt"</span>
    <span class="s">"log"</span>
    <span class="s">"net/http"</span>

    <span class="s">"github.com/firebase/genkit/go/ai"</span>
    <span class="s">"github.com/firebase/genkit/go/genkit"</span>
    <span class="s">"github.com/firebase/genkit/go/plugins/googlegenai"</span>
    <span class="s">"github.com/firebase/genkit/go/plugins/server"</span>
<span class="p">)</span>

<span class="c">// Input schema, picked up automatically by Genkit and the Dev UI.</span>
<span class="k">type</span> <span class="n">RecipeInput</span> <span class="k">struct</span> <span class="p">{</span>
    <span class="n">Ingredient</span>          <span class="kt">string</span> <span class="s">`json:"ingredient" jsonschema:"description=Main ingredient or cuisine type"`</span>
    <span class="n">DietaryRestrictions</span> <span class="kt">string</span> <span class="s">`json:"dietaryRestrictions,omitempty" jsonschema:"description=Any dietary restrictions"`</span>
<span class="p">}</span>

<span class="c">// Output schema, returned directly by the model as a typed Go struct.</span>
<span class="k">type</span> <span class="n">Recipe</span> <span class="k">struct</span> <span class="p">{</span>
    <span class="n">Title</span>        <span class="kt">string</span>   <span class="s">`json:"title"`</span>
    <span class="n">Description</span>  <span class="kt">string</span>   <span class="s">`json:"description"`</span>
    <span class="n">PrepTime</span>     <span class="kt">string</span>   <span class="s">`json:"prepTime"`</span>
    <span class="n">CookTime</span>     <span class="kt">string</span>   <span class="s">`json:"cookTime"`</span>
    <span class="n">Servings</span>     <span class="kt">int</span>      <span class="s">`json:"servings"`</span>
    <span class="n">Ingredients</span>  <span class="p">[]</span><span class="kt">string</span> <span class="s">`json:"ingredients"`</span>
    <span class="n">Instructions</span> <span class="p">[]</span><span class="kt">string</span> <span class="s">`json:"instructions"`</span>
    <span class="n">Tips</span>         <span class="p">[]</span><span class="kt">string</span> <span class="s">`json:"tips,omitempty"`</span>
<span class="p">}</span>

<span class="k">func</span> <span class="n">main</span><span class="p">()</span> <span class="p">{</span>
    <span class="n">ctx</span> <span class="o">:=</span> <span class="n">context</span><span class="o">.</span><span class="n">Background</span><span class="p">()</span>

    <span class="c">// Initialize Genkit with the Google AI plugin and a default model.</span>
    <span class="n">g</span> <span class="o">:=</span> <span class="n">genkit</span><span class="o">.</span><span class="n">Init</span><span class="p">(</span><span class="n">ctx</span><span class="p">,</span>
        <span class="n">genkit</span><span class="o">.</span><span class="n">WithPlugins</span><span class="p">(</span><span class="o">&amp;</span><span class="n">googlegenai</span><span class="o">.</span><span class="n">GoogleAI</span><span class="p">{}),</span>
        <span class="n">genkit</span><span class="o">.</span><span class="n">WithDefaultModel</span><span class="p">(</span><span class="s">"googleai/gemini-3-pro"</span><span class="p">),</span>
    <span class="p">)</span>

    <span class="c">// Define a typed flow: (RecipeInput) -&gt; (Recipe, error)</span>
    <span class="n">recipeGeneratorFlow</span> <span class="o">:=</span> <span class="n">genkit</span><span class="o">.</span><span class="n">DefineFlow</span><span class="p">(</span><span class="n">g</span><span class="p">,</span> <span class="s">"recipeGeneratorFlow"</span><span class="p">,</span>
        <span class="k">func</span><span class="p">(</span><span class="n">ctx</span> <span class="n">context</span><span class="o">.</span><span class="n">Context</span><span class="p">,</span> <span class="n">input</span> <span class="o">*</span><span class="n">RecipeInput</span><span class="p">)</span> <span class="p">(</span><span class="o">*</span><span class="n">Recipe</span><span class="p">,</span> <span class="kt">error</span><span class="p">)</span> <span class="p">{</span>
            <span class="n">dietary</span> <span class="o">:=</span> <span class="n">input</span><span class="o">.</span><span class="n">DietaryRestrictions</span>
            <span class="k">if</span> <span class="n">dietary</span> <span class="o">==</span> <span class="s">""</span> <span class="p">{</span>
                <span class="n">dietary</span> <span class="o">=</span> <span class="s">"none"</span>
            <span class="p">}</span>

            <span class="n">prompt</span> <span class="o">:=</span> <span class="n">fmt</span><span class="o">.</span><span class="n">Sprintf</span><span class="p">(</span><span class="s">`Create a recipe with the following requirements:
                Main ingredient: %s
                Dietary restrictions: %s`</span><span class="p">,</span> <span class="n">input</span><span class="o">.</span><span class="n">Ingredient</span><span class="p">,</span> <span class="n">dietary</span><span class="p">)</span>

            <span class="c">// Structured generation: Gemini returns a Recipe directly.</span>
            <span class="n">recipe</span><span class="p">,</span> <span class="n">_</span><span class="p">,</span> <span class="n">err</span> <span class="o">:=</span> <span class="n">genkit</span><span class="o">.</span><span class="n">GenerateData</span><span class="p">[</span><span class="n">Recipe</span><span class="p">](</span><span class="n">ctx</span><span class="p">,</span> <span class="n">g</span><span class="p">,</span>
                <span class="n">ai</span><span class="o">.</span><span class="n">WithPrompt</span><span class="p">(</span><span class="n">prompt</span><span class="p">),</span>
            <span class="p">)</span>
            <span class="k">if</span> <span class="n">err</span> <span class="o">!=</span> <span class="no">nil</span> <span class="p">{</span>
                <span class="k">return</span> <span class="no">nil</span><span class="p">,</span> <span class="n">fmt</span><span class="o">.</span><span class="n">Errorf</span><span class="p">(</span><span class="s">"failed to generate recipe: %w"</span><span class="p">,</span> <span class="n">err</span><span class="p">)</span>
            <span class="p">}</span>
            <span class="k">return</span> <span class="n">recipe</span><span class="p">,</span> <span class="no">nil</span>
        <span class="p">},</span>
    <span class="p">)</span>

    <span class="c">// Smoke-test the flow once at boot.</span>
    <span class="n">recipe</span><span class="p">,</span> <span class="n">err</span> <span class="o">:=</span> <span class="n">recipeGeneratorFlow</span><span class="o">.</span><span class="n">Run</span><span class="p">(</span><span class="n">ctx</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">RecipeInput</span><span class="p">{</span>
        <span class="n">Ingredient</span><span class="o">:</span>          <span class="s">"avocado"</span><span class="p">,</span>
        <span class="n">DietaryRestrictions</span><span class="o">:</span> <span class="s">"vegetarian"</span><span class="p">,</span>
    <span class="p">})</span>
    <span class="k">if</span> <span class="n">err</span> <span class="o">!=</span> <span class="no">nil</span> <span class="p">{</span>
        <span class="n">log</span><span class="o">.</span><span class="n">Fatalf</span><span class="p">(</span><span class="s">"could not generate recipe: %v"</span><span class="p">,</span> <span class="n">err</span><span class="p">)</span>
    <span class="p">}</span>
    <span class="n">out</span><span class="p">,</span> <span class="n">_</span> <span class="o">:=</span> <span class="n">json</span><span class="o">.</span><span class="n">MarshalIndent</span><span class="p">(</span><span class="n">recipe</span><span class="p">,</span> <span class="s">""</span><span class="p">,</span> <span class="s">"  "</span><span class="p">)</span>
    <span class="n">fmt</span><span class="o">.</span><span class="n">Println</span><span class="p">(</span><span class="s">"Sample recipe generated:"</span><span class="p">)</span>
    <span class="n">fmt</span><span class="o">.</span><span class="n">Println</span><span class="p">(</span><span class="kt">string</span><span class="p">(</span><span class="n">out</span><span class="p">))</span>

    <span class="c">// Expose the flow as an HTTP endpoint.</span>
    <span class="n">mux</span> <span class="o">:=</span> <span class="n">http</span><span class="o">.</span><span class="n">NewServeMux</span><span class="p">()</span>
    <span class="n">mux</span><span class="o">.</span><span class="n">HandleFunc</span><span class="p">(</span><span class="s">"POST /recipeGeneratorFlow"</span><span class="p">,</span> <span class="n">genkit</span><span class="o">.</span><span class="n">Handler</span><span class="p">(</span><span class="n">recipeGeneratorFlow</span><span class="p">))</span>

    <span class="n">log</span><span class="o">.</span><span class="n">Println</span><span class="p">(</span><span class="s">"Starting server on http://localhost:3400"</span><span class="p">)</span>
    <span class="n">log</span><span class="o">.</span><span class="n">Println</span><span class="p">(</span><span class="s">"Flow available at: POST http://localhost:3400/recipeGeneratorFlow"</span><span class="p">)</span>
    <span class="n">log</span><span class="o">.</span><span class="n">Fatal</span><span class="p">(</span><span class="n">server</span><span class="o">.</span><span class="n">Start</span><span class="p">(</span><span class="n">ctx</span><span class="p">,</span> <span class="s">"127.0.0.1:3400"</span><span class="p">,</span> <span class="n">mux</span><span class="p">))</span>
<span class="p">}</span>
</code></pre></div></div>

<p>Take a moment to count what is <strong>not</strong> in this file:</p>

<ul>
  <li>No web framework.</li>
  <li>No JSON parsing of the model output.</li>
  <li>No manual OpenTelemetry setup.</li>
  <li>No request/response DTO duplication.</li>
  <li>No Dockerfile yet (we will not need much).</li>
</ul>

<p>The struct <strong>is</strong> the contract. The flow <strong>is</strong> the endpoint. The compiler enforces both.</p>

<h2 id="run-it">Run It</h2>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>go run <span class="nb">.</span>
</code></pre></div></div>

<p>You should see a structured recipe printed as JSON, then the server logging that it is listening on <code class="language-plaintext highlighter-rouge">:3400</code>.</p>

<p>In another terminal, hit it with <code class="language-plaintext highlighter-rouge">curl</code>:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>curl <span class="nt">-X</span> POST <span class="s2">"http://localhost:3400/recipeGeneratorFlow"</span> <span class="se">\</span>
  <span class="nt">-H</span> <span class="s2">"Content-Type: application/json"</span> <span class="se">\</span>
  <span class="nt">-d</span> <span class="s1">'{"data": {"ingredient": "tomato", "dietaryRestrictions": "vegan"}}'</span>
</code></pre></div></div>

<p>You will get back a fully structured JSON recipe. That is it, you have a production-shaped Gen AI microservice in one file.</p>

<h2 id="test-it-visually-with-the-developer-ui">Test It Visually with the Developer UI</h2>

<p>The Genkit Developer UI is one of the strongest reasons to adopt Genkit, regardless of language. It gives you a local web app to run flows, inspect traces, tweak prompts, and debug tool calls.</p>

<p>From the project root:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>genkit start <span class="nt">--</span> go run <span class="nb">.</span>
</code></pre></div></div>

<p>Open <a href="http://localhost:4000">http://localhost:4000</a>, pick <code class="language-plaintext highlighter-rouge">recipeGeneratorFlow</code>, paste:</p>

<div class="language-json highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">{</span><span class="w">
  </span><span class="nl">"ingredient"</span><span class="p">:</span><span class="w"> </span><span class="s2">"avocado"</span><span class="p">,</span><span class="w">
  </span><span class="nl">"dietaryRestrictions"</span><span class="p">:</span><span class="w"> </span><span class="s2">"vegetarian"</span><span class="w">
</span><span class="p">}</span><span class="w">
</span></code></pre></div></div>

<p>Click <strong>Run</strong>. You will see the typed output and a full trace of the model call: tokens, latency, prompt, response. This is the kind of inner loop Python frameworks are still catching up on.</p>

<h2 id="deploying-it">Deploying It</h2>

<p>Because it is Go, deployment is almost anticlimactic. A minimal <code class="language-plaintext highlighter-rouge">Dockerfile</code>:</p>

<div class="language-dockerfile highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">FROM</span><span class="w"> </span><span class="s">golang:1.24</span><span class="w"> </span><span class="k">AS</span><span class="w"> </span><span class="s">build</span>
<span class="k">WORKDIR</span><span class="s"> /src</span>
<span class="k">COPY</span><span class="s"> . .</span>
<span class="k">RUN </span><span class="nv">CGO_ENABLED</span><span class="o">=</span>0 go build <span class="nt">-o</span> /out/server .

<span class="k">FROM</span><span class="s"> gcr.io/distroless/static</span>
<span class="k">COPY</span><span class="s"> --from=build /out/server /server</span>
<span class="k">ENV</span><span class="s"> PORT=3400</span>
<span class="k">EXPOSE</span><span class="s"> 3400</span>
<span class="k">ENTRYPOINT</span><span class="s"> ["/server"]</span>
</code></pre></div></div>

<p>That is your entire production image. Deploy it to <strong>Cloud Run</strong>, <strong>Cloud Run Jobs</strong>, <strong>Kubernetes</strong>, <strong>AWS Lambda</strong> (via container image), <strong>Azure Container Apps</strong>, or any platform that runs containers. No Python runtime to vendor. No <code class="language-plaintext highlighter-rouge">pip install</code> at build time. No virtual environment. Just a binary.</p>

<p>If you want to see the same pattern applied to other clouds and languages, I have already covered:</p>

<ul>
  <li><a href="/genkit/2026-03-20-genkit-aws-lambda-bedrock/">Genkit + AWS Lambda + Bedrock</a></li>
  <li><a href="/genkit/2026-03-20-genkit-azure-function-ai-foundry/">Genkit + Azure Functions + AI Foundry</a></li>
  <li><a href="/genkit/gcp/2026-02-10-genkit-java-101/">Genkit Java 101</a></li>
</ul>

<p>Genkit Go fits the same mold, with the smallest runtime footprint of all of them.</p>

<h2 id="but-what-about">“But What About…”</h2>

<p>A few honest objections worth addressing.</p>

<ul>
  <li><strong>“All the cool research libraries are in Python.”</strong> True. Keep them in Python, behind a small Python service that does only the research-y bit. Put your <strong>product surface</strong> (the part your users actually call) in Go. That separation is healthy.</li>
  <li><strong>“My team only knows Python.”</strong> Go is famously the easiest “real” backend language to learn. A Python developer can be productive in Go in days, and Genkit’s API surface is small enough that the learning curve is mostly Go itself, not the framework.</li>
  <li><strong>“What about LangChain / LlamaIndex features?”</strong> Most of what those frameworks give you (flows, tools, RAG, prompts, evaluation, observability) Genkit Go gives you too, with a fraction of the surface area and without the abstraction tax. See my <a href="/genkit/2026-04-16-top-jsts-genai-frameworks-2026/">2026 frameworks comparison</a> for the long version.</li>
  <li><strong>“Is Genkit Go production-ready?”</strong> It powers Gen AI features at Google and a growing list of companies. The Go SDK shares the same core philosophy and plugin model as the JS and Java SDKs. It is stable enough to bet on, and the iteration speed is high.</li>
</ul>

<h2 id="wrapping-up">Wrapping Up</h2>

<p>Python earned its place as the language of AI <strong>research</strong>. It did not earn its place as the language of AI <strong>services</strong>. Those are different problems with different constraints, and the constraints of production services, concurrency, footprint, deployment, types, observability, all favor Go.</p>

<p>Genkit Go is the framework that finally makes that switch painless. You get a typed, observable, multi-provider Gen AI service in one file, one binary, and one deploy. If you are still maintaining a Python microservice whose only job is to call an LLM and return structured JSON, you are paying a tax you do not need to pay.</p>

<p>Try it on your next flow. Replace one Python service. See how much smaller the resulting system is, in code, in memory, and in operational surface area.</p>

<h2 id="references">References</h2>

<ul>
  <li><a href="https://genkit.dev/docs/go/get-started/">Genkit Go, Get Started</a></li>
  <li><a href="https://genkit.dev/docs/go/flows/">Genkit Go, Flows</a></li>
  <li><a href="https://genkit.dev/docs/go/tool-calling/">Genkit Go, Tool Calling</a></li>
  <li><a href="https://genkit.dev/docs/go/deployment/cloud-run/">Genkit Go, Deployment on Cloud Run</a></li>
  <li><a href="https://github.com/firebase/genkit">Genkit GitHub</a></li>
</ul>]]></content><author><name>Xavier Portilla Edo</name><email>xavierportillaedo@gmail.com</email></author><category term="genkit" /><category term="genkit" /><category term="gcp" /><category term="go" /><summary type="html"><![CDATA[Python has dominated the Gen AI conversation, but it is not the only, nor the best, option for production. Here is why Go (and Genkit Go in particular) is a stronger bet for serious AI services in 2026.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://xavidop.me/assets/img/blog/post-headers/genkit-go-getting-started.png" /><media:content medium="image" url="https://xavidop.me/assets/img/blog/post-headers/genkit-go-getting-started.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry xml:lang="en"><title type="html">Top Gen AI Frameworks for Java in 2026: A Hands-On Comparison</title><link href="https://xavidop.me/genkit/2026-04-16-top-java-genai-frameworks-2026/" rel="alternate" type="text/html" title="Top Gen AI Frameworks for Java in 2026: A Hands-On Comparison" /><published>2026-04-16T00:00:00+00:00</published><updated>2026-05-06T04:23:37+00:00</updated><id>https://xavidop.me/genkit/top-java-genai-frameworks-2026</id><content type="html" xml:base="https://xavidop.me/genkit/2026-04-16-top-java-genai-frameworks-2026/"><![CDATA[<ol class="no_toc" id="markdown-toc">
  <li><a href="#introduction" id="markdown-toc-introduction">Introduction</a></li>
  <li><a href="#genkit-java" id="markdown-toc-genkit-java">Genkit Java</a>    <ol>
      <li><a href="#history-and-direction" id="markdown-toc-history-and-direction">History and Direction</a></li>
      <li><a href="#what-makes-genkit-java-stand-out" id="markdown-toc-what-makes-genkit-java-stand-out">What Makes Genkit Java Stand Out</a>        <ol>
          <li><a href="#vanilla-generation" id="markdown-toc-vanilla-generation">Vanilla Generation</a></li>
          <li><a href="#typed-flows--observable-pipelines" id="markdown-toc-typed-flows--observable-pipelines">Typed Flows — Observable Pipelines</a></li>
          <li><a href="#tools-and-agents" id="markdown-toc-tools-and-agents">Tools and Agents</a></li>
          <li><a href="#the-dev-ui--same-power-as-typescript" id="markdown-toc-the-dev-ui--same-power-as-typescript">The Dev UI — Same Power as TypeScript</a></li>
          <li><a href="#provider-support" id="markdown-toc-provider-support">Provider Support</a></li>
          <li><a href="#pros-and-cons" id="markdown-toc-pros-and-cons">Pros and Cons</a></li>
        </ol>
      </li>
    </ol>
  </li>
  <li><a href="#spring-ai" id="markdown-toc-spring-ai">Spring AI</a>    <ol>
      <li><a href="#history-and-direction-1" id="markdown-toc-history-and-direction-1">History and Direction</a></li>
      <li><a href="#what-makes-spring-ai-stand-out" id="markdown-toc-what-makes-spring-ai-stand-out">What Makes Spring AI Stand Out</a>        <ol>
          <li><a href="#structured-output" id="markdown-toc-structured-output">Structured Output</a></li>
          <li><a href="#rag-with-advisors" id="markdown-toc-rag-with-advisors">RAG with Advisors</a></li>
          <li><a href="#observability" id="markdown-toc-observability">Observability</a></li>
          <li><a href="#broad-vector-store-and-model-support" id="markdown-toc-broad-vector-store-and-model-support">Broad Vector Store and Model Support</a></li>
          <li><a href="#pros-and-cons-1" id="markdown-toc-pros-and-cons-1">Pros and Cons</a></li>
        </ol>
      </li>
    </ol>
  </li>
  <li><a href="#langchain4j" id="markdown-toc-langchain4j">LangChain4j</a>    <ol>
      <li><a href="#history-and-direction-2" id="markdown-toc-history-and-direction-2">History and Direction</a></li>
      <li><a href="#what-makes-langchain4j-stand-out" id="markdown-toc-what-makes-langchain4j-stand-out">What Makes LangChain4j Stand Out</a>        <ol>
          <li><a href="#memory-and-streaming" id="markdown-toc-memory-and-streaming">Memory and Streaming</a></li>
          <li><a href="#rag-pipeline" id="markdown-toc-rag-pipeline">RAG Pipeline</a></li>
          <li><a href="#two-abstraction-levels" id="markdown-toc-two-abstraction-levels">Two Abstraction Levels</a></li>
          <li><a href="#pros-and-cons-2" id="markdown-toc-pros-and-cons-2">Pros and Cons</a></li>
        </ol>
      </li>
    </ol>
  </li>
  <li><a href="#google-adk-java" id="markdown-toc-google-adk-java">Google ADK Java</a>    <ol>
      <li><a href="#history-and-direction-3" id="markdown-toc-history-and-direction-3">History and Direction</a></li>
      <li><a href="#adk-javas-position-agent-only-enterprise-grade" id="markdown-toc-adk-javas-position-agent-only-enterprise-grade">ADK Java’s Position: Agent-Only, Enterprise-Grade</a>        <ol>
          <li><a href="#multi-agent-orchestration" id="markdown-toc-multi-agent-orchestration">Multi-Agent Orchestration</a></li>
          <li><a href="#vertex-ai-lock-in" id="markdown-toc-vertex-ai-lock-in">Vertex AI Lock-In</a></li>
          <li><a href="#pros-and-cons-3" id="markdown-toc-pros-and-cons-3">Pros and Cons</a></li>
        </ol>
      </li>
    </ol>
  </li>
  <li><a href="#head-to-head-comparison" id="markdown-toc-head-to-head-comparison">Head-to-Head Comparison</a>    <ol>
      <li><a href="#developer-experience" id="markdown-toc-developer-experience">Developer Experience</a></li>
      <li><a href="#abstraction-levels" id="markdown-toc-abstraction-levels">Abstraction Levels</a></li>
      <li><a href="#observability-1" id="markdown-toc-observability-1">Observability</a></li>
      <li><a href="#framework-neutrality" id="markdown-toc-framework-neutrality">Framework Neutrality</a></li>
      <li><a href="#java-ecosystem-fit" id="markdown-toc-java-ecosystem-fit">Java Ecosystem Fit</a></li>
    </ol>
  </li>
  <li><a href="#which-framework-should-you-choose" id="markdown-toc-which-framework-should-you-choose">Which Framework Should You Choose?</a></li>
  <li><a href="#conclusion" id="markdown-toc-conclusion">Conclusion</a></li>
</ol>

<h2 id="introduction">Introduction</h2>

<p>Java has always been a serious language for production systems, and in 2026, the Generative AI ecosystem has finally caught up. For years, Java developers watched from the sidelines as Python and TypeScript accumulated framework after framework for building LLM-powered applications. Today, the picture is very different. Java has multiple mature, actively maintained AI frameworks, each with its own philosophy and trade-offs.</p>

<p>This article covers the four frameworks I have personally used to ship Java AI applications: <strong>Genkit Java</strong>, <strong>Spring AI</strong>, <strong>LangChain4j</strong>, and <strong>Google ADK Java</strong>. Each one represents a meaningfully different bet on what a Java AI framework should be, and understanding those differences will save you from picking the wrong tool.</p>

<hr />

<h2 id="genkit-java">Genkit Java</h2>

<h3 id="history-and-direction">History and Direction</h3>

<p>Genkit started life as a TypeScript-first framework launched by Google at I/O 2024. The Java SDK arrived as a community-maintained effort, built and maintained by developers within the Google ecosystem who wanted to bring the same developer experience to Java that Genkit had established in TypeScript. As of 2026, <strong>Genkit Java is unofficial</strong>, it is not an official Google product, but it is actively maintained, follows the core Genkit design closely, and ships its own plugin ecosystem.</p>

<p>The framework’s first stable release landed in early 2026 after months of preview use. Its ambition mirrors the TypeScript SDK’s: bring Genkit’s multi-level abstractions (vanilla generation, typed flows, agents), its broad provider-neutral plugin model, and, crucially, the <strong>Genkit Developer UI</strong> to Java developers. The Java SDK ships with Spring Boot and Jetty server plugins, making it a natural fit for teams that already run Java services in production. The Javadoc and architecture are clean and idiomatic Java, this does not feel like a port; it feels designed for the language.</p>

<p>The direction is clear: maintain parity with the TypeScript Genkit SDK’s abstractions while embracing Java idioms (builder patterns, typed schemas via Java classes, annotation-free configuration). Support for evaluation, MCP (Model Context Protocol), RAG with pgvector and Pinecone, and multi-agent patterns is already in place.</p>

<h3 id="what-makes-genkit-java-stand-out">What Makes Genkit Java Stand Out</h3>

<p>Like its TypeScript counterpart, Genkit Java provides <strong>three levels of abstraction in a single SDK</strong>: direct model calls, typed flows (observable pipelines), and agents. This is unique in the Java AI space, no other Java framework gives you all three in one coherent API.</p>

<p><strong>Supported languages:</strong> Java 21+ (primary). Deploys to Spring Boot, Jetty, or Firebase Cloud Functions.</p>

<h4 id="vanilla-generation">Vanilla Generation</h4>

<div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">import</span> <span class="nn">com.google.genkit.Genkit</span><span class="o">;</span>
<span class="kn">import</span> <span class="nn">com.google.genkit.ai.GenerateOptions</span><span class="o">;</span>
<span class="kn">import</span> <span class="nn">com.google.genkit.plugins.googlegenai.GoogleGenAIPlugin</span><span class="o">;</span>

<span class="nc">Genkit</span> <span class="n">genkit</span> <span class="o">=</span> <span class="nc">Genkit</span><span class="o">.</span><span class="na">builder</span><span class="o">()</span>
    <span class="o">.</span><span class="na">plugin</span><span class="o">(</span><span class="nc">GoogleGenAIPlugin</span><span class="o">.</span><span class="na">create</span><span class="o">())</span>
    <span class="o">.</span><span class="na">build</span><span class="o">();</span>

<span class="nc">String</span> <span class="n">text</span> <span class="o">=</span> <span class="n">genkit</span><span class="o">.</span><span class="na">generate</span><span class="o">(</span><span class="nc">GenerateOptions</span><span class="o">.</span><span class="na">builder</span><span class="o">()</span>
    <span class="o">.</span><span class="na">model</span><span class="o">(</span><span class="s">"googleai/gemini-flash-latest"</span><span class="o">)</span>
    <span class="o">.</span><span class="na">prompt</span><span class="o">(</span><span class="s">"Explain the CAP theorem in two sentences."</span><span class="o">)</span>
    <span class="o">.</span><span class="na">build</span><span class="o">()).</span><span class="na">getText</span><span class="o">();</span>

<span class="nc">System</span><span class="o">.</span><span class="na">out</span><span class="o">.</span><span class="na">println</span><span class="o">(</span><span class="n">text</span><span class="o">);</span>
</code></pre></div></div>

<h4 id="typed-flows--observable-pipelines">Typed Flows — Observable Pipelines</h4>

<p>Flows are the heart of Genkit Java. They wrap your AI logic in a named, typed, traceable unit that is automatically exposed as an HTTP endpoint and visible in the Dev UI.</p>

<div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">import</span> <span class="nn">com.google.genkit.Genkit</span><span class="o">;</span>
<span class="kn">import</span> <span class="nn">com.google.genkit.flow.FlowOptions</span><span class="o">;</span>
<span class="kn">import</span> <span class="nn">com.google.genkit.plugins.googlegenai.GoogleGenAIPlugin</span><span class="o">;</span>
<span class="kn">import</span> <span class="nn">com.google.genkit.plugins.jetty.JettyPlugin</span><span class="o">;</span>
<span class="kn">import</span> <span class="nn">com.google.genkit.plugins.jetty.JettyPluginOptions</span><span class="o">;</span>

<span class="n">record</span> <span class="nf">TranslateRequest</span><span class="o">(</span><span class="nc">String</span> <span class="n">text</span><span class="o">,</span> <span class="nc">String</span> <span class="n">targetLanguage</span><span class="o">)</span> <span class="o">{}</span>
<span class="n">record</span> <span class="nf">TranslateResponse</span><span class="o">(</span><span class="nc">String</span> <span class="n">translation</span><span class="o">,</span> <span class="nc">String</span> <span class="n">detectedLanguage</span><span class="o">)</span> <span class="o">{}</span>

<span class="nc">JettyPlugin</span> <span class="n">jetty</span> <span class="o">=</span> <span class="k">new</span> <span class="nc">JettyPlugin</span><span class="o">(</span><span class="nc">JettyPluginOptions</span><span class="o">.</span><span class="na">builder</span><span class="o">().</span><span class="na">port</span><span class="o">(</span><span class="mi">8080</span><span class="o">).</span><span class="na">build</span><span class="o">());</span>

<span class="nc">Genkit</span> <span class="n">genkit</span> <span class="o">=</span> <span class="nc">Genkit</span><span class="o">.</span><span class="na">builder</span><span class="o">()</span>
    <span class="o">.</span><span class="na">plugin</span><span class="o">(</span><span class="nc">GoogleGenAIPlugin</span><span class="o">.</span><span class="na">create</span><span class="o">())</span>
    <span class="o">.</span><span class="na">plugin</span><span class="o">(</span><span class="n">jetty</span><span class="o">)</span>
    <span class="o">.</span><span class="na">build</span><span class="o">();</span>

<span class="n">genkit</span><span class="o">.</span><span class="na">defineFlow</span><span class="o">(</span>
    <span class="nc">FlowOptions</span><span class="o">.&lt;</span><span class="nc">TranslateRequest</span><span class="o">,</span> <span class="nc">TranslateResponse</span><span class="o">&gt;</span><span class="n">builder</span><span class="o">()</span>
        <span class="o">.</span><span class="na">name</span><span class="o">(</span><span class="s">"translateText"</span><span class="o">)</span>
        <span class="o">.</span><span class="na">inputClass</span><span class="o">(</span><span class="nc">TranslateRequest</span><span class="o">.</span><span class="na">class</span><span class="o">)</span>
        <span class="o">.</span><span class="na">outputClass</span><span class="o">(</span><span class="nc">TranslateResponse</span><span class="o">.</span><span class="na">class</span><span class="o">)</span>
        <span class="o">.</span><span class="na">build</span><span class="o">(),</span>
    <span class="o">(</span><span class="n">ctx</span><span class="o">,</span> <span class="n">request</span><span class="o">)</span> <span class="o">-&gt;</span> <span class="o">{</span>
        <span class="kt">var</span> <span class="n">response</span> <span class="o">=</span> <span class="n">genkit</span><span class="o">.</span><span class="na">generate</span><span class="o">(</span><span class="nc">GenerateOptions</span><span class="o">.</span><span class="na">builder</span><span class="o">()</span>
            <span class="o">.</span><span class="na">model</span><span class="o">(</span><span class="s">"googleai/gemini-flash-latest"</span><span class="o">)</span>
            <span class="o">.</span><span class="na">prompt</span><span class="o">(</span><span class="s">"Translate '%s' to %s. Return JSON with 'translation' and 'detectedLanguage'."</span>
                <span class="o">.</span><span class="na">formatted</span><span class="o">(</span><span class="n">request</span><span class="o">.</span><span class="na">text</span><span class="o">(),</span> <span class="n">request</span><span class="o">.</span><span class="na">targetLanguage</span><span class="o">()))</span>
            <span class="o">.</span><span class="na">outputClass</span><span class="o">(</span><span class="nc">TranslateResponse</span><span class="o">.</span><span class="na">class</span><span class="o">)</span>
            <span class="o">.</span><span class="na">build</span><span class="o">());</span>
        <span class="k">return</span> <span class="n">response</span><span class="o">.</span><span class="na">getOutput</span><span class="o">(</span><span class="nc">TranslateResponse</span><span class="o">.</span><span class="na">class</span><span class="o">);</span>
    <span class="o">}</span>
<span class="o">);</span>

<span class="n">jetty</span><span class="o">.</span><span class="na">start</span><span class="o">();</span>
</code></pre></div></div>

<h4 id="tools-and-agents">Tools and Agents</h4>

<div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">import</span> <span class="nn">com.google.genkit.ai.tool.ToolDefinition</span><span class="o">;</span>

<span class="kt">var</span> <span class="n">weatherTool</span> <span class="o">=</span> <span class="n">genkit</span><span class="o">.</span><span class="na">defineTool</span><span class="o">(</span>
    <span class="nc">ToolDefinition</span><span class="o">.&lt;</span><span class="nc">String</span><span class="o">,</span> <span class="nc">String</span><span class="o">&gt;</span><span class="n">builder</span><span class="o">()</span>
        <span class="o">.</span><span class="na">name</span><span class="o">(</span><span class="s">"getWeather"</span><span class="o">)</span>
        <span class="o">.</span><span class="na">description</span><span class="o">(</span><span class="s">"Returns current weather for a city."</span><span class="o">)</span>
        <span class="o">.</span><span class="na">inputClass</span><span class="o">(</span><span class="nc">String</span><span class="o">.</span><span class="na">class</span><span class="o">)</span>
        <span class="o">.</span><span class="na">outputClass</span><span class="o">(</span><span class="nc">String</span><span class="o">.</span><span class="na">class</span><span class="o">)</span>
        <span class="o">.</span><span class="na">build</span><span class="o">(),</span>
    <span class="o">(</span><span class="n">ctx</span><span class="o">,</span> <span class="n">city</span><span class="o">)</span> <span class="o">-&gt;</span> <span class="s">"Sunny, 24°C in "</span> <span class="o">+</span> <span class="n">city</span>
<span class="o">);</span>

<span class="c1">// Use the tool inside a flow or agent</span>
<span class="kt">var</span> <span class="n">result</span> <span class="o">=</span> <span class="n">genkit</span><span class="o">.</span><span class="na">generate</span><span class="o">(</span><span class="nc">GenerateOptions</span><span class="o">.</span><span class="na">builder</span><span class="o">()</span>
    <span class="o">.</span><span class="na">model</span><span class="o">(</span><span class="s">"googleai/gemini-flash-latest"</span><span class="o">)</span>
    <span class="o">.</span><span class="na">prompt</span><span class="o">(</span><span class="s">"What's the weather like in Tokyo?"</span><span class="o">)</span>
    <span class="o">.</span><span class="na">tools</span><span class="o">(</span><span class="nc">List</span><span class="o">.</span><span class="na">of</span><span class="o">(</span><span class="n">weatherTool</span><span class="o">))</span>
    <span class="o">.</span><span class="na">build</span><span class="o">());</span>
</code></pre></div></div>

<h4 id="the-dev-ui--same-power-as-typescript">The Dev UI — Same Power as TypeScript</h4>

<p>One of Genkit Java’s most compelling features is that the <strong>same Genkit Developer UI</strong> used by the TypeScript SDK works directly with Java applications. You install the Genkit CLI (Node.js-based) and start your Java app through it:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>npm <span class="nb">install</span> <span class="nt">-g</span> genkit
genkit start <span class="nt">--</span> mvn <span class="nb">exec</span>:java
</code></pre></div></div>

<p>The Dev UI opens at <code class="language-plaintext highlighter-rouge">http://localhost:4000</code> and gives you:</p>
<ul>
  <li><strong>Flow runner</strong> — execute any flow interactively with custom inputs and inspect typed outputs.</li>
  <li><strong>Trace explorer</strong> — full OpenTelemetry traces for every <code class="language-plaintext highlighter-rouge">generate</code> and flow call, showing latency, token counts, and exact prompts.</li>
  <li><strong>Model playground</strong> — test any registered model directly.</li>
  <li><strong>Tool testing</strong> — stub and test tools in isolation.</li>
  <li><strong>Dotprompt editor</strong> — edit <code class="language-plaintext highlighter-rouge">.prompt</code> files live with variable injection.</li>
</ul>

<p>This is the single biggest advantage Genkit Java has over every other Java AI framework: a zero-config, local developer UI that replaces the need for LangSmith or Grafana during development.</p>

<h4 id="provider-support">Provider Support</h4>

<p>Genkit Java ships plugins for: <strong>Google GenAI (Gemini)</strong>, <strong>OpenAI</strong>, <strong>Anthropic (Claude)</strong>, <strong>AWS Bedrock</strong>, <strong>Azure AI Foundry</strong>, <strong>Ollama</strong>, <strong>xAI (Grok)</strong>, <strong>DeepSeek</strong>, <strong>Cohere</strong>, <strong>Mistral</strong>, and <strong>Groq</strong>. All accessed through the same <code class="language-plaintext highlighter-rouge">genkit.generate()</code> interface.</p>

<p>Vector store plugins cover: Firebase Firestore, Weaviate, PostgreSQL (pgvector), Pinecone, and a local in-memory store.</p>

<h4 id="pros-and-cons">Pros and Cons</h4>

<table>
  <thead>
    <tr>
      <th>✅ Pros</th>
      <th>❌ Cons</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>Best-in-class Dev UI with local trace explorer</td>
      <td>Unofficial/community-maintained (not a Google product)</td>
    </tr>
    <tr>
      <td>Multi-level abstractions: vanilla, flows, agents</td>
      <td>Artifacts on GitHub Packages (requires auth to pull)</td>
    </tr>
    <tr>
      <td>Broadest provider support in Java ecosystem</td>
      <td>Java 21+ required</td>
    </tr>
    <tr>
      <td>Spring Boot and Jetty deployment plugins</td>
      <td>Smaller community than LangChain4j or Spring AI</td>
    </tr>
    <tr>
      <td>OpenTelemetry built in</td>
      <td>Still SNAPSHOT versioned (1.0.0-SNAPSHOT)</td>
    </tr>
    <tr>
      <td>Idiomatic Java with builder patterns</td>
      <td> </td>
    </tr>
  </tbody>
</table>

<hr />

<h2 id="spring-ai">Spring AI</h2>

<h3 id="history-and-direction-1">History and Direction</h3>

<p>Spring AI was announced by the Spring team (Broadcom) in mid-2023 and reached its 1.0 GA release in mid-2024. It is the most enterprise-grade option in this comparison, built by the same team that maintains Spring Framework, Spring Boot, and Spring Data, which together underpin a vast proportion of the world’s Java server-side applications.</p>

<p>The founding premise of Spring AI is that AI integration in Java applications should feel like every other Spring integration: auto-configured, testable, portable, and production-ready out of the box. The project draws inspiration from LangChain and LlamaIndex but explicitly avoids being a port, it is designed from the ground up to be idiomatic Spring. If you have written Spring applications, Spring AI will feel immediately familiar: <code class="language-plaintext highlighter-rouge">@Autowired</code> AI clients, Spring Boot starters, <code class="language-plaintext highlighter-rouge">application.properties</code> configuration, and <code class="language-plaintext highlighter-rouge">Advisor</code> patterns that mirror Spring’s existing interception model.</p>

<p>Spring AI’s direction through 2025 and into 2026 has been to deepen its observability story (Micrometer-native metrics and traces), expand its <code class="language-plaintext highlighter-rouge">ChatClient</code> fluent API, and ship more vector store integrations. The framework is now the de facto standard for teams that are already invested in the Spring ecosystem and want to add AI capabilities without introducing a foreign dependency philosophy.</p>

<h3 id="what-makes-spring-ai-stand-out">What Makes Spring AI Stand Out</h3>

<p>Spring AI’s killer feature is <strong>Spring Boot integration depth</strong>. There is no framework on this list, in any language, that integrates AI capabilities as seamlessly into an existing application framework as Spring AI does with Spring Boot. Auto-configuration, conditional beans, health indicators, Actuator endpoints for AI metrics, everything a Spring developer expects, applied to AI.</p>

<p><strong>Supported languages:</strong> Java (primary). Also supports Kotlin (via Spring’s Kotlin DSL). Runs anywhere Spring Boot runs: embedded Tomcat, Jetty, Undertow, GraalVM native images.</p>

<div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// application.properties</span>
<span class="c1">// spring.ai.openai.api-key=${OPENAI_API_KEY}</span>
<span class="c1">// spring.ai.openai.chat.options.model=gpt-4o</span>

<span class="kn">import</span> <span class="nn">org.springframework.ai.chat.client.ChatClient</span><span class="o">;</span>
<span class="kn">import</span> <span class="nn">org.springframework.web.bind.annotation.*</span><span class="o">;</span>

<span class="nd">@RestController</span>
<span class="kd">public</span> <span class="kd">class</span> <span class="nc">ChatController</span> <span class="o">{</span>

    <span class="kd">private</span> <span class="kd">final</span> <span class="nc">ChatClient</span> <span class="n">chatClient</span><span class="o">;</span>

    <span class="kd">public</span> <span class="nf">ChatController</span><span class="o">(</span><span class="nc">ChatClient</span><span class="o">.</span><span class="na">Builder</span> <span class="n">builder</span><span class="o">)</span> <span class="o">{</span>
        <span class="k">this</span><span class="o">.</span><span class="na">chatClient</span> <span class="o">=</span> <span class="n">builder</span><span class="o">.</span><span class="na">build</span><span class="o">();</span>
    <span class="o">}</span>

    <span class="nd">@GetMapping</span><span class="o">(</span><span class="s">"/chat"</span><span class="o">)</span>
    <span class="kd">public</span> <span class="nc">String</span> <span class="nf">chat</span><span class="o">(</span><span class="nd">@RequestParam</span> <span class="nc">String</span> <span class="n">message</span><span class="o">)</span> <span class="o">{</span>
        <span class="k">return</span> <span class="n">chatClient</span><span class="o">.</span><span class="na">prompt</span><span class="o">()</span>
            <span class="o">.</span><span class="na">user</span><span class="o">(</span><span class="n">message</span><span class="o">)</span>
            <span class="o">.</span><span class="na">call</span><span class="o">()</span>
            <span class="o">.</span><span class="na">content</span><span class="o">();</span>
    <span class="o">}</span>
<span class="o">}</span>
</code></pre></div></div>

<h4 id="structured-output">Structured Output</h4>

<p>Spring AI’s <code class="language-plaintext highlighter-rouge">BeanOutputConverter</code> maps model responses directly to Java POJOs, using the class schema to generate format instructions automatically.</p>

<div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">import</span> <span class="nn">org.springframework.ai.chat.client.ChatClient</span><span class="o">;</span>
<span class="kn">import</span> <span class="nn">org.springframework.ai.converter.BeanOutputConverter</span><span class="o">;</span>

<span class="n">record</span> <span class="nf">MovieReview</span><span class="o">(</span><span class="nc">String</span> <span class="n">title</span><span class="o">,</span> <span class="kt">int</span> <span class="n">rating</span><span class="o">,</span> <span class="nc">String</span> <span class="n">summary</span><span class="o">,</span> <span class="nc">List</span><span class="o">&lt;</span><span class="nc">String</span><span class="o">&gt;</span> <span class="n">pros</span><span class="o">)</span> <span class="o">{}</span>

<span class="nc">BeanOutputConverter</span><span class="o">&lt;</span><span class="nc">MovieReview</span><span class="o">&gt;</span> <span class="n">converter</span> <span class="o">=</span> <span class="k">new</span> <span class="nc">BeanOutputConverter</span><span class="o">&lt;&gt;(</span><span class="nc">MovieReview</span><span class="o">.</span><span class="na">class</span><span class="o">);</span>

<span class="nc">MovieReview</span> <span class="n">review</span> <span class="o">=</span> <span class="n">chatClient</span><span class="o">.</span><span class="na">prompt</span><span class="o">()</span>
    <span class="o">.</span><span class="na">user</span><span class="o">(</span><span class="n">u</span> <span class="o">-&gt;</span> <span class="n">u</span><span class="o">.</span><span class="na">text</span><span class="o">(</span><span class="s">"Review the movie Inception. {format}"</span><span class="o">)</span>
        <span class="o">.</span><span class="na">param</span><span class="o">(</span><span class="s">"format"</span><span class="o">,</span> <span class="n">converter</span><span class="o">.</span><span class="na">getFormat</span><span class="o">()))</span>
    <span class="o">.</span><span class="na">call</span><span class="o">()</span>
    <span class="o">.</span><span class="na">entity</span><span class="o">(</span><span class="nc">MovieReview</span><span class="o">.</span><span class="na">class</span><span class="o">);</span>

<span class="nc">System</span><span class="o">.</span><span class="na">out</span><span class="o">.</span><span class="na">println</span><span class="o">(</span><span class="n">review</span><span class="o">.</span><span class="na">title</span><span class="o">()</span> <span class="o">+</span> <span class="s">" — "</span> <span class="o">+</span> <span class="n">review</span><span class="o">.</span><span class="na">rating</span><span class="o">()</span> <span class="o">+</span> <span class="s">"/10"</span><span class="o">);</span>
</code></pre></div></div>

<h4 id="rag-with-advisors">RAG with Advisors</h4>

<p>Spring AI’s <code class="language-plaintext highlighter-rouge">Advisors</code> API is one of its most elegant features. Advisors wrap <code class="language-plaintext highlighter-rouge">ChatClient</code> calls with cross-cutting concerns, RAG retrieval, chat memory, logging, guardrails, in a declarative, composable way.</p>

<div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">import</span> <span class="nn">org.springframework.ai.chat.client.ChatClient</span><span class="o">;</span>
<span class="kn">import</span> <span class="nn">org.springframework.ai.chat.client.advisor.QuestionAnswerAdvisor</span><span class="o">;</span>
<span class="kn">import</span> <span class="nn">org.springframework.ai.vectorstore.VectorStore</span><span class="o">;</span>

<span class="nd">@Service</span>
<span class="kd">public</span> <span class="kd">class</span> <span class="nc">DocumentQAService</span> <span class="o">{</span>

    <span class="kd">private</span> <span class="kd">final</span> <span class="nc">ChatClient</span> <span class="n">chatClient</span><span class="o">;</span>

    <span class="kd">public</span> <span class="nf">DocumentQAService</span><span class="o">(</span><span class="nc">ChatClient</span><span class="o">.</span><span class="na">Builder</span> <span class="n">builder</span><span class="o">,</span> <span class="nc">VectorStore</span> <span class="n">vectorStore</span><span class="o">)</span> <span class="o">{</span>
        <span class="k">this</span><span class="o">.</span><span class="na">chatClient</span> <span class="o">=</span> <span class="n">builder</span>
            <span class="o">.</span><span class="na">defaultAdvisors</span><span class="o">(</span><span class="k">new</span> <span class="nc">QuestionAnswerAdvisor</span><span class="o">(</span><span class="n">vectorStore</span><span class="o">))</span>
            <span class="o">.</span><span class="na">build</span><span class="o">();</span>
    <span class="o">}</span>

    <span class="kd">public</span> <span class="nc">String</span> <span class="nf">answerQuestion</span><span class="o">(</span><span class="nc">String</span> <span class="n">question</span><span class="o">)</span> <span class="o">{</span>
        <span class="k">return</span> <span class="n">chatClient</span><span class="o">.</span><span class="na">prompt</span><span class="o">()</span>
            <span class="o">.</span><span class="na">user</span><span class="o">(</span><span class="n">question</span><span class="o">)</span>
            <span class="o">.</span><span class="na">call</span><span class="o">()</span>
            <span class="o">.</span><span class="na">content</span><span class="o">();</span>
    <span class="o">}</span>
<span class="o">}</span>
</code></pre></div></div>

<h4 id="observability">Observability</h4>

<p>Spring AI ships with <strong>Micrometer</strong> integration out of the box. Every chat call generates spans (Spring Boot tracing) and metrics (prompt token count, completion token count, model latency) visible in any Micrometer-compatible backend: Prometheus, Grafana, Zipkin, or Datadog. There is no separate Dev UI, observability is handled by your existing Spring Boot infrastructure.</p>

<h4 id="broad-vector-store-and-model-support">Broad Vector Store and Model Support</h4>

<p>Spring AI supports 10+ model providers (OpenAI, Anthropic, Google Vertex AI, Amazon Bedrock, Azure OpenAI, Mistral, Ollama, Groq, and more) and 20+ vector stores (PGVector, Pinecone, Weaviate, Redis, Elasticsearch, MongoDB Atlas, Chroma, and more), the broadest integration coverage of any Java AI framework.</p>

<h4 id="pros-and-cons-1">Pros and Cons</h4>

<table>
  <thead>
    <tr>
      <th>✅ Pros</th>
      <th>❌ Cons</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>Deepest Spring Boot integration, feels native</td>
      <td>No standalone Dev UI for flow inspection</td>
    </tr>
    <tr>
      <td>Micrometer-native observability</td>
      <td>Agent abstractions are less mature than LangChain4j</td>
    </tr>
    <tr>
      <td>Broadest model and vector store integrations</td>
      <td>Advisors pattern has a learning curve</td>
    </tr>
    <tr>
      <td>Production-tested by the Spring ecosystem</td>
      <td>Heavier spring context overhead for simple use cases</td>
    </tr>
    <tr>
      <td>GraalVM native image support</td>
      <td>No flow/pipeline abstraction like Genkit</td>
    </tr>
    <tr>
      <td>Idiomatic Java and Kotlin support</td>
      <td> </td>
    </tr>
  </tbody>
</table>

<hr />

<h2 id="langchain4j">LangChain4j</h2>

<h3 id="history-and-direction-2">History and Direction</h3>

<p>LangChain4j was started in early 2023 by a small community of Java developers who noticed that the LLM framework explosion happening in Python had no Java equivalent. Despite the name, the project is not a mechanical port of LangChain Python, it is a fusion of ideas from LangChain, Haystack, LlamaIndex, and original innovation, packaged in a way that makes sense for Java.</p>

<p>It grew quickly through 2023 and 2024, driven by its comprehensive integration list (20+ LLM providers, 30+ vector stores) and its clean two-level abstraction model: low-level primitives for maximum control and high-level <strong>AI Services</strong> for rapid development. The AI Services pattern, where you define an interface with annotations and LangChain4j implements it for you at runtime, became the framework’s signature feature and arguably the most Java-idiomatic approach to LLM integration in the ecosystem.</p>

<p>By 2025, LangChain4j had formal integrations with <strong>Quarkus</strong>, <strong>Spring Boot</strong>, <strong>Micronaut</strong>, and <strong>Helidon</strong>, covering every major Java application framework. The team’s direction in 2026 is focused on deepening agentic capabilities (multi-step tools, planning loops, MCP support) and improving the observability story, which has historically been a weaker point compared to Spring AI’s Micrometer integration or Genkit’s Dev UI.</p>

<h3 id="what-makes-langchain4j-stand-out">What Makes LangChain4j Stand Out</h3>

<p>LangChain4j’s <strong>AI Services</strong> pattern is its defining feature. Instead of writing imperative LLM call code, you declare an interface, annotate it with <code class="language-plaintext highlighter-rouge">@SystemMessage</code>, <code class="language-plaintext highlighter-rouge">@UserMessage</code>, and memory annotations, and LangChain4j generates the implementation. The result is AI code that reads like a Java service contract, clean, testable, and completely familiar to Java developers.</p>

<p><strong>Supported languages:</strong> Java (primary). Kotlin extensions available (coroutine-based async support). Integrates with Spring Boot, Quarkus, Micronaut, Helidon.</p>

<div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">import</span> <span class="nn">dev.langchain4j.service.AiServices</span><span class="o">;</span>
<span class="kn">import</span> <span class="nn">dev.langchain4j.service.SystemMessage</span><span class="o">;</span>
<span class="kn">import</span> <span class="nn">dev.langchain4j.model.openai.OpenAiChatModel</span><span class="o">;</span>

<span class="kd">interface</span> <span class="nc">TranslationAssistant</span> <span class="o">{</span>
    <span class="nd">@SystemMessage</span><span class="o">(</span><span class="s">"You are a professional translator. Translate text accurately and naturally."</span><span class="o">)</span>
    <span class="nc">String</span> <span class="nf">translate</span><span class="o">(</span><span class="nd">@UserMessage</span> <span class="nc">String</span> <span class="n">text</span><span class="o">,</span> <span class="nd">@V</span><span class="o">(</span><span class="s">"language"</span><span class="o">)</span> <span class="nc">String</span> <span class="n">targetLanguage</span><span class="o">);</span>
<span class="o">}</span>

<span class="kt">var</span> <span class="n">model</span> <span class="o">=</span> <span class="nc">OpenAiChatModel</span><span class="o">.</span><span class="na">withApiKey</span><span class="o">(</span><span class="nc">System</span><span class="o">.</span><span class="na">getenv</span><span class="o">(</span><span class="s">"OPENAI_API_KEY"</span><span class="o">));</span>

<span class="nc">TranslationAssistant</span> <span class="n">assistant</span> <span class="o">=</span> <span class="nc">AiServices</span><span class="o">.</span><span class="na">builder</span><span class="o">(</span><span class="nc">TranslationAssistant</span><span class="o">.</span><span class="na">class</span><span class="o">)</span>
    <span class="o">.</span><span class="na">chatLanguageModel</span><span class="o">(</span><span class="n">model</span><span class="o">)</span>
    <span class="o">.</span><span class="na">build</span><span class="o">();</span>

<span class="nc">String</span> <span class="n">result</span> <span class="o">=</span> <span class="n">assistant</span><span class="o">.</span><span class="na">translate</span><span class="o">(</span><span class="s">"The quick brown fox jumps over the lazy dog"</span><span class="o">,</span> <span class="s">"Spanish"</span><span class="o">);</span>
<span class="nc">System</span><span class="o">.</span><span class="na">out</span><span class="o">.</span><span class="na">println</span><span class="o">(</span><span class="n">result</span><span class="o">);</span>
</code></pre></div></div>

<h4 id="memory-and-streaming">Memory and Streaming</h4>

<div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">import</span> <span class="nn">dev.langchain4j.memory.chat.MessageWindowChatMemory</span><span class="o">;</span>
<span class="kn">import</span> <span class="nn">dev.langchain4j.service.MemoryId</span><span class="o">;</span>

<span class="kd">interface</span> <span class="nc">ConversationalAssistant</span> <span class="o">{</span>
    <span class="nd">@SystemMessage</span><span class="o">(</span><span class="s">"You are a helpful assistant."</span><span class="o">)</span>
    <span class="nc">String</span> <span class="nf">chat</span><span class="o">(</span><span class="nd">@MemoryId</span> <span class="nc">String</span> <span class="n">userId</span><span class="o">,</span> <span class="nd">@UserMessage</span> <span class="nc">String</span> <span class="n">message</span><span class="o">);</span>
<span class="o">}</span>

<span class="nc">ConversationalAssistant</span> <span class="n">assistant</span> <span class="o">=</span> <span class="nc">AiServices</span><span class="o">.</span><span class="na">builder</span><span class="o">(</span><span class="nc">ConversationalAssistant</span><span class="o">.</span><span class="na">class</span><span class="o">)</span>
    <span class="o">.</span><span class="na">chatLanguageModel</span><span class="o">(</span><span class="n">model</span><span class="o">)</span>
    <span class="o">.</span><span class="na">chatMemoryProvider</span><span class="o">(</span><span class="n">memoryId</span> <span class="o">-&gt;</span> <span class="nc">MessageWindowChatMemory</span><span class="o">.</span><span class="na">withMaxMessages</span><span class="o">(</span><span class="mi">20</span><span class="o">))</span>
    <span class="o">.</span><span class="na">build</span><span class="o">();</span>

<span class="c1">// Each userId gets its own isolated memory</span>
<span class="n">assistant</span><span class="o">.</span><span class="na">chat</span><span class="o">(</span><span class="s">"user-42"</span><span class="o">,</span> <span class="s">"My name is Alice."</span><span class="o">);</span>
<span class="nc">String</span> <span class="n">response</span> <span class="o">=</span> <span class="n">assistant</span><span class="o">.</span><span class="na">chat</span><span class="o">(</span><span class="s">"user-42"</span><span class="o">,</span> <span class="s">"What's my name?"</span><span class="o">);</span>
<span class="c1">// Returns: "Your name is Alice."</span>
</code></pre></div></div>

<h4 id="rag-pipeline">RAG Pipeline</h4>

<div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">import</span> <span class="nn">dev.langchain4j.data.document.loader.UrlDocumentLoader</span><span class="o">;</span>
<span class="kn">import</span> <span class="nn">dev.langchain4j.data.document.splitter.DocumentSplitters</span><span class="o">;</span>
<span class="kn">import</span> <span class="nn">dev.langchain4j.store.embedding.inmemory.InMemoryEmbeddingStore</span><span class="o">;</span>
<span class="kn">import</span> <span class="nn">dev.langchain4j.rag.content.retriever.EmbeddingStoreContentRetriever</span><span class="o">;</span>

<span class="c1">// Ingest documents</span>
<span class="kt">var</span> <span class="n">documents</span> <span class="o">=</span> <span class="nc">UrlDocumentLoader</span><span class="o">.</span><span class="na">load</span><span class="o">(</span><span class="s">"https://example.com/docs"</span><span class="o">);</span>
<span class="kt">var</span> <span class="n">splitter</span> <span class="o">=</span> <span class="nc">DocumentSplitters</span><span class="o">.</span><span class="na">recursive</span><span class="o">(</span><span class="mi">500</span><span class="o">,</span> <span class="mi">50</span><span class="o">);</span>
<span class="kt">var</span> <span class="n">segments</span> <span class="o">=</span> <span class="n">splitter</span><span class="o">.</span><span class="na">splitAll</span><span class="o">(</span><span class="n">documents</span><span class="o">);</span>

<span class="kt">var</span> <span class="n">embeddingModel</span> <span class="o">=</span> <span class="nc">OpenAiEmbeddingModel</span><span class="o">.</span><span class="na">withApiKey</span><span class="o">(</span><span class="n">apiKey</span><span class="o">);</span>
<span class="kt">var</span> <span class="n">embeddingStore</span> <span class="o">=</span> <span class="k">new</span> <span class="nc">InMemoryEmbeddingStore</span><span class="o">&lt;</span><span class="nc">TextSegment</span><span class="o">&gt;();</span>
<span class="nc">EmbeddingStoreIngestor</span><span class="o">.</span><span class="na">ingest</span><span class="o">(</span><span class="n">segments</span><span class="o">,</span> <span class="n">embeddingStore</span><span class="o">,</span> <span class="n">embeddingModel</span><span class="o">);</span>

<span class="c1">// Build RAG-enabled assistant</span>
<span class="kd">interface</span> <span class="nc">DocsAssistant</span> <span class="o">{</span>
    <span class="nc">String</span> <span class="nf">answer</span><span class="o">(</span><span class="nd">@UserMessage</span> <span class="nc">String</span> <span class="n">question</span><span class="o">);</span>
<span class="o">}</span>

<span class="kt">var</span> <span class="n">retriever</span> <span class="o">=</span> <span class="nc">EmbeddingStoreContentRetriever</span><span class="o">.</span><span class="na">builder</span><span class="o">()</span>
    <span class="o">.</span><span class="na">embeddingStore</span><span class="o">(</span><span class="n">embeddingStore</span><span class="o">)</span>
    <span class="o">.</span><span class="na">embeddingModel</span><span class="o">(</span><span class="n">embeddingModel</span><span class="o">)</span>
    <span class="o">.</span><span class="na">maxResults</span><span class="o">(</span><span class="mi">3</span><span class="o">)</span>
    <span class="o">.</span><span class="na">build</span><span class="o">();</span>

<span class="nc">DocsAssistant</span> <span class="n">assistant</span> <span class="o">=</span> <span class="nc">AiServices</span><span class="o">.</span><span class="na">builder</span><span class="o">(</span><span class="nc">DocsAssistant</span><span class="o">.</span><span class="na">class</span><span class="o">)</span>
    <span class="o">.</span><span class="na">chatLanguageModel</span><span class="o">(</span><span class="n">model</span><span class="o">)</span>
    <span class="o">.</span><span class="na">contentRetriever</span><span class="o">(</span><span class="n">retriever</span><span class="o">)</span>
    <span class="o">.</span><span class="na">build</span><span class="o">();</span>
</code></pre></div></div>

<h4 id="two-abstraction-levels">Two Abstraction Levels</h4>

<p>LangChain4j explicitly offers two levels:</p>
<ul>
  <li><strong>Low level</strong> — <code class="language-plaintext highlighter-rouge">ChatModel</code>, <code class="language-plaintext highlighter-rouge">UserMessage</code>, <code class="language-plaintext highlighter-rouge">AiMessage</code>, <code class="language-plaintext highlighter-rouge">EmbeddingStore</code>: full control, more code.</li>
  <li><strong>High level</strong> — <code class="language-plaintext highlighter-rouge">AiServices</code>: declarative interfaces, minimal boilerplate.</li>
</ul>

<p>This mirrors what Genkit Java achieves differently. Where Genkit gives you flows and agents as pipeline concepts, LangChain4j uses interface-based AI Services as its high-level abstraction, very idiomatic in Java terms.</p>

<h4 id="pros-and-cons-2">Pros and Cons</h4>

<table>
  <thead>
    <tr>
      <th>✅ Pros</th>
      <th>❌ Cons</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>AI Services pattern is uniquely Java-idiomatic</td>
      <td>No built-in Dev UI or trace explorer</td>
    </tr>
    <tr>
      <td>Largest integration ecosystem (20+ models, 30+ stores)</td>
      <td>Observability requires external tooling (no Micrometer by default)</td>
    </tr>
    <tr>
      <td>Two clear abstraction levels (low and high)</td>
      <td>Agent capabilities still maturing (2026)</td>
    </tr>
    <tr>
      <td>Spring Boot, Quarkus, Micronaut, Helidon integrations</td>
      <td>Large number of modules can be overwhelming</td>
    </tr>
    <tr>
      <td>Kotlin coroutine support</td>
      <td>Less opinionated, more choices to make yourself</td>
    </tr>
    <tr>
      <td>Strong RAG tooling out of the box</td>
      <td> </td>
    </tr>
  </tbody>
</table>

<hr />

<h2 id="google-adk-java">Google ADK Java</h2>

<h3 id="history-and-direction-3">History and Direction</h3>

<p>Google ADK (Agent Development Kit) launched in 2024 as a Python-first agent framework targeting enterprise deployments on Google Cloud. Java was a late addition to the multi-language roadmap, with <strong>ADK Java 1.0</strong> shipping in early 2026 alongside ADK Go 1.0. The Java SDK arrival was significant: it signaled that Google views ADK as a serious enterprise runtime, not just a Python scripting tool.</p>

<p>ADK Java follows the same design philosophy as the Python SDK: everything is an agent, workflow, or tool. The framework is optimized for building reliable, evaluatable, production-grade multi-agent systems and deploying them to Google Cloud infrastructure, primarily <strong>Vertex AI Agent Engine</strong>, Cloud Run, and GKE. Like its Python counterpart, ADK Java carries the weight of Google Cloud gravity. The best developer experience, the smoothest deployment path, and the most mature observability story all assume you are running on GCP.</p>

<p>ADK Java 1.0 includes the full agent runtime (LLM agents, sequential/loop/parallel workflow agents), tool calling, MCP support, A2A (Agent-to-Agent) protocol, session/memory management, and streaming. The Java API closely mirrors the Python API in structure, which means the mental model transfers well, but also means the Java SDK carries a style that reflects Python-first design decisions.</p>

<h3 id="adk-javas-position-agent-only-enterprise-grade">ADK Java’s Position: Agent-Only, Enterprise-Grade</h3>

<p>Like its Python counterpart, ADK Java is an <strong>agent framework</strong>, it has no vanilla generation primitive or flow abstraction outside the agent model. Its raison d’être is spinning up reliable, evaluatable agents and deploying them at enterprise scale. If you are building a multi-agent system on Google Cloud and Java is your language of choice, ADK Java 1.0 is Google’s recommended path.</p>

<p><strong>Supported languages:</strong> Java (with ADK Java 1.0). Also: Python (primary), TypeScript, Go.</p>

<div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">import</span> <span class="nn">com.google.adk.agents.LlmAgent</span><span class="o">;</span>
<span class="kn">import</span> <span class="nn">com.google.adk.tools.GoogleSearchTool</span><span class="o">;</span>
<span class="kn">import</span> <span class="nn">com.google.adk.runner.InMemoryRunner</span><span class="o">;</span>
<span class="kn">import</span> <span class="nn">com.google.genai.types.Content</span><span class="o">;</span>

<span class="kt">var</span> <span class="n">researchAgent</span> <span class="o">=</span> <span class="nc">LlmAgent</span><span class="o">.</span><span class="na">builder</span><span class="o">()</span>
    <span class="o">.</span><span class="na">name</span><span class="o">(</span><span class="s">"researcher"</span><span class="o">)</span>
    <span class="o">.</span><span class="na">model</span><span class="o">(</span><span class="s">"gemini-flash-latest"</span><span class="o">)</span>
    <span class="o">.</span><span class="na">instruction</span><span class="o">(</span><span class="s">"You help users research topics thoroughly and accurately."</span><span class="o">)</span>
    <span class="o">.</span><span class="na">tools</span><span class="o">(</span><span class="nc">List</span><span class="o">.</span><span class="na">of</span><span class="o">(</span><span class="k">new</span> <span class="nc">GoogleSearchTool</span><span class="o">()))</span>
    <span class="o">.</span><span class="na">build</span><span class="o">();</span>

<span class="kt">var</span> <span class="n">runner</span> <span class="o">=</span> <span class="k">new</span> <span class="nc">InMemoryRunner</span><span class="o">(</span><span class="n">researchAgent</span><span class="o">);</span>

<span class="kt">var</span> <span class="n">session</span> <span class="o">=</span> <span class="n">runner</span><span class="o">.</span><span class="na">sessionService</span><span class="o">().</span><span class="na">createSession</span><span class="o">(</span>
    <span class="n">researchAgent</span><span class="o">.</span><span class="na">name</span><span class="o">(),</span> <span class="s">"user-1"</span>
<span class="o">).</span><span class="na">blockingGet</span><span class="o">();</span>

<span class="kt">var</span> <span class="n">userMessage</span> <span class="o">=</span> <span class="nc">Content</span><span class="o">.</span><span class="na">fromParts</span><span class="o">(</span><span class="nc">Part</span><span class="o">.</span><span class="na">fromText</span><span class="o">(</span>
    <span class="s">"What are the latest developments in fusion energy?"</span>
<span class="o">));</span>

<span class="n">runner</span><span class="o">.</span><span class="na">runAsync</span><span class="o">(</span><span class="n">researchAgent</span><span class="o">.</span><span class="na">name</span><span class="o">(),</span> <span class="n">session</span><span class="o">.</span><span class="na">id</span><span class="o">(),</span> <span class="n">userMessage</span><span class="o">)</span>
    <span class="o">.</span><span class="na">blockingForEach</span><span class="o">(</span><span class="n">event</span> <span class="o">-&gt;</span> <span class="o">{</span>
        <span class="k">if</span> <span class="o">(</span><span class="n">event</span><span class="o">.</span><span class="na">finalResponse</span><span class="o">())</span> <span class="o">{</span>
            <span class="nc">System</span><span class="o">.</span><span class="na">out</span><span class="o">.</span><span class="na">println</span><span class="o">(</span><span class="n">event</span><span class="o">.</span><span class="na">stringifyContent</span><span class="o">());</span>
        <span class="o">}</span>
    <span class="o">});</span>
</code></pre></div></div>

<h4 id="multi-agent-orchestration">Multi-Agent Orchestration</h4>

<p>ADK Java’s multi-agent capabilities match the Python SDK’s, including sequential, parallel, and loop orchestration.</p>

<div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">import</span> <span class="nn">com.google.adk.agents.SequentialAgent</span><span class="o">;</span>

<span class="kt">var</span> <span class="n">researcher</span> <span class="o">=</span> <span class="nc">LlmAgent</span><span class="o">.</span><span class="na">builder</span><span class="o">()</span>
    <span class="o">.</span><span class="na">name</span><span class="o">(</span><span class="s">"researcher"</span><span class="o">)</span>
    <span class="o">.</span><span class="na">model</span><span class="o">(</span><span class="s">"gemini-flash-latest"</span><span class="o">)</span>
    <span class="o">.</span><span class="na">instruction</span><span class="o">(</span><span class="s">"Research the given topic and provide key facts."</span><span class="o">)</span>
    <span class="o">.</span><span class="na">build</span><span class="o">();</span>

<span class="kt">var</span> <span class="n">writer</span> <span class="o">=</span> <span class="nc">LlmAgent</span><span class="o">.</span><span class="na">builder</span><span class="o">()</span>
    <span class="o">.</span><span class="na">name</span><span class="o">(</span><span class="s">"writer"</span><span class="o">)</span>
    <span class="o">.</span><span class="na">model</span><span class="o">(</span><span class="s">"gemini-flash-latest"</span><span class="o">)</span>
    <span class="o">.</span><span class="na">instruction</span><span class="o">(</span><span class="s">"Write a clear, well-structured article from the research provided."</span><span class="o">)</span>
    <span class="o">.</span><span class="na">build</span><span class="o">();</span>

<span class="kt">var</span> <span class="n">editor</span> <span class="o">=</span> <span class="nc">LlmAgent</span><span class="o">.</span><span class="na">builder</span><span class="o">()</span>
    <span class="o">.</span><span class="na">name</span><span class="o">(</span><span class="s">"editor"</span><span class="o">)</span>
    <span class="o">.</span><span class="na">model</span><span class="o">(</span><span class="s">"gemini-flash-latest"</span><span class="o">)</span>
    <span class="o">.</span><span class="na">instruction</span><span class="o">(</span><span class="s">"Polish and format the article for publication."</span><span class="o">)</span>
    <span class="o">.</span><span class="na">build</span><span class="o">();</span>

<span class="kt">var</span> <span class="n">pipeline</span> <span class="o">=</span> <span class="nc">SequentialAgent</span><span class="o">.</span><span class="na">builder</span><span class="o">()</span>
    <span class="o">.</span><span class="na">name</span><span class="o">(</span><span class="s">"contentPipeline"</span><span class="o">)</span>
    <span class="o">.</span><span class="na">subAgents</span><span class="o">(</span><span class="nc">List</span><span class="o">.</span><span class="na">of</span><span class="o">(</span><span class="n">researcher</span><span class="o">,</span> <span class="n">writer</span><span class="o">,</span> <span class="n">editor</span><span class="o">))</span>
    <span class="o">.</span><span class="na">build</span><span class="o">();</span>
</code></pre></div></div>

<h4 id="vertex-ai-lock-in">Vertex AI Lock-In</h4>

<p>ADK Java’s production deployment story is built around <strong>Vertex AI Agent Engine</strong> and Google Cloud. While you can run ADK Java locally (via the ADK CLI or directly) and deploy to Cloud Run or GKE independently, the managed evaluation tools, performance dashboards, and enterprise support all assume GCP. This is the clearest example in the Java AI space of a framework built to serve a platform rather than being platform-neutral.</p>

<h4 id="pros-and-cons-3">Pros and Cons</h4>

<table>
  <thead>
    <tr>
      <th>✅ Pros</th>
      <th>❌ Cons</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>Official Google support with production SLA</td>
      <td>Tightly coupled to Vertex AI and GCP</td>
    </tr>
    <tr>
      <td>Best multi-agent orchestration in Java</td>
      <td>Agent-only framework, no vanilla generation or flows</td>
    </tr>
    <tr>
      <td>A2A protocol for agent interoperability</td>
      <td>Python-first design reflected in Java API style</td>
    </tr>
    <tr>
      <td>Full evaluation tools (user simulation, custom metrics)</td>
      <td>Requires GCP for full observability and deployment features</td>
    </tr>
    <tr>
      <td>Scales to enterprise on Google Cloud</td>
      <td>Youngest Java SDK (1.0 released 2026)</td>
    </tr>
    <tr>
      <td>Streaming support (Gemini Live API)</td>
      <td> </td>
    </tr>
  </tbody>
</table>

<hr />

<h2 id="head-to-head-comparison">Head-to-Head Comparison</h2>

<h3 id="developer-experience">Developer Experience</h3>

<table>
  <thead>
    <tr>
      <th>Framework</th>
      <th>DX Highlights</th>
      <th>Shortcomings</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td><strong>Genkit Java</strong></td>
      <td>Dev UI for local tracing is unmatched. Idiomatic Java builder API.</td>
      <td>GitHub Packages auth friction; unofficial status</td>
    </tr>
    <tr>
      <td><strong>Spring AI</strong></td>
      <td>Feels native to any Spring Boot codebase. Zero-surprise API.</td>
      <td>No visual Dev UI; observability via Micrometer only</td>
    </tr>
    <tr>
      <td><strong>LangChain4j</strong></td>
      <td>AI Services pattern is the cleanest Java-native AI abstraction</td>
      <td>No Dev UI; agent features still maturing</td>
    </tr>
    <tr>
      <td><strong>ADK Java</strong></td>
      <td>Powerful multi-agent tooling. Official Google support.</td>
      <td>GCP-centric; Python-style reflected in Java API</td>
    </tr>
  </tbody>
</table>

<h3 id="abstraction-levels">Abstraction Levels</h3>

<p>Genkit Java is the only Java AI framework that provides all three levels: <strong>vanilla generation</strong>, <strong>typed flows (pipelines)</strong>, and <strong>agents</strong>. Spring AI covers generation and a basic agent model via tools, but lacks a flow abstraction. LangChain4j provides two levels (low-level primitives and high-level AI Services) but is agent/service focused. ADK Java is agent-only.</p>

<h3 id="observability-1">Observability</h3>

<table>
  <thead>
    <tr>
      <th>Framework</th>
      <th>Local Dev</th>
      <th>Production</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td><strong>Genkit Java</strong></td>
      <td>Dev UI with trace explorer</td>
      <td>OTEL-compatible export</td>
    </tr>
    <tr>
      <td><strong>Spring AI</strong></td>
      <td>Logs and Actuator endpoints</td>
      <td>Micrometer (Prometheus, Grafana, Datadog)</td>
    </tr>
    <tr>
      <td><strong>LangChain4j</strong></td>
      <td>Logging only</td>
      <td>Manual OTEL setup</td>
    </tr>
    <tr>
      <td><strong>ADK Java</strong></td>
      <td>ADK Web UI</td>
      <td>Cloud Trace + Vertex (GCP)</td>
    </tr>
  </tbody>
</table>

<h3 id="framework-neutrality">Framework Neutrality</h3>

<p>Genkit Java and LangChain4j are built to be provider-neutral: they support every major model and deploy to any infrastructure. Spring AI is similarly neutral on model providers, though it carries Spring’s opinionated application framework as a dependency, a worthwhile trade for most Java shops. ADK Java carries the heaviest platform dependency: its full value is unlocked on Google Cloud.</p>

<h3 id="java-ecosystem-fit">Java Ecosystem Fit</h3>

<table>
  <thead>
    <tr>
      <th>Framework</th>
      <th>Spring Boot</th>
      <th>Quarkus</th>
      <th>Micronaut</th>
      <th>Native Image</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td><strong>Genkit Java</strong></td>
      <td>✅ Plugin</td>
      <td>❌</td>
      <td>❌</td>
      <td>❌</td>
    </tr>
    <tr>
      <td><strong>Spring AI</strong></td>
      <td>✅ Native</td>
      <td>❌</td>
      <td>❌</td>
      <td>✅ GraalVM</td>
    </tr>
    <tr>
      <td><strong>LangChain4j</strong></td>
      <td>✅ Module</td>
      <td>✅ Extension</td>
      <td>✅ Module</td>
      <td>Partial</td>
    </tr>
    <tr>
      <td><strong>ADK Java</strong></td>
      <td>❌</td>
      <td>❌</td>
      <td>❌</td>
      <td>❌</td>
    </tr>
  </tbody>
</table>

<hr />

<h2 id="which-framework-should-you-choose">Which Framework Should You Choose?</h2>

<p><strong>Choose Genkit Java if:</strong></p>
<ul>
  <li>You want to iterate on your AI fast and get feedback with less back and forth — Genkit was built from the ground up for powerful local tooling and observability, and the Dev UI is genuinely transformative.</li>
  <li>You need multiple abstraction levels (vanilla calls, typed flows, and agents) in one SDK.</li>
  <li>Provider neutrality matters: you need to swap or mix Gemini, Claude, OpenAI, and Bedrock.</li>
  <li>Your team also writes TypeScript and wants a consistent framework story across both stacks.</li>
</ul>

<p><strong>Choose Spring AI if:</strong></p>
<ul>
  <li>You are already running Spring Boot and want AI to feel like any other Spring integration.</li>
  <li>Micrometer-native metrics and traces plugging into your existing Prometheus/Grafana stack are a priority.</li>
  <li>You need the broadest model and vector store coverage with production-grade auto-configuration.</li>
  <li>GraalVM native images are a requirement for your deployment targets.</li>
</ul>

<p><strong>Choose LangChain4j if:</strong></p>
<ul>
  <li>You want the most Java-idiomatic high-level AI abstraction: interface-based AI Services with annotations.</li>
  <li>You need the largest integration ecosystem and don’t want to be tied to any application framework.</li>
  <li>Your team works across Spring Boot, Quarkus, Micronaut, and Helidon, LangChain4j is the most framework-agnostic.</li>
  <li>RAG pipelines with rich document ingestion and retrieval are a core use case.</li>
</ul>

<p><strong>Choose ADK Java if:</strong></p>
<ul>
  <li>You are building enterprise-grade multi-agent systems and Google Cloud is your runtime.</li>
  <li>You need official Google support and SLA-backed infrastructure for agent deployment.</li>
  <li>Multi-agent orchestration (sequential, parallel, loop) and the A2A interoperability protocol matter.</li>
  <li>Your team is already using the ADK Python SDK and wants to extend to Java services.</li>
</ul>

<hr />

<h2 id="conclusion">Conclusion</h2>

<p>Java’s AI framework landscape in 2026 is surprisingly rich. The four frameworks covered here serve genuinely different needs, and unlike in the JavaScript world where Genkit, Vercel, Mastra, LangChain, and ADK overlap significantly, the Java options each occupy a clearer niche.</p>

<p>For <strong>enterprise Spring Boot teams</strong>, Spring AI is the obvious choice, zero friction, production-ready observability via Micrometer, and the broadest integration matrix. For <strong>teams that value developer experience above all</strong>, Genkit Java’s Dev UI is a category apart and worth the unofficial status trade-off. For <strong>framework-agnostic Java developers</strong> who want the most idiomatic Java AI service abstraction, LangChain4j’s AI Services pattern is hard to beat. And for <strong>Google Cloud enterprise workloads</strong> that need reliable multi-agent orchestration at scale, ADK Java 1.0 is where Google is putting its weight.</p>

<p>The most important thing is that you no longer have an excuse to reach for Python just because it has better AI tooling. Java’s time in generative AI has arrived.</p>

<hr />

<p><em>Last updated: April 2026. Framework versions referenced: Genkit Java 1.0.0-SNAPSHOT, Spring AI 1.x, LangChain4j 0.36.x, Google ADK Java 1.0.</em></p>]]></content><author><name>Xavier Portilla Edo</name><email>xavierportillaedo@gmail.com</email></author><category term="genkit" /><category term="genkit" /><summary type="html"><![CDATA[A practical, in-depth comparison of the top Generative AI frameworks for Java in 2026: Genkit Java, Spring AI, LangChain4j, and Google ADK (English)]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://xavidop.me/assets/img/blog/post-headers/top-java-genai-frameworks-2026.png" /><media:content medium="image" url="https://xavidop.me/assets/img/blog/post-headers/top-java-genai-frameworks-2026.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry xml:lang="en"><title type="html">Top JavaScript/TypeScript Gen AI Frameworks for 2026: A Hands-On Comparison</title><link href="https://xavidop.me/genkit/2026-04-16-top-jsts-genai-frameworks-2026/" rel="alternate" type="text/html" title="Top JavaScript/TypeScript Gen AI Frameworks for 2026: A Hands-On Comparison" /><published>2026-04-16T00:00:00+00:00</published><updated>2026-05-06T04:23:37+00:00</updated><id>https://xavidop.me/genkit/top-jsts-genai-frameworks-2026</id><content type="html" xml:base="https://xavidop.me/genkit/2026-04-16-top-jsts-genai-frameworks-2026/"><![CDATA[<ol class="no_toc" id="markdown-toc">
  <li><a href="#introduction" id="markdown-toc-introduction">Introduction</a></li>
  <li><a href="#genkit" id="markdown-toc-genkit">Genkit</a>    <ol>
      <li><a href="#history-and-direction" id="markdown-toc-history-and-direction">History and Direction</a></li>
      <li><a href="#what-makes-genkit-stand-out" id="markdown-toc-what-makes-genkit-stand-out">What Makes Genkit Stand Out</a>        <ol>
          <li><a href="#flows--composable-typed-pipelines" id="markdown-toc-flows--composable-typed-pipelines">Flows — Composable, Typed Pipelines</a></li>
          <li><a href="#agent-abstractions" id="markdown-toc-agent-abstractions">Agent Abstractions</a></li>
          <li><a href="#the-dev-ui--where-genkit-truly-shines" id="markdown-toc-the-dev-ui--where-genkit-truly-shines">The Dev UI — Where Genkit Truly Shines</a></li>
          <li><a href="#broad-model-support--provider-neutral-by-design" id="markdown-toc-broad-model-support--provider-neutral-by-design">Broad Model Support — Provider Neutral by Design</a></li>
          <li><a href="#pros-and-cons" id="markdown-toc-pros-and-cons">Pros and Cons</a></li>
        </ol>
      </li>
    </ol>
  </li>
  <li><a href="#vercel-ai-sdk" id="markdown-toc-vercel-ai-sdk">Vercel AI SDK</a>    <ol>
      <li><a href="#history-and-direction-1" id="markdown-toc-history-and-direction-1">History and Direction</a></li>
      <li><a href="#what-makes-the-vercel-ai-sdk-stand-out" id="markdown-toc-what-makes-the-vercel-ai-sdk-stand-out">What Makes the Vercel AI SDK Stand Out</a>        <ol>
          <li><a href="#structured-generation-and-agent-patterns" id="markdown-toc-structured-generation-and-agent-patterns">Structured Generation and Agent Patterns</a></li>
          <li><a href="#genkit-vs-vercel-ai-sdk--abstraction-levels" id="markdown-toc-genkit-vs-vercel-ai-sdk--abstraction-levels">Genkit vs. Vercel AI SDK — Abstraction Levels</a></li>
          <li><a href="#pros-and-cons-1" id="markdown-toc-pros-and-cons-1">Pros and Cons</a></li>
        </ol>
      </li>
    </ol>
  </li>
  <li><a href="#mastra" id="markdown-toc-mastra">Mastra</a>    <ol>
      <li><a href="#history-and-direction-2" id="markdown-toc-history-and-direction-2">History and Direction</a></li>
      <li><a href="#what-makes-mastra-stand-out" id="markdown-toc-what-makes-mastra-stand-out">What Makes Mastra Stand Out</a>        <ol>
          <li><a href="#workflows" id="markdown-toc-workflows">Workflows</a></li>
          <li><a href="#pros-and-cons-2" id="markdown-toc-pros-and-cons-2">Pros and Cons</a></li>
        </ol>
      </li>
    </ol>
  </li>
  <li><a href="#langchain" id="markdown-toc-langchain">LangChain</a>    <ol>
      <li><a href="#history-and-direction-3" id="markdown-toc-history-and-direction-3">History and Direction</a></li>
      <li><a href="#langchains-position-agent-first-platform-tied" id="markdown-toc-langchains-position-agent-first-platform-tied">LangChain’s Position: Agent-First, Platform-Tied</a>        <ol>
          <li><a href="#langsmith-observability" id="markdown-toc-langsmith-observability">LangSmith Observability</a></li>
          <li><a href="#pros-and-cons-3" id="markdown-toc-pros-and-cons-3">Pros and Cons</a></li>
        </ol>
      </li>
    </ol>
  </li>
  <li><a href="#google-adk-agent-development-kit" id="markdown-toc-google-adk-agent-development-kit">Google ADK (Agent Development Kit)</a>    <ol>
      <li><a href="#history-and-direction-4" id="markdown-toc-history-and-direction-4">History and Direction</a></li>
      <li><a href="#adks-position-agent-first-enterprise-grade" id="markdown-toc-adks-position-agent-first-enterprise-grade">ADK’s Position: Agent-First, Enterprise-Grade</a>        <ol>
          <li><a href="#multi-agent-systems" id="markdown-toc-multi-agent-systems">Multi-Agent Systems</a></li>
          <li><a href="#vertex-ai-lock-in" id="markdown-toc-vertex-ai-lock-in">Vertex AI Lock-In</a></li>
          <li><a href="#pros-and-cons-4" id="markdown-toc-pros-and-cons-4">Pros and Cons</a></li>
        </ol>
      </li>
    </ol>
  </li>
  <li><a href="#head-to-head-comparison" id="markdown-toc-head-to-head-comparison">Head-to-Head Comparison</a>    <ol>
      <li><a href="#developer-experience" id="markdown-toc-developer-experience">Developer Experience</a></li>
      <li><a href="#abstraction-levels" id="markdown-toc-abstraction-levels">Abstraction Levels</a></li>
      <li><a href="#observability" id="markdown-toc-observability">Observability</a></li>
      <li><a href="#language-support" id="markdown-toc-language-support">Language Support</a></li>
      <li><a href="#framework-neutrality" id="markdown-toc-framework-neutrality">Framework Neutrality</a></li>
      <li><a href="#idiom-and-code-style" id="markdown-toc-idiom-and-code-style">Idiom and Code Style</a></li>
    </ol>
  </li>
  <li><a href="#which-framework-should-you-choose" id="markdown-toc-which-framework-should-you-choose">Which Framework Should You Choose?</a></li>
  <li><a href="#conclusion" id="markdown-toc-conclusion">Conclusion</a></li>
</ol>

<h2 id="introduction">Introduction</h2>

<p>The Generative AI tooling ecosystem has exploded over the past two years. What started as a handful of Python libraries has grown into a rich, opinionated landscape of frameworks spanning multiple languages, deployment targets, and philosophy bets. As a developer who has shipped production applications using all five of the frameworks covered in this article, <strong>Genkit</strong>, <strong>Vercel AI SDK</strong>, <strong>Mastra</strong>, <strong>LangChain</strong>, and <strong>Google ADK</strong>, I want to offer a practical, hands-on view of where each one excels, where each one falls short, and what I would reach for depending on the project I’m building.</p>

<p>This is not a benchmark post. Tokens per second and latency numbers go stale within weeks. Instead, this is a developer experience and architecture comparison, the kind of thing that matters when you’re deciding what framework will carry your product through 2026 and beyond.</p>

<p>A quick note on scope: all five frameworks are in active development and moving fast. Code samples in this article use the APIs as of <strong>April 2026</strong>.</p>

<hr />

<h2 id="genkit">Genkit</h2>

<h3 id="history-and-direction">History and Direction</h3>

<p>Genkit was announced by Google at Google I/O 2024 as an open-source framework designed to bring production-ready AI tooling to full-stack developers, regardless of their cloud provider. At the time, the JavaScript/TypeScript ecosystem lacked a coherent story for building AI-powered features with the kind of developer ergonomics you’d expect from, say, a Next.js app. Firebase’s team set out to fix that, building Genkit not as a proprietary Firebase product but as a cloud-agnostic SDK with first-class support for plugins.</p>

<p>By mid-2024, Genkit had already attracted a community plugin ecosystem covering AWS Bedrock, Azure OpenAI, Ollama, Cohere, and a growing list of vector stores. The framework reached its 1.0 milestone in late 2024 and shipped major expansions in 2025, most notably adding Python (preview), Go, and Dart (preview) SDKs alongside the primary TypeScript runtime. This multi-language vision is central to Genkit’s story: it aspires to be the framework you reach for no matter what stack you’re running. As of 2026, the Dart SDK has matured notably, making Genkit one of the very few AI frameworks with meaningful <strong>Flutter</strong> support, giving mobile developers a first-class path into generative AI that no other framework on this list can match. It is also important to note that Genkit has a unofficial Java SDK, maintained by the community, which has been used in production but is not officially supported by the Genkit team.</p>

<p>The team’s declared direction is to deepen Genkit’s role as a full-stack AI layer: strong observability primitives baked into the runtime, composable workflow abstractions (flows), and an expanding model plugin ecosystem. The ambition is not just to be a bridge to a single model provider but to be the connective tissue that lets you swap providers, mix modalities, and trace every hop in your pipeline, all from one coherent API. Of course, Adding more campabilities to its DEV UI is also a major focus, with the goal of making it the best local development experience for AI applications, regardless of where they deploy.</p>

<h3 id="what-makes-genkit-stand-out">What Makes Genkit Stand Out</h3>

<p>Genkit occupies a unique position among the frameworks in this comparison: it is the only one that provides <strong>multiple levels of abstraction</strong> in a single, coherent API. You can call a model directly (vanilla generation), compose steps into a typed flow, or wire up a fully autonomous agent, and you can mix all three in the same application. Most other frameworks force you to choose a lane.</p>

<p><strong>Supported languages:</strong> TypeScript/JavaScript (primary, stable), Python (preview), Go, Dart/Flutter (preview)</p>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">import</span> <span class="p">{</span> <span class="nx">genkit</span> <span class="p">}</span> <span class="k">from</span> <span class="dl">'</span><span class="s1">genkit</span><span class="dl">'</span><span class="p">;</span>
<span class="k">import</span> <span class="p">{</span> <span class="nx">googleAI</span> <span class="p">}</span> <span class="k">from</span> <span class="dl">'</span><span class="s1">@genkit-ai/google-genai</span><span class="dl">'</span><span class="p">;</span>

<span class="kd">const</span> <span class="nx">ai</span> <span class="o">=</span> <span class="nf">genkit</span><span class="p">({</span> <span class="na">plugins</span><span class="p">:</span> <span class="p">[</span><span class="nf">googleAI</span><span class="p">()]</span> <span class="p">});</span>

<span class="c1">// Vanilla generation — no abstraction needed</span>
<span class="kd">const</span> <span class="p">{</span> <span class="nx">text</span> <span class="p">}</span> <span class="o">=</span> <span class="k">await</span> <span class="nx">ai</span><span class="p">.</span><span class="nf">generate</span><span class="p">({</span>
  <span class="na">model</span><span class="p">:</span> <span class="nx">googleAI</span><span class="p">.</span><span class="nf">model</span><span class="p">(</span><span class="dl">'</span><span class="s1">gemini-flash-latest</span><span class="dl">'</span><span class="p">),</span>
  <span class="na">prompt</span><span class="p">:</span> <span class="dl">'</span><span class="s1">What is the capital of France?</span><span class="dl">'</span><span class="p">,</span>
<span class="p">});</span>
<span class="nx">console</span><span class="p">.</span><span class="nf">log</span><span class="p">(</span><span class="nx">text</span><span class="p">);</span>
</code></pre></div></div>

<h4 id="flows--composable-typed-pipelines">Flows — Composable, Typed Pipelines</h4>

<p>Flows are Genkit’s first-class pipeline primitive. They are strongly typed, observable end-to-end, and automatically traced in the Dev UI. You define them once and can invoke them from CLI, HTTP, or the Dev UI without any extra scaffolding.</p>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">import</span> <span class="p">{</span> <span class="nx">genkit</span><span class="p">,</span> <span class="nx">z</span> <span class="p">}</span> <span class="k">from</span> <span class="dl">'</span><span class="s1">genkit</span><span class="dl">'</span><span class="p">;</span>
<span class="k">import</span> <span class="p">{</span> <span class="nx">googleAI</span> <span class="p">}</span> <span class="k">from</span> <span class="dl">'</span><span class="s1">@genkit-ai/google-genai</span><span class="dl">'</span><span class="p">;</span>

<span class="kd">const</span> <span class="nx">ai</span> <span class="o">=</span> <span class="nf">genkit</span><span class="p">({</span> <span class="na">plugins</span><span class="p">:</span> <span class="p">[</span><span class="nf">googleAI</span><span class="p">()]</span> <span class="p">});</span>

<span class="kd">const</span> <span class="nx">summarizeFlow</span> <span class="o">=</span> <span class="nx">ai</span><span class="p">.</span><span class="nf">defineFlow</span><span class="p">(</span>
  <span class="p">{</span>
    <span class="na">name</span><span class="p">:</span> <span class="dl">'</span><span class="s1">summarizeArticle</span><span class="dl">'</span><span class="p">,</span>
    <span class="na">inputSchema</span><span class="p">:</span> <span class="nx">z</span><span class="p">.</span><span class="nf">object</span><span class="p">({</span> <span class="na">url</span><span class="p">:</span> <span class="nx">z</span><span class="p">.</span><span class="nf">string</span><span class="p">().</span><span class="nf">url</span><span class="p">()</span> <span class="p">}),</span>
    <span class="na">outputSchema</span><span class="p">:</span> <span class="nx">z</span><span class="p">.</span><span class="nf">object</span><span class="p">({</span> <span class="na">summary</span><span class="p">:</span> <span class="nx">z</span><span class="p">.</span><span class="nf">string</span><span class="p">(),</span> <span class="na">keyPoints</span><span class="p">:</span> <span class="nx">z</span><span class="p">.</span><span class="nf">array</span><span class="p">(</span><span class="nx">z</span><span class="p">.</span><span class="nf">string</span><span class="p">())</span> <span class="p">}),</span>
  <span class="p">},</span>
  <span class="k">async </span><span class="p">({</span> <span class="nx">url</span> <span class="p">})</span> <span class="o">=&gt;</span> <span class="p">{</span>
    <span class="kd">const</span> <span class="p">{</span> <span class="nx">output</span> <span class="p">}</span> <span class="o">=</span> <span class="k">await</span> <span class="nx">ai</span><span class="p">.</span><span class="nf">generate</span><span class="p">({</span>
      <span class="na">model</span><span class="p">:</span> <span class="nx">googleAI</span><span class="p">.</span><span class="nf">model</span><span class="p">(</span><span class="dl">'</span><span class="s1">gemini-flash-latest</span><span class="dl">'</span><span class="p">),</span>
      <span class="na">prompt</span><span class="p">:</span> <span class="s2">`Summarize the article at </span><span class="p">${</span><span class="nx">url</span><span class="p">}</span><span class="s2"> and list the key points.`</span><span class="p">,</span>
      <span class="na">output</span><span class="p">:</span> <span class="p">{</span>
        <span class="na">schema</span><span class="p">:</span> <span class="nx">z</span><span class="p">.</span><span class="nf">object</span><span class="p">({</span> <span class="na">summary</span><span class="p">:</span> <span class="nx">z</span><span class="p">.</span><span class="nf">string</span><span class="p">(),</span> <span class="na">keyPoints</span><span class="p">:</span> <span class="nx">z</span><span class="p">.</span><span class="nf">array</span><span class="p">(</span><span class="nx">z</span><span class="p">.</span><span class="nf">string</span><span class="p">())</span> <span class="p">}),</span>
      <span class="p">},</span>
    <span class="p">});</span>
    <span class="k">return</span> <span class="nx">output</span><span class="o">!</span><span class="p">;</span>
  <span class="p">}</span>
<span class="p">);</span>
</code></pre></div></div>

<h4 id="agent-abstractions">Agent Abstractions</h4>

<p>For agents, Genkit uses <code class="language-plaintext highlighter-rouge">definePrompt</code> with tools and a system prompt to define specialized agents, along with tool calling via <code class="language-plaintext highlighter-rouge">defineTool</code> and conversation memory, all integrated with the same tracing and observability infrastructure that flows use. The agent model is deliberate: it gives you control over how much autonomy you hand over to the model.</p>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">import</span> <span class="p">{</span> <span class="nx">genkit</span><span class="p">,</span> <span class="nx">z</span> <span class="p">}</span> <span class="k">from</span> <span class="dl">'</span><span class="s1">genkit</span><span class="dl">'</span><span class="p">;</span>
<span class="k">import</span> <span class="p">{</span> <span class="nx">googleAI</span> <span class="p">}</span> <span class="k">from</span> <span class="dl">'</span><span class="s1">@genkit-ai/google-genai</span><span class="dl">'</span><span class="p">;</span>

<span class="kd">const</span> <span class="nx">ai</span> <span class="o">=</span> <span class="nf">genkit</span><span class="p">({</span> <span class="na">plugins</span><span class="p">:</span> <span class="p">[</span><span class="nf">googleAI</span><span class="p">()]</span> <span class="p">});</span>

<span class="kd">const</span> <span class="nx">weatherTool</span> <span class="o">=</span> <span class="nx">ai</span><span class="p">.</span><span class="nf">defineTool</span><span class="p">(</span>
  <span class="p">{</span>
    <span class="na">name</span><span class="p">:</span> <span class="dl">'</span><span class="s1">getWeather</span><span class="dl">'</span><span class="p">,</span>
    <span class="na">description</span><span class="p">:</span> <span class="dl">'</span><span class="s1">Returns current weather conditions for a given city.</span><span class="dl">'</span><span class="p">,</span>
    <span class="na">inputSchema</span><span class="p">:</span> <span class="nx">z</span><span class="p">.</span><span class="nf">object</span><span class="p">({</span> <span class="na">city</span><span class="p">:</span> <span class="nx">z</span><span class="p">.</span><span class="nf">string</span><span class="p">()</span> <span class="p">}),</span>
    <span class="na">outputSchema</span><span class="p">:</span> <span class="nx">z</span><span class="p">.</span><span class="nf">object</span><span class="p">({</span> <span class="na">temperature</span><span class="p">:</span> <span class="nx">z</span><span class="p">.</span><span class="nf">number</span><span class="p">(),</span> <span class="na">condition</span><span class="p">:</span> <span class="nx">z</span><span class="p">.</span><span class="nf">string</span><span class="p">()</span> <span class="p">}),</span>
  <span class="p">},</span>
  <span class="k">async </span><span class="p">({</span> <span class="nx">city</span> <span class="p">})</span> <span class="o">=&gt;</span> <span class="p">{</span>
    <span class="c1">// Real implementation would call a weather API</span>
    <span class="k">return</span> <span class="p">{</span> <span class="na">temperature</span><span class="p">:</span> <span class="mi">22</span><span class="p">,</span> <span class="na">condition</span><span class="p">:</span> <span class="dl">'</span><span class="s1">Sunny</span><span class="dl">'</span> <span class="p">};</span>
  <span class="p">}</span>
<span class="p">);</span>

<span class="kd">const</span> <span class="nx">travelAgent</span> <span class="o">=</span> <span class="nx">ai</span><span class="p">.</span><span class="nf">definePrompt</span><span class="p">(</span>
  <span class="p">{</span>
    <span class="na">name</span><span class="p">:</span> <span class="dl">'</span><span class="s1">travelAdvisor</span><span class="dl">'</span><span class="p">,</span>
    <span class="na">description</span><span class="p">:</span> <span class="dl">'</span><span class="s1">Travel Advisor can help with trip planning and weather-based advice</span><span class="dl">'</span><span class="p">,</span>
    <span class="na">model</span><span class="p">:</span> <span class="nx">googleAI</span><span class="p">.</span><span class="nf">model</span><span class="p">(</span><span class="dl">'</span><span class="s1">gemini-flash-latest</span><span class="dl">'</span><span class="p">),</span>
    <span class="na">tools</span><span class="p">:</span> <span class="p">[</span><span class="nx">weatherTool</span><span class="p">],</span>
    <span class="na">system</span><span class="p">:</span> <span class="dl">'</span><span class="s1">You are a helpful travel advisor. Use available tools to give accurate advice.</span><span class="dl">'</span><span class="p">,</span>
  <span class="p">}</span>
<span class="p">);</span>

<span class="c1">// Start a chat session with the agent</span>
<span class="kd">const</span> <span class="nx">chat</span> <span class="o">=</span> <span class="nx">ai</span><span class="p">.</span><span class="nf">chat</span><span class="p">(</span><span class="nx">travelAgent</span><span class="p">);</span>
<span class="kd">const</span> <span class="nx">response</span> <span class="o">=</span> <span class="k">await</span> <span class="nx">chat</span><span class="p">.</span><span class="nf">send</span><span class="p">(</span><span class="dl">'</span><span class="s1">Should I pack a jacket for my trip to Lisbon?</span><span class="dl">'</span><span class="p">);</span>
<span class="nx">console</span><span class="p">.</span><span class="nf">log</span><span class="p">(</span><span class="nx">response</span><span class="p">.</span><span class="nx">text</span><span class="p">);</span>
</code></pre></div></div>

<h4 id="the-dev-ui--where-genkit-truly-shines">The Dev UI — Where Genkit Truly Shines</h4>

<p>The <strong>Genkit Developer UI</strong> is, frankly, the killer feature. No other framework in this comparison comes close to what Genkit offers locally. You launch it with a single command:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>npx genkit start
</code></pre></div></div>

<p>The Dev UI gives you:</p>

<ul>
  <li><strong>Flow runner</strong> — execute any flow with a custom input, inspect the typed output, and view the full execution trace.</li>
  <li><strong>Model playground</strong> — invoke any registered model directly, tweak prompt templates, compare outputs.</li>
  <li><strong>Tool testing</strong> — stub and test individual tools in isolation before wiring them into an agent.</li>
  <li><strong>Trace explorer</strong> — every <code class="language-plaintext highlighter-rouge">generate</code>, <code class="language-plaintext highlighter-rouge">flow</code>, and <code class="language-plaintext highlighter-rouge">agent</code> call is traced with latency breakdowns, token counts, and the exact prompts and completions sent to the model. This is OpenTelemetry-compatible telemetry, exportable to Cloud Trace, Langfuse, or any OTEL collector.</li>
  <li><strong>Dotprompt editor</strong> — Genkit’s <code class="language-plaintext highlighter-rouge">.prompt</code> files (Dotprompt) are editable live in the UI, with real-time preview and variable injection.</li>
  <li><strong>Session replay</strong> — replay any traced session end-to-end to reproduce bugs without re-running the full application.</li>
</ul>

<p>This local observability loop collapses what normally requires a deployed tracing backend (LangSmith, Langfuse, Weave) into a zero-config experience that runs entirely offline. For development speed, this is enormous.</p>

<p>Vercel’s Developer Tool, by comparison, is a lightweight panel primarily for inspecting HTTP streaming responses. It doesn’t offer flow visualization, trace exploration, or tool testing. It’s functional but basic, the kind of thing you’d expect as a starting point, not a full developer experience.</p>

<h4 id="broad-model-support--provider-neutral-by-design">Broad Model Support — Provider Neutral by Design</h4>

<p>Genkit ships official plugins for <strong>Google AI (Gemini)</strong>, <strong>Google Vertex AI</strong>, <strong>OpenAI</strong>, <strong>Anthropic Claude</strong>, <strong>Cohere</strong>, <strong>Mistral</strong>, <strong>Ollama</strong> (local models), <strong>AWS Bedrock</strong>, and more. The community has extended this to xAI, DeepSeek, Perplexity, and Azure OpenAI. Every model, regardless of provider, is accessed through the same <code class="language-plaintext highlighter-rouge">ai.generate()</code> interface, and every call is automatically traced.</p>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">import</span> <span class="p">{</span> <span class="nx">genkit</span> <span class="p">}</span> <span class="k">from</span> <span class="dl">'</span><span class="s1">genkit</span><span class="dl">'</span><span class="p">;</span>
<span class="k">import</span> <span class="p">{</span> <span class="nx">anthropic</span> <span class="p">}</span> <span class="k">from</span> <span class="dl">'</span><span class="s1">genkitx-anthropic</span><span class="dl">'</span><span class="p">;</span>
<span class="k">import</span> <span class="p">{</span> <span class="nx">openAI</span> <span class="p">}</span> <span class="k">from</span> <span class="dl">'</span><span class="s1">genkitx-openai</span><span class="dl">'</span><span class="p">;</span>

<span class="kd">const</span> <span class="nx">ai</span> <span class="o">=</span> <span class="nf">genkit</span><span class="p">({</span> <span class="na">plugins</span><span class="p">:</span> <span class="p">[</span><span class="nf">anthropic</span><span class="p">(),</span> <span class="nf">openAI</span><span class="p">()]</span> <span class="p">});</span>

<span class="c1">// Switch between providers without changing downstream code</span>
<span class="kd">const</span> <span class="p">{</span> <span class="na">text</span><span class="p">:</span> <span class="nx">claudeResponse</span> <span class="p">}</span> <span class="o">=</span> <span class="k">await</span> <span class="nx">ai</span><span class="p">.</span><span class="nf">generate</span><span class="p">({</span>
  <span class="na">model</span><span class="p">:</span> <span class="nx">anthropic</span><span class="p">.</span><span class="nf">model</span><span class="p">(</span><span class="dl">'</span><span class="s1">claude-sonnet-4-5</span><span class="dl">'</span><span class="p">),</span>
  <span class="na">prompt</span><span class="p">:</span> <span class="dl">'</span><span class="s1">Explain transformer attention in one paragraph.</span><span class="dl">'</span><span class="p">,</span>
<span class="p">});</span>

<span class="kd">const</span> <span class="p">{</span> <span class="na">text</span><span class="p">:</span> <span class="nx">gptResponse</span> <span class="p">}</span> <span class="o">=</span> <span class="k">await</span> <span class="nx">ai</span><span class="p">.</span><span class="nf">generate</span><span class="p">({</span>
  <span class="na">model</span><span class="p">:</span> <span class="nx">openAI</span><span class="p">.</span><span class="nf">model</span><span class="p">(</span><span class="dl">'</span><span class="s1">gpt-4o</span><span class="dl">'</span><span class="p">),</span>
  <span class="na">prompt</span><span class="p">:</span> <span class="dl">'</span><span class="s1">Explain transformer attention in one paragraph.</span><span class="dl">'</span><span class="p">,</span>
<span class="p">});</span>
</code></pre></div></div>

<h4 id="pros-and-cons">Pros and Cons</h4>

<table>
  <thead>
    <tr>
      <th>✅ Pros</th>
      <th>❌ Cons</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>Best-in-class Dev UI with local tracing and flow visualization</td>
      <td>Dart/Python SDKs still in preview</td>
    </tr>
    <tr>
      <td>Multiple abstraction levels: vanilla, flows, and agents</td>
      <td>Smaller community than LangChain</td>
    </tr>
    <tr>
      <td>Truly provider-neutral with broad plugin ecosystem</td>
      <td>Some advanced patterns require deeper framework knowledge</td>
    </tr>
    <tr>
      <td>Strong Flutter/Dart support for mobile AI</td>
      <td> </td>
    </tr>
    <tr>
      <td>Idiomatic TypeScript API</td>
      <td> </td>
    </tr>
    <tr>
      <td>Firebase, Cloud Run, or self-hosted deployment</td>
      <td> </td>
    </tr>
    <tr>
      <td>OpenTelemetry-compatible observability built in</td>
      <td> </td>
    </tr>
  </tbody>
</table>

<hr />

<h2 id="vercel-ai-sdk">Vercel AI SDK</h2>

<h3 id="history-and-direction-1">History and Direction</h3>

<p>The Vercel AI SDK was born out of a practical need: Vercel builds the infrastructure that powers a large portion of the modern web, and as developers started shipping AI features inside Next.js apps in 2023, the friction of integrating streaming LLM responses into React was painfully apparent. Vercel released the initial AI SDK as an open-source library to standardize streaming, provider integration, and UI hooks across their ecosystem.</p>

<p>The SDK grew quickly, adding support for Vue, Svelte, SolidJS, and plain Node.js, but its DNA remains deeply tied to the Vercel and Next.js stack. Version 3 in 2024 introduced <code class="language-plaintext highlighter-rouge">streamUI</code>, which lets you stream React components as model output, a paradigm-shift for building truly generative user interfaces. Version 4, shipping in late 2024, brought <code class="language-plaintext highlighter-rouge">generateObject</code> and <code class="language-plaintext highlighter-rouge">streamObject</code> with Zod schemas, structured output across all providers, and an expanded agent API. By 2026, AI SDK v6 has established itself as the go-to choice for teams that live in the Vercel/React ecosystem and want the lowest-friction path from a prompt to a production UI.</p>

<p>Vercel’s direction is clear: deeper integration between AI, edge compute, and the frontend. The AI Gateway, launched in 2025, acts as a provider proxy with load balancing and fallback, another layer of lock-in dressed as a convenience. The SDK is intentionally lower-level than Genkit or Mastra, favoring simplicity and composability over opinionated abstractions.</p>

<h3 id="what-makes-the-vercel-ai-sdk-stand-out">What Makes the Vercel AI SDK Stand Out</h3>

<p>The Vercel AI SDK’s greatest strength is its <strong>seamless integration with React and the web UI layer</strong>. <code class="language-plaintext highlighter-rouge">useChat</code>, <code class="language-plaintext highlighter-rouge">useCompletion</code>, and <code class="language-plaintext highlighter-rouge">useObject</code> hooks wire directly into streaming AI responses with built-in state management, loading indicators, and error boundaries. If you’re building a Next.js app and want to add a chat interface or a streaming form, nothing gets you there faster.</p>

<p><strong>Supported languages:</strong> TypeScript/JavaScript (primary). Node.js, React, Next.js, Nuxt, SvelteKit, SolidStart, Expo (React Native).</p>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// app/api/chat/route.ts (Next.js App Router)</span>
<span class="k">import</span> <span class="p">{</span> <span class="nx">streamText</span> <span class="p">}</span> <span class="k">from</span> <span class="dl">'</span><span class="s1">ai</span><span class="dl">'</span><span class="p">;</span>
<span class="k">import</span> <span class="p">{</span> <span class="nx">openai</span> <span class="p">}</span> <span class="k">from</span> <span class="dl">'</span><span class="s1">@ai-sdk/openai</span><span class="dl">'</span><span class="p">;</span>

<span class="k">export</span> <span class="k">async</span> <span class="kd">function</span> <span class="nf">POST</span><span class="p">(</span><span class="nx">req</span><span class="p">:</span> <span class="nx">Request</span><span class="p">)</span> <span class="p">{</span>
  <span class="kd">const</span> <span class="p">{</span> <span class="nx">messages</span> <span class="p">}</span> <span class="o">=</span> <span class="k">await</span> <span class="nx">req</span><span class="p">.</span><span class="nf">json</span><span class="p">();</span>

  <span class="kd">const</span> <span class="nx">result</span> <span class="o">=</span> <span class="k">await</span> <span class="nf">streamText</span><span class="p">({</span>
    <span class="na">model</span><span class="p">:</span> <span class="nf">openai</span><span class="p">(</span><span class="dl">'</span><span class="s1">gpt-4o</span><span class="dl">'</span><span class="p">),</span>
    <span class="nx">messages</span><span class="p">,</span>
  <span class="p">});</span>

  <span class="k">return</span> <span class="nx">result</span><span class="p">.</span><span class="nf">toDataStreamResponse</span><span class="p">();</span>
<span class="p">}</span>
</code></pre></div></div>

<div class="language-tsx highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// app/page.tsx — chat UI with one hook</span>
<span class="dl">'</span><span class="s1">use client</span><span class="dl">'</span><span class="p">;</span>
<span class="k">import</span> <span class="p">{</span> <span class="nx">useChat</span> <span class="p">}</span> <span class="k">from</span> <span class="dl">'</span><span class="s1">ai/react</span><span class="dl">'</span><span class="p">;</span>

<span class="k">export</span> <span class="k">default</span> <span class="kd">function</span> <span class="nf">Chat</span><span class="p">()</span> <span class="p">{</span>
  <span class="kd">const</span> <span class="p">{</span> <span class="nx">messages</span><span class="p">,</span> <span class="nx">input</span><span class="p">,</span> <span class="nx">handleInputChange</span><span class="p">,</span> <span class="nx">handleSubmit</span> <span class="p">}</span> <span class="o">=</span> <span class="nf">useChat</span><span class="p">();</span>

  <span class="k">return </span><span class="p">(</span>
    <span class="p">&lt;</span><span class="nt">div</span><span class="p">&gt;</span>
      <span class="si">{</span><span class="nx">messages</span><span class="p">.</span><span class="nf">map</span><span class="p">(</span><span class="nx">m</span> <span class="o">=&gt;</span> <span class="p">(</span>
        <span class="p">&lt;</span><span class="nt">div</span> <span class="na">key</span><span class="p">=</span><span class="si">{</span><span class="nx">m</span><span class="p">.</span><span class="nx">id</span><span class="si">}</span><span class="p">&gt;&lt;</span><span class="nt">b</span><span class="p">&gt;</span><span class="si">{</span><span class="nx">m</span><span class="p">.</span><span class="nx">role</span><span class="si">}</span>:<span class="p">&lt;/</span><span class="nt">b</span><span class="p">&gt;</span> <span class="si">{</span><span class="nx">m</span><span class="p">.</span><span class="nx">content</span><span class="si">}</span><span class="p">&lt;/</span><span class="nt">div</span><span class="p">&gt;</span>
      <span class="p">))</span><span class="si">}</span>
      <span class="p">&lt;</span><span class="nt">form</span> <span class="na">onSubmit</span><span class="p">=</span><span class="si">{</span><span class="nx">handleSubmit</span><span class="si">}</span><span class="p">&gt;</span>
        <span class="p">&lt;</span><span class="nt">input</span> <span class="na">value</span><span class="p">=</span><span class="si">{</span><span class="nx">input</span><span class="si">}</span> <span class="na">onChange</span><span class="p">=</span><span class="si">{</span><span class="nx">handleInputChange</span><span class="si">}</span> <span class="na">placeholder</span><span class="p">=</span><span class="s">"Say something..."</span> <span class="p">/&gt;</span>
        <span class="p">&lt;</span><span class="nt">button</span> <span class="na">type</span><span class="p">=</span><span class="s">"submit"</span><span class="p">&gt;</span>Send<span class="p">&lt;/</span><span class="nt">button</span><span class="p">&gt;</span>
      <span class="p">&lt;/</span><span class="nt">form</span><span class="p">&gt;</span>
    <span class="p">&lt;/</span><span class="nt">div</span><span class="p">&gt;</span>
  <span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>

<h4 id="structured-generation-and-agent-patterns">Structured Generation and Agent Patterns</h4>

<p>The SDK provides clean primitives for structured output and tool use, though the abstractions are deliberately minimal. You get <code class="language-plaintext highlighter-rouge">generateText</code>, <code class="language-plaintext highlighter-rouge">streamText</code>, <code class="language-plaintext highlighter-rouge">generateObject</code>, <code class="language-plaintext highlighter-rouge">streamObject</code>, and a simple <code class="language-plaintext highlighter-rouge">maxSteps</code> loop for agentic behavior. There is no high-level “flow” abstraction or graph, you compose these primitives yourself.</p>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">import</span> <span class="p">{</span> <span class="nx">generateObject</span> <span class="p">}</span> <span class="k">from</span> <span class="dl">'</span><span class="s1">ai</span><span class="dl">'</span><span class="p">;</span>
<span class="k">import</span> <span class="p">{</span> <span class="nx">openai</span> <span class="p">}</span> <span class="k">from</span> <span class="dl">'</span><span class="s1">@ai-sdk/openai</span><span class="dl">'</span><span class="p">;</span>
<span class="k">import</span> <span class="p">{</span> <span class="nx">z</span> <span class="p">}</span> <span class="k">from</span> <span class="dl">'</span><span class="s1">zod</span><span class="dl">'</span><span class="p">;</span>

<span class="kd">const</span> <span class="p">{</span> <span class="nx">object</span> <span class="p">}</span> <span class="o">=</span> <span class="k">await</span> <span class="nf">generateObject</span><span class="p">({</span>
  <span class="na">model</span><span class="p">:</span> <span class="nf">openai</span><span class="p">(</span><span class="dl">'</span><span class="s1">gpt-4o</span><span class="dl">'</span><span class="p">),</span>
  <span class="na">schema</span><span class="p">:</span> <span class="nx">z</span><span class="p">.</span><span class="nf">object</span><span class="p">({</span>
    <span class="na">recipe</span><span class="p">:</span> <span class="nx">z</span><span class="p">.</span><span class="nf">object</span><span class="p">({</span>
      <span class="na">name</span><span class="p">:</span> <span class="nx">z</span><span class="p">.</span><span class="nf">string</span><span class="p">(),</span>
      <span class="na">ingredients</span><span class="p">:</span> <span class="nx">z</span><span class="p">.</span><span class="nf">array</span><span class="p">(</span><span class="nx">z</span><span class="p">.</span><span class="nf">object</span><span class="p">({</span> <span class="na">name</span><span class="p">:</span> <span class="nx">z</span><span class="p">.</span><span class="nf">string</span><span class="p">(),</span> <span class="na">amount</span><span class="p">:</span> <span class="nx">z</span><span class="p">.</span><span class="nf">string</span><span class="p">()</span> <span class="p">})),</span>
      <span class="na">steps</span><span class="p">:</span> <span class="nx">z</span><span class="p">.</span><span class="nf">array</span><span class="p">(</span><span class="nx">z</span><span class="p">.</span><span class="nf">string</span><span class="p">()),</span>
    <span class="p">}),</span>
  <span class="p">}),</span>
  <span class="na">prompt</span><span class="p">:</span> <span class="dl">'</span><span class="s1">Generate a recipe for a vegan chocolate cake.</span><span class="dl">'</span><span class="p">,</span>
<span class="p">});</span>
</code></pre></div></div>

<h4 id="genkit-vs-vercel-ai-sdk--abstraction-levels">Genkit vs. Vercel AI SDK — Abstraction Levels</h4>

<p>Compared to Genkit, the Vercel AI SDK operates at a <strong>lower level of abstraction</strong>. This is by design, Vercel wants to give you sharp, composable tools, not an opinionated framework. The trade-off is that you assemble more boilerplate yourself. Want to trace a multi-step agent? Wire up OpenTelemetry manually. Want a typed pipeline? Build it yourself. Genkit bakes these in.</p>

<p>Conversely, Vercel’s <strong>deep UI integration</strong>, streaming RSC, <code class="language-plaintext highlighter-rouge">useChat</code>, generative UI patterns, is something Genkit does not attempt to own. For Flutter-based applications, Genkit’s Dart SDK fills this role, but in the web domain, Vercel wins on integration depth.</p>

<h4 id="pros-and-cons-1">Pros and Cons</h4>

<table>
  <thead>
    <tr>
      <th>✅ Pros</th>
      <th>❌ Cons</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>Unmatched React/Next.js/Edge integration</td>
      <td>Primarily TypeScript/JavaScript only</td>
    </tr>
    <tr>
      <td>Minimal API surface, easy to learn</td>
      <td>No built-in flow or pipeline abstraction</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">useChat</code> / <code class="language-plaintext highlighter-rouge">useCompletion</code> hooks are best-in-class</td>
      <td>Developer Tool is basic (no trace explorer, no flow runner)</td>
    </tr>
    <tr>
      <td>Generative UI with RSC streaming</td>
      <td>Observability requires external tooling</td>
    </tr>
    <tr>
      <td>Broad provider support via official adapters</td>
      <td>Deeper use cases accumulate boilerplate quickly</td>
    </tr>
    <tr>
      <td>Idiomatic TypeScript throughout</td>
      <td>Vercel-ecosystem bias (AI Gateway, templates)</td>
    </tr>
  </tbody>
</table>

<hr />

<h2 id="mastra">Mastra</h2>

<h3 id="history-and-direction-2">History and Direction</h3>

<p>Mastra is the youngest framework in this comparison, founded in 2024 by the team behind Gatsby (Cade Diehm and Sam Bhagwat). Coming from a background of developer experience tooling and static-site generation, Mastra’s founders approached AI framework design with a strong bias toward <strong>TypeScript ergonomics</strong>, workflow-first thinking, and integrated tooling. The name “Mastra” (Swahili for “master”) reflects the team’s ambition to be the definitive TypeScript-native AI orchestration layer.</p>

<p>Mastra reached public beta in late 2024 and gained significant traction in early 2025 among TypeScript developers frustrated with LangChain’s Python-ported patterns. The framework’s distinct feature, a built-in <strong>Studio UI</strong>, arrived in early 2025 and quickly became its marquee differentiator. Mastra Studio is a web-based visual interface for defining, testing, and running agents and workflows, accessible locally or in the cloud. By mid-2025, Mastra had secured seed funding and announced hosted cloud infrastructure for deploying Mastra agents directly from the Studio.</p>

<p>Mastra’s direction is firmly in the TypeScript/JavaScript ecosystem. The team has shown no signs of pursuing multi-language support; instead, they are doubling down on deep integrations with popular TypeScript meta-frameworks like Next.js, Astro, SvelteKit, and Hono. Think of Mastra as the opinionated, batteries-included agent framework for TypeScript developers who want to spin up production agents as fast as possible, without writing any platform glue.</p>

<h3 id="what-makes-mastra-stand-out">What Makes Mastra Stand Out</h3>

<p>Mastra is purpose-built for one thing: <strong>spinning up agents fast</strong>. It is an agent-only framework, you will not find vanilla model calls or a “flow” primitive. Everything in Mastra is modelled around agents, tools, memory, and workflows. If you know exactly what you need (an agent with memory and tool access), Mastra gets you there in fewer lines of code than any other framework here.</p>

<p><strong>Supported languages:</strong> TypeScript/JavaScript exclusively. Integrations with Next.js, Astro, SvelteKit, Hono, Express.</p>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">import</span> <span class="p">{</span> <span class="nx">Mastra</span><span class="p">,</span> <span class="nx">Agent</span> <span class="p">}</span> <span class="k">from</span> <span class="dl">'</span><span class="s1">@mastra/core</span><span class="dl">'</span><span class="p">;</span>
<span class="k">import</span> <span class="p">{</span> <span class="nx">openai</span> <span class="p">}</span> <span class="k">from</span> <span class="dl">'</span><span class="s1">@mastra/openai</span><span class="dl">'</span><span class="p">;</span>

<span class="kd">const</span> <span class="nx">researchAgent</span> <span class="o">=</span> <span class="k">new</span> <span class="nc">Agent</span><span class="p">({</span>
  <span class="na">name</span><span class="p">:</span> <span class="dl">'</span><span class="s1">researcher</span><span class="dl">'</span><span class="p">,</span>
  <span class="na">model</span><span class="p">:</span> <span class="nf">openai</span><span class="p">(</span><span class="dl">'</span><span class="s1">gpt-4o</span><span class="dl">'</span><span class="p">),</span>
  <span class="na">instructions</span><span class="p">:</span> <span class="s2">`You are a research assistant. 
    Find relevant information, synthesize key points, 
    and present clear, well-structured summaries.`</span><span class="p">,</span>
  <span class="na">tools</span><span class="p">:</span> <span class="p">{</span>
    <span class="c1">// Tools added here</span>
  <span class="p">},</span>
<span class="p">});</span>

<span class="kd">const</span> <span class="nx">mastra</span> <span class="o">=</span> <span class="k">new</span> <span class="nc">Mastra</span><span class="p">({</span> <span class="na">agents</span><span class="p">:</span> <span class="p">{</span> <span class="nx">researchAgent</span> <span class="p">}</span> <span class="p">});</span>

<span class="kd">const</span> <span class="nx">response</span> <span class="o">=</span> <span class="k">await</span> <span class="nx">mastra</span><span class="p">.</span><span class="nf">getAgent</span><span class="p">(</span><span class="dl">'</span><span class="s1">researcher</span><span class="dl">'</span><span class="p">).</span><span class="nf">generate</span><span class="p">([</span>
  <span class="p">{</span> <span class="na">role</span><span class="p">:</span> <span class="dl">'</span><span class="s1">user</span><span class="dl">'</span><span class="p">,</span> <span class="na">content</span><span class="p">:</span> <span class="dl">'</span><span class="s1">Summarize the latest developments in quantum computing.</span><span class="dl">'</span> <span class="p">},</span>
<span class="p">]);</span>

<span class="nx">console</span><span class="p">.</span><span class="nf">log</span><span class="p">(</span><span class="nx">response</span><span class="p">.</span><span class="nx">text</span><span class="p">);</span>
</code></pre></div></div>

<h4 id="workflows">Workflows</h4>

<p>Mastra’s workflow primitive lets you chain agent steps into typed, directed graphs, useful when you need a mix of deterministic logic and LLM reasoning.</p>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">import</span> <span class="p">{</span> <span class="nx">Workflow</span><span class="p">,</span> <span class="nx">Step</span> <span class="p">}</span> <span class="k">from</span> <span class="dl">'</span><span class="s1">@mastra/core</span><span class="dl">'</span><span class="p">;</span>
<span class="k">import</span> <span class="p">{</span> <span class="nx">z</span> <span class="p">}</span> <span class="k">from</span> <span class="dl">'</span><span class="s1">zod</span><span class="dl">'</span><span class="p">;</span>

<span class="kd">const</span> <span class="nx">contentPipeline</span> <span class="o">=</span> <span class="k">new</span> <span class="nc">Workflow</span><span class="p">({</span>
  <span class="na">name</span><span class="p">:</span> <span class="dl">'</span><span class="s1">contentPipeline</span><span class="dl">'</span><span class="p">,</span>
  <span class="na">triggerSchema</span><span class="p">:</span> <span class="nx">z</span><span class="p">.</span><span class="nf">object</span><span class="p">({</span> <span class="na">topic</span><span class="p">:</span> <span class="nx">z</span><span class="p">.</span><span class="nf">string</span><span class="p">()</span> <span class="p">}),</span>
<span class="p">});</span>

<span class="nx">contentPipeline</span>
  <span class="p">.</span><span class="nf">step</span><span class="p">({</span>
    <span class="na">id</span><span class="p">:</span> <span class="dl">'</span><span class="s1">research</span><span class="dl">'</span><span class="p">,</span>
    <span class="na">execute</span><span class="p">:</span> <span class="k">async </span><span class="p">({</span> <span class="nx">context</span> <span class="p">})</span> <span class="o">=&gt;</span> <span class="p">{</span>
      <span class="kd">const</span> <span class="p">{</span> <span class="nx">topic</span> <span class="p">}</span> <span class="o">=</span> <span class="nx">context</span><span class="p">.</span><span class="nx">triggerData</span><span class="p">;</span>
      <span class="c1">// Agent call to research the topic</span>
      <span class="k">return</span> <span class="p">{</span> <span class="na">research</span><span class="p">:</span> <span class="s2">`Key facts about </span><span class="p">${</span><span class="nx">topic</span><span class="p">}</span><span class="s2">`</span> <span class="p">};</span>
    <span class="p">},</span>
  <span class="p">})</span>
  <span class="p">.</span><span class="nf">then</span><span class="p">({</span>
    <span class="na">id</span><span class="p">:</span> <span class="dl">'</span><span class="s1">draft</span><span class="dl">'</span><span class="p">,</span>
    <span class="na">execute</span><span class="p">:</span> <span class="k">async </span><span class="p">({</span> <span class="nx">context</span> <span class="p">})</span> <span class="o">=&gt;</span> <span class="p">{</span>
      <span class="kd">const</span> <span class="p">{</span> <span class="nx">research</span> <span class="p">}</span> <span class="o">=</span> <span class="nx">context</span><span class="p">.</span><span class="nf">getStepResult</span><span class="p">(</span><span class="dl">'</span><span class="s1">research</span><span class="dl">'</span><span class="p">);</span>
      <span class="c1">// Agent call to draft the article</span>
      <span class="k">return</span> <span class="p">{</span> <span class="na">draft</span><span class="p">:</span> <span class="s2">`Article draft using: </span><span class="p">${</span><span class="nx">research</span><span class="p">}</span><span class="s2">`</span> <span class="p">};</span>
    <span class="p">},</span>
  <span class="p">})</span>
  <span class="p">.</span><span class="nf">commit</span><span class="p">();</span>
</code></pre></div></div>

<h4 id="pros-and-cons-2">Pros and Cons</h4>

<table>
  <thead>
    <tr>
      <th>✅ Pros</th>
      <th>❌ Cons</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>Fastest path to a production-ready agent in TypeScript</td>
      <td>Agent-only: no flows, no vanilla generation primitives</td>
    </tr>
    <tr>
      <td>Excellent Studio UI for visual workflow building</td>
      <td>TypeScript/JavaScript only</td>
    </tr>
    <tr>
      <td>Idiomatic TypeScript API with strong type inference</td>
      <td>Younger ecosystem, fewer plugins</td>
    </tr>
    <tr>
      <td>Good memory and tool-calling primitives</td>
      <td>Observability still maturing</td>
    </tr>
    <tr>
      <td>Integrates well with popular JS meta-frameworks</td>
      <td>No mobile/cross-platform story</td>
    </tr>
  </tbody>
</table>

<hr />

<h2 id="langchain">LangChain</h2>

<h3 id="history-and-direction-3">History and Direction</h3>

<p>LangChain is, by a significant margin, the most widely used AI framework in the world, but its story is complicated. Harrison Chase created LangChain in October 2022 as a Python library for chaining LLM calls, and it spread virally through the developer community in early 2023 as everyone scrambled to experiment with GPT-3 and GPT-4. Its key insight, that useful AI applications require structured chains of calls, retrieval augmentation, and tool integration, was correct and arrived at the right moment. GitHub stars and npm downloads shot to the top of every chart.</p>

<p>The JavaScript port, <code class="language-plaintext highlighter-rouge">langchain</code> on npm, arrived shortly after and has tracked the Python library closely in both API design and feature parity. This is the source of one of LangChain’s most persistent criticisms: the JavaScript SDK feels like <strong>Python idioms force-translated into TypeScript</strong>. Patterns like <code class="language-plaintext highlighter-rouge">BaseChain</code>, <code class="language-plaintext highlighter-rouge">runnable</code> pipelines with <code class="language-plaintext highlighter-rouge">.pipe()</code>, and the LCEL (LangChain Expression Language) make perfect sense coming from Python’s compositional patterns but feel unnatural to TypeScript developers accustomed to async/await and module-based composition.</p>

<p>LangChain the company raised $35M in 2023 and has since built a growing platform around <strong>LangSmith</strong> (observability and evaluation) and <strong>LangGraph</strong> (graph-based orchestration). This is where the tension lies: LangChain’s open-source SDK and LangSmith are designed to complement each other. Getting the best observability experience requires using LangSmith. While you can configure other backends, the seamless experience is on their platform. The framework is excellent and featureful, but its commercial direction is unmistakably pointed toward LangSmith adoption.</p>

<p>In 2025, LangChain reorganized its JavaScript library around a cleaner agent API (<code class="language-plaintext highlighter-rouge">create_agent</code>) and introduced Deep Agents, pre-built agent implementations with built-in context compression and subagent spawning. LangGraph remains the recommended framework for complex multi-step workflows, and LangSmith continues to be the best-in-class platform for production LLM observability.</p>

<h3 id="langchains-position-agent-first-platform-tied">LangChain’s Position: Agent-First, Platform-Tied</h3>

<p>LangChain is squarely an <strong>agent framework</strong>. Its sweet spot is spinning up capable agents quickly, particularly for teams coming from the Python AI ecosystem who want to move to or stay in JavaScript without losing the LangChain mental model. It is the most feature-complete framework here in terms of raw agent capabilities, RAG patterns, and integrations, but that breadth comes with complexity.</p>

<p><strong>Supported languages:</strong> Python (primary, feature-complete), JavaScript/TypeScript (JS port, near-parity). Note: the JS SDK carries Python-style patterns.</p>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">import</span> <span class="p">{</span> <span class="nx">createAgent</span> <span class="p">}</span> <span class="k">from</span> <span class="dl">'</span><span class="s1">langchain/agents</span><span class="dl">'</span><span class="p">;</span>
<span class="k">import</span> <span class="p">{</span> <span class="nx">ChatOpenAI</span> <span class="p">}</span> <span class="k">from</span> <span class="dl">'</span><span class="s1">@langchain/openai</span><span class="dl">'</span><span class="p">;</span>

<span class="kd">function</span> <span class="nf">getWeather</span><span class="p">(</span><span class="nx">city</span><span class="p">:</span> <span class="kr">string</span><span class="p">):</span> <span class="kr">string</span> <span class="p">{</span>
  <span class="c1">// Real implementation would call a weather API</span>
  <span class="k">return</span> <span class="s2">`It's always sunny in </span><span class="p">${</span><span class="nx">city</span><span class="p">}</span><span class="s2">!`</span><span class="p">;</span>
<span class="p">}</span>

<span class="kd">const</span> <span class="nx">model</span> <span class="o">=</span> <span class="k">new</span> <span class="nc">ChatOpenAI</span><span class="p">({</span> <span class="na">model</span><span class="p">:</span> <span class="dl">'</span><span class="s1">gpt-4o</span><span class="dl">'</span><span class="p">,</span> <span class="na">temperature</span><span class="p">:</span> <span class="mi">0</span> <span class="p">});</span>

<span class="kd">const</span> <span class="nx">agent</span> <span class="o">=</span> <span class="nf">createAgent</span><span class="p">({</span>
  <span class="nx">model</span><span class="p">,</span>
  <span class="na">tools</span><span class="p">:</span> <span class="p">[</span>
    <span class="p">{</span>
      <span class="na">name</span><span class="p">:</span> <span class="dl">'</span><span class="s1">get_weather</span><span class="dl">'</span><span class="p">,</span>
      <span class="na">description</span><span class="p">:</span> <span class="dl">'</span><span class="s1">Get weather for a given city.</span><span class="dl">'</span><span class="p">,</span>
      <span class="na">func</span><span class="p">:</span> <span class="nx">getWeather</span><span class="p">,</span>
    <span class="p">},</span>
  <span class="p">],</span>
  <span class="na">systemPrompt</span><span class="p">:</span> <span class="dl">'</span><span class="s1">You are a helpful assistant.</span><span class="dl">'</span><span class="p">,</span>
<span class="p">});</span>

<span class="kd">const</span> <span class="nx">result</span> <span class="o">=</span> <span class="k">await</span> <span class="nx">agent</span><span class="p">.</span><span class="nf">invoke</span><span class="p">({</span>
  <span class="na">messages</span><span class="p">:</span> <span class="p">[{</span> <span class="na">role</span><span class="p">:</span> <span class="dl">'</span><span class="s1">user</span><span class="dl">'</span><span class="p">,</span> <span class="na">content</span><span class="p">:</span> <span class="dl">'</span><span class="s1">What is the weather in Madrid?</span><span class="dl">'</span> <span class="p">}],</span>
<span class="p">});</span>
<span class="nx">console</span><span class="p">.</span><span class="nf">log</span><span class="p">(</span><span class="nx">result</span><span class="p">.</span><span class="nx">messages</span><span class="p">.</span><span class="nf">at</span><span class="p">(</span><span class="o">-</span><span class="mi">1</span><span class="p">)?.</span><span class="nx">content</span><span class="p">);</span>
</code></pre></div></div>

<h4 id="langsmith-observability">LangSmith Observability</h4>

<p>LangSmith is LangChain’s answer to the observability problem. It provides trace visualization, dataset management, prompt versioning, and LLM evaluation, all polished and production-grade. The integration with LangChain is seamless: set <code class="language-plaintext highlighter-rouge">LANGSMITH_TRACING=true</code> and every run is captured automatically.</p>

<p>The catch is that LangSmith is a SaaS platform. Genkit’s Dev UI provides comparable local observability with zero cloud dependency. If you need hosted, team-scale observability, LangSmith is arguably the best option in the market. If you need local, zero-config development tracing, Genkit wins.</p>

<h4 id="pros-and-cons-3">Pros and Cons</h4>

<table>
  <thead>
    <tr>
      <th>✅ Pros</th>
      <th>❌ Cons</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>Largest community and integration ecosystem</td>
      <td>JavaScript SDK feels like Python ported to TS</td>
    </tr>
    <tr>
      <td>LangSmith is best-in-class for production observability</td>
      <td>Tight coupling to LangSmith for full observability</td>
    </tr>
    <tr>
      <td>Feature-complete agent, RAG, and chain primitives</td>
      <td>Complex API surface, steep learning curve</td>
    </tr>
    <tr>
      <td>Excellent Python SDK for Python teams</td>
      <td>LangGraph required for complex graph workflows</td>
    </tr>
    <tr>
      <td>Deep AgentS provide batteries-included patterns</td>
      <td>Heavy bundle size in browser/edge environments</td>
    </tr>
    <tr>
      <td>LangGraph for advanced workflow orchestration</td>
      <td>Commercial platform pressure</td>
    </tr>
  </tbody>
</table>

<hr />

<h2 id="google-adk-agent-development-kit">Google ADK (Agent Development Kit)</h2>

<h3 id="history-and-direction-4">History and Direction</h3>

<p>Google ADK was announced at Google Cloud Next 2024 as Google’s opinionated take on a production-grade agent framework, specifically targeting enterprise deployments on Google Cloud. Unlike Genkit, which is cloud-agnostic and full-stack, ADK was designed from day one around <strong>Vertex AI</strong> and Google Cloud’s agent infrastructure, including Agent Engine, Cloud Run, and GKE. It is the framework Google recommends when you’re building agents that will live in a Google Cloud environment at scale.</p>

<p>ADK’s initial release was Python-only, which told the story clearly: this was a framework for the enterprise Python AI developer, data scientists, ML engineers, and cloud architects who think in agents and workflows and are already committed to Google Cloud. The TypeScript, Go, and Java SDKs followed in 2025, with ADK Go 1.0 and ADK Java 1.0 shipping in early 2026. This multi-language expansion signals that Google is positioning ADK as more than a Python script runner, it wants to be the enterprise agent runtime for any Google Cloud workload.</p>

<p>ADK 2.0, released in 2026, brought significant refinements: graph-based workflow APIs, a visual Web UI builder, enhanced evaluation tooling (including user simulation and environment simulation for testing agents end-to-end), and deeper A2A (Agent-to-Agent) protocol support. The A2A protocol is an open standard that allows ADK agents to communicate with agents built on other frameworks, a meaningful interoperability effort in a fragmented ecosystem.</p>

<p>Google’s direction with ADK is unmistakable: this is enterprise AI infrastructure for Google Cloud customers. If your organization runs on GCP and needs reliable, scalable, observable agent deployments with enterprise support, ADK is Google’s answer. If you need to be cloud-agnostic, look elsewhere.</p>

<h3 id="adks-position-agent-first-enterprise-grade">ADK’s Position: Agent-First, Enterprise-Grade</h3>

<p>Like LangChain and Mastra, ADK is an <strong>agent-only framework</strong>, its reason for existing is to make building, evaluating, and deploying agents fast and reliable. Unlike Mastra (which targets indie developers and startups), ADK is purpose-built for enterprise scenarios: multi-agent systems, graph-based orchestration, agent evaluation at scale, and deployment to Google’s managed infrastructure.</p>

<p><strong>Supported languages:</strong> Python (primary, feature-complete), TypeScript/JavaScript, Go, Java. Note: the API design and documentation are heavily Python-first; TypeScript and other SDKs track but sometimes lag the Python feature set.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># Python — ADK's primary language
</span><span class="kn">from</span> <span class="n">google.adk</span> <span class="kn">import</span> <span class="n">Agent</span>
<span class="kn">from</span> <span class="n">google.adk.tools</span> <span class="kn">import</span> <span class="n">google_search</span>

<span class="n">research_agent</span> <span class="o">=</span> <span class="nc">Agent</span><span class="p">(</span>
    <span class="n">name</span><span class="o">=</span><span class="sh">"</span><span class="s">researcher</span><span class="sh">"</span><span class="p">,</span>
    <span class="n">model</span><span class="o">=</span><span class="sh">"</span><span class="s">gemini-flash-latest</span><span class="sh">"</span><span class="p">,</span>
    <span class="n">instruction</span><span class="o">=</span><span class="sh">"</span><span class="s">You help users research topics thoroughly and accurately.</span><span class="sh">"</span><span class="p">,</span>
    <span class="n">tools</span><span class="o">=</span><span class="p">[</span><span class="n">google_search</span><span class="p">],</span>
<span class="p">)</span>

<span class="c1"># Run locally
</span><span class="n">result</span> <span class="o">=</span> <span class="n">research_agent</span><span class="p">.</span><span class="nf">run</span><span class="p">(</span><span class="sh">"</span><span class="s">What are the latest developments in fusion energy?</span><span class="sh">"</span><span class="p">)</span>
<span class="nf">print</span><span class="p">(</span><span class="n">result</span><span class="p">.</span><span class="n">text</span><span class="p">)</span>
</code></pre></div></div>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// TypeScript ADK</span>
<span class="k">import</span> <span class="p">{</span> <span class="nx">Agent</span> <span class="p">}</span> <span class="k">from</span> <span class="dl">'</span><span class="s1">@google/adk</span><span class="dl">'</span><span class="p">;</span>
<span class="k">import</span> <span class="p">{</span> <span class="nx">googleSearch</span> <span class="p">}</span> <span class="k">from</span> <span class="dl">'</span><span class="s1">@google/adk/tools</span><span class="dl">'</span><span class="p">;</span>

<span class="kd">const</span> <span class="nx">researchAgent</span> <span class="o">=</span> <span class="k">new</span> <span class="nc">Agent</span><span class="p">({</span>
  <span class="na">name</span><span class="p">:</span> <span class="dl">'</span><span class="s1">researcher</span><span class="dl">'</span><span class="p">,</span>
  <span class="na">model</span><span class="p">:</span> <span class="dl">'</span><span class="s1">gemini-flash-latest</span><span class="dl">'</span><span class="p">,</span>
  <span class="na">instruction</span><span class="p">:</span> <span class="dl">'</span><span class="s1">You help users research topics thoroughly and accurately.</span><span class="dl">'</span><span class="p">,</span>
  <span class="na">tools</span><span class="p">:</span> <span class="p">[</span><span class="nx">googleSearch</span><span class="p">],</span>
<span class="p">});</span>

<span class="kd">const</span> <span class="nx">result</span> <span class="o">=</span> <span class="k">await</span> <span class="nx">researchAgent</span><span class="p">.</span><span class="nf">run</span><span class="p">(</span>
  <span class="dl">'</span><span class="s1">What are the latest developments in fusion energy?</span><span class="dl">'</span>
<span class="p">);</span>
<span class="nx">console</span><span class="p">.</span><span class="nf">log</span><span class="p">(</span><span class="nx">result</span><span class="p">.</span><span class="nx">text</span><span class="p">);</span>
</code></pre></div></div>

<h4 id="multi-agent-systems">Multi-Agent Systems</h4>

<p>ADK’s multi-agent support is one of its strongest features. You can compose agents hierarchically, assign them different models, and let them collaborate via the A2A protocol.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">from</span> <span class="n">google.adk</span> <span class="kn">import</span> <span class="n">Agent</span>
<span class="kn">from</span> <span class="n">google.adk.agents</span> <span class="kn">import</span> <span class="n">SequentialAgent</span><span class="p">,</span> <span class="n">ParallelAgent</span>

<span class="n">researcher</span> <span class="o">=</span> <span class="nc">Agent</span><span class="p">(</span><span class="n">name</span><span class="o">=</span><span class="sh">"</span><span class="s">researcher</span><span class="sh">"</span><span class="p">,</span> <span class="n">model</span><span class="o">=</span><span class="sh">"</span><span class="s">gemini-flash-latest</span><span class="sh">"</span><span class="p">,</span> <span class="n">instruction</span><span class="o">=</span><span class="sh">"</span><span class="s">Research the topic.</span><span class="sh">"</span><span class="p">)</span>
<span class="n">writer</span> <span class="o">=</span> <span class="nc">Agent</span><span class="p">(</span><span class="n">name</span><span class="o">=</span><span class="sh">"</span><span class="s">writer</span><span class="sh">"</span><span class="p">,</span> <span class="n">model</span><span class="o">=</span><span class="sh">"</span><span class="s">gemini-pro-latest</span><span class="sh">"</span><span class="p">,</span> <span class="n">instruction</span><span class="o">=</span><span class="sh">"</span><span class="s">Write a clear article from the research.</span><span class="sh">"</span><span class="p">)</span>
<span class="n">editor</span> <span class="o">=</span> <span class="nc">Agent</span><span class="p">(</span><span class="n">name</span><span class="o">=</span><span class="sh">"</span><span class="s">editor</span><span class="sh">"</span><span class="p">,</span> <span class="n">model</span><span class="o">=</span><span class="sh">"</span><span class="s">gemini-flash-latest</span><span class="sh">"</span><span class="p">,</span> <span class="n">instruction</span><span class="o">=</span><span class="sh">"</span><span class="s">Polish and format the article.</span><span class="sh">"</span><span class="p">)</span>

<span class="n">content_pipeline</span> <span class="o">=</span> <span class="nc">SequentialAgent</span><span class="p">(</span>
    <span class="n">name</span><span class="o">=</span><span class="sh">"</span><span class="s">contentPipeline</span><span class="sh">"</span><span class="p">,</span>
    <span class="n">agents</span><span class="o">=</span><span class="p">[</span><span class="n">researcher</span><span class="p">,</span> <span class="n">writer</span><span class="p">,</span> <span class="n">editor</span><span class="p">],</span>
<span class="p">)</span>

<span class="n">result</span> <span class="o">=</span> <span class="n">content_pipeline</span><span class="p">.</span><span class="nf">run</span><span class="p">(</span><span class="sh">"</span><span class="s">Write an article about the impact of quantum computing on cryptography.</span><span class="sh">"</span><span class="p">)</span>
</code></pre></div></div>

<h4 id="vertex-ai-lock-in">Vertex AI Lock-In</h4>

<p>ADK’s evaluation, deployment, and production observability features lean heavily on <strong>Vertex AI Agent Engine</strong>, <strong>Cloud Trace</strong>, and Google’s managed infrastructure. You can run ADK locally and even deploy to Cloud Run or GKE independently, but to get the full ADK experience, including agent evaluation, performance dashboards, and managed scaling, you’re on Google Cloud. This is similar to how LangSmith is the intended observability backend for LangChain: technically optional, practically expected.</p>

<p>Frameworks like Genkit, Vercel AI SDK, and Mastra were designed from the ground up to be cloud-neutral. ADK and LangChain, by contrast, have strong ecosystem gravity toward their respective platforms.</p>

<h4 id="pros-and-cons-4">Pros and Cons</h4>

<table>
  <thead>
    <tr>
      <th>✅ Pros</th>
      <th>❌ Cons</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>Enterprise-grade agent infrastructure</td>
      <td>Strongly tied to Vertex AI and Google Cloud</td>
    </tr>
    <tr>
      <td>Multi-language: Python, TypeScript, Go, Java</td>
      <td>Python-first: TS/Go/Java APIs lag in features</td>
    </tr>
    <tr>
      <td>Best-in-class multi-agent and A2A support</td>
      <td>Brings Python coding patterns to JS developers</td>
    </tr>
    <tr>
      <td>Graph-based workflows and evaluation tools</td>
      <td>Less suitable for cloud-agnostic deployments</td>
    </tr>
    <tr>
      <td>Direct integration with Google Search, Vertex Search</td>
      <td>Heavier setup and operational complexity</td>
    </tr>
    <tr>
      <td>Agent evaluation with user simulation</td>
      <td>Not a full-stack framework (agent-only)</td>
    </tr>
  </tbody>
</table>

<hr />

<h2 id="head-to-head-comparison">Head-to-Head Comparison</h2>

<h3 id="developer-experience">Developer Experience</h3>

<table>
  <thead>
    <tr>
      <th>Framework</th>
      <th>DX Highlights</th>
      <th>Shortcomings</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td><strong>Genkit</strong></td>
      <td>Dev UI is unparalleled for local debugging. Idiomatic TypeScript. Multi-level abstractions.</td>
      <td>Less prescriptive, more choices to make upfront</td>
    </tr>
    <tr>
      <td><strong>Vercel AI SDK</strong></td>
      <td>Frictionless React/Next.js integration. Minimal API.</td>
      <td>Assembles boilerplate for complex scenarios</td>
    </tr>
    <tr>
      <td><strong>Mastra</strong></td>
      <td>Fastest path to a working agent. Great Studio UI.</td>
      <td>Agent-only, JS-only</td>
    </tr>
    <tr>
      <td><strong>LangChain</strong></td>
      <td>Vast documentation and community. Battle-tested patterns.</td>
      <td>Python idioms in TypeScript, complex API</td>
    </tr>
    <tr>
      <td><strong>ADK</strong></td>
      <td>Powerful multi-agent tooling. Strong eval story.</td>
      <td>GCP-centric, Python-first</td>
    </tr>
  </tbody>
</table>

<h3 id="abstraction-levels">Abstraction Levels</h3>

<p>Genkit is the only framework that gives you all three levels in one SDK: <strong>vanilla generation</strong>, <strong>typed flows (pipelines)</strong>, and <strong>agents</strong>. Vercel AI SDK lives at the lower end, it gives you clean generation and tool-calling primitives but no flow abstraction. Mastra, LangChain, and ADK are agent frameworks: they optimize for spinning up agents quickly but don’t offer a coherent story for when you just want to generate text or structure a pipeline without agent autonomy.</p>

<h3 id="observability">Observability</h3>

<table>
  <thead>
    <tr>
      <th>Framework</th>
      <th>Local Dev Observability</th>
      <th>Production Observability</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td><strong>Genkit</strong></td>
      <td>Built-in Dev UI, trace explorer, Dotprompt editor</td>
      <td>OTEL-compatible, Cloud Trace, Langfuse</td>
    </tr>
    <tr>
      <td><strong>Vercel AI SDK</strong></td>
      <td>Basic Developer Panel</td>
      <td>OTEL, Vercel Observability (platform-tied)</td>
    </tr>
    <tr>
      <td><strong>Mastra</strong></td>
      <td>Studio UI for workflows</td>
      <td>Still maturing</td>
    </tr>
    <tr>
      <td><strong>LangChain</strong></td>
      <td>Minimal without LangSmith</td>
      <td>LangSmith (best-in-class, SaaS)</td>
    </tr>
    <tr>
      <td><strong>ADK</strong></td>
      <td>ADK Web UI</td>
      <td>Cloud Trace + Vertex (GCP-tied)</td>
    </tr>
  </tbody>
</table>

<h3 id="language-support">Language Support</h3>

<table>
  <thead>
    <tr>
      <th>Framework</th>
      <th>Primary</th>
      <th>Additional</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td><strong>Genkit</strong></td>
      <td>TypeScript</td>
      <td>Python (preview), Go, Dart/Flutter (preview), Java (Unofficial)</td>
    </tr>
    <tr>
      <td><strong>Vercel AI SDK</strong></td>
      <td>TypeScript</td>
      <td>Node.js runtimes, Edge</td>
    </tr>
    <tr>
      <td><strong>Mastra</strong></td>
      <td>TypeScript</td>
      <td>JS runtimes only</td>
    </tr>
    <tr>
      <td><strong>LangChain</strong></td>
      <td>Python</td>
      <td>TypeScript (near-parity, Python idioms)</td>
    </tr>
    <tr>
      <td><strong>ADK</strong></td>
      <td>Python</td>
      <td>TypeScript, Go, Java</td>
    </tr>
  </tbody>
</table>

<h3 id="framework-neutrality">Framework Neutrality</h3>

<p><strong>Genkit</strong>, <strong>Vercel AI SDK</strong>, and <strong>Mastra</strong> were built from the ground up to be provider-neutral. They support OpenAI, Anthropic, Google, and others through a unified API, and they deploy to any infrastructure.</p>

<p><strong>LangChain</strong> and <strong>ADK</strong> are platform-influenced. LangChain’s full power unlocks with LangSmith; ADK’s full power unlocks on Google Cloud. This is not a dealbreaker, both platforms are excellent, but it is an architectural commitment you should make consciously.</p>

<h3 id="idiom-and-code-style">Idiom and Code Style</h3>

<p>Genkit, Mastra, and Vercel AI SDK feel <strong>natively TypeScript</strong>: async/await everywhere, Zod schemas for validation, module-based composition, and no runtime class inheritance chains to navigate.</p>

<p>LangChain and ADK’s TypeScript SDKs carry the weight of their Python origins. You’ll find class-heavy APIs, <code class="language-plaintext highlighter-rouge">.pipe()</code> chains, and patterns that feel natural if you’ve written LangChain Python but unfamiliar if you’re coming from the TypeScript world. This is not a quality judgment, it’s a cultural fit question.</p>

<hr />

<h2 id="which-framework-should-you-choose">Which Framework Should You Choose?</h2>

<p>After building with all five, here’s my honest take:</p>

<p><strong>Choose Genkit if:</strong></p>
<ul>
  <li>You want to iterate on your AI fast and get feedback with less back and forth — Genkit was built from the ground up for powerful local tooling and observability.</li>
  <li>You need to mix vanilla generation, typed pipelines (flows), and agents in the same app.</li>
  <li>Provider neutrality is important now or likely to be important later.</li>
  <li>You’re building a Flutter/Dart mobile app and need AI capabilities.</li>
  <li>You want OpenTelemetry-compatible tracing without configuring a separate backend.</li>
</ul>

<p><strong>Choose Vercel AI SDK if:</strong></p>
<ul>
  <li>You’re building a React/Next.js app and want the lowest-friction path to streaming AI UI.</li>
  <li>Simplicity and minimal API surface matter more than built-in abstractions.</li>
  <li>You’re already on the Vercel platform and want native integration.</li>
  <li>Your use case maps well to the UI hooks (<code class="language-plaintext highlighter-rouge">useChat</code>, <code class="language-plaintext highlighter-rouge">useCompletion</code>, generative UI).</li>
</ul>

<p><strong>Choose Mastra if:</strong></p>
<ul>
  <li>You’re a TypeScript developer who wants to spin up a production agent as fast as possible.</li>
  <li>You want a clean, idiomatic TypeScript agent API without Python-ported patterns.</li>
  <li>The visual Studio UI for workflow design appeals to your team.</li>
  <li>You’re building in the Next.js/SvelteKit/Hono ecosystem.</li>
</ul>

<p><strong>Choose LangChain if:</strong></p>
<ul>
  <li>Your team is coming from the Python AI ecosystem and wants cross-language continuity.</li>
  <li>You need the broadest possible integration ecosystem (the most integrations of any framework).</li>
  <li>You’re investing in LangSmith for production observability and want a cohesive platform.</li>
  <li>LangGraph’s graph-based orchestration matches your workflow complexity.</li>
</ul>

<p><strong>Choose ADK if:</strong></p>
<ul>
  <li>You’re building enterprise-grade multi-agent systems on Google Cloud.</li>
  <li>Vertex AI’s infrastructure (Agent Engine, Cloud Trace, Vertex Search) is already in your stack.</li>
  <li>You need battle-tested multi-language support including Go and Java.</li>
  <li>Agent evaluation at scale (user simulation, custom metrics) is a core requirement.</li>
</ul>

<hr />

<h2 id="conclusion">Conclusion</h2>

<p>The Generative AI framework landscape in 2026 is not a winner-take-all market. Each of the five frameworks covered here has a legitimate use case, a growing community, and an active development team.</p>

<p>If I had to crown one framework as the most versatile choice for teams that haven’t already committed to a cloud platform, it would be <strong>Genkit</strong>. Its combination of multi-level abstractions, provider neutrality, and, above all, the Developer UI creates a development experience that genuinely accelerates iteration. The fact that it is expanding to Dart/Flutter, Python, and Go while keeping its TypeScript SDK as the best-in-class experience is a sign of a team thinking about the long game.</p>

<p>That said, none of these frameworks is going away. LangChain’s ecosystem depth, ADK’s enterprise footprint, Vercel’s UI ergonomics, and Mastra’s TypeScript-native speed all serve real needs. The most important thing is to make the choice deliberately, understanding what you’re trading when you pick a platform-tied framework, and what you’re gaining when you pick a more opinionated one.</p>

<p>Happy building.</p>

<hr />

<p><em>Last updated: April 2026. Framework versions referenced: Genkit 1.x, Vercel AI SDK 6.x, Mastra 0.x (latest), LangChain JS 0.3.x, Google ADK 2.0.</em></p>]]></content><author><name>Xavier Portilla Edo</name><email>xavierportillaedo@gmail.com</email></author><category term="genkit" /><category term="genkit" /><summary type="html"><![CDATA[A practical, in-depth comparison of the top Generative AI frameworks in 2026: Genkit, Vercel AI SDK, Mastra, LangChain, and Google ADK, from someone who has built with all of them. (English)]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://xavidop.me/assets/img/blog/post-headers/top-genai-frameworks-2026.png" /><media:content medium="image" url="https://xavidop.me/assets/img/blog/post-headers/top-genai-frameworks-2026.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry xml:lang="en"><title type="html">Running Genkit on AWS Lambda with Bedrock (English)</title><link href="https://xavidop.me/genkit/2026-03-20-genkit-aws-lambda-bedrock/" rel="alternate" type="text/html" title="Running Genkit on AWS Lambda with Bedrock (English)" /><published>2026-03-20T00:00:00+00:00</published><updated>2026-03-20T17:41:31+00:00</updated><id>https://xavidop.me/genkit/genkit-aws-lambda-bedrock</id><content type="html" xml:base="https://xavidop.me/genkit/2026-03-20-genkit-aws-lambda-bedrock/"><![CDATA[<ol class="no_toc" id="markdown-toc">
  <li><a href="#introduction" id="markdown-toc-introduction">Introduction</a></li>
  <li><a href="#why-genkit-on-aws" id="markdown-toc-why-genkit-on-aws">Why Genkit on AWS?</a></li>
  <li><a href="#prerequisites" id="markdown-toc-prerequisites">Prerequisites</a></li>
  <li><a href="#project-structure" id="markdown-toc-project-structure">Project Structure</a></li>
  <li><a href="#initializing-genkit-with-bedrock" id="markdown-toc-initializing-genkit-with-bedrock">Initializing Genkit with Bedrock</a></li>
  <li><a href="#defining-flows" id="markdown-toc-defining-flows">Defining Flows</a>    <ol>
      <li><a href="#story-generator-flow" id="markdown-toc-story-generator-flow">Story Generator Flow</a></li>
      <li><a href="#joke-flow-with-streaming" id="markdown-toc-joke-flow-with-streaming">Joke Flow with Streaming</a></li>
    </ol>
  </li>
  <li><a href="#wrapping-flows-as-lambda-handlers" id="markdown-toc-wrapping-flows-as-lambda-handlers">Wrapping Flows as Lambda Handlers</a></li>
  <li><a href="#the-genkit-dev-ui" id="markdown-toc-the-genkit-dev-ui">The Genkit Dev UI</a></li>
  <li><a href="#local-development-with-serverless-offline" id="markdown-toc-local-development-with-serverless-offline">Local Development with Serverless Offline</a></li>
  <li><a href="#deployment" id="markdown-toc-deployment">Deployment</a></li>
  <li><a href="#using-the-genkit-client-sdk" id="markdown-toc-using-the-genkit-client-sdk">Using the Genkit Client SDK</a></li>
  <li><a href="#resources" id="markdown-toc-resources">Resources</a></li>
  <li><a href="#conclusion" id="markdown-toc-conclusion">Conclusion</a></li>
</ol>

<h2 id="introduction">Introduction</h2>

<p>One of the most powerful things about <strong>Genkit</strong> is that it is cloud-agnostic. You are not locked into a single provider. In this post, we will explore how to run Genkit flows on <strong>AWS Lambda</strong> using the <strong>AWS Bedrock plugin</strong>, deploying a full AI-powered story and joke generator with streaming support, all managed by the <strong>Serverless Framework</strong>.</p>

<p>The project uses the <code class="language-plaintext highlighter-rouge">onCallGenkit</code> helper from the AWS Bedrock plugin, which wraps any Genkit flow as a Lambda handler automatically, handling CORS, request parsing, error formatting, and even streaming via Lambda Function URLs.</p>

<h2 id="why-genkit-on-aws">Why Genkit on AWS?</h2>

<p>If your infrastructure lives on AWS, you might think Genkit is not for you. Think again. The community-maintained <a href="https://github.com/genkit-ai/aws-bedrock-js-plugin">AWS Bedrock plugin</a> brings first-class Bedrock support to Genkit, giving you:</p>

<ul>
  <li>Access to <strong>Amazon Nova</strong>, <strong>Anthropic Claude</strong>, and other Bedrock models</li>
  <li>The <code class="language-plaintext highlighter-rouge">onCallGenkit</code> helper for zero-boilerplate Lambda handlers</li>
  <li>Full compatibility with the <strong>Genkit Dev UI</strong> for local development</li>
  <li>Streaming support via Lambda Function URLs</li>
</ul>

<h2 id="prerequisites">Prerequisites</h2>

<ul>
  <li><strong>Node.js 20</strong> or later</li>
  <li><strong>AWS Account</strong> with AWS CLI configured and access to AWS Bedrock</li>
  <li><strong>Serverless Framework</strong> (installed as dev dependency)</li>
  <li><strong>Genkit CLI</strong> installed globally</li>
</ul>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>npm <span class="nb">install</span> <span class="nt">-g</span> genkit-cli
</code></pre></div></div>

<h2 id="project-structure">Project Structure</h2>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>genkit-aws-lambda-bedrock/
├── src/
│   └── index.ts          # Genkit flows + Lambda handlers via onCallGenkit
├── serverless.yml        # Serverless Framework configuration
├── tsconfig.json         # TypeScript configuration
├── package.json          # Dependencies and scripts
└── README.md
</code></pre></div></div>

<h2 id="initializing-genkit-with-bedrock">Initializing Genkit with Bedrock</h2>

<p>The setup is minimal. Import the plugin, initialize Genkit, and pick your model:</p>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">import</span> <span class="p">{</span> <span class="nx">genkit</span><span class="p">,</span> <span class="nx">z</span> <span class="p">}</span> <span class="k">from</span> <span class="dl">'</span><span class="s1">genkit</span><span class="dl">'</span><span class="p">;</span>
<span class="k">import</span> <span class="p">{</span> <span class="nx">awsBedrock</span><span class="p">,</span> <span class="nx">amazonNovaProV1</span><span class="p">,</span> <span class="nx">onCallGenkit</span> <span class="p">}</span> <span class="k">from</span> <span class="dl">'</span><span class="s1">genkitx-aws-bedrock</span><span class="dl">'</span><span class="p">;</span>

<span class="kd">const</span> <span class="nx">ai</span> <span class="o">=</span> <span class="nf">genkit</span><span class="p">({</span>
  <span class="na">plugins</span><span class="p">:</span> <span class="p">[</span><span class="nf">awsBedrock</span><span class="p">()],</span>
  <span class="na">model</span><span class="p">:</span> <span class="nf">amazonNovaProV1</span><span class="p">(),</span>
<span class="p">});</span>
</code></pre></div></div>

<p>That is it. Genkit is now configured to use <strong>Amazon Nova Pro</strong> via Bedrock. You can swap to <code class="language-plaintext highlighter-rouge">anthropicClaude35SonnetV2</code> or any other supported model with a single line change.</p>

<h2 id="defining-flows">Defining Flows</h2>

<p>Genkit flows are the core building block. Each flow has typed input and output schemas using <strong>Zod</strong>, making everything type-safe from end to end.</p>

<h3 id="story-generator-flow">Story Generator Flow</h3>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">const</span> <span class="nx">StoryInputSchema</span> <span class="o">=</span> <span class="nx">z</span><span class="p">.</span><span class="nf">object</span><span class="p">({</span>
  <span class="na">topic</span><span class="p">:</span> <span class="nx">z</span><span class="p">.</span><span class="nf">string</span><span class="p">().</span><span class="nf">describe</span><span class="p">(</span><span class="dl">'</span><span class="s1">The main topic or theme for the story</span><span class="dl">'</span><span class="p">),</span>
  <span class="na">style</span><span class="p">:</span> <span class="nx">z</span><span class="p">.</span><span class="nf">string</span><span class="p">().</span><span class="nf">optional</span><span class="p">().</span><span class="nf">describe</span><span class="p">(</span><span class="dl">'</span><span class="s1">Writing style (e.g., adventure, mystery, sci-fi)</span><span class="dl">'</span><span class="p">),</span>
  <span class="na">length</span><span class="p">:</span> <span class="nx">z</span><span class="p">.</span><span class="nf">enum</span><span class="p">([</span><span class="dl">'</span><span class="s1">short</span><span class="dl">'</span><span class="p">,</span> <span class="dl">'</span><span class="s1">medium</span><span class="dl">'</span><span class="p">,</span> <span class="dl">'</span><span class="s1">long</span><span class="dl">'</span><span class="p">]).</span><span class="k">default</span><span class="p">(</span><span class="dl">'</span><span class="s1">medium</span><span class="dl">'</span><span class="p">),</span>
<span class="p">});</span>

<span class="kd">const</span> <span class="nx">StoryOutputSchema</span> <span class="o">=</span> <span class="nx">z</span><span class="p">.</span><span class="nf">object</span><span class="p">({</span>
  <span class="na">title</span><span class="p">:</span> <span class="nx">z</span><span class="p">.</span><span class="nf">string</span><span class="p">(),</span>
  <span class="na">genre</span><span class="p">:</span> <span class="nx">z</span><span class="p">.</span><span class="nf">string</span><span class="p">(),</span>
  <span class="na">story</span><span class="p">:</span> <span class="nx">z</span><span class="p">.</span><span class="nf">string</span><span class="p">(),</span>
  <span class="na">wordCount</span><span class="p">:</span> <span class="nx">z</span><span class="p">.</span><span class="nf">number</span><span class="p">(),</span>
  <span class="na">themes</span><span class="p">:</span> <span class="nx">z</span><span class="p">.</span><span class="nf">array</span><span class="p">(</span><span class="nx">z</span><span class="p">.</span><span class="nf">string</span><span class="p">()),</span>
<span class="p">});</span>

<span class="kd">const</span> <span class="nx">storyGeneratorFlow</span> <span class="o">=</span> <span class="nx">ai</span><span class="p">.</span><span class="nf">defineFlow</span><span class="p">(</span>
  <span class="p">{</span>
    <span class="na">name</span><span class="p">:</span> <span class="dl">'</span><span class="s1">storyGeneratorFlow</span><span class="dl">'</span><span class="p">,</span>
    <span class="na">inputSchema</span><span class="p">:</span> <span class="nx">StoryInputSchema</span><span class="p">,</span>
    <span class="na">outputSchema</span><span class="p">:</span> <span class="nx">StoryOutputSchema</span><span class="p">,</span>
  <span class="p">},</span>
  <span class="k">async </span><span class="p">(</span><span class="nx">input</span><span class="p">)</span> <span class="o">=&gt;</span> <span class="p">{</span>
    <span class="kd">const</span> <span class="nx">lengthMap</span> <span class="o">=</span> <span class="p">{</span> <span class="na">short</span><span class="p">:</span> <span class="dl">'</span><span class="s1">200-300</span><span class="dl">'</span><span class="p">,</span> <span class="na">medium</span><span class="p">:</span> <span class="dl">'</span><span class="s1">500-700</span><span class="dl">'</span><span class="p">,</span> <span class="na">long</span><span class="p">:</span> <span class="dl">'</span><span class="s1">1000-1500</span><span class="dl">'</span> <span class="p">};</span>
    <span class="kd">const</span> <span class="nx">wordCount</span> <span class="o">=</span> <span class="nx">lengthMap</span><span class="p">[</span><span class="nx">input</span><span class="p">.</span><span class="nx">length</span><span class="p">];</span>

    <span class="kd">const</span> <span class="nx">prompt</span> <span class="o">=</span> <span class="s2">`Create a creative </span><span class="p">${</span><span class="nx">input</span><span class="p">.</span><span class="nx">style</span> <span class="o">||</span> <span class="dl">'</span><span class="s1">fictional</span><span class="dl">'</span><span class="p">}</span><span class="s2"> story with the following requirements:
      Topic: </span><span class="p">${</span><span class="nx">input</span><span class="p">.</span><span class="nx">topic</span><span class="p">}</span><span class="s2">
      Length: </span><span class="p">${</span><span class="nx">wordCount</span><span class="p">}</span><span class="s2"> words
      
      Please provide a captivating story with a clear beginning, middle, and end.
      Include rich descriptions and engaging characters.`</span><span class="p">;</span>

    <span class="kd">const</span> <span class="p">{</span> <span class="nx">output</span> <span class="p">}</span> <span class="o">=</span> <span class="k">await</span> <span class="nx">ai</span><span class="p">.</span><span class="nf">generate</span><span class="p">({</span>
      <span class="nx">prompt</span><span class="p">,</span>
      <span class="na">output</span><span class="p">:</span> <span class="p">{</span> <span class="na">schema</span><span class="p">:</span> <span class="nx">StoryOutputSchema</span> <span class="p">},</span>
    <span class="p">});</span>

    <span class="k">if </span><span class="p">(</span><span class="o">!</span><span class="nx">output</span><span class="p">)</span> <span class="p">{</span>
      <span class="k">throw</span> <span class="k">new</span> <span class="nc">Error</span><span class="p">(</span><span class="dl">'</span><span class="s1">Failed to generate story</span><span class="dl">'</span><span class="p">);</span>
    <span class="p">}</span>

    <span class="k">return</span> <span class="nx">output</span><span class="p">;</span>
  <span class="p">}</span>
<span class="p">);</span>
</code></pre></div></div>

<p>Notice how <code class="language-plaintext highlighter-rouge">ai.generate</code> returns a fully structured, typed object. No JSON parsing, no string manipulation. Genkit handles all of that for you.</p>

<h3 id="joke-flow-with-streaming">Joke Flow with Streaming</h3>

<p>Genkit also supports streaming responses. The <code class="language-plaintext highlighter-rouge">jokeStreamingFlow</code> uses <code class="language-plaintext highlighter-rouge">ai.generateStream</code> and <code class="language-plaintext highlighter-rouge">sendChunk</code> to emit text chunks as they arrive from the LLM:</p>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">const</span> <span class="nx">jokeStreamingFlow</span> <span class="o">=</span> <span class="nx">ai</span><span class="p">.</span><span class="nf">defineFlow</span><span class="p">(</span>
  <span class="p">{</span>
    <span class="na">name</span><span class="p">:</span> <span class="dl">'</span><span class="s1">jokeStreamingFlow</span><span class="dl">'</span><span class="p">,</span>
    <span class="na">inputSchema</span><span class="p">:</span> <span class="nx">z</span><span class="p">.</span><span class="nf">object</span><span class="p">({</span>
      <span class="na">subject</span><span class="p">:</span> <span class="nx">z</span><span class="p">.</span><span class="nf">string</span><span class="p">().</span><span class="nf">describe</span><span class="p">(</span><span class="dl">'</span><span class="s1">The subject to tell a joke about</span><span class="dl">'</span><span class="p">),</span>
    <span class="p">}),</span>
    <span class="na">outputSchema</span><span class="p">:</span> <span class="nx">z</span><span class="p">.</span><span class="nf">object</span><span class="p">({</span>
      <span class="na">joke</span><span class="p">:</span> <span class="nx">z</span><span class="p">.</span><span class="nf">string</span><span class="p">(),</span>
      <span class="na">type</span><span class="p">:</span> <span class="nx">z</span><span class="p">.</span><span class="nf">string</span><span class="p">().</span><span class="nf">optional</span><span class="p">(),</span>
    <span class="p">}),</span>
    <span class="na">streamSchema</span><span class="p">:</span> <span class="nx">z</span><span class="p">.</span><span class="nf">string</span><span class="p">(),</span>
  <span class="p">},</span>
  <span class="k">async </span><span class="p">(</span><span class="nx">input</span><span class="p">,</span> <span class="nx">sendChunk</span><span class="p">)</span> <span class="o">=&gt;</span> <span class="p">{</span>
    <span class="kd">const</span> <span class="p">{</span> <span class="nx">stream</span><span class="p">,</span> <span class="nx">response</span> <span class="p">}</span> <span class="o">=</span> <span class="k">await</span> <span class="nx">ai</span><span class="p">.</span><span class="nf">generateStream</span><span class="p">({</span>
      <span class="na">prompt</span><span class="p">:</span> <span class="s2">`Tell me a funny joke about </span><span class="p">${</span><span class="nx">input</span><span class="p">.</span><span class="nx">subject</span><span class="p">}</span><span class="s2">. Make it clever and appropriate for all ages.`</span><span class="p">,</span>
      <span class="na">output</span><span class="p">:</span> <span class="p">{</span>
        <span class="na">schema</span><span class="p">:</span> <span class="nx">z</span><span class="p">.</span><span class="nf">object</span><span class="p">({</span>
          <span class="na">joke</span><span class="p">:</span> <span class="nx">z</span><span class="p">.</span><span class="nf">string</span><span class="p">(),</span>
          <span class="na">type</span><span class="p">:</span> <span class="nx">z</span><span class="p">.</span><span class="nf">string</span><span class="p">().</span><span class="nf">optional</span><span class="p">(),</span>
        <span class="p">}),</span>
      <span class="p">},</span>
    <span class="p">});</span>

    <span class="k">for</span> <span class="k">await </span><span class="p">(</span><span class="kd">const</span> <span class="nx">chunk</span> <span class="k">of</span> <span class="nx">stream</span><span class="p">)</span> <span class="p">{</span>
      <span class="nf">sendChunk</span><span class="p">(</span><span class="nx">chunk</span><span class="p">.</span><span class="nx">text</span><span class="p">);</span>
    <span class="p">}</span>

    <span class="kd">const</span> <span class="nx">result</span> <span class="o">=</span> <span class="k">await</span> <span class="nx">response</span><span class="p">;</span>
    <span class="k">return</span> <span class="nx">result</span><span class="p">.</span><span class="nx">output</span> <span class="o">||</span> <span class="p">{</span> <span class="na">joke</span><span class="p">:</span> <span class="nx">result</span><span class="p">.</span><span class="nx">text</span> <span class="p">};</span>
  <span class="p">}</span>
<span class="p">);</span>
</code></pre></div></div>

<h2 id="wrapping-flows-as-lambda-handlers">Wrapping Flows as Lambda Handlers</h2>

<p>The <code class="language-plaintext highlighter-rouge">onCallGenkit</code> helper is where the magic happens. It transforms any Genkit flow into a production-ready Lambda handler:</p>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// Simple flow handler</span>
<span class="k">export</span> <span class="kd">const</span> <span class="nx">jokeHandler</span> <span class="o">=</span> <span class="nf">onCallGenkit</span><span class="p">(</span><span class="nx">jokeFlow</span><span class="p">);</span>

<span class="c1">// With CORS and debug options</span>
<span class="k">export</span> <span class="kd">const</span> <span class="nx">storyGeneratorHandler</span> <span class="o">=</span> <span class="nf">onCallGenkit</span><span class="p">(</span>
  <span class="p">{</span>
    <span class="na">cors</span><span class="p">:</span> <span class="p">{</span> <span class="na">origin</span><span class="p">:</span> <span class="dl">'</span><span class="s1">*</span><span class="dl">'</span><span class="p">,</span> <span class="na">methods</span><span class="p">:</span> <span class="p">[</span><span class="dl">'</span><span class="s1">POST</span><span class="dl">'</span><span class="p">,</span> <span class="dl">'</span><span class="s1">OPTIONS</span><span class="dl">'</span><span class="p">]</span> <span class="p">},</span>
    <span class="na">debug</span><span class="p">:</span> <span class="nx">process</span><span class="p">.</span><span class="nx">env</span><span class="p">.</span><span class="nx">NODE_ENV</span> <span class="o">!==</span> <span class="dl">'</span><span class="s1">production</span><span class="dl">'</span><span class="p">,</span>
  <span class="p">},</span>
  <span class="nx">storyGeneratorFlow</span>
<span class="p">);</span>

<span class="c1">// Streaming handler (requires Lambda Function URL with RESPONSE_STREAM)</span>
<span class="k">export</span> <span class="kd">const</span> <span class="nx">jokeStreamHandler</span> <span class="o">=</span> <span class="nf">onCallGenkit</span><span class="p">(</span>
  <span class="p">{</span>
    <span class="na">streaming</span><span class="p">:</span> <span class="kc">true</span><span class="p">,</span>
    <span class="na">cors</span><span class="p">:</span> <span class="p">{</span> <span class="na">origin</span><span class="p">:</span> <span class="dl">'</span><span class="s1">*</span><span class="dl">'</span><span class="p">,</span> <span class="na">methods</span><span class="p">:</span> <span class="p">[</span><span class="dl">'</span><span class="s1">POST</span><span class="dl">'</span><span class="p">,</span> <span class="dl">'</span><span class="s1">OPTIONS</span><span class="dl">'</span><span class="p">]</span> <span class="p">},</span>
    <span class="na">debug</span><span class="p">:</span> <span class="nx">process</span><span class="p">.</span><span class="nx">env</span><span class="p">.</span><span class="nx">NODE_ENV</span> <span class="o">!==</span> <span class="dl">'</span><span class="s1">production</span><span class="dl">'</span><span class="p">,</span>
  <span class="p">},</span>
  <span class="nx">jokeStreamingFlow</span>
<span class="p">);</span>
</code></pre></div></div>

<h2 id="the-genkit-dev-ui">The Genkit Dev UI</h2>

<p>This is where Genkit truly shines during development. Run:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>npm run genkit:ui
</code></pre></div></div>

<p>This starts the <strong>Genkit Developer UI</strong> at <code class="language-plaintext highlighter-rouge">http://localhost:4000</code>. From here, you can:</p>

<ul>
  <li><strong>Test any flow</strong> visually with different inputs, no cURL needed</li>
  <li><strong>View detailed traces</strong> of every AI generation, including prompts, model responses, and latency</li>
  <li><strong>Debug and optimize prompts</strong> interactively</li>
  <li><strong>Inspect streaming</strong> responses in real-time</li>
</ul>

<p>The Dev UI is model-agnostic, so even though we are using AWS Bedrock, the same visual debugging experience applies. This is one of the biggest advantages of Genkit: a unified developer experience regardless of the underlying AI provider.</p>

<h2 id="local-development-with-serverless-offline">Local Development with Serverless Offline</h2>

<p>For testing the Lambda locally with a real HTTP endpoint:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>npm run dev
</code></pre></div></div>

<p>This starts a local server at <code class="language-plaintext highlighter-rouge">http://localhost:3000</code> that mimics API Gateway. Test it with:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>curl <span class="nt">-X</span> POST http://localhost:3000/generate <span class="se">\</span>
  <span class="nt">-H</span> <span class="s2">"Content-Type: application/json"</span> <span class="se">\</span>
  <span class="nt">-d</span> <span class="s1">'{
    "data": {
      "topic": "a robot learning to feel emotions",
      "style": "sci-fi",
      "length": "medium"
    }
  }'</span>
</code></pre></div></div>

<h2 id="deployment">Deployment</h2>

<p>Deploy to AWS with a single command:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>npm run deploy
</code></pre></div></div>

<p>After deployment, you will see output like:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>endpoints:
  POST - https://abc123.execute-api.us-east-1.amazonaws.com/generate
  POST - https://abc123.execute-api.us-east-1.amazonaws.com/joke
functions:
  storyGenerator: genkit-aws-lambda-bedrock-dev-storyGenerator
  jokeGenerator: genkit-aws-lambda-bedrock-dev-jokeGenerator
</code></pre></div></div>

<h2 id="using-the-genkit-client-sdk">Using the Genkit Client SDK</h2>

<p>You can also call your deployed flows from a frontend or another service using the <code class="language-plaintext highlighter-rouge">genkit/beta/client</code> SDK:</p>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">import</span> <span class="p">{</span> <span class="nx">runFlow</span><span class="p">,</span> <span class="nx">streamFlow</span> <span class="p">}</span> <span class="k">from</span> <span class="dl">'</span><span class="s1">genkit/beta/client</span><span class="dl">'</span><span class="p">;</span>

<span class="c1">// Non-streaming</span>
<span class="kd">const</span> <span class="nx">result</span> <span class="o">=</span> <span class="k">await</span> <span class="nf">runFlow</span><span class="p">({</span>
  <span class="na">url</span><span class="p">:</span> <span class="dl">'</span><span class="s1">https://your-api-url.amazonaws.com/generate</span><span class="dl">'</span><span class="p">,</span>
  <span class="na">input</span><span class="p">:</span> <span class="p">{</span> <span class="na">topic</span><span class="p">:</span> <span class="dl">'</span><span class="s1">space exploration</span><span class="dl">'</span><span class="p">,</span> <span class="na">style</span><span class="p">:</span> <span class="dl">'</span><span class="s1">sci-fi</span><span class="dl">'</span><span class="p">,</span> <span class="na">length</span><span class="p">:</span> <span class="dl">'</span><span class="s1">short</span><span class="dl">'</span> <span class="p">},</span>
<span class="p">});</span>

<span class="c1">// Streaming</span>
<span class="kd">const</span> <span class="nx">result</span> <span class="o">=</span> <span class="nf">streamFlow</span><span class="p">({</span>
  <span class="na">url</span><span class="p">:</span> <span class="dl">'</span><span class="s1">https://your-api-url.amazonaws.com/joke-stream</span><span class="dl">'</span><span class="p">,</span>
  <span class="na">input</span><span class="p">:</span> <span class="p">{</span> <span class="na">subject</span><span class="p">:</span> <span class="dl">'</span><span class="s1">TypeScript</span><span class="dl">'</span> <span class="p">},</span>
<span class="p">});</span>
<span class="k">for</span> <span class="k">await </span><span class="p">(</span><span class="kd">const</span> <span class="nx">chunk</span> <span class="k">of</span> <span class="nx">result</span><span class="p">.</span><span class="nx">stream</span><span class="p">)</span> <span class="p">{</span>
  <span class="nx">process</span><span class="p">.</span><span class="nx">stdout</span><span class="p">.</span><span class="nf">write</span><span class="p">(</span><span class="nx">chunk</span><span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>

<h2 id="resources">Resources</h2>

<ul>
  <li><a href="https://genkit.dev/docs/">Genkit Documentation</a></li>
  <li><a href="https://github.com/genkit-ai/aws-bedrock-js-plugin">AWS Bedrock Plugin</a></li>
  <li><a href="https://www.serverless.com/framework/docs">Serverless Framework Documentation</a></li>
  <li><a href="https://aws.amazon.com/bedrock/">AWS Bedrock</a></li>
</ul>

<h2 id="conclusion">Conclusion</h2>

<p>Genkit makes it incredibly easy to build, test, and deploy AI-powered applications on AWS. The combination of typed flows, the Dev UI for visual debugging, and the <code class="language-plaintext highlighter-rouge">onCallGenkit</code> helper for zero-boilerplate Lambda handlers means you spend your time on AI logic, not infrastructure plumbing.</p>

<p>You can find the full code of this example in the <a href="https://github.com/xavidop/genkit-aws-lambda-bedrock">GitHub repository</a></p>

<p>Happy coding!</p>]]></content><author><name>Xavier Portilla Edo</name><email>xavierportillaedo@gmail.com</email></author><category term="genkit" /><category term="genkit" /><category term="gcp" /><category term="aws" /><summary type="html"><![CDATA[Deploy AI-powered flows to AWS Lambda using Genkit and the AWS Bedrock plugin]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://xavidop.me/assets/img/blog/post-headers/genkit-aws-lambda-bedrock.png" /><media:content medium="image" url="https://xavidop.me/assets/img/blog/post-headers/genkit-aws-lambda-bedrock.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry xml:lang="en"><title type="html">Running Genkit on Azure Functions with Azure OpenAI (English)</title><link href="https://xavidop.me/genkit/2026-03-20-genkit-azure-function-ai-foundry/" rel="alternate" type="text/html" title="Running Genkit on Azure Functions with Azure OpenAI (English)" /><published>2026-03-20T00:00:00+00:00</published><updated>2026-03-20T17:41:31+00:00</updated><id>https://xavidop.me/genkit/genkit-azure-function-ai-foundry</id><content type="html" xml:base="https://xavidop.me/genkit/2026-03-20-genkit-azure-function-ai-foundry/"><![CDATA[<ol class="no_toc" id="markdown-toc">
  <li><a href="#introduction" id="markdown-toc-introduction">Introduction</a></li>
  <li><a href="#why-genkit-on-azure" id="markdown-toc-why-genkit-on-azure">Why Genkit on Azure?</a></li>
  <li><a href="#prerequisites" id="markdown-toc-prerequisites">Prerequisites</a></li>
  <li><a href="#project-structure" id="markdown-toc-project-structure">Project Structure</a></li>
  <li><a href="#initializing-genkit-with-azure-openai" id="markdown-toc-initializing-genkit-with-azure-openai">Initializing Genkit with Azure OpenAI</a></li>
  <li><a href="#defining-flows" id="markdown-toc-defining-flows">Defining Flows</a>    <ol>
      <li><a href="#story-generator-flow" id="markdown-toc-story-generator-flow">Story Generator Flow</a></li>
      <li><a href="#streaming-joke-flow" id="markdown-toc-streaming-joke-flow">Streaming Joke Flow</a></li>
      <li><a href="#protected-summary-flow-with-api-key-auth" id="markdown-toc-protected-summary-flow-with-api-key-auth">Protected Summary Flow with API Key Auth</a></li>
    </ol>
  </li>
  <li><a href="#registering-flows-as-azure-functions" id="markdown-toc-registering-flows-as-azure-functions">Registering Flows as Azure Functions</a></li>
  <li><a href="#the-genkit-dev-ui" id="markdown-toc-the-genkit-dev-ui">The Genkit Dev UI</a></li>
  <li><a href="#local-development" id="markdown-toc-local-development">Local Development</a>    <ol>
      <li><a href="#run-with-azure-functions-core-tools" id="markdown-toc-run-with-azure-functions-core-tools">Run with Azure Functions Core Tools</a></li>
      <li><a href="#using-the-genkit-client-sdk" id="markdown-toc-using-the-genkit-client-sdk">Using the Genkit Client SDK</a></li>
    </ol>
  </li>
  <li><a href="#deployment" id="markdown-toc-deployment">Deployment</a></li>
  <li><a href="#resources" id="markdown-toc-resources">Resources</a></li>
  <li><a href="#conclusion" id="markdown-toc-conclusion">Conclusion</a></li>
</ol>

<h2 id="introduction">Introduction</h2>

<p>Genkit is not tied to any single cloud. In this post, we will explore how to run Genkit flows on <strong>Azure Functions</strong> powered by <strong>Azure OpenAI</strong>, building a story generator, a joke generator with streaming, and a protected summary endpoint with API key authentication, all using the <code class="language-plaintext highlighter-rouge">onCallGenkit</code> helper from the Azure OpenAI plugin.</p>

<p>This project shows how Genkit brings the same great developer experience, typed flows, structured output, the Dev UI, to the Azure ecosystem.</p>

<h2 id="why-genkit-on-azure">Why Genkit on Azure?</h2>

<p>Azure has a first-class AI offering through <strong>Azure OpenAI Service</strong> with GPT-4o and other powerful models. The <a href="https://github.com/genkit-ai/azure-foundry-js-plugin">Azure OpenAI plugin for Genkit</a> brings all of these models into the Genkit ecosystem, giving you:</p>

<ul>
  <li>Access to <strong>GPT-4o</strong>, <strong>GPT-3.5 Turbo</strong>, and other Azure OpenAI models</li>
  <li>The <code class="language-plaintext highlighter-rouge">onCallGenkit</code> helper for zero-boilerplate Azure Function handlers</li>
  <li>Built-in API key authentication via <code class="language-plaintext highlighter-rouge">requireApiKey</code></li>
  <li>Full compatibility with the <strong>Genkit Dev UI</strong> for local testing</li>
  <li>Streaming support via SSE (Server-Sent Events)</li>
</ul>

<h2 id="prerequisites">Prerequisites</h2>

<ul>
  <li><strong>Node.js 20</strong> or later</li>
  <li><strong>Azure Account</strong> with an Azure OpenAI resource deployed</li>
  <li><strong>Azure Functions Core Tools v4</strong></li>
  <li><strong>Genkit CLI</strong> installed globally</li>
</ul>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>npm <span class="nb">install</span> <span class="nt">-g</span> genkit-cli
</code></pre></div></div>

<h2 id="project-structure">Project Structure</h2>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>genkit-azure-function-ai-foundry/
├── src/
│   └── index.ts          # Main Azure Function handler with Genkit flows
├── host.json             # Azure Functions host configuration
├── local.settings.json   # Local config
├── .env                  # Environment variables
├── tsconfig.json         # TypeScript configuration
├── package.json          # Dependencies and scripts
└── README.md
</code></pre></div></div>

<h2 id="initializing-genkit-with-azure-openai">Initializing Genkit with Azure OpenAI</h2>

<p>Setting up Genkit with Azure OpenAI is straightforward:</p>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">import</span> <span class="p">{</span> <span class="nx">genkit</span><span class="p">,</span> <span class="nx">z</span> <span class="p">}</span> <span class="k">from</span> <span class="dl">'</span><span class="s1">genkit</span><span class="dl">'</span><span class="p">;</span>
<span class="k">import</span> <span class="p">{</span>
  <span class="nx">azureOpenAI</span><span class="p">,</span>
  <span class="nx">gpt4o</span><span class="p">,</span>
  <span class="nx">onCallGenkit</span><span class="p">,</span>
  <span class="nx">requireApiKey</span><span class="p">,</span>
<span class="p">}</span> <span class="k">from</span> <span class="dl">'</span><span class="s1">genkitx-azure-openai</span><span class="dl">'</span><span class="p">;</span>
<span class="k">import</span> <span class="o">*</span> <span class="kd">as </span><span class="nx">dotenv</span> <span class="k">from</span> <span class="dl">'</span><span class="s1">dotenv</span><span class="dl">'</span><span class="p">;</span>

<span class="nx">dotenv</span><span class="p">.</span><span class="nf">config</span><span class="p">();</span>

<span class="kd">const</span> <span class="nx">ai</span> <span class="o">=</span> <span class="nf">genkit</span><span class="p">({</span>
  <span class="na">plugins</span><span class="p">:</span> <span class="p">[</span>
    <span class="nf">azureOpenAI</span><span class="p">({</span>
      <span class="c1">// Reads from environment variables:</span>
      <span class="c1">// AZURE_OPENAI_ENDPOINT</span>
      <span class="c1">// AZURE_OPENAI_API_KEY</span>
      <span class="c1">// OPENAI_API_VERSION</span>
    <span class="p">}),</span>
  <span class="p">],</span>
  <span class="na">model</span><span class="p">:</span> <span class="nx">gpt4o</span><span class="p">,</span>
<span class="p">});</span>
</code></pre></div></div>

<p>The plugin reads your Azure credentials from environment variables. No manual HTTP clients, no JSON wrangling. Just configure and go.</p>

<h2 id="defining-flows">Defining Flows</h2>

<h3 id="story-generator-flow">Story Generator Flow</h3>

<p>The story generator uses <strong>structured output</strong>, which means Genkit instructs the LLM to return a typed object matching your Zod schema directly:</p>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">const</span> <span class="nx">StoryInputSchema</span> <span class="o">=</span> <span class="nx">z</span><span class="p">.</span><span class="nf">object</span><span class="p">({</span>
  <span class="na">topic</span><span class="p">:</span> <span class="nx">z</span><span class="p">.</span><span class="nf">string</span><span class="p">().</span><span class="nf">describe</span><span class="p">(</span><span class="dl">'</span><span class="s1">The main topic or theme for the story</span><span class="dl">'</span><span class="p">),</span>
  <span class="na">style</span><span class="p">:</span> <span class="nx">z</span><span class="p">.</span><span class="nf">string</span><span class="p">().</span><span class="nf">optional</span><span class="p">().</span><span class="nf">describe</span><span class="p">(</span><span class="dl">'</span><span class="s1">Writing style (e.g., adventure, mystery, sci-fi)</span><span class="dl">'</span><span class="p">),</span>
  <span class="na">length</span><span class="p">:</span> <span class="nx">z</span><span class="p">.</span><span class="nf">enum</span><span class="p">([</span><span class="dl">'</span><span class="s1">short</span><span class="dl">'</span><span class="p">,</span> <span class="dl">'</span><span class="s1">medium</span><span class="dl">'</span><span class="p">,</span> <span class="dl">'</span><span class="s1">long</span><span class="dl">'</span><span class="p">]).</span><span class="k">default</span><span class="p">(</span><span class="dl">'</span><span class="s1">medium</span><span class="dl">'</span><span class="p">),</span>
<span class="p">});</span>

<span class="kd">const</span> <span class="nx">StorySchema</span> <span class="o">=</span> <span class="nx">z</span><span class="p">.</span><span class="nf">object</span><span class="p">({</span>
  <span class="na">title</span><span class="p">:</span> <span class="nx">z</span><span class="p">.</span><span class="nf">string</span><span class="p">(),</span>
  <span class="na">genre</span><span class="p">:</span> <span class="nx">z</span><span class="p">.</span><span class="nf">string</span><span class="p">(),</span>
  <span class="na">story</span><span class="p">:</span> <span class="nx">z</span><span class="p">.</span><span class="nf">string</span><span class="p">(),</span>
  <span class="na">wordCount</span><span class="p">:</span> <span class="nx">z</span><span class="p">.</span><span class="nf">number</span><span class="p">(),</span>
  <span class="na">themes</span><span class="p">:</span> <span class="nx">z</span><span class="p">.</span><span class="nf">array</span><span class="p">(</span><span class="nx">z</span><span class="p">.</span><span class="nf">string</span><span class="p">()),</span>
<span class="p">});</span>

<span class="kd">const</span> <span class="nx">storyGeneratorFlow</span> <span class="o">=</span> <span class="nx">ai</span><span class="p">.</span><span class="nf">defineFlow</span><span class="p">(</span>
  <span class="p">{</span>
    <span class="na">name</span><span class="p">:</span> <span class="dl">'</span><span class="s1">storyGeneratorFlow</span><span class="dl">'</span><span class="p">,</span>
    <span class="na">inputSchema</span><span class="p">:</span> <span class="nx">StoryInputSchema</span><span class="p">,</span>
    <span class="na">outputSchema</span><span class="p">:</span> <span class="nx">StorySchema</span><span class="p">,</span>
  <span class="p">},</span>
  <span class="k">async </span><span class="p">(</span><span class="nx">input</span><span class="p">)</span> <span class="o">=&gt;</span> <span class="p">{</span>
    <span class="kd">const</span> <span class="nx">lengthMap</span> <span class="o">=</span> <span class="p">{</span> <span class="na">short</span><span class="p">:</span> <span class="dl">'</span><span class="s1">200-300</span><span class="dl">'</span><span class="p">,</span> <span class="na">medium</span><span class="p">:</span> <span class="dl">'</span><span class="s1">500-700</span><span class="dl">'</span><span class="p">,</span> <span class="na">long</span><span class="p">:</span> <span class="dl">'</span><span class="s1">1000-1500</span><span class="dl">'</span> <span class="p">};</span>
    <span class="kd">const</span> <span class="nx">wordCount</span> <span class="o">=</span> <span class="nx">lengthMap</span><span class="p">[</span><span class="nx">input</span><span class="p">.</span><span class="nx">length</span><span class="p">];</span>

    <span class="kd">const</span> <span class="nx">prompt</span> <span class="o">=</span> <span class="s2">`Create a creative </span><span class="p">${</span><span class="nx">input</span><span class="p">.</span><span class="nx">style</span> <span class="o">||</span> <span class="dl">'</span><span class="s1">fictional</span><span class="dl">'</span><span class="p">}</span><span class="s2"> story with the following requirements:
      Topic: </span><span class="p">${</span><span class="nx">input</span><span class="p">.</span><span class="nx">topic</span><span class="p">}</span><span class="s2">
      Length: </span><span class="p">${</span><span class="nx">wordCount</span><span class="p">}</span><span class="s2"> words
      
      Please provide a captivating story with a clear beginning, middle, and end.
      Include rich descriptions and engaging characters.`</span><span class="p">;</span>

    <span class="kd">const</span> <span class="p">{</span> <span class="nx">output</span> <span class="p">}</span> <span class="o">=</span> <span class="k">await</span> <span class="nx">ai</span><span class="p">.</span><span class="nf">generate</span><span class="p">({</span>
      <span class="nx">prompt</span><span class="p">,</span>
      <span class="na">output</span><span class="p">:</span> <span class="p">{</span> <span class="na">schema</span><span class="p">:</span> <span class="nx">StorySchema</span> <span class="p">},</span>
    <span class="p">});</span>

    <span class="k">if </span><span class="p">(</span><span class="o">!</span><span class="nx">output</span><span class="p">)</span> <span class="p">{</span>
      <span class="k">throw</span> <span class="k">new</span> <span class="nc">Error</span><span class="p">(</span><span class="dl">'</span><span class="s1">Failed to generate story</span><span class="dl">'</span><span class="p">);</span>
    <span class="p">}</span>

    <span class="k">return</span> <span class="nx">output</span><span class="p">;</span>
  <span class="p">}</span>
<span class="p">);</span>
</code></pre></div></div>

<p>The response is a fully typed <code class="language-plaintext highlighter-rouge">StorySchema</code> object. No <code class="language-plaintext highlighter-rouge">JSON.parse</code>, no manual deserialization.</p>

<h3 id="streaming-joke-flow">Streaming Joke Flow</h3>

<p>The streaming flow uses <code class="language-plaintext highlighter-rouge">ai.generateStream</code> and emits chunks via <code class="language-plaintext highlighter-rouge">sendChunk</code>:</p>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">const</span> <span class="nx">jokeStreamingFlow</span> <span class="o">=</span> <span class="nx">ai</span><span class="p">.</span><span class="nf">defineFlow</span><span class="p">(</span>
  <span class="p">{</span>
    <span class="na">name</span><span class="p">:</span> <span class="dl">'</span><span class="s1">jokeStreamingFlow</span><span class="dl">'</span><span class="p">,</span>
    <span class="na">inputSchema</span><span class="p">:</span> <span class="nx">JokeInputSchema</span><span class="p">,</span>
    <span class="na">outputSchema</span><span class="p">:</span> <span class="nx">JokeOutputSchema</span><span class="p">,</span>
    <span class="na">streamSchema</span><span class="p">:</span> <span class="nx">z</span><span class="p">.</span><span class="nf">string</span><span class="p">(),</span>
  <span class="p">},</span>
  <span class="k">async </span><span class="p">(</span><span class="nx">input</span><span class="p">,</span> <span class="p">{</span> <span class="nx">sendChunk</span> <span class="p">})</span> <span class="o">=&gt;</span> <span class="p">{</span>
    <span class="kd">const</span> <span class="p">{</span> <span class="nx">stream</span><span class="p">,</span> <span class="nx">response</span> <span class="p">}</span> <span class="o">=</span> <span class="k">await</span> <span class="nx">ai</span><span class="p">.</span><span class="nf">generateStream</span><span class="p">({</span>
      <span class="na">prompt</span><span class="p">:</span> <span class="s2">`Tell me a long and funny joke about </span><span class="p">${</span><span class="nx">input</span><span class="p">.</span><span class="nx">subject</span><span class="p">}</span><span class="s2">`</span><span class="p">,</span>
    <span class="p">});</span>

    <span class="k">for</span> <span class="k">await </span><span class="p">(</span><span class="kd">const</span> <span class="nx">chunk</span> <span class="k">of</span> <span class="nx">stream</span><span class="p">)</span> <span class="p">{</span>
      <span class="nf">sendChunk</span><span class="p">(</span><span class="nx">chunk</span><span class="p">.</span><span class="nx">text</span><span class="p">);</span>
    <span class="p">}</span>

    <span class="kd">const</span> <span class="nx">result</span> <span class="o">=</span> <span class="k">await</span> <span class="nx">response</span><span class="p">;</span>
    <span class="k">return</span> <span class="p">{</span> <span class="na">joke</span><span class="p">:</span> <span class="nx">result</span><span class="p">.</span><span class="nx">text</span> <span class="p">};</span>
  <span class="p">}</span>
<span class="p">);</span>
</code></pre></div></div>

<h3 id="protected-summary-flow-with-api-key-auth">Protected Summary Flow with API Key Auth</h3>

<p>One of the unique features of the Azure OpenAI plugin is the built-in <code class="language-plaintext highlighter-rouge">requireApiKey</code> context provider. This lets you protect flows with API key authentication without writing any middleware:</p>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">export</span> <span class="kd">const</span> <span class="nx">protectedHandler</span> <span class="o">=</span> <span class="nf">onCallGenkit</span><span class="p">(</span>
  <span class="p">{</span>
    <span class="na">contextProvider</span><span class="p">:</span> <span class="nf">requireApiKey</span><span class="p">(</span>
      <span class="dl">'</span><span class="s1">X-API-Key</span><span class="dl">'</span><span class="p">,</span>
      <span class="nx">process</span><span class="p">.</span><span class="nx">env</span><span class="p">.</span><span class="nx">API_KEY</span> <span class="o">||</span> <span class="dl">'</span><span class="s1">demo-api-key</span><span class="dl">'</span>
    <span class="p">),</span>
    <span class="na">cors</span><span class="p">:</span> <span class="p">{</span>
      <span class="na">origin</span><span class="p">:</span> <span class="p">[</span><span class="dl">'</span><span class="s1">https://myapp.com</span><span class="dl">'</span><span class="p">,</span> <span class="dl">'</span><span class="s1">http://localhost:3000</span><span class="dl">'</span><span class="p">],</span>
      <span class="na">credentials</span><span class="p">:</span> <span class="kc">true</span><span class="p">,</span>
    <span class="p">},</span>
    <span class="na">onError</span><span class="p">:</span> <span class="k">async </span><span class="p">(</span><span class="nx">error</span><span class="p">)</span> <span class="o">=&gt;</span> <span class="p">({</span>
      <span class="na">statusCode</span><span class="p">:</span> <span class="nx">error</span><span class="p">.</span><span class="nx">message</span><span class="p">.</span><span class="nf">includes</span><span class="p">(</span><span class="dl">'</span><span class="s1">Unauthorized</span><span class="dl">'</span><span class="p">)</span> <span class="p">?</span> <span class="mi">401</span> <span class="p">:</span> <span class="mi">500</span><span class="p">,</span>
      <span class="na">message</span><span class="p">:</span> <span class="nx">error</span><span class="p">.</span><span class="nx">message</span><span class="p">,</span>
    <span class="p">}),</span>
  <span class="p">},</span>
  <span class="nx">protectedSummaryFlow</span>
<span class="p">);</span>
</code></pre></div></div>

<h2 id="registering-flows-as-azure-functions">Registering Flows as Azure Functions</h2>

<p>The <code class="language-plaintext highlighter-rouge">onCallGenkit</code> helper wraps Genkit flows as Azure Function HTTP triggers:</p>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// With CORS and debug</span>
<span class="k">export</span> <span class="kd">const</span> <span class="nx">storyGeneratorHandler</span> <span class="o">=</span> <span class="nf">onCallGenkit</span><span class="p">(</span>
  <span class="p">{</span>
    <span class="na">cors</span><span class="p">:</span> <span class="p">{</span> <span class="na">origin</span><span class="p">:</span> <span class="dl">'</span><span class="s1">*</span><span class="dl">'</span> <span class="p">},</span>
    <span class="na">debug</span><span class="p">:</span> <span class="nx">process</span><span class="p">.</span><span class="nx">env</span><span class="p">.</span><span class="nx">NODE_ENV</span> <span class="o">!==</span> <span class="dl">'</span><span class="s1">production</span><span class="dl">'</span><span class="p">,</span>
  <span class="p">},</span>
  <span class="nx">storyGeneratorFlow</span>
<span class="p">);</span>

<span class="c1">// Simplest form</span>
<span class="k">export</span> <span class="kd">const</span> <span class="nx">jokeHandler</span> <span class="o">=</span> <span class="nf">onCallGenkit</span><span class="p">(</span><span class="nx">jokeFlow</span><span class="p">);</span>

<span class="c1">// Streaming with SSE</span>
<span class="k">export</span> <span class="kd">const</span> <span class="nx">jokeStreamHandler</span> <span class="o">=</span> <span class="nf">onCallGenkit</span><span class="p">(</span>
  <span class="p">{</span> <span class="na">streaming</span><span class="p">:</span> <span class="kc">true</span><span class="p">,</span> <span class="na">cors</span><span class="p">:</span> <span class="p">{</span> <span class="na">origin</span><span class="p">:</span> <span class="dl">'</span><span class="s1">*</span><span class="dl">'</span> <span class="p">}</span> <span class="p">},</span>
  <span class="nx">jokeStreamingFlow</span>
<span class="p">);</span>
</code></pre></div></div>

<p>This gives you four Azure Function endpoints:</p>

<table>
  <thead>
    <tr>
      <th>Flow</th>
      <th>Endpoint</th>
      <th>Auth</th>
      <th>Features</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>storyGeneratorFlow</td>
      <td>POST /api/storyGeneratorFlow</td>
      <td>anonymous</td>
      <td>CORS, debug</td>
    </tr>
    <tr>
      <td>jokeFlow</td>
      <td>POST /api/jokeFlow</td>
      <td>anonymous</td>
      <td>Simplest form</td>
    </tr>
    <tr>
      <td>jokeStreamingFlow</td>
      <td>POST /api/jokeStreamingFlow</td>
      <td>anonymous</td>
      <td>SSE streaming</td>
    </tr>
    <tr>
      <td>protectedSummaryFlow</td>
      <td>POST /api/protectedSummaryFlow</td>
      <td>API key</td>
      <td>Auth, CORS, custom error handler</td>
    </tr>
  </tbody>
</table>

<h2 id="the-genkit-dev-ui">The Genkit Dev UI</h2>

<p>Run the following command to start the Genkit Developer UI:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>npm run genkit:ui
</code></pre></div></div>

<p>This opens the Dev UI at <code class="language-plaintext highlighter-rouge">http://localhost:4000</code> where you can:</p>

<ul>
  <li><strong>Test all four flows</strong> with different inputs visually</li>
  <li><strong>View detailed traces</strong> of every AI call, including the prompt sent, model response, latency, and token usage</li>
  <li><strong>Debug streaming flows</strong> and watch chunks arrive in real-time</li>
  <li><strong>Inspect structured outputs</strong> and verify the schema is being followed</li>
</ul>

<p>The Dev UI works identically whether you are using Azure OpenAI, Google Gemini, or AWS Bedrock. This unified experience is one of Genkit’s greatest strengths: you develop and debug the same way regardless of the backend.</p>

<h2 id="local-development">Local Development</h2>

<h3 id="run-with-azure-functions-core-tools">Run with Azure Functions Core Tools</h3>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>npm run func:start
</code></pre></div></div>

<p>This starts a local server at <code class="language-plaintext highlighter-rouge">http://localhost:7071</code> that mimics the Azure Functions runtime:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>curl <span class="nt">-X</span> POST http://localhost:7071/api/storyGeneratorFlow <span class="se">\</span>
  <span class="nt">-H</span> <span class="s2">"Content-Type: application/json"</span> <span class="se">\</span>
  <span class="nt">-d</span> <span class="s1">'{
    "data": {
      "topic": "a robot learning to feel emotions",
      "style": "sci-fi",
      "length": "medium"
    }
  }'</span>
</code></pre></div></div>

<h3 id="using-the-genkit-client-sdk">Using the Genkit Client SDK</h3>

<p>You can also call your flows using the <code class="language-plaintext highlighter-rouge">genkit/beta/client</code> SDK:</p>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">import</span> <span class="p">{</span> <span class="nx">runFlow</span><span class="p">,</span> <span class="nx">streamFlow</span> <span class="p">}</span> <span class="k">from</span> <span class="dl">'</span><span class="s1">genkit/beta/client</span><span class="dl">'</span><span class="p">;</span>

<span class="kd">const</span> <span class="nx">result</span> <span class="o">=</span> <span class="k">await</span> <span class="nf">runFlow</span><span class="p">({</span>
  <span class="na">url</span><span class="p">:</span> <span class="dl">'</span><span class="s1">http://localhost:7071/api/jokeFlow</span><span class="dl">'</span><span class="p">,</span>
  <span class="na">input</span><span class="p">:</span> <span class="p">{</span> <span class="na">subject</span><span class="p">:</span> <span class="dl">'</span><span class="s1">programming</span><span class="dl">'</span> <span class="p">},</span>
<span class="p">});</span>

<span class="c1">// Streaming</span>
<span class="kd">const</span> <span class="nx">result</span> <span class="o">=</span> <span class="nf">streamFlow</span><span class="p">({</span>
  <span class="na">url</span><span class="p">:</span> <span class="dl">'</span><span class="s1">http://localhost:7071/api/jokeStreamingFlow</span><span class="dl">'</span><span class="p">,</span>
  <span class="na">input</span><span class="p">:</span> <span class="p">{</span> <span class="na">subject</span><span class="p">:</span> <span class="dl">'</span><span class="s1">TypeScript</span><span class="dl">'</span> <span class="p">},</span>
<span class="p">});</span>
<span class="k">for</span> <span class="k">await </span><span class="p">(</span><span class="kd">const</span> <span class="nx">chunk</span> <span class="k">of</span> <span class="nx">result</span><span class="p">.</span><span class="nx">stream</span><span class="p">)</span> <span class="p">{</span>
  <span class="nx">process</span><span class="p">.</span><span class="nx">stdout</span><span class="p">.</span><span class="nf">write</span><span class="p">(</span><span class="nx">chunk</span><span class="p">);</span>
<span class="p">}</span>

<span class="c1">// With API key auth</span>
<span class="kd">const</span> <span class="nx">result</span> <span class="o">=</span> <span class="k">await</span> <span class="nf">runFlow</span><span class="p">({</span>
  <span class="na">url</span><span class="p">:</span> <span class="dl">'</span><span class="s1">http://localhost:7071/api/protectedSummaryFlow</span><span class="dl">'</span><span class="p">,</span>
  <span class="na">input</span><span class="p">:</span> <span class="p">{</span> <span class="na">text</span><span class="p">:</span> <span class="dl">'</span><span class="s1">Your long text...</span><span class="dl">'</span><span class="p">,</span> <span class="na">maxLength</span><span class="p">:</span> <span class="mi">50</span> <span class="p">},</span>
  <span class="na">headers</span><span class="p">:</span> <span class="p">{</span> <span class="dl">'</span><span class="s1">X-API-Key</span><span class="dl">'</span><span class="p">:</span> <span class="dl">'</span><span class="s1">demo-api-key</span><span class="dl">'</span> <span class="p">},</span>
<span class="p">});</span>
</code></pre></div></div>

<h2 id="deployment">Deployment</h2>

<p>Build and deploy to Azure:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>npm run build
npm run deploy
</code></pre></div></div>

<p>Make sure to set your environment variables in Azure:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>az functionapp config appsettings <span class="nb">set</span> <span class="se">\</span>
  <span class="nt">--name</span> myFunctionAppName <span class="se">\</span>
  <span class="nt">--resource-group</span> myResourceGroup <span class="se">\</span>
  <span class="nt">--settings</span> <span class="se">\</span>
    <span class="nv">AZURE_OPENAI_ENDPOINT</span><span class="o">=</span><span class="s2">"https://your-resource-name.openai.azure.com/"</span> <span class="se">\</span>
    <span class="nv">AZURE_OPENAI_API_KEY</span><span class="o">=</span><span class="s2">"your-api-key-here"</span> <span class="se">\</span>
    <span class="nv">OPENAI_API_VERSION</span><span class="o">=</span><span class="s2">"2024-10-21"</span>
</code></pre></div></div>

<h2 id="resources">Resources</h2>

<ul>
  <li><a href="https://genkit.dev/docs/">Genkit Documentation</a></li>
  <li><a href="https://github.com/genkit-ai/azure-foundry-js-plugin">Azure OpenAI Plugin</a></li>
  <li><a href="https://docs.microsoft.com/azure/azure-functions/">Azure Functions Documentation</a></li>
  <li><a href="https://azure.microsoft.com/services/cognitive-services/openai-service/">Azure OpenAI Service</a></li>
</ul>

<h2 id="conclusion">Conclusion</h2>

<p>Genkit brings a consistent, delightful developer experience to Azure Functions. The combination of typed flows with Zod schemas, structured LLM output, the Dev UI for visual debugging, and the <code class="language-plaintext highlighter-rouge">onCallGenkit</code> helper makes building AI-powered Azure Functions as straightforward as defining a function.</p>

<p>You can find the full code of this example in the <a href="https://github.com/xavidop/genkit-azure-function-ai-foundry">GitHub repository</a></p>

<p>Happy coding!</p>]]></content><author><name>Xavier Portilla Edo</name><email>xavierportillaedo@gmail.com</email></author><category term="genkit" /><category term="genkit" /><category term="gcp" /><category term="azure" /><summary type="html"><![CDATA[Deploy AI-powered flows to Azure Functions using Genkit and the Azure OpenAI plugin]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://xavidop.me/assets/img/blog/post-headers/genkit-azure-function-ai-foundry.png" /><media:content medium="image" url="https://xavidop.me/assets/img/blog/post-headers/genkit-azure-function-ai-foundry.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry></feed>