<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en"><generator uri="https://jekyllrb.com/" version="4.3.3">Jekyll</generator><link href="https://xavidop.me/feed.xml" rel="self" type="application/atom+xml" /><link href="https://xavidop.me/" rel="alternate" type="text/html" hreflang="en" /><updated>2026-03-20T17:45:37+00:00</updated><id>https://xavidop.me/feed.xml</id><title type="html">Xavier Portilla Edo</title><subtitle>Personal Blog of Xavier Portilla Edo.
</subtitle><author><name>Xavi Portilla Edo</name><email>xavierportillaedo@gmail.com</email></author><entry xml:lang="en"><title type="html">Running Genkit on AWS Lambda with Bedrock (English)</title><link href="https://xavidop.me/genkit/2026-03-20-genkit-aws-lambda-bedrock/" rel="alternate" type="text/html" title="Running Genkit on AWS Lambda with Bedrock (English)" /><published>2026-03-20T00:00:00+00:00</published><updated>2026-03-20T17:41:31+00:00</updated><id>https://xavidop.me/genkit/genkit-aws-lambda-bedrock</id><content type="html" xml:base="https://xavidop.me/genkit/2026-03-20-genkit-aws-lambda-bedrock/"><![CDATA[<ol class="no_toc" id="markdown-toc">
  <li><a href="#introduction" id="markdown-toc-introduction">Introduction</a></li>
  <li><a href="#why-genkit-on-aws" id="markdown-toc-why-genkit-on-aws">Why Genkit on AWS?</a></li>
  <li><a href="#prerequisites" id="markdown-toc-prerequisites">Prerequisites</a></li>
  <li><a href="#project-structure" id="markdown-toc-project-structure">Project Structure</a></li>
  <li><a href="#initializing-genkit-with-bedrock" id="markdown-toc-initializing-genkit-with-bedrock">Initializing Genkit with Bedrock</a></li>
  <li><a href="#defining-flows" id="markdown-toc-defining-flows">Defining Flows</a>    <ol>
      <li><a href="#story-generator-flow" id="markdown-toc-story-generator-flow">Story Generator Flow</a></li>
      <li><a href="#joke-flow-with-streaming" id="markdown-toc-joke-flow-with-streaming">Joke Flow with Streaming</a></li>
    </ol>
  </li>
  <li><a href="#wrapping-flows-as-lambda-handlers" id="markdown-toc-wrapping-flows-as-lambda-handlers">Wrapping Flows as Lambda Handlers</a></li>
  <li><a href="#the-genkit-dev-ui" id="markdown-toc-the-genkit-dev-ui">The Genkit Dev UI</a></li>
  <li><a href="#local-development-with-serverless-offline" id="markdown-toc-local-development-with-serverless-offline">Local Development with Serverless Offline</a></li>
  <li><a href="#deployment" id="markdown-toc-deployment">Deployment</a></li>
  <li><a href="#using-the-genkit-client-sdk" id="markdown-toc-using-the-genkit-client-sdk">Using the Genkit Client SDK</a></li>
  <li><a href="#resources" id="markdown-toc-resources">Resources</a></li>
  <li><a href="#conclusion" id="markdown-toc-conclusion">Conclusion</a></li>
</ol>

<h2 id="introduction">Introduction</h2>

<p>One of the most powerful things about <strong>Genkit</strong> is that it is cloud-agnostic. You are not locked into a single provider. In this post, we will explore how to run Genkit flows on <strong>AWS Lambda</strong> using the <strong>AWS Bedrock plugin</strong>, deploying a full AI-powered story and joke generator with streaming support, all managed by the <strong>Serverless Framework</strong>.</p>

<p>The project uses the <code class="language-plaintext highlighter-rouge">onCallGenkit</code> helper from the AWS Bedrock plugin, which wraps any Genkit flow as a Lambda handler automatically, handling CORS, request parsing, error formatting, and even streaming via Lambda Function URLs.</p>

<h2 id="why-genkit-on-aws">Why Genkit on AWS?</h2>

<p>If your infrastructure lives on AWS, you might think Genkit is not for you. Think again. The community-maintained <a href="https://github.com/genkit-ai/aws-bedrock-js-plugin">AWS Bedrock plugin</a> brings first-class Bedrock support to Genkit, giving you:</p>

<ul>
  <li>Access to <strong>Amazon Nova</strong>, <strong>Anthropic Claude</strong>, and other Bedrock models</li>
  <li>The <code class="language-plaintext highlighter-rouge">onCallGenkit</code> helper for zero-boilerplate Lambda handlers</li>
  <li>Full compatibility with the <strong>Genkit Dev UI</strong> for local development</li>
  <li>Streaming support via Lambda Function URLs</li>
</ul>

<h2 id="prerequisites">Prerequisites</h2>

<ul>
  <li><strong>Node.js 20</strong> or later</li>
  <li><strong>AWS Account</strong> with AWS CLI configured and access to AWS Bedrock</li>
  <li><strong>Serverless Framework</strong> (installed as dev dependency)</li>
  <li><strong>Genkit CLI</strong> installed globally</li>
</ul>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>npm <span class="nb">install</span> <span class="nt">-g</span> genkit-cli
</code></pre></div></div>

<h2 id="project-structure">Project Structure</h2>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>genkit-aws-lambda-bedrock/
├── src/
│   └── index.ts          # Genkit flows + Lambda handlers via onCallGenkit
├── serverless.yml        # Serverless Framework configuration
├── tsconfig.json         # TypeScript configuration
├── package.json          # Dependencies and scripts
└── README.md
</code></pre></div></div>

<h2 id="initializing-genkit-with-bedrock">Initializing Genkit with Bedrock</h2>

<p>The setup is minimal. Import the plugin, initialize Genkit, and pick your model:</p>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">import</span> <span class="p">{</span> <span class="nx">genkit</span><span class="p">,</span> <span class="nx">z</span> <span class="p">}</span> <span class="k">from</span> <span class="dl">'</span><span class="s1">genkit</span><span class="dl">'</span><span class="p">;</span>
<span class="k">import</span> <span class="p">{</span> <span class="nx">awsBedrock</span><span class="p">,</span> <span class="nx">amazonNovaProV1</span><span class="p">,</span> <span class="nx">onCallGenkit</span> <span class="p">}</span> <span class="k">from</span> <span class="dl">'</span><span class="s1">genkitx-aws-bedrock</span><span class="dl">'</span><span class="p">;</span>

<span class="kd">const</span> <span class="nx">ai</span> <span class="o">=</span> <span class="nf">genkit</span><span class="p">({</span>
  <span class="na">plugins</span><span class="p">:</span> <span class="p">[</span><span class="nf">awsBedrock</span><span class="p">()],</span>
  <span class="na">model</span><span class="p">:</span> <span class="nf">amazonNovaProV1</span><span class="p">(),</span>
<span class="p">});</span>
</code></pre></div></div>

<p>That is it. Genkit is now configured to use <strong>Amazon Nova Pro</strong> via Bedrock. You can swap to <code class="language-plaintext highlighter-rouge">anthropicClaude35SonnetV2</code> or any other supported model with a single line change.</p>

<h2 id="defining-flows">Defining Flows</h2>

<p>Genkit flows are the core building block. Each flow has typed input and output schemas using <strong>Zod</strong>, making everything type-safe from end to end.</p>

<h3 id="story-generator-flow">Story Generator Flow</h3>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">const</span> <span class="nx">StoryInputSchema</span> <span class="o">=</span> <span class="nx">z</span><span class="p">.</span><span class="nf">object</span><span class="p">({</span>
  <span class="na">topic</span><span class="p">:</span> <span class="nx">z</span><span class="p">.</span><span class="nf">string</span><span class="p">().</span><span class="nf">describe</span><span class="p">(</span><span class="dl">'</span><span class="s1">The main topic or theme for the story</span><span class="dl">'</span><span class="p">),</span>
  <span class="na">style</span><span class="p">:</span> <span class="nx">z</span><span class="p">.</span><span class="nf">string</span><span class="p">().</span><span class="nf">optional</span><span class="p">().</span><span class="nf">describe</span><span class="p">(</span><span class="dl">'</span><span class="s1">Writing style (e.g., adventure, mystery, sci-fi)</span><span class="dl">'</span><span class="p">),</span>
  <span class="na">length</span><span class="p">:</span> <span class="nx">z</span><span class="p">.</span><span class="nf">enum</span><span class="p">([</span><span class="dl">'</span><span class="s1">short</span><span class="dl">'</span><span class="p">,</span> <span class="dl">'</span><span class="s1">medium</span><span class="dl">'</span><span class="p">,</span> <span class="dl">'</span><span class="s1">long</span><span class="dl">'</span><span class="p">]).</span><span class="k">default</span><span class="p">(</span><span class="dl">'</span><span class="s1">medium</span><span class="dl">'</span><span class="p">),</span>
<span class="p">});</span>

<span class="kd">const</span> <span class="nx">StoryOutputSchema</span> <span class="o">=</span> <span class="nx">z</span><span class="p">.</span><span class="nf">object</span><span class="p">({</span>
  <span class="na">title</span><span class="p">:</span> <span class="nx">z</span><span class="p">.</span><span class="nf">string</span><span class="p">(),</span>
  <span class="na">genre</span><span class="p">:</span> <span class="nx">z</span><span class="p">.</span><span class="nf">string</span><span class="p">(),</span>
  <span class="na">story</span><span class="p">:</span> <span class="nx">z</span><span class="p">.</span><span class="nf">string</span><span class="p">(),</span>
  <span class="na">wordCount</span><span class="p">:</span> <span class="nx">z</span><span class="p">.</span><span class="nf">number</span><span class="p">(),</span>
  <span class="na">themes</span><span class="p">:</span> <span class="nx">z</span><span class="p">.</span><span class="nf">array</span><span class="p">(</span><span class="nx">z</span><span class="p">.</span><span class="nf">string</span><span class="p">()),</span>
<span class="p">});</span>

<span class="kd">const</span> <span class="nx">storyGeneratorFlow</span> <span class="o">=</span> <span class="nx">ai</span><span class="p">.</span><span class="nf">defineFlow</span><span class="p">(</span>
  <span class="p">{</span>
    <span class="na">name</span><span class="p">:</span> <span class="dl">'</span><span class="s1">storyGeneratorFlow</span><span class="dl">'</span><span class="p">,</span>
    <span class="na">inputSchema</span><span class="p">:</span> <span class="nx">StoryInputSchema</span><span class="p">,</span>
    <span class="na">outputSchema</span><span class="p">:</span> <span class="nx">StoryOutputSchema</span><span class="p">,</span>
  <span class="p">},</span>
  <span class="k">async </span><span class="p">(</span><span class="nx">input</span><span class="p">)</span> <span class="o">=&gt;</span> <span class="p">{</span>
    <span class="kd">const</span> <span class="nx">lengthMap</span> <span class="o">=</span> <span class="p">{</span> <span class="na">short</span><span class="p">:</span> <span class="dl">'</span><span class="s1">200-300</span><span class="dl">'</span><span class="p">,</span> <span class="na">medium</span><span class="p">:</span> <span class="dl">'</span><span class="s1">500-700</span><span class="dl">'</span><span class="p">,</span> <span class="na">long</span><span class="p">:</span> <span class="dl">'</span><span class="s1">1000-1500</span><span class="dl">'</span> <span class="p">};</span>
    <span class="kd">const</span> <span class="nx">wordCount</span> <span class="o">=</span> <span class="nx">lengthMap</span><span class="p">[</span><span class="nx">input</span><span class="p">.</span><span class="nx">length</span><span class="p">];</span>

    <span class="kd">const</span> <span class="nx">prompt</span> <span class="o">=</span> <span class="s2">`Create a creative </span><span class="p">${</span><span class="nx">input</span><span class="p">.</span><span class="nx">style</span> <span class="o">||</span> <span class="dl">'</span><span class="s1">fictional</span><span class="dl">'</span><span class="p">}</span><span class="s2"> story with the following requirements:
      Topic: </span><span class="p">${</span><span class="nx">input</span><span class="p">.</span><span class="nx">topic</span><span class="p">}</span><span class="s2">
      Length: </span><span class="p">${</span><span class="nx">wordCount</span><span class="p">}</span><span class="s2"> words
      
      Please provide a captivating story with a clear beginning, middle, and end.
      Include rich descriptions and engaging characters.`</span><span class="p">;</span>

    <span class="kd">const</span> <span class="p">{</span> <span class="nx">output</span> <span class="p">}</span> <span class="o">=</span> <span class="k">await</span> <span class="nx">ai</span><span class="p">.</span><span class="nf">generate</span><span class="p">({</span>
      <span class="nx">prompt</span><span class="p">,</span>
      <span class="na">output</span><span class="p">:</span> <span class="p">{</span> <span class="na">schema</span><span class="p">:</span> <span class="nx">StoryOutputSchema</span> <span class="p">},</span>
    <span class="p">});</span>

    <span class="k">if </span><span class="p">(</span><span class="o">!</span><span class="nx">output</span><span class="p">)</span> <span class="p">{</span>
      <span class="k">throw</span> <span class="k">new</span> <span class="nc">Error</span><span class="p">(</span><span class="dl">'</span><span class="s1">Failed to generate story</span><span class="dl">'</span><span class="p">);</span>
    <span class="p">}</span>

    <span class="k">return</span> <span class="nx">output</span><span class="p">;</span>
  <span class="p">}</span>
<span class="p">);</span>
</code></pre></div></div>

<p>Notice how <code class="language-plaintext highlighter-rouge">ai.generate</code> returns a fully structured, typed object. No JSON parsing, no string manipulation. Genkit handles all of that for you.</p>

<h3 id="joke-flow-with-streaming">Joke Flow with Streaming</h3>

<p>Genkit also supports streaming responses. The <code class="language-plaintext highlighter-rouge">jokeStreamingFlow</code> uses <code class="language-plaintext highlighter-rouge">ai.generateStream</code> and <code class="language-plaintext highlighter-rouge">sendChunk</code> to emit text chunks as they arrive from the LLM:</p>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">const</span> <span class="nx">jokeStreamingFlow</span> <span class="o">=</span> <span class="nx">ai</span><span class="p">.</span><span class="nf">defineFlow</span><span class="p">(</span>
  <span class="p">{</span>
    <span class="na">name</span><span class="p">:</span> <span class="dl">'</span><span class="s1">jokeStreamingFlow</span><span class="dl">'</span><span class="p">,</span>
    <span class="na">inputSchema</span><span class="p">:</span> <span class="nx">z</span><span class="p">.</span><span class="nf">object</span><span class="p">({</span>
      <span class="na">subject</span><span class="p">:</span> <span class="nx">z</span><span class="p">.</span><span class="nf">string</span><span class="p">().</span><span class="nf">describe</span><span class="p">(</span><span class="dl">'</span><span class="s1">The subject to tell a joke about</span><span class="dl">'</span><span class="p">),</span>
    <span class="p">}),</span>
    <span class="na">outputSchema</span><span class="p">:</span> <span class="nx">z</span><span class="p">.</span><span class="nf">object</span><span class="p">({</span>
      <span class="na">joke</span><span class="p">:</span> <span class="nx">z</span><span class="p">.</span><span class="nf">string</span><span class="p">(),</span>
      <span class="na">type</span><span class="p">:</span> <span class="nx">z</span><span class="p">.</span><span class="nf">string</span><span class="p">().</span><span class="nf">optional</span><span class="p">(),</span>
    <span class="p">}),</span>
    <span class="na">streamSchema</span><span class="p">:</span> <span class="nx">z</span><span class="p">.</span><span class="nf">string</span><span class="p">(),</span>
  <span class="p">},</span>
  <span class="k">async </span><span class="p">(</span><span class="nx">input</span><span class="p">,</span> <span class="nx">sendChunk</span><span class="p">)</span> <span class="o">=&gt;</span> <span class="p">{</span>
    <span class="kd">const</span> <span class="p">{</span> <span class="nx">stream</span><span class="p">,</span> <span class="nx">response</span> <span class="p">}</span> <span class="o">=</span> <span class="k">await</span> <span class="nx">ai</span><span class="p">.</span><span class="nf">generateStream</span><span class="p">({</span>
      <span class="na">prompt</span><span class="p">:</span> <span class="s2">`Tell me a funny joke about </span><span class="p">${</span><span class="nx">input</span><span class="p">.</span><span class="nx">subject</span><span class="p">}</span><span class="s2">. Make it clever and appropriate for all ages.`</span><span class="p">,</span>
      <span class="na">output</span><span class="p">:</span> <span class="p">{</span>
        <span class="na">schema</span><span class="p">:</span> <span class="nx">z</span><span class="p">.</span><span class="nf">object</span><span class="p">({</span>
          <span class="na">joke</span><span class="p">:</span> <span class="nx">z</span><span class="p">.</span><span class="nf">string</span><span class="p">(),</span>
          <span class="na">type</span><span class="p">:</span> <span class="nx">z</span><span class="p">.</span><span class="nf">string</span><span class="p">().</span><span class="nf">optional</span><span class="p">(),</span>
        <span class="p">}),</span>
      <span class="p">},</span>
    <span class="p">});</span>

    <span class="k">for</span> <span class="k">await </span><span class="p">(</span><span class="kd">const</span> <span class="nx">chunk</span> <span class="k">of</span> <span class="nx">stream</span><span class="p">)</span> <span class="p">{</span>
      <span class="nf">sendChunk</span><span class="p">(</span><span class="nx">chunk</span><span class="p">.</span><span class="nx">text</span><span class="p">);</span>
    <span class="p">}</span>

    <span class="kd">const</span> <span class="nx">result</span> <span class="o">=</span> <span class="k">await</span> <span class="nx">response</span><span class="p">;</span>
    <span class="k">return</span> <span class="nx">result</span><span class="p">.</span><span class="nx">output</span> <span class="o">||</span> <span class="p">{</span> <span class="na">joke</span><span class="p">:</span> <span class="nx">result</span><span class="p">.</span><span class="nx">text</span> <span class="p">};</span>
  <span class="p">}</span>
<span class="p">);</span>
</code></pre></div></div>

<h2 id="wrapping-flows-as-lambda-handlers">Wrapping Flows as Lambda Handlers</h2>

<p>The <code class="language-plaintext highlighter-rouge">onCallGenkit</code> helper is where the magic happens. It transforms any Genkit flow into a production-ready Lambda handler:</p>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// Simple flow handler</span>
<span class="k">export</span> <span class="kd">const</span> <span class="nx">jokeHandler</span> <span class="o">=</span> <span class="nf">onCallGenkit</span><span class="p">(</span><span class="nx">jokeFlow</span><span class="p">);</span>

<span class="c1">// With CORS and debug options</span>
<span class="k">export</span> <span class="kd">const</span> <span class="nx">storyGeneratorHandler</span> <span class="o">=</span> <span class="nf">onCallGenkit</span><span class="p">(</span>
  <span class="p">{</span>
    <span class="na">cors</span><span class="p">:</span> <span class="p">{</span> <span class="na">origin</span><span class="p">:</span> <span class="dl">'</span><span class="s1">*</span><span class="dl">'</span><span class="p">,</span> <span class="na">methods</span><span class="p">:</span> <span class="p">[</span><span class="dl">'</span><span class="s1">POST</span><span class="dl">'</span><span class="p">,</span> <span class="dl">'</span><span class="s1">OPTIONS</span><span class="dl">'</span><span class="p">]</span> <span class="p">},</span>
    <span class="na">debug</span><span class="p">:</span> <span class="nx">process</span><span class="p">.</span><span class="nx">env</span><span class="p">.</span><span class="nx">NODE_ENV</span> <span class="o">!==</span> <span class="dl">'</span><span class="s1">production</span><span class="dl">'</span><span class="p">,</span>
  <span class="p">},</span>
  <span class="nx">storyGeneratorFlow</span>
<span class="p">);</span>

<span class="c1">// Streaming handler (requires Lambda Function URL with RESPONSE_STREAM)</span>
<span class="k">export</span> <span class="kd">const</span> <span class="nx">jokeStreamHandler</span> <span class="o">=</span> <span class="nf">onCallGenkit</span><span class="p">(</span>
  <span class="p">{</span>
    <span class="na">streaming</span><span class="p">:</span> <span class="kc">true</span><span class="p">,</span>
    <span class="na">cors</span><span class="p">:</span> <span class="p">{</span> <span class="na">origin</span><span class="p">:</span> <span class="dl">'</span><span class="s1">*</span><span class="dl">'</span><span class="p">,</span> <span class="na">methods</span><span class="p">:</span> <span class="p">[</span><span class="dl">'</span><span class="s1">POST</span><span class="dl">'</span><span class="p">,</span> <span class="dl">'</span><span class="s1">OPTIONS</span><span class="dl">'</span><span class="p">]</span> <span class="p">},</span>
    <span class="na">debug</span><span class="p">:</span> <span class="nx">process</span><span class="p">.</span><span class="nx">env</span><span class="p">.</span><span class="nx">NODE_ENV</span> <span class="o">!==</span> <span class="dl">'</span><span class="s1">production</span><span class="dl">'</span><span class="p">,</span>
  <span class="p">},</span>
  <span class="nx">jokeStreamingFlow</span>
<span class="p">);</span>
</code></pre></div></div>

<h2 id="the-genkit-dev-ui">The Genkit Dev UI</h2>

<p>This is where Genkit truly shines during development. Run:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>npm run genkit:ui
</code></pre></div></div>

<p>This starts the <strong>Genkit Developer UI</strong> at <code class="language-plaintext highlighter-rouge">http://localhost:4000</code>. From here, you can:</p>

<ul>
  <li><strong>Test any flow</strong> visually with different inputs, no cURL needed</li>
  <li><strong>View detailed traces</strong> of every AI generation, including prompts, model responses, and latency</li>
  <li><strong>Debug and optimize prompts</strong> interactively</li>
  <li><strong>Inspect streaming</strong> responses in real-time</li>
</ul>

<p>The Dev UI is model-agnostic, so even though we are using AWS Bedrock, the same visual debugging experience applies. This is one of the biggest advantages of Genkit: a unified developer experience regardless of the underlying AI provider.</p>

<h2 id="local-development-with-serverless-offline">Local Development with Serverless Offline</h2>

<p>For testing the Lambda locally with a real HTTP endpoint:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>npm run dev
</code></pre></div></div>

<p>This starts a local server at <code class="language-plaintext highlighter-rouge">http://localhost:3000</code> that mimics API Gateway. Test it with:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>curl <span class="nt">-X</span> POST http://localhost:3000/generate <span class="se">\</span>
  <span class="nt">-H</span> <span class="s2">"Content-Type: application/json"</span> <span class="se">\</span>
  <span class="nt">-d</span> <span class="s1">'{
    "data": {
      "topic": "a robot learning to feel emotions",
      "style": "sci-fi",
      "length": "medium"
    }
  }'</span>
</code></pre></div></div>

<h2 id="deployment">Deployment</h2>

<p>Deploy to AWS with a single command:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>npm run deploy
</code></pre></div></div>

<p>After deployment, you will see output like:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>endpoints:
  POST - https://abc123.execute-api.us-east-1.amazonaws.com/generate
  POST - https://abc123.execute-api.us-east-1.amazonaws.com/joke
functions:
  storyGenerator: genkit-aws-lambda-bedrock-dev-storyGenerator
  jokeGenerator: genkit-aws-lambda-bedrock-dev-jokeGenerator
</code></pre></div></div>

<h2 id="using-the-genkit-client-sdk">Using the Genkit Client SDK</h2>

<p>You can also call your deployed flows from a frontend or another service using the <code class="language-plaintext highlighter-rouge">genkit/beta/client</code> SDK:</p>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">import</span> <span class="p">{</span> <span class="nx">runFlow</span><span class="p">,</span> <span class="nx">streamFlow</span> <span class="p">}</span> <span class="k">from</span> <span class="dl">'</span><span class="s1">genkit/beta/client</span><span class="dl">'</span><span class="p">;</span>

<span class="c1">// Non-streaming</span>
<span class="kd">const</span> <span class="nx">result</span> <span class="o">=</span> <span class="k">await</span> <span class="nf">runFlow</span><span class="p">({</span>
  <span class="na">url</span><span class="p">:</span> <span class="dl">'</span><span class="s1">https://your-api-url.amazonaws.com/generate</span><span class="dl">'</span><span class="p">,</span>
  <span class="na">input</span><span class="p">:</span> <span class="p">{</span> <span class="na">topic</span><span class="p">:</span> <span class="dl">'</span><span class="s1">space exploration</span><span class="dl">'</span><span class="p">,</span> <span class="na">style</span><span class="p">:</span> <span class="dl">'</span><span class="s1">sci-fi</span><span class="dl">'</span><span class="p">,</span> <span class="na">length</span><span class="p">:</span> <span class="dl">'</span><span class="s1">short</span><span class="dl">'</span> <span class="p">},</span>
<span class="p">});</span>

<span class="c1">// Streaming</span>
<span class="kd">const</span> <span class="nx">result</span> <span class="o">=</span> <span class="nf">streamFlow</span><span class="p">({</span>
  <span class="na">url</span><span class="p">:</span> <span class="dl">'</span><span class="s1">https://your-api-url.amazonaws.com/joke-stream</span><span class="dl">'</span><span class="p">,</span>
  <span class="na">input</span><span class="p">:</span> <span class="p">{</span> <span class="na">subject</span><span class="p">:</span> <span class="dl">'</span><span class="s1">TypeScript</span><span class="dl">'</span> <span class="p">},</span>
<span class="p">});</span>
<span class="k">for</span> <span class="k">await </span><span class="p">(</span><span class="kd">const</span> <span class="nx">chunk</span> <span class="k">of</span> <span class="nx">result</span><span class="p">.</span><span class="nx">stream</span><span class="p">)</span> <span class="p">{</span>
  <span class="nx">process</span><span class="p">.</span><span class="nx">stdout</span><span class="p">.</span><span class="nf">write</span><span class="p">(</span><span class="nx">chunk</span><span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>

<h2 id="resources">Resources</h2>

<ul>
  <li><a href="https://genkit.dev/docs/">Genkit Documentation</a></li>
  <li><a href="https://github.com/genkit-ai/aws-bedrock-js-plugin">AWS Bedrock Plugin</a></li>
  <li><a href="https://www.serverless.com/framework/docs">Serverless Framework Documentation</a></li>
  <li><a href="https://aws.amazon.com/bedrock/">AWS Bedrock</a></li>
</ul>

<h2 id="conclusion">Conclusion</h2>

<p>Genkit makes it incredibly easy to build, test, and deploy AI-powered applications on AWS. The combination of typed flows, the Dev UI for visual debugging, and the <code class="language-plaintext highlighter-rouge">onCallGenkit</code> helper for zero-boilerplate Lambda handlers means you spend your time on AI logic, not infrastructure plumbing.</p>

<p>You can find the full code of this example in the <a href="https://github.com/xavidop/genkit-aws-lambda-bedrock">GitHub repository</a></p>

<p>Happy coding!</p>]]></content><author><name>Xavier Portilla Edo</name><email>xavierportillaedo@gmail.com</email></author><category term="genkit" /><category term="genkit" /><category term="gcp" /><category term="aws" /><summary type="html"><![CDATA[Deploy AI-powered flows to AWS Lambda using Genkit and the AWS Bedrock plugin]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://xavidop.me/assets/img/blog/post-headers/genkit-aws-lambda-bedrock.png" /><media:content medium="image" url="https://xavidop.me/assets/img/blog/post-headers/genkit-aws-lambda-bedrock.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry xml:lang="en"><title type="html">Running Genkit on Azure Functions with Azure OpenAI (English)</title><link href="https://xavidop.me/genkit/2026-03-20-genkit-azure-function-ai-foundry/" rel="alternate" type="text/html" title="Running Genkit on Azure Functions with Azure OpenAI (English)" /><published>2026-03-20T00:00:00+00:00</published><updated>2026-03-20T17:41:31+00:00</updated><id>https://xavidop.me/genkit/genkit-azure-function-ai-foundry</id><content type="html" xml:base="https://xavidop.me/genkit/2026-03-20-genkit-azure-function-ai-foundry/"><![CDATA[<ol class="no_toc" id="markdown-toc">
  <li><a href="#introduction" id="markdown-toc-introduction">Introduction</a></li>
  <li><a href="#why-genkit-on-azure" id="markdown-toc-why-genkit-on-azure">Why Genkit on Azure?</a></li>
  <li><a href="#prerequisites" id="markdown-toc-prerequisites">Prerequisites</a></li>
  <li><a href="#project-structure" id="markdown-toc-project-structure">Project Structure</a></li>
  <li><a href="#initializing-genkit-with-azure-openai" id="markdown-toc-initializing-genkit-with-azure-openai">Initializing Genkit with Azure OpenAI</a></li>
  <li><a href="#defining-flows" id="markdown-toc-defining-flows">Defining Flows</a>    <ol>
      <li><a href="#story-generator-flow" id="markdown-toc-story-generator-flow">Story Generator Flow</a></li>
      <li><a href="#streaming-joke-flow" id="markdown-toc-streaming-joke-flow">Streaming Joke Flow</a></li>
      <li><a href="#protected-summary-flow-with-api-key-auth" id="markdown-toc-protected-summary-flow-with-api-key-auth">Protected Summary Flow with API Key Auth</a></li>
    </ol>
  </li>
  <li><a href="#registering-flows-as-azure-functions" id="markdown-toc-registering-flows-as-azure-functions">Registering Flows as Azure Functions</a></li>
  <li><a href="#the-genkit-dev-ui" id="markdown-toc-the-genkit-dev-ui">The Genkit Dev UI</a></li>
  <li><a href="#local-development" id="markdown-toc-local-development">Local Development</a>    <ol>
      <li><a href="#run-with-azure-functions-core-tools" id="markdown-toc-run-with-azure-functions-core-tools">Run with Azure Functions Core Tools</a></li>
      <li><a href="#using-the-genkit-client-sdk" id="markdown-toc-using-the-genkit-client-sdk">Using the Genkit Client SDK</a></li>
    </ol>
  </li>
  <li><a href="#deployment" id="markdown-toc-deployment">Deployment</a></li>
  <li><a href="#resources" id="markdown-toc-resources">Resources</a></li>
  <li><a href="#conclusion" id="markdown-toc-conclusion">Conclusion</a></li>
</ol>

<h2 id="introduction">Introduction</h2>

<p>Genkit is not tied to any single cloud. In this post, we will explore how to run Genkit flows on <strong>Azure Functions</strong> powered by <strong>Azure OpenAI</strong>, building a story generator, a joke generator with streaming, and a protected summary endpoint with API key authentication, all using the <code class="language-plaintext highlighter-rouge">onCallGenkit</code> helper from the Azure OpenAI plugin.</p>

<p>This project shows how Genkit brings the same great developer experience, typed flows, structured output, the Dev UI, to the Azure ecosystem.</p>

<h2 id="why-genkit-on-azure">Why Genkit on Azure?</h2>

<p>Azure has a first-class AI offering through <strong>Azure OpenAI Service</strong> with GPT-4o and other powerful models. The <a href="https://github.com/genkit-ai/azure-foundry-js-plugin">Azure OpenAI plugin for Genkit</a> brings all of these models into the Genkit ecosystem, giving you:</p>

<ul>
  <li>Access to <strong>GPT-4o</strong>, <strong>GPT-3.5 Turbo</strong>, and other Azure OpenAI models</li>
  <li>The <code class="language-plaintext highlighter-rouge">onCallGenkit</code> helper for zero-boilerplate Azure Function handlers</li>
  <li>Built-in API key authentication via <code class="language-plaintext highlighter-rouge">requireApiKey</code></li>
  <li>Full compatibility with the <strong>Genkit Dev UI</strong> for local testing</li>
  <li>Streaming support via SSE (Server-Sent Events)</li>
</ul>

<h2 id="prerequisites">Prerequisites</h2>

<ul>
  <li><strong>Node.js 20</strong> or later</li>
  <li><strong>Azure Account</strong> with an Azure OpenAI resource deployed</li>
  <li><strong>Azure Functions Core Tools v4</strong></li>
  <li><strong>Genkit CLI</strong> installed globally</li>
</ul>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>npm <span class="nb">install</span> <span class="nt">-g</span> genkit-cli
</code></pre></div></div>

<h2 id="project-structure">Project Structure</h2>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>genkit-azure-function-ai-foundry/
├── src/
│   └── index.ts          # Main Azure Function handler with Genkit flows
├── host.json             # Azure Functions host configuration
├── local.settings.json   # Local config
├── .env                  # Environment variables
├── tsconfig.json         # TypeScript configuration
├── package.json          # Dependencies and scripts
└── README.md
</code></pre></div></div>

<h2 id="initializing-genkit-with-azure-openai">Initializing Genkit with Azure OpenAI</h2>

<p>Setting up Genkit with Azure OpenAI is straightforward:</p>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">import</span> <span class="p">{</span> <span class="nx">genkit</span><span class="p">,</span> <span class="nx">z</span> <span class="p">}</span> <span class="k">from</span> <span class="dl">'</span><span class="s1">genkit</span><span class="dl">'</span><span class="p">;</span>
<span class="k">import</span> <span class="p">{</span>
  <span class="nx">azureOpenAI</span><span class="p">,</span>
  <span class="nx">gpt4o</span><span class="p">,</span>
  <span class="nx">onCallGenkit</span><span class="p">,</span>
  <span class="nx">requireApiKey</span><span class="p">,</span>
<span class="p">}</span> <span class="k">from</span> <span class="dl">'</span><span class="s1">genkitx-azure-openai</span><span class="dl">'</span><span class="p">;</span>
<span class="k">import</span> <span class="o">*</span> <span class="kd">as </span><span class="nx">dotenv</span> <span class="k">from</span> <span class="dl">'</span><span class="s1">dotenv</span><span class="dl">'</span><span class="p">;</span>

<span class="nx">dotenv</span><span class="p">.</span><span class="nf">config</span><span class="p">();</span>

<span class="kd">const</span> <span class="nx">ai</span> <span class="o">=</span> <span class="nf">genkit</span><span class="p">({</span>
  <span class="na">plugins</span><span class="p">:</span> <span class="p">[</span>
    <span class="nf">azureOpenAI</span><span class="p">({</span>
      <span class="c1">// Reads from environment variables:</span>
      <span class="c1">// AZURE_OPENAI_ENDPOINT</span>
      <span class="c1">// AZURE_OPENAI_API_KEY</span>
      <span class="c1">// OPENAI_API_VERSION</span>
    <span class="p">}),</span>
  <span class="p">],</span>
  <span class="na">model</span><span class="p">:</span> <span class="nx">gpt4o</span><span class="p">,</span>
<span class="p">});</span>
</code></pre></div></div>

<p>The plugin reads your Azure credentials from environment variables. No manual HTTP clients, no JSON wrangling. Just configure and go.</p>

<h2 id="defining-flows">Defining Flows</h2>

<h3 id="story-generator-flow">Story Generator Flow</h3>

<p>The story generator uses <strong>structured output</strong>, which means Genkit instructs the LLM to return a typed object matching your Zod schema directly:</p>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">const</span> <span class="nx">StoryInputSchema</span> <span class="o">=</span> <span class="nx">z</span><span class="p">.</span><span class="nf">object</span><span class="p">({</span>
  <span class="na">topic</span><span class="p">:</span> <span class="nx">z</span><span class="p">.</span><span class="nf">string</span><span class="p">().</span><span class="nf">describe</span><span class="p">(</span><span class="dl">'</span><span class="s1">The main topic or theme for the story</span><span class="dl">'</span><span class="p">),</span>
  <span class="na">style</span><span class="p">:</span> <span class="nx">z</span><span class="p">.</span><span class="nf">string</span><span class="p">().</span><span class="nf">optional</span><span class="p">().</span><span class="nf">describe</span><span class="p">(</span><span class="dl">'</span><span class="s1">Writing style (e.g., adventure, mystery, sci-fi)</span><span class="dl">'</span><span class="p">),</span>
  <span class="na">length</span><span class="p">:</span> <span class="nx">z</span><span class="p">.</span><span class="nf">enum</span><span class="p">([</span><span class="dl">'</span><span class="s1">short</span><span class="dl">'</span><span class="p">,</span> <span class="dl">'</span><span class="s1">medium</span><span class="dl">'</span><span class="p">,</span> <span class="dl">'</span><span class="s1">long</span><span class="dl">'</span><span class="p">]).</span><span class="k">default</span><span class="p">(</span><span class="dl">'</span><span class="s1">medium</span><span class="dl">'</span><span class="p">),</span>
<span class="p">});</span>

<span class="kd">const</span> <span class="nx">StorySchema</span> <span class="o">=</span> <span class="nx">z</span><span class="p">.</span><span class="nf">object</span><span class="p">({</span>
  <span class="na">title</span><span class="p">:</span> <span class="nx">z</span><span class="p">.</span><span class="nf">string</span><span class="p">(),</span>
  <span class="na">genre</span><span class="p">:</span> <span class="nx">z</span><span class="p">.</span><span class="nf">string</span><span class="p">(),</span>
  <span class="na">story</span><span class="p">:</span> <span class="nx">z</span><span class="p">.</span><span class="nf">string</span><span class="p">(),</span>
  <span class="na">wordCount</span><span class="p">:</span> <span class="nx">z</span><span class="p">.</span><span class="nf">number</span><span class="p">(),</span>
  <span class="na">themes</span><span class="p">:</span> <span class="nx">z</span><span class="p">.</span><span class="nf">array</span><span class="p">(</span><span class="nx">z</span><span class="p">.</span><span class="nf">string</span><span class="p">()),</span>
<span class="p">});</span>

<span class="kd">const</span> <span class="nx">storyGeneratorFlow</span> <span class="o">=</span> <span class="nx">ai</span><span class="p">.</span><span class="nf">defineFlow</span><span class="p">(</span>
  <span class="p">{</span>
    <span class="na">name</span><span class="p">:</span> <span class="dl">'</span><span class="s1">storyGeneratorFlow</span><span class="dl">'</span><span class="p">,</span>
    <span class="na">inputSchema</span><span class="p">:</span> <span class="nx">StoryInputSchema</span><span class="p">,</span>
    <span class="na">outputSchema</span><span class="p">:</span> <span class="nx">StorySchema</span><span class="p">,</span>
  <span class="p">},</span>
  <span class="k">async </span><span class="p">(</span><span class="nx">input</span><span class="p">)</span> <span class="o">=&gt;</span> <span class="p">{</span>
    <span class="kd">const</span> <span class="nx">lengthMap</span> <span class="o">=</span> <span class="p">{</span> <span class="na">short</span><span class="p">:</span> <span class="dl">'</span><span class="s1">200-300</span><span class="dl">'</span><span class="p">,</span> <span class="na">medium</span><span class="p">:</span> <span class="dl">'</span><span class="s1">500-700</span><span class="dl">'</span><span class="p">,</span> <span class="na">long</span><span class="p">:</span> <span class="dl">'</span><span class="s1">1000-1500</span><span class="dl">'</span> <span class="p">};</span>
    <span class="kd">const</span> <span class="nx">wordCount</span> <span class="o">=</span> <span class="nx">lengthMap</span><span class="p">[</span><span class="nx">input</span><span class="p">.</span><span class="nx">length</span><span class="p">];</span>

    <span class="kd">const</span> <span class="nx">prompt</span> <span class="o">=</span> <span class="s2">`Create a creative </span><span class="p">${</span><span class="nx">input</span><span class="p">.</span><span class="nx">style</span> <span class="o">||</span> <span class="dl">'</span><span class="s1">fictional</span><span class="dl">'</span><span class="p">}</span><span class="s2"> story with the following requirements:
      Topic: </span><span class="p">${</span><span class="nx">input</span><span class="p">.</span><span class="nx">topic</span><span class="p">}</span><span class="s2">
      Length: </span><span class="p">${</span><span class="nx">wordCount</span><span class="p">}</span><span class="s2"> words
      
      Please provide a captivating story with a clear beginning, middle, and end.
      Include rich descriptions and engaging characters.`</span><span class="p">;</span>

    <span class="kd">const</span> <span class="p">{</span> <span class="nx">output</span> <span class="p">}</span> <span class="o">=</span> <span class="k">await</span> <span class="nx">ai</span><span class="p">.</span><span class="nf">generate</span><span class="p">({</span>
      <span class="nx">prompt</span><span class="p">,</span>
      <span class="na">output</span><span class="p">:</span> <span class="p">{</span> <span class="na">schema</span><span class="p">:</span> <span class="nx">StorySchema</span> <span class="p">},</span>
    <span class="p">});</span>

    <span class="k">if </span><span class="p">(</span><span class="o">!</span><span class="nx">output</span><span class="p">)</span> <span class="p">{</span>
      <span class="k">throw</span> <span class="k">new</span> <span class="nc">Error</span><span class="p">(</span><span class="dl">'</span><span class="s1">Failed to generate story</span><span class="dl">'</span><span class="p">);</span>
    <span class="p">}</span>

    <span class="k">return</span> <span class="nx">output</span><span class="p">;</span>
  <span class="p">}</span>
<span class="p">);</span>
</code></pre></div></div>

<p>The response is a fully typed <code class="language-plaintext highlighter-rouge">StorySchema</code> object. No <code class="language-plaintext highlighter-rouge">JSON.parse</code>, no manual deserialization.</p>

<h3 id="streaming-joke-flow">Streaming Joke Flow</h3>

<p>The streaming flow uses <code class="language-plaintext highlighter-rouge">ai.generateStream</code> and emits chunks via <code class="language-plaintext highlighter-rouge">sendChunk</code>:</p>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">const</span> <span class="nx">jokeStreamingFlow</span> <span class="o">=</span> <span class="nx">ai</span><span class="p">.</span><span class="nf">defineFlow</span><span class="p">(</span>
  <span class="p">{</span>
    <span class="na">name</span><span class="p">:</span> <span class="dl">'</span><span class="s1">jokeStreamingFlow</span><span class="dl">'</span><span class="p">,</span>
    <span class="na">inputSchema</span><span class="p">:</span> <span class="nx">JokeInputSchema</span><span class="p">,</span>
    <span class="na">outputSchema</span><span class="p">:</span> <span class="nx">JokeOutputSchema</span><span class="p">,</span>
    <span class="na">streamSchema</span><span class="p">:</span> <span class="nx">z</span><span class="p">.</span><span class="nf">string</span><span class="p">(),</span>
  <span class="p">},</span>
  <span class="k">async </span><span class="p">(</span><span class="nx">input</span><span class="p">,</span> <span class="p">{</span> <span class="nx">sendChunk</span> <span class="p">})</span> <span class="o">=&gt;</span> <span class="p">{</span>
    <span class="kd">const</span> <span class="p">{</span> <span class="nx">stream</span><span class="p">,</span> <span class="nx">response</span> <span class="p">}</span> <span class="o">=</span> <span class="k">await</span> <span class="nx">ai</span><span class="p">.</span><span class="nf">generateStream</span><span class="p">({</span>
      <span class="na">prompt</span><span class="p">:</span> <span class="s2">`Tell me a long and funny joke about </span><span class="p">${</span><span class="nx">input</span><span class="p">.</span><span class="nx">subject</span><span class="p">}</span><span class="s2">`</span><span class="p">,</span>
    <span class="p">});</span>

    <span class="k">for</span> <span class="k">await </span><span class="p">(</span><span class="kd">const</span> <span class="nx">chunk</span> <span class="k">of</span> <span class="nx">stream</span><span class="p">)</span> <span class="p">{</span>
      <span class="nf">sendChunk</span><span class="p">(</span><span class="nx">chunk</span><span class="p">.</span><span class="nx">text</span><span class="p">);</span>
    <span class="p">}</span>

    <span class="kd">const</span> <span class="nx">result</span> <span class="o">=</span> <span class="k">await</span> <span class="nx">response</span><span class="p">;</span>
    <span class="k">return</span> <span class="p">{</span> <span class="na">joke</span><span class="p">:</span> <span class="nx">result</span><span class="p">.</span><span class="nx">text</span> <span class="p">};</span>
  <span class="p">}</span>
<span class="p">);</span>
</code></pre></div></div>

<h3 id="protected-summary-flow-with-api-key-auth">Protected Summary Flow with API Key Auth</h3>

<p>One of the unique features of the Azure OpenAI plugin is the built-in <code class="language-plaintext highlighter-rouge">requireApiKey</code> context provider. This lets you protect flows with API key authentication without writing any middleware:</p>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">export</span> <span class="kd">const</span> <span class="nx">protectedHandler</span> <span class="o">=</span> <span class="nf">onCallGenkit</span><span class="p">(</span>
  <span class="p">{</span>
    <span class="na">contextProvider</span><span class="p">:</span> <span class="nf">requireApiKey</span><span class="p">(</span>
      <span class="dl">'</span><span class="s1">X-API-Key</span><span class="dl">'</span><span class="p">,</span>
      <span class="nx">process</span><span class="p">.</span><span class="nx">env</span><span class="p">.</span><span class="nx">API_KEY</span> <span class="o">||</span> <span class="dl">'</span><span class="s1">demo-api-key</span><span class="dl">'</span>
    <span class="p">),</span>
    <span class="na">cors</span><span class="p">:</span> <span class="p">{</span>
      <span class="na">origin</span><span class="p">:</span> <span class="p">[</span><span class="dl">'</span><span class="s1">https://myapp.com</span><span class="dl">'</span><span class="p">,</span> <span class="dl">'</span><span class="s1">http://localhost:3000</span><span class="dl">'</span><span class="p">],</span>
      <span class="na">credentials</span><span class="p">:</span> <span class="kc">true</span><span class="p">,</span>
    <span class="p">},</span>
    <span class="na">onError</span><span class="p">:</span> <span class="k">async </span><span class="p">(</span><span class="nx">error</span><span class="p">)</span> <span class="o">=&gt;</span> <span class="p">({</span>
      <span class="na">statusCode</span><span class="p">:</span> <span class="nx">error</span><span class="p">.</span><span class="nx">message</span><span class="p">.</span><span class="nf">includes</span><span class="p">(</span><span class="dl">'</span><span class="s1">Unauthorized</span><span class="dl">'</span><span class="p">)</span> <span class="p">?</span> <span class="mi">401</span> <span class="p">:</span> <span class="mi">500</span><span class="p">,</span>
      <span class="na">message</span><span class="p">:</span> <span class="nx">error</span><span class="p">.</span><span class="nx">message</span><span class="p">,</span>
    <span class="p">}),</span>
  <span class="p">},</span>
  <span class="nx">protectedSummaryFlow</span>
<span class="p">);</span>
</code></pre></div></div>

<h2 id="registering-flows-as-azure-functions">Registering Flows as Azure Functions</h2>

<p>The <code class="language-plaintext highlighter-rouge">onCallGenkit</code> helper wraps Genkit flows as Azure Function HTTP triggers:</p>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// With CORS and debug</span>
<span class="k">export</span> <span class="kd">const</span> <span class="nx">storyGeneratorHandler</span> <span class="o">=</span> <span class="nf">onCallGenkit</span><span class="p">(</span>
  <span class="p">{</span>
    <span class="na">cors</span><span class="p">:</span> <span class="p">{</span> <span class="na">origin</span><span class="p">:</span> <span class="dl">'</span><span class="s1">*</span><span class="dl">'</span> <span class="p">},</span>
    <span class="na">debug</span><span class="p">:</span> <span class="nx">process</span><span class="p">.</span><span class="nx">env</span><span class="p">.</span><span class="nx">NODE_ENV</span> <span class="o">!==</span> <span class="dl">'</span><span class="s1">production</span><span class="dl">'</span><span class="p">,</span>
  <span class="p">},</span>
  <span class="nx">storyGeneratorFlow</span>
<span class="p">);</span>

<span class="c1">// Simplest form</span>
<span class="k">export</span> <span class="kd">const</span> <span class="nx">jokeHandler</span> <span class="o">=</span> <span class="nf">onCallGenkit</span><span class="p">(</span><span class="nx">jokeFlow</span><span class="p">);</span>

<span class="c1">// Streaming with SSE</span>
<span class="k">export</span> <span class="kd">const</span> <span class="nx">jokeStreamHandler</span> <span class="o">=</span> <span class="nf">onCallGenkit</span><span class="p">(</span>
  <span class="p">{</span> <span class="na">streaming</span><span class="p">:</span> <span class="kc">true</span><span class="p">,</span> <span class="na">cors</span><span class="p">:</span> <span class="p">{</span> <span class="na">origin</span><span class="p">:</span> <span class="dl">'</span><span class="s1">*</span><span class="dl">'</span> <span class="p">}</span> <span class="p">},</span>
  <span class="nx">jokeStreamingFlow</span>
<span class="p">);</span>
</code></pre></div></div>

<p>This gives you four Azure Function endpoints:</p>

<table>
  <thead>
    <tr>
      <th>Flow</th>
      <th>Endpoint</th>
      <th>Auth</th>
      <th>Features</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>storyGeneratorFlow</td>
      <td>POST /api/storyGeneratorFlow</td>
      <td>anonymous</td>
      <td>CORS, debug</td>
    </tr>
    <tr>
      <td>jokeFlow</td>
      <td>POST /api/jokeFlow</td>
      <td>anonymous</td>
      <td>Simplest form</td>
    </tr>
    <tr>
      <td>jokeStreamingFlow</td>
      <td>POST /api/jokeStreamingFlow</td>
      <td>anonymous</td>
      <td>SSE streaming</td>
    </tr>
    <tr>
      <td>protectedSummaryFlow</td>
      <td>POST /api/protectedSummaryFlow</td>
      <td>API key</td>
      <td>Auth, CORS, custom error handler</td>
    </tr>
  </tbody>
</table>

<h2 id="the-genkit-dev-ui">The Genkit Dev UI</h2>

<p>Run the following command to start the Genkit Developer UI:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>npm run genkit:ui
</code></pre></div></div>

<p>This opens the Dev UI at <code class="language-plaintext highlighter-rouge">http://localhost:4000</code> where you can:</p>

<ul>
  <li><strong>Test all four flows</strong> with different inputs visually</li>
  <li><strong>View detailed traces</strong> of every AI call, including the prompt sent, model response, latency, and token usage</li>
  <li><strong>Debug streaming flows</strong> and watch chunks arrive in real-time</li>
  <li><strong>Inspect structured outputs</strong> and verify the schema is being followed</li>
</ul>

<p>The Dev UI works identically whether you are using Azure OpenAI, Google Gemini, or AWS Bedrock. This unified experience is one of Genkit’s greatest strengths: you develop and debug the same way regardless of the backend.</p>

<h2 id="local-development">Local Development</h2>

<h3 id="run-with-azure-functions-core-tools">Run with Azure Functions Core Tools</h3>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>npm run func:start
</code></pre></div></div>

<p>This starts a local server at <code class="language-plaintext highlighter-rouge">http://localhost:7071</code> that mimics the Azure Functions runtime:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>curl <span class="nt">-X</span> POST http://localhost:7071/api/storyGeneratorFlow <span class="se">\</span>
  <span class="nt">-H</span> <span class="s2">"Content-Type: application/json"</span> <span class="se">\</span>
  <span class="nt">-d</span> <span class="s1">'{
    "data": {
      "topic": "a robot learning to feel emotions",
      "style": "sci-fi",
      "length": "medium"
    }
  }'</span>
</code></pre></div></div>

<h3 id="using-the-genkit-client-sdk">Using the Genkit Client SDK</h3>

<p>You can also call your flows using the <code class="language-plaintext highlighter-rouge">genkit/beta/client</code> SDK:</p>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">import</span> <span class="p">{</span> <span class="nx">runFlow</span><span class="p">,</span> <span class="nx">streamFlow</span> <span class="p">}</span> <span class="k">from</span> <span class="dl">'</span><span class="s1">genkit/beta/client</span><span class="dl">'</span><span class="p">;</span>

<span class="kd">const</span> <span class="nx">result</span> <span class="o">=</span> <span class="k">await</span> <span class="nf">runFlow</span><span class="p">({</span>
  <span class="na">url</span><span class="p">:</span> <span class="dl">'</span><span class="s1">http://localhost:7071/api/jokeFlow</span><span class="dl">'</span><span class="p">,</span>
  <span class="na">input</span><span class="p">:</span> <span class="p">{</span> <span class="na">subject</span><span class="p">:</span> <span class="dl">'</span><span class="s1">programming</span><span class="dl">'</span> <span class="p">},</span>
<span class="p">});</span>

<span class="c1">// Streaming</span>
<span class="kd">const</span> <span class="nx">result</span> <span class="o">=</span> <span class="nf">streamFlow</span><span class="p">({</span>
  <span class="na">url</span><span class="p">:</span> <span class="dl">'</span><span class="s1">http://localhost:7071/api/jokeStreamingFlow</span><span class="dl">'</span><span class="p">,</span>
  <span class="na">input</span><span class="p">:</span> <span class="p">{</span> <span class="na">subject</span><span class="p">:</span> <span class="dl">'</span><span class="s1">TypeScript</span><span class="dl">'</span> <span class="p">},</span>
<span class="p">});</span>
<span class="k">for</span> <span class="k">await </span><span class="p">(</span><span class="kd">const</span> <span class="nx">chunk</span> <span class="k">of</span> <span class="nx">result</span><span class="p">.</span><span class="nx">stream</span><span class="p">)</span> <span class="p">{</span>
  <span class="nx">process</span><span class="p">.</span><span class="nx">stdout</span><span class="p">.</span><span class="nf">write</span><span class="p">(</span><span class="nx">chunk</span><span class="p">);</span>
<span class="p">}</span>

<span class="c1">// With API key auth</span>
<span class="kd">const</span> <span class="nx">result</span> <span class="o">=</span> <span class="k">await</span> <span class="nf">runFlow</span><span class="p">({</span>
  <span class="na">url</span><span class="p">:</span> <span class="dl">'</span><span class="s1">http://localhost:7071/api/protectedSummaryFlow</span><span class="dl">'</span><span class="p">,</span>
  <span class="na">input</span><span class="p">:</span> <span class="p">{</span> <span class="na">text</span><span class="p">:</span> <span class="dl">'</span><span class="s1">Your long text...</span><span class="dl">'</span><span class="p">,</span> <span class="na">maxLength</span><span class="p">:</span> <span class="mi">50</span> <span class="p">},</span>
  <span class="na">headers</span><span class="p">:</span> <span class="p">{</span> <span class="dl">'</span><span class="s1">X-API-Key</span><span class="dl">'</span><span class="p">:</span> <span class="dl">'</span><span class="s1">demo-api-key</span><span class="dl">'</span> <span class="p">},</span>
<span class="p">});</span>
</code></pre></div></div>

<h2 id="deployment">Deployment</h2>

<p>Build and deploy to Azure:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>npm run build
npm run deploy
</code></pre></div></div>

<p>Make sure to set your environment variables in Azure:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>az functionapp config appsettings <span class="nb">set</span> <span class="se">\</span>
  <span class="nt">--name</span> myFunctionAppName <span class="se">\</span>
  <span class="nt">--resource-group</span> myResourceGroup <span class="se">\</span>
  <span class="nt">--settings</span> <span class="se">\</span>
    <span class="nv">AZURE_OPENAI_ENDPOINT</span><span class="o">=</span><span class="s2">"https://your-resource-name.openai.azure.com/"</span> <span class="se">\</span>
    <span class="nv">AZURE_OPENAI_API_KEY</span><span class="o">=</span><span class="s2">"your-api-key-here"</span> <span class="se">\</span>
    <span class="nv">OPENAI_API_VERSION</span><span class="o">=</span><span class="s2">"2024-10-21"</span>
</code></pre></div></div>

<h2 id="resources">Resources</h2>

<ul>
  <li><a href="https://genkit.dev/docs/">Genkit Documentation</a></li>
  <li><a href="https://github.com/genkit-ai/azure-foundry-js-plugin">Azure OpenAI Plugin</a></li>
  <li><a href="https://docs.microsoft.com/azure/azure-functions/">Azure Functions Documentation</a></li>
  <li><a href="https://azure.microsoft.com/services/cognitive-services/openai-service/">Azure OpenAI Service</a></li>
</ul>

<h2 id="conclusion">Conclusion</h2>

<p>Genkit brings a consistent, delightful developer experience to Azure Functions. The combination of typed flows with Zod schemas, structured LLM output, the Dev UI for visual debugging, and the <code class="language-plaintext highlighter-rouge">onCallGenkit</code> helper makes building AI-powered Azure Functions as straightforward as defining a function.</p>

<p>You can find the full code of this example in the <a href="https://github.com/xavidop/genkit-azure-function-ai-foundry">GitHub repository</a></p>

<p>Happy coding!</p>]]></content><author><name>Xavier Portilla Edo</name><email>xavierportillaedo@gmail.com</email></author><category term="genkit" /><category term="genkit" /><category term="gcp" /><category term="azure" /><summary type="html"><![CDATA[Deploy AI-powered flows to Azure Functions using Genkit and the Azure OpenAI plugin]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://xavidop.me/assets/img/blog/post-headers/genkit-azure-function-ai-foundry.png" /><media:content medium="image" url="https://xavidop.me/assets/img/blog/post-headers/genkit-azure-function-ai-foundry.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry xml:lang="en"><title type="html">Building a Perplexity-like CLI with Genkit, Gemini, and Tavily (English)</title><link href="https://xavidop.me/genkit/gcp/2026-03-20-genkit-perplexity-cli/" rel="alternate" type="text/html" title="Building a Perplexity-like CLI with Genkit, Gemini, and Tavily (English)" /><published>2026-03-20T00:00:00+00:00</published><updated>2026-03-20T17:41:31+00:00</updated><id>https://xavidop.me/genkit/gcp/genkit-perplexity-cli</id><content type="html" xml:base="https://xavidop.me/genkit/gcp/2026-03-20-genkit-perplexity-cli/"><![CDATA[<ol class="no_toc" id="markdown-toc">
  <li><a href="#introduction" id="markdown-toc-introduction">Introduction</a></li>
  <li><a href="#features" id="markdown-toc-features">Features</a></li>
  <li><a href="#prerequisites" id="markdown-toc-prerequisites">Prerequisites</a></li>
  <li><a href="#project-structure" id="markdown-toc-project-structure">Project Structure</a></li>
  <li><a href="#how-it-works" id="markdown-toc-how-it-works">How It Works</a></li>
  <li><a href="#initializing-genkit" id="markdown-toc-initializing-genkit">Initializing Genkit</a></li>
  <li><a href="#defining-a-tool-web-search" id="markdown-toc-defining-a-tool-web-search">Defining a Tool: Web Search</a></li>
  <li><a href="#creating-the-chat-agent" id="markdown-toc-creating-the-chat-agent">Creating the Chat Agent</a></li>
  <li><a href="#the-interactive-chat-loop" id="markdown-toc-the-interactive-chat-loop">The Interactive Chat Loop</a></li>
  <li><a href="#genkit-dev-ui-for-agent-debugging" id="markdown-toc-genkit-dev-ui-for-agent-debugging">Genkit Dev UI for Agent Debugging</a></li>
  <li><a href="#example-session" id="markdown-toc-example-session">Example Session</a></li>
  <li><a href="#setup--development" id="markdown-toc-setup--development">Setup &amp; Development</a></li>
  <li><a href="#resources" id="markdown-toc-resources">Resources</a></li>
  <li><a href="#conclusion" id="markdown-toc-conclusion">Conclusion</a></li>
</ol>

<h2 id="introduction">Introduction</h2>

<p>What if you could build your own <strong>Perplexity AI</strong>, right in your terminal? In this post, we will walk through an interactive command-line chat tool that searches the web and generates comprehensive AI-powered answers with cited sources, all built with <strong>Genkit</strong>, <strong>Gemini 3 Pro</strong>, and the <strong>Tavily Search API</strong>.</p>

<p>This project showcases some of Genkit’s most powerful features: <strong>AI agents</strong>, <strong>tool calling</strong>, <strong>chat sessions with persistent history</strong>, and <strong>prompt definitions</strong>, all wired together in a clean TypeScript codebase.</p>

<h2 id="features">Features</h2>

<ul>
  <li>Web search powered by Tavily API</li>
  <li>AI-generated comprehensive answers using Genkit with Gemini 3 Pro</li>
  <li>Interactive chat mode with persistent conversation history</li>
  <li>AI agents with tool-calling capabilities</li>
  <li>Cited sources with URLs</li>
  <li>Beautiful terminal output with colors and spinners</li>
</ul>

<h2 id="prerequisites">Prerequisites</h2>

<ul>
  <li><strong>Node.js 18</strong> or higher</li>
  <li><strong>TypeScript</strong> (installed as dev dependency)</li>
  <li><strong>Tavily API key</strong> (get it from <a href="https://tavily.com/">tavily.com</a>)</li>
  <li><strong>Google AI API key</strong> (get it from <a href="https://aistudio.google.com/app/apikey">Google AI Studio</a>)</li>
</ul>

<h2 id="project-structure">Project Structure</h2>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>perplexity-cli/
├── index.ts             # Main entry point with interactive chat interface
├── src/
│   ├── search.ts        # Tavily search integration
│   └── agent.ts         # Genkit AI agent with tool definitions
├── tsconfig.json        # TypeScript configuration
├── package.json
├── .env                 # Environment variables
└── README.md
</code></pre></div></div>

<h2 id="how-it-works">How It Works</h2>

<p>The architecture follows a clean flow:</p>

<ol>
  <li><strong>Interactive Session</strong>: The CLI starts an interactive chat session with persistent conversation history</li>
  <li><strong>Query Processing</strong>: When you ask a question, the AI agent decides if it needs current web information</li>
  <li><strong>Tool Calling</strong>: If needed, the AI agent automatically calls the web search tool via Tavily</li>
  <li><strong>Search</strong>: Tavily searches the web for relevant, up-to-date information</li>
  <li><strong>Generate</strong>: Genkit with Gemini 3 Pro analyzes the search results and conversation context to generate a comprehensive answer</li>
  <li><strong>Display</strong>: The answer is displayed in the terminal with proper formatting and source citations</li>
</ol>

<h2 id="initializing-genkit">Initializing Genkit</h2>

<p>The main entry point sets up Genkit with the Google AI plugin:</p>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">import</span> <span class="p">{</span> <span class="nx">genkit</span> <span class="p">}</span> <span class="k">from</span> <span class="dl">'</span><span class="s1">genkit/beta</span><span class="dl">'</span><span class="p">;</span>
<span class="k">import</span> <span class="p">{</span> <span class="nx">googleAI</span> <span class="p">}</span> <span class="k">from</span> <span class="dl">'</span><span class="s1">@genkit-ai/googleai</span><span class="dl">'</span><span class="p">;</span>
<span class="k">import</span> <span class="p">{</span> <span class="nx">tavily</span> <span class="p">}</span> <span class="k">from</span> <span class="dl">'</span><span class="s1">@tavily/core</span><span class="dl">'</span><span class="p">;</span>

<span class="kd">const</span> <span class="nx">client</span> <span class="o">=</span> <span class="nf">tavily</span><span class="p">({</span> <span class="na">apiKey</span><span class="p">:</span> <span class="nx">TavilyApiKey</span> <span class="p">});</span>
<span class="kd">const</span> <span class="nx">ai</span> <span class="o">=</span> <span class="nf">genkit</span><span class="p">({</span>
  <span class="na">plugins</span><span class="p">:</span> <span class="p">[</span><span class="nf">googleAI</span><span class="p">({</span> <span class="na">apiKey</span><span class="p">:</span> <span class="nx">GeminiApiKey</span> <span class="p">})],</span>
<span class="p">});</span>
</code></pre></div></div>

<p>Notice we are using <code class="language-plaintext highlighter-rouge">genkit/beta</code> here. This gives us access to the chat API, which is key for maintaining conversation history across multiple interactions.</p>

<h2 id="defining-a-tool-web-search">Defining a Tool: Web Search</h2>

<p>One of Genkit’s most powerful features is <strong>tool calling</strong>. You define tools with typed schemas, and the LLM decides when to invoke them. Here is the web search tool:</p>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">import</span> <span class="p">{</span> <span class="nx">z</span> <span class="p">}</span> <span class="k">from</span> <span class="dl">'</span><span class="s1">zod</span><span class="dl">'</span><span class="p">;</span>
<span class="k">import</span> <span class="p">{</span> <span class="nx">searchWeb</span> <span class="p">}</span> <span class="k">from</span> <span class="dl">'</span><span class="s1">./search.js</span><span class="dl">'</span><span class="p">;</span>

<span class="k">export</span> <span class="kd">function</span> <span class="nf">createSearchTool</span><span class="p">(</span><span class="nx">ai</span><span class="p">:</span> <span class="nx">GenkitBeta</span><span class="p">,</span> <span class="nx">client</span><span class="p">:</span> <span class="nx">TavilyClient</span><span class="p">)</span> <span class="p">{</span>
  <span class="k">return</span> <span class="nx">ai</span><span class="p">.</span><span class="nf">defineTool</span><span class="p">(</span>
    <span class="p">{</span>
      <span class="na">name</span><span class="p">:</span> <span class="dl">'</span><span class="s1">searchWeb</span><span class="dl">'</span><span class="p">,</span>
      <span class="na">description</span><span class="p">:</span>
        <span class="dl">'</span><span class="s1">Search the web for current information to answer user queries. </span><span class="dl">'</span> <span class="o">+</span>
        <span class="dl">'</span><span class="s1">Use this when you need up-to-date or factual information.</span><span class="dl">'</span><span class="p">,</span>
      <span class="na">inputSchema</span><span class="p">:</span> <span class="nx">z</span><span class="p">.</span><span class="nf">object</span><span class="p">({</span>
        <span class="na">query</span><span class="p">:</span> <span class="nx">z</span><span class="p">.</span><span class="nf">string</span><span class="p">().</span><span class="nf">describe</span><span class="p">(</span><span class="dl">'</span><span class="s1">The search query to look up</span><span class="dl">'</span><span class="p">),</span>
      <span class="p">}),</span>
      <span class="na">outputSchema</span><span class="p">:</span> <span class="nx">z</span><span class="p">.</span><span class="nf">string</span><span class="p">().</span><span class="nf">describe</span><span class="p">(</span><span class="dl">'</span><span class="s1">Search results with titles, URLs, and content</span><span class="dl">'</span><span class="p">),</span>
    <span class="p">},</span>
    <span class="k">async </span><span class="p">(</span><span class="nx">input</span><span class="p">:</span> <span class="p">{</span> <span class="nl">query</span><span class="p">:</span> <span class="kr">string</span> <span class="p">})</span> <span class="o">=&gt;</span> <span class="p">{</span>
      <span class="kd">const</span> <span class="nx">searchResults</span> <span class="o">=</span> <span class="k">await</span> <span class="nf">searchWeb</span><span class="p">(</span><span class="nx">client</span><span class="p">,</span> <span class="nx">input</span><span class="p">.</span><span class="nx">query</span><span class="p">,</span> <span class="mi">5</span><span class="p">);</span>

      <span class="kd">const</span> <span class="nx">formattedResults</span> <span class="o">=</span> <span class="nx">searchResults</span><span class="p">.</span><span class="nx">results</span>
        <span class="p">.</span><span class="nf">map</span><span class="p">((</span><span class="nx">result</span><span class="p">,</span> <span class="nx">index</span><span class="p">)</span> <span class="o">=&gt;</span> <span class="p">{</span>
          <span class="k">return</span> <span class="s2">`[</span><span class="p">${</span><span class="nx">index</span> <span class="o">+</span> <span class="mi">1</span><span class="p">}</span><span class="s2">] </span><span class="p">${</span><span class="nx">result</span><span class="p">.</span><span class="nx">title</span><span class="p">}</span><span class="s2">\nURL: </span><span class="p">${</span><span class="nx">result</span><span class="p">.</span><span class="nx">url</span><span class="p">}</span><span class="s2">\nContent: </span><span class="p">${</span><span class="nx">result</span><span class="p">.</span><span class="nx">content</span><span class="p">}</span><span class="s2">\n`</span><span class="p">;</span>
        <span class="p">})</span>
        <span class="p">.</span><span class="nf">join</span><span class="p">(</span><span class="dl">'</span><span class="se">\n</span><span class="dl">'</span><span class="p">);</span>

      <span class="k">return</span> <span class="nx">formattedResults</span><span class="p">;</span>
    <span class="p">}</span>
  <span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>

<p>The tool is defined with:</p>
<ul>
  <li>A <strong>name</strong> and <strong>description</strong> that help the LLM understand when to use it</li>
  <li>A typed <strong>input schema</strong> (what the LLM needs to provide)</li>
  <li>A typed <strong>output schema</strong> (what the tool returns)</li>
  <li>An <strong>implementation</strong> that calls the external Tavily API</li>
</ul>

<p>Genkit handles the entire tool-calling loop: the LLM decides it needs web results, Genkit invokes the tool, feeds the results back to the LLM, and the LLM generates the final answer.</p>

<h2 id="creating-the-chat-agent">Creating the Chat Agent</h2>

<p>The chat agent combines the search tool with a prompt definition and Genkit’s chat API:</p>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">export</span> <span class="kd">function</span> <span class="nf">createChatAgent</span><span class="p">(</span>
  <span class="nx">ai</span><span class="p">:</span> <span class="nx">GenkitBeta</span><span class="p">,</span>
  <span class="nx">client</span><span class="p">:</span> <span class="nx">TavilyClient</span><span class="p">,</span>
  <span class="nx">model</span><span class="p">:</span> <span class="nx">ModelReference</span><span class="o">&lt;</span><span class="k">typeof</span> <span class="nx">GeminiConfigSchema</span><span class="o">&gt;</span>
<span class="p">)</span> <span class="p">{</span>
  <span class="kd">const</span> <span class="nx">searchTool</span> <span class="o">=</span> <span class="nf">createSearchTool</span><span class="p">(</span><span class="nx">ai</span><span class="p">,</span> <span class="nx">client</span><span class="p">);</span>

  <span class="kd">const</span> <span class="nx">searchPrompt</span> <span class="o">=</span> <span class="nx">ai</span><span class="p">.</span><span class="nf">definePrompt</span><span class="p">({</span>
    <span class="na">name</span><span class="p">:</span> <span class="dl">'</span><span class="s1">searchPrompt</span><span class="dl">'</span><span class="p">,</span>
    <span class="na">description</span><span class="p">:</span> <span class="dl">'</span><span class="s1">Tool that searches the web to answer user queries.</span><span class="dl">'</span><span class="p">,</span>
    <span class="na">input</span><span class="p">:</span> <span class="p">{</span>
      <span class="na">schema</span><span class="p">:</span> <span class="nx">z</span><span class="p">.</span><span class="nf">object</span><span class="p">({</span>
        <span class="na">query</span><span class="p">:</span> <span class="nx">z</span><span class="p">.</span><span class="nf">string</span><span class="p">().</span><span class="nf">describe</span><span class="p">(</span><span class="dl">'</span><span class="s1">The user query to be answered using web search results</span><span class="dl">'</span><span class="p">),</span>
      <span class="p">}),</span>
    <span class="p">},</span>
    <span class="na">tools</span><span class="p">:</span> <span class="p">[</span><span class="nx">searchTool</span><span class="p">],</span>
    <span class="na">prompt</span><span class="p">:</span> <span class="s2">`You are a helpful AI assistant that provides comprehensive and accurate answers based on web search results.

User Query: 

Instructions:
1. Provide a comprehensive answer based on the search results
2. Synthesize information from multiple sources when relevant
3. Be factual and cite specific sources using [1], [2], etc. notation
4. If the search results don't contain enough information, acknowledge this
5. Keep the answer clear and well-structured
6. Use markdown formatting for better readability
7. Please use the tool searchPrompt always when you need to look up current information
8. Add a section at the end titled "Sources" listing the URLs of the references used

Answer:`</span><span class="p">,</span>
  <span class="p">});</span>

  <span class="k">return</span> <span class="nx">ai</span><span class="p">.</span><span class="nf">chat</span><span class="p">(</span><span class="nx">searchPrompt</span><span class="p">,</span> <span class="p">{</span> <span class="na">model</span><span class="p">:</span> <span class="nx">model</span> <span class="p">});</span>
<span class="p">}</span>
</code></pre></div></div>

<p>Key things to notice:</p>
<ul>
  <li><strong><code class="language-plaintext highlighter-rouge">ai.definePrompt</code></strong> creates a reusable prompt with typed input, tools, and instructions</li>
  <li><strong><code class="language-plaintext highlighter-rouge">ai.chat</code></strong> creates a chat session with <strong>persistent memory</strong>, so follow-up questions have full context from previous interactions</li>
  <li>The prompt instructs the LLM to use the search tool and cite sources</li>
</ul>

<h2 id="the-interactive-chat-loop">The Interactive Chat Loop</h2>

<p>The main loop is a simple readline interface that sends each query to the chat agent:</p>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">const</span> <span class="nx">chat</span> <span class="o">=</span> <span class="nf">createChatAgent</span><span class="p">(</span><span class="nx">ai</span><span class="p">,</span> <span class="nx">client</span><span class="p">,</span> <span class="nx">googleAI</span><span class="p">.</span><span class="nf">model</span><span class="p">(</span><span class="dl">'</span><span class="s1">gemini-3-pro-preview</span><span class="dl">'</span><span class="p">));</span>

<span class="nx">rl</span><span class="p">.</span><span class="nf">on</span><span class="p">(</span><span class="dl">'</span><span class="s1">line</span><span class="dl">'</span><span class="p">,</span> <span class="k">async </span><span class="p">(</span><span class="nx">line</span><span class="p">:</span> <span class="kr">string</span><span class="p">)</span> <span class="o">=&gt;</span> <span class="p">{</span>
  <span class="kd">const</span> <span class="nx">query</span> <span class="o">=</span> <span class="nx">line</span><span class="p">.</span><span class="nf">trim</span><span class="p">();</span>

  <span class="kd">let</span> <span class="nx">spinner</span> <span class="o">=</span> <span class="nf">ora</span><span class="p">(</span><span class="dl">'</span><span class="s1">Thinking...</span><span class="dl">'</span><span class="p">).</span><span class="nf">start</span><span class="p">();</span>

  <span class="kd">const</span> <span class="p">{</span> <span class="nx">text</span> <span class="p">}</span> <span class="o">=</span> <span class="k">await</span> <span class="nx">chat</span><span class="p">.</span><span class="nf">send</span><span class="p">(</span><span class="nx">query</span><span class="p">);</span>

  <span class="nx">spinner</span><span class="p">.</span><span class="nf">succeed</span><span class="p">(</span><span class="dl">'</span><span class="s1">Response generated</span><span class="dl">'</span><span class="p">);</span>
  <span class="nx">console</span><span class="p">.</span><span class="nf">log</span><span class="p">(</span><span class="nx">chalk</span><span class="p">.</span><span class="nf">white</span><span class="p">(</span><span class="dl">'</span><span class="se">\n</span><span class="dl">'</span> <span class="o">+</span> <span class="nx">text</span> <span class="o">+</span> <span class="dl">'</span><span class="se">\n</span><span class="dl">'</span><span class="p">));</span>
<span class="p">});</span>
</code></pre></div></div>

<p>Each call to <code class="language-plaintext highlighter-rouge">chat.send()</code> automatically includes the full conversation history. The agent decides whether to search the web or answer from context, and Genkit handles the tool-calling orchestration behind the scenes.</p>

<h2 id="genkit-dev-ui-for-agent-debugging">Genkit Dev UI for Agent Debugging</h2>

<p>Even though this is a CLI tool, you can still use the <strong>Genkit Dev UI</strong> to debug and test the agent. The Dev UI is invaluable for:</p>

<ul>
  <li><strong>Inspecting tool calls</strong>: See exactly when the agent decides to call the search tool, what query it constructs, and what results come back</li>
  <li><strong>Viewing traces</strong>: Every interaction generates a detailed trace showing the full chain of LLM calls, tool invocations, and final responses</li>
  <li><strong>Testing prompts</strong>: Tweak the system prompt and test with different queries without restarting the CLI</li>
  <li><strong>Monitoring latency</strong>: See how long each step takes, from tool calls to LLM generation</li>
</ul>

<p>To start the Dev UI alongside your project:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>npx genkit start <span class="nt">--</span> npx tsx index.ts
</code></pre></div></div>

<p>This runs your CLI with the Genkit Dev UI available at <code class="language-plaintext highlighter-rouge">http://localhost:4000</code>.</p>

<h2 id="example-session">Example Session</h2>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>╔════════════════════════════════════════════════════╗
║   Welcome to Perplexity CLI - Interactive Mode     ║
╚════════════════════════════════════════════════════╝
Type your questions and get AI-powered answers with sources!
Chat <span class="nb">history </span>is maintained during this session.
Commands: <span class="nb">exit</span>, quit, or press Ctrl+C to leave

💬 Ask a question <span class="o">(</span>or <span class="nb">type</span> <span class="s2">"exit"</span> to quit<span class="o">)</span>: What is quantum computing?
✓ Response generated

Quantum computing is a revolutionary approach to computation that leverages
the principles of quantum mechanics. Unlike classical computers that use bits
<span class="o">(</span>0s and 1s<span class="o">)</span>, quantum computers use quantum bits or <span class="s2">"qubits"</span> that can exist
<span class="k">in </span>multiple states simultaneously...

Sources:
<span class="o">[</span>1] https://example.com/quantum-basics
<span class="o">[</span>2] https://example.com/quantum-intro

💬 Ask a question <span class="o">(</span>or <span class="nb">type</span> <span class="s2">"exit"</span> to quit<span class="o">)</span>: How does it differ from classical computing?
✓ Response generated

<span class="o">[</span>AI continues the conversation with context from previous question]
</code></pre></div></div>

<h2 id="setup--development">Setup &amp; Development</h2>

<ol>
  <li>Clone and install dependencies:
    <div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>git clone https://github.com/xavidop/perplexity-cli.git
<span class="nb">cd </span>perplexity-cli
npm <span class="nb">install</span>
</code></pre></div>    </div>
  </li>
  <li>Create a <code class="language-plaintext highlighter-rouge">.env</code> file with your API keys:
    <div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">TAVILY_API_KEY</span><span class="o">=</span>your_tavily_api_key_here
<span class="nv">GOOGLE_API_KEY</span><span class="o">=</span>your_google_api_key_here
</code></pre></div>    </div>
  </li>
  <li>Run in development mode:
    <div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>npm run dev
</code></pre></div>    </div>
  </li>
  <li>Build for production:
    <div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>npm run build
npm start
</code></pre></div>    </div>
  </li>
</ol>

<h2 id="resources">Resources</h2>

<ul>
  <li><a href="https://genkit.dev/docs/">Genkit Documentation</a></li>
  <li><a href="https://genkit.dev/docs/plugins/google-genai/">Google AI Plugin</a></li>
  <li><a href="https://tavily.com/">Tavily API</a></li>
  <li><a href="https://ai.google.dev/">Gemini Models</a></li>
</ul>

<h2 id="conclusion">Conclusion</h2>

<p>This project demonstrates some of Genkit’s most exciting capabilities: <strong>AI agents</strong> that autonomously decide when to use tools, <strong>tool calling</strong> with typed schemas, <strong>chat sessions</strong> with persistent history, and <strong>prompt definitions</strong> that keep your AI logic clean and reusable. The Genkit Dev UI ties it all together by giving you full visibility into every decision the agent makes.</p>

<p>You can find the full code of this example in the <a href="https://github.com/xavidop/perplexity-cli">GitHub repository</a></p>

<p>Happy coding!</p>]]></content><author><name>Xavier Portilla Edo</name><email>xavierportillaedo@gmail.com</email></author><category term="genkit" /><category term="gcp" /><category term="genkit" /><category term="gcp" /><summary type="html"><![CDATA[An interactive CLI chat tool with AI-powered web search using Genkit agents and tool calling]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://xavidop.me/assets/img/blog/post-headers/genkit-perplexity-cli.png" /><media:content medium="image" url="https://xavidop.me/assets/img/blog/post-headers/genkit-perplexity-cli.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry xml:lang="en"><title type="html">It Has Never Been This Easy to Build Gen AI Features in Java (English)</title><link href="https://xavidop.me/genkit/gcp/2026-02-10-genkit-java-101/" rel="alternate" type="text/html" title="It Has Never Been This Easy to Build Gen AI Features in Java (English)" /><published>2026-02-10T00:00:00+00:00</published><updated>2026-03-20T17:41:31+00:00</updated><id>https://xavidop.me/genkit/gcp/genkit-java-101</id><content type="html" xml:base="https://xavidop.me/genkit/gcp/2026-02-10-genkit-java-101/"><![CDATA[<ol class="no_toc" id="markdown-toc">
  <li><a href="#introduction" id="markdown-toc-introduction">Introduction</a></li>
  <li><a href="#why-genkit-java" id="markdown-toc-why-genkit-java">Why Genkit Java?</a></li>
  <li><a href="#what-were-building" id="markdown-toc-what-were-building">What We’re Building</a></li>
  <li><a href="#prerequisites" id="markdown-toc-prerequisites">Prerequisites</a>    <ol>
      <li><a href="#install-the-genkit-cli" id="markdown-toc-install-the-genkit-cli">Install the Genkit CLI</a></li>
    </ol>
  </li>
  <li><a href="#project-structure" id="markdown-toc-project-structure">Project Structure</a></li>
  <li><a href="#getting-started" id="markdown-toc-getting-started">Getting Started</a>    <ol>
      <li><a href="#1-clone-and-set-your-api-key" id="markdown-toc-1-clone-and-set-your-api-key">1. Clone and Set Your API Key</a></li>
      <li><a href="#2-run-with-the-genkit-dev-ui-recommended" id="markdown-toc-2-run-with-the-genkit-dev-ui-recommended">2. Run with the Genkit Dev UI (Recommended)</a></li>
      <li><a href="#3-or-run-directly-without-dev-ui" id="markdown-toc-3-or-run-directly-without-dev-ui">3. Or Run Directly (Without Dev UI)</a></li>
    </ol>
  </li>
  <li><a href="#the-code-its-stupidly-simple" id="markdown-toc-the-code-its-stupidly-simple">The Code, It’s Stupidly Simple</a>    <ol>
      <li><a href="#step-1-define-typed-inputoutput-classes" id="markdown-toc-step-1-define-typed-inputoutput-classes">Step 1: Define Typed Input/Output Classes</a></li>
      <li><a href="#step-2-initialize-genkit" id="markdown-toc-step-2-initialize-genkit">Step 2: Initialize Genkit</a></li>
      <li><a href="#step-3-define-a-flow-with-typed-classes-and-structured-output" id="markdown-toc-step-3-define-a-flow-with-typed-classes-and-structured-output">Step 3: Define a Flow with Typed Classes and Structured Output</a></li>
    </ol>
  </li>
  <li><a href="#the-genkit-dev-ui---your-ai-playground" id="markdown-toc-the-genkit-dev-ui---your-ai-playground">The Genkit Dev UI - Your AI Playground</a>    <ol>
      <li><a href="#what-can-you-do-in-the-dev-ui" id="markdown-toc-what-can-you-do-in-the-dev-ui">What Can You Do in the Dev UI?</a></li>
    </ol>
  </li>
  <li><a href="#deploying-to-google-cloud-run" id="markdown-toc-deploying-to-google-cloud-run">Deploying to Google Cloud Run</a>    <ol>
      <li><a href="#step-by-step-deployment" id="markdown-toc-step-by-step-deployment">Step-by-Step Deployment</a></li>
      <li><a href="#why-jib" id="markdown-toc-why-jib">Why Jib?</a></li>
    </ol>
  </li>
  <li><a href="#available-flows--api-examples" id="markdown-toc-available-flows--api-examples">Available Flows &amp; API Examples</a>    <ol>
      <li><a href="#translate-text" id="markdown-toc-translate-text">Translate Text</a></li>
    </ol>
  </li>
  <li><a href="#what-genkit-gives-you-for-free" id="markdown-toc-what-genkit-gives-you-for-free">What Genkit Gives You for Free</a>    <ol>
      <li><a href="#observability-zero-config" id="markdown-toc-observability-zero-config">Observability (Zero Config)</a></li>
      <li><a href="#plugin-ecosystem" id="markdown-toc-plugin-ecosystem">Plugin Ecosystem</a></li>
      <li><a href="#type-safety" id="markdown-toc-type-safety">Type Safety</a></li>
    </ol>
  </li>
  <li><a href="#whats-next" id="markdown-toc-whats-next">What’s Next?</a></li>
  <li><a href="#conclusion" id="markdown-toc-conclusion">Conclusion</a></li>
</ol>

<h2 id="introduction">Introduction</h2>

<p>Building generative AI applications in Java used to be a complex, boilerplate-heavy endeavor. You’d wrestle with raw HTTP clients, hand-craft JSON payloads, parse streaming responses, manage API keys, and stitch together observability, all before writing a single line of actual AI logic. Those days are over.</p>

<p><strong><a href="https://github.com/genkit-ai/genkit-java">Genkit Java</a></strong> is an open-source framework that makes building AI-powered applications in Java as straightforward as defining a function. Pair it with <strong>Google’s Gemini</strong> models and <strong>Google Cloud Run</strong>, and you can go from zero to a production-deployed generative AI service in minutes, not days.</p>

<p>This is a complete, working example. Clone it, set your API key, and run.</p>

<h2 id="why-genkit-java">Why Genkit Java?</h2>

<p>If you’re a Java developer, you’ve probably watched the Gen AI revolution unfold mostly in Python and TypeScript. The tooling, the frameworks, the tutorials, all skewed toward those ecosystems. Java developers were left to either build everything from scratch or use verbose, low-level SDKs.</p>

<p>Genkit Java changes that. Here’s what makes it different:</p>

<table>
  <tbody>
    <tr>
      <td>Feature</td>
      <td>Without Genkit</td>
      <td>With Genkit</td>
    </tr>
    <tr>
      <td> </td>
      <td> </td>
      <td> </td>
    </tr>
    <tr>
      <td>Call Gemini</td>
      <td>Manual HTTP client, JSON parsing, error handling</td>
      <td><code class="language-plaintext highlighter-rouge">genkit.generate(...)</code>, one method call</td>
    </tr>
    <tr>
      <td>Expose as API</td>
      <td>Set up Spring Boot, write controllers, handle serialization</td>
      <td><code class="language-plaintext highlighter-rouge">genkit.defineFlow(...)</code>, auto-exposed as HTTP endpoint</td>
    </tr>
    <tr>
      <td>Structured output</td>
      <td>Parse raw JSON strings, deserialize manually</td>
      <td><code class="language-plaintext highlighter-rouge">outputClass(MyClass.class)</code>, Gemini returns typed Java objects</td>
    </tr>
    <tr>
      <td>Tool calling</td>
      <td>Parse function call responses, execute tools, re-submit</td>
      <td>Define tools with <code class="language-plaintext highlighter-rouge">genkit.defineTool(...)</code>, automatic execution</td>
    </tr>
    <tr>
      <td>Observability</td>
      <td>Manual OpenTelemetry setup, custom spans, metrics</td>
      <td>Built-in tracing, metrics, and latency tracking, zero config</td>
    </tr>
    <tr>
      <td>Dev/test your flows</td>
      <td>cURL, Postman, write test harnesses</td>
      <td><strong>Genkit Dev UI</strong>, visual, interactive, built-in</td>
    </tr>
  </tbody>
</table>

<h2 id="what-were-building">What We’re Building</h2>

<p>A Java application with a translation AI flow powered by Gemini via Genkit, showcasing:</p>

<ul>
  <li><strong>Typed flow inputs</strong>, <code class="language-plaintext highlighter-rouge">TranslateRequest</code> class with <code class="language-plaintext highlighter-rouge">@JsonProperty</code> annotations as the flow input</li>
  <li><strong>Structured LLM output</strong>, Gemini returns a <code class="language-plaintext highlighter-rouge">TranslateResponse</code> Java object directly (no manual JSON parsing)</li>
  <li><strong>Typed flow outputs</strong>, The flow returns a fully typed <code class="language-plaintext highlighter-rouge">TranslateResponse</code> to the caller</li>
</ul>

<p>All of this in <strong>a single Java file + two model classes</strong>. No Spring Boot. No annotations soup. No XML configuration. Just clean, readable, type-safe code.</p>

<h2 id="prerequisites">Prerequisites</h2>

<ul>
  <li><strong>Java 21+</strong> (<a href="https://adoptium.net/">Eclipse Temurin</a> recommended)</li>
  <li><strong>Maven 3.6+</strong></li>
  <li><strong>Node.js 18+</strong> (for the Genkit CLI)</li>
  <li>A <strong>Google GenAI API key</strong> (free from <a href="https://aistudio.google.com/">Google AI Studio</a>)</li>
  <li><strong>Google Cloud SDK</strong> (only for Cloud Run deployment)</li>
</ul>

<h3 id="install-the-genkit-cli">Install the Genkit CLI</h3>

<p>The Genkit CLI is your command-line companion for developing and testing AI flows. Install it globally:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>npm <span class="nb">install</span> <span class="nt">-g</span> genkit
</code></pre></div></div>

<p>Verify the installation:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>genkit <span class="nt">--version</span>
</code></pre></div></div>

<p>The CLI is what powers the Dev UI and provides a seamless development experience, more on that below.</p>

<h2 id="project-structure">Project Structure</h2>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>genkit-java-getting-started/
├── src/
│   └── main/
│       ├── java/
│       │   └── com/example/
│       │       ├── App.java                # ← The main application
│       │       ├── TranslateRequest.java   # ← Typed flow input
│       │       └── TranslateResponse.java  # ← Typed flow + LLM output
│       └── resources/
│           └── logback.xml                 # Logging configuration
├── pom.xml                                 # Maven config with Genkit + Jib
├── run.sh                                  # Quick-start script
└── README.md                               # This article
</code></pre></div></div>

<h2 id="getting-started">Getting Started</h2>

<h3 id="1-clone-and-set-your-api-key">1. Clone and Set Your API Key</h3>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>git clone https://github.com/xavidop/genkit-java-getting-started.git
<span class="nb">cd </span>genkit-java-getting-started

<span class="nb">export </span><span class="nv">GOOGLE_API_KEY</span><span class="o">=</span>your-api-key-here
</code></pre></div></div>

<h3 id="2-run-with-the-genkit-dev-ui-recommended">2. Run with the Genkit Dev UI (Recommended)</h3>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>genkit start <span class="nt">--</span> mvn compile <span class="nb">exec</span>:java
</code></pre></div></div>

<p>That’s it. Two commands. Your AI-powered Java server is running on <code class="language-plaintext highlighter-rouge">http://localhost:8080</code>, and the <strong>Genkit Dev UI</strong> is available at <code class="language-plaintext highlighter-rouge">http://localhost:4000</code>.</p>

<p class="figure"><img src="/assets/img/blog/tutorials/genkit-java-getting-started/devui.png" alt="Full-width image" class="lead" data-width="800" data-height="100" />
genkit UI</p>

<h3 id="3-or-run-directly-without-dev-ui">3. Or Run Directly (Without Dev UI)</h3>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>mvn compile <span class="nb">exec</span>:java
</code></pre></div></div>

<h2 id="the-code-its-stupidly-simple">The Code, It’s Stupidly Simple</h2>

<h3 id="step-1-define-typed-inputoutput-classes">Step 1: Define Typed Input/Output Classes</h3>

<p>Instead of using raw <code class="language-plaintext highlighter-rouge">Map</code> or <code class="language-plaintext highlighter-rouge">String</code>, define proper Java classes with Jackson annotations. Genkit uses these annotations to generate JSON schemas that tell Gemini exactly what structure to return.</p>

<p><strong><code class="language-plaintext highlighter-rouge">TranslateRequest.java</code></strong>, the flow input:</p>

<div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">import</span> <span class="nn">com.fasterxml.jackson.annotation.JsonProperty</span><span class="o">;</span>
<span class="kn">import</span> <span class="nn">com.fasterxml.jackson.annotation.JsonPropertyDescription</span><span class="o">;</span>

<span class="cm">/**
 * Input for the translate flow.
 */</span>
<span class="kd">public</span> <span class="kd">class</span> <span class="nc">TranslateRequest</span> <span class="o">{</span>

    <span class="nd">@JsonProperty</span><span class="o">(</span><span class="n">required</span> <span class="o">=</span> <span class="kc">true</span><span class="o">)</span>
    <span class="nd">@JsonPropertyDescription</span><span class="o">(</span><span class="s">"The text to translate"</span><span class="o">)</span>
    <span class="kd">private</span> <span class="nc">String</span> <span class="n">text</span><span class="o">;</span>

    <span class="nd">@JsonProperty</span><span class="o">(</span><span class="n">required</span> <span class="o">=</span> <span class="kc">true</span><span class="o">)</span>
    <span class="nd">@JsonPropertyDescription</span><span class="o">(</span><span class="s">"The target language (e.g., Spanish, French, Japanese)"</span><span class="o">)</span>
    <span class="kd">private</span> <span class="nc">String</span> <span class="n">language</span><span class="o">;</span>

    <span class="kd">public</span> <span class="nf">TranslateRequest</span><span class="o">()</span> <span class="o">{}</span>

    <span class="kd">public</span> <span class="nf">TranslateRequest</span><span class="o">(</span><span class="nc">String</span> <span class="n">text</span><span class="o">,</span> <span class="nc">String</span> <span class="n">language</span><span class="o">)</span> <span class="o">{</span>
        <span class="k">this</span><span class="o">.</span><span class="na">text</span> <span class="o">=</span> <span class="n">text</span><span class="o">;</span>
        <span class="k">this</span><span class="o">.</span><span class="na">language</span> <span class="o">=</span> <span class="n">language</span><span class="o">;</span>
    <span class="o">}</span>

    <span class="kd">public</span> <span class="nc">String</span> <span class="nf">getText</span><span class="o">()</span> <span class="o">{</span>
        <span class="k">return</span> <span class="n">text</span><span class="o">;</span>
    <span class="o">}</span>

    <span class="kd">public</span> <span class="kt">void</span> <span class="nf">setText</span><span class="o">(</span><span class="nc">String</span> <span class="n">text</span><span class="o">)</span> <span class="o">{</span>
        <span class="k">this</span><span class="o">.</span><span class="na">text</span> <span class="o">=</span> <span class="n">text</span><span class="o">;</span>
    <span class="o">}</span>

    <span class="kd">public</span> <span class="nc">String</span> <span class="nf">getLanguage</span><span class="o">()</span> <span class="o">{</span>
        <span class="k">return</span> <span class="n">language</span><span class="o">;</span>
    <span class="o">}</span>

    <span class="kd">public</span> <span class="kt">void</span> <span class="nf">setLanguage</span><span class="o">(</span><span class="nc">String</span> <span class="n">language</span><span class="o">)</span> <span class="o">{</span>
        <span class="k">this</span><span class="o">.</span><span class="na">language</span> <span class="o">=</span> <span class="n">language</span><span class="o">;</span>
    <span class="o">}</span>

    <span class="nd">@Override</span>
    <span class="kd">public</span> <span class="nc">String</span> <span class="nf">toString</span><span class="o">()</span> <span class="o">{</span>
        <span class="k">return</span> <span class="nc">String</span><span class="o">.</span><span class="na">format</span><span class="o">(</span><span class="s">"TranslateRequest{text='%s', language='%s'}"</span><span class="o">,</span> <span class="n">text</span><span class="o">,</span> <span class="n">language</span><span class="o">);</span>
    <span class="o">}</span>
<span class="o">}</span>
</code></pre></div></div>

<p><strong><code class="language-plaintext highlighter-rouge">TranslateResponse.java</code></strong>, the flow output <em>and</em> the LLM structured output:</p>

<div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">import</span> <span class="nn">com.fasterxml.jackson.annotation.JsonProperty</span><span class="o">;</span>
<span class="kn">import</span> <span class="nn">com.fasterxml.jackson.annotation.JsonPropertyDescription</span><span class="o">;</span>

<span class="cm">/**
 * Structured output for the translate flow.
 */</span>
<span class="kd">public</span> <span class="kd">class</span> <span class="nc">TranslateResponse</span> <span class="o">{</span>

    <span class="nd">@JsonProperty</span><span class="o">(</span><span class="n">required</span> <span class="o">=</span> <span class="kc">true</span><span class="o">)</span>
    <span class="nd">@JsonPropertyDescription</span><span class="o">(</span><span class="s">"The original text that was translated"</span><span class="o">)</span>
    <span class="kd">private</span> <span class="nc">String</span> <span class="n">originalText</span><span class="o">;</span>

    <span class="nd">@JsonProperty</span><span class="o">(</span><span class="n">required</span> <span class="o">=</span> <span class="kc">true</span><span class="o">)</span>
    <span class="nd">@JsonPropertyDescription</span><span class="o">(</span><span class="s">"The translated text"</span><span class="o">)</span>
    <span class="kd">private</span> <span class="nc">String</span> <span class="n">translatedText</span><span class="o">;</span>

    <span class="nd">@JsonProperty</span><span class="o">(</span><span class="n">required</span> <span class="o">=</span> <span class="kc">true</span><span class="o">)</span>
    <span class="nd">@JsonPropertyDescription</span><span class="o">(</span><span class="s">"The target language"</span><span class="o">)</span>
    <span class="kd">private</span> <span class="nc">String</span> <span class="n">language</span><span class="o">;</span>

    <span class="kd">public</span> <span class="nf">TranslateResponse</span><span class="o">()</span> <span class="o">{}</span>

    <span class="kd">public</span> <span class="nf">TranslateResponse</span><span class="o">(</span><span class="nc">String</span> <span class="n">originalText</span><span class="o">,</span> <span class="nc">String</span> <span class="n">translatedText</span><span class="o">,</span> <span class="nc">String</span> <span class="n">language</span><span class="o">)</span> <span class="o">{</span>
        <span class="k">this</span><span class="o">.</span><span class="na">originalText</span> <span class="o">=</span> <span class="n">originalText</span><span class="o">;</span>
        <span class="k">this</span><span class="o">.</span><span class="na">translatedText</span> <span class="o">=</span> <span class="n">translatedText</span><span class="o">;</span>
        <span class="k">this</span><span class="o">.</span><span class="na">language</span> <span class="o">=</span> <span class="n">language</span><span class="o">;</span>
    <span class="o">}</span>

    <span class="kd">public</span> <span class="nc">String</span> <span class="nf">getOriginalText</span><span class="o">()</span> <span class="o">{</span>
        <span class="k">return</span> <span class="n">originalText</span><span class="o">;</span>
    <span class="o">}</span>

    <span class="kd">public</span> <span class="kt">void</span> <span class="nf">setOriginalText</span><span class="o">(</span><span class="nc">String</span> <span class="n">originalText</span><span class="o">)</span> <span class="o">{</span>
        <span class="k">this</span><span class="o">.</span><span class="na">originalText</span> <span class="o">=</span> <span class="n">originalText</span><span class="o">;</span>
    <span class="o">}</span>

    <span class="kd">public</span> <span class="nc">String</span> <span class="nf">getTranslatedText</span><span class="o">()</span> <span class="o">{</span>
        <span class="k">return</span> <span class="n">translatedText</span><span class="o">;</span>
    <span class="o">}</span>

    <span class="kd">public</span> <span class="kt">void</span> <span class="nf">setTranslatedText</span><span class="o">(</span><span class="nc">String</span> <span class="n">translatedText</span><span class="o">)</span> <span class="o">{</span>
        <span class="k">this</span><span class="o">.</span><span class="na">translatedText</span> <span class="o">=</span> <span class="n">translatedText</span><span class="o">;</span>
    <span class="o">}</span>

    <span class="kd">public</span> <span class="nc">String</span> <span class="nf">getLanguage</span><span class="o">()</span> <span class="o">{</span>
        <span class="k">return</span> <span class="n">language</span><span class="o">;</span>
    <span class="o">}</span>

    <span class="kd">public</span> <span class="kt">void</span> <span class="nf">setLanguage</span><span class="o">(</span><span class="nc">String</span> <span class="n">language</span><span class="o">)</span> <span class="o">{</span>
        <span class="k">this</span><span class="o">.</span><span class="na">language</span> <span class="o">=</span> <span class="n">language</span><span class="o">;</span>
    <span class="o">}</span>

    <span class="nd">@Override</span>
    <span class="kd">public</span> <span class="nc">String</span> <span class="nf">toString</span><span class="o">()</span> <span class="o">{</span>
        <span class="k">return</span> <span class="nc">String</span><span class="o">.</span><span class="na">format</span><span class="o">(</span>
            <span class="s">"TranslateResponse{originalText='%s', translatedText='%s', language='%s'}"</span><span class="o">,</span>
            <span class="n">originalText</span><span class="o">,</span> <span class="n">translatedText</span><span class="o">,</span> <span class="n">language</span><span class="o">);</span>
    <span class="o">}</span>
<span class="o">}</span>
</code></pre></div></div>

<p>The <code class="language-plaintext highlighter-rouge">@JsonPropertyDescription</code> annotations are key, Genkit passes them to Gemini as part of the JSON schema, so the model knows exactly what each field means.</p>

<h3 id="step-2-initialize-genkit">Step 2: Initialize Genkit</h3>

<div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nc">Genkit</span> <span class="n">genkit</span> <span class="o">=</span> <span class="nc">Genkit</span><span class="o">.</span><span class="na">builder</span><span class="o">()</span>
    <span class="o">.</span><span class="na">options</span><span class="o">(</span><span class="nc">GenkitOptions</span><span class="o">.</span><span class="na">builder</span><span class="o">()</span>
        <span class="o">.</span><span class="na">devMode</span><span class="o">(</span><span class="kc">true</span><span class="o">)</span>
        <span class="o">.</span><span class="na">reflectionPort</span><span class="o">(</span><span class="mi">3100</span><span class="o">)</span>
        <span class="o">.</span><span class="na">build</span><span class="o">())</span>
    <span class="o">.</span><span class="na">plugin</span><span class="o">(</span><span class="nc">GoogleGenAIPlugin</span><span class="o">.</span><span class="na">create</span><span class="o">())</span>
    <span class="o">.</span><span class="na">plugin</span><span class="o">(</span><span class="n">jetty</span><span class="o">)</span>
    <span class="o">.</span><span class="na">build</span><span class="o">();</span>
</code></pre></div></div>

<p>That’s the entire setup. The <code class="language-plaintext highlighter-rouge">GoogleGenAIPlugin</code> reads your <code class="language-plaintext highlighter-rouge">GOOGLE_API_KEY</code> automatically. The <code class="language-plaintext highlighter-rouge">JettyPlugin</code> handles HTTP. Genkit wires everything together.</p>

<h3 id="step-3-define-a-flow-with-typed-classes-and-structured-output">Step 3: Define a Flow with Typed Classes and Structured Output</h3>

<div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">genkit</span><span class="o">.</span><span class="na">defineFlow</span><span class="o">(</span>
    <span class="s">"translate"</span><span class="o">,</span>
    <span class="nc">TranslateRequest</span><span class="o">.</span><span class="na">class</span><span class="o">,</span>     <span class="c1">// ← typed input</span>
    <span class="nc">TranslateResponse</span><span class="o">.</span><span class="na">class</span><span class="o">,</span>    <span class="c1">// ← typed output</span>
    <span class="o">(</span><span class="n">ctx</span><span class="o">,</span> <span class="n">request</span><span class="o">)</span> <span class="o">-&gt;</span> <span class="o">{</span>
        <span class="nc">String</span> <span class="n">prompt</span> <span class="o">=</span> <span class="nc">String</span><span class="o">.</span><span class="na">format</span><span class="o">(</span>
            <span class="s">"Translate the following text to %s.\n\nText: %s"</span><span class="o">,</span>
            <span class="n">request</span><span class="o">.</span><span class="na">getLanguage</span><span class="o">(),</span> <span class="n">request</span><span class="o">.</span><span class="na">getText</span><span class="o">()</span>
        <span class="o">);</span>

        <span class="k">return</span> <span class="n">genkit</span><span class="o">.</span><span class="na">generate</span><span class="o">(</span>
            <span class="nc">GenerateOptions</span><span class="o">.&lt;</span><span class="nc">TranslateResponse</span><span class="o">&gt;</span><span class="n">builder</span><span class="o">()</span>
                <span class="o">.</span><span class="na">model</span><span class="o">(</span><span class="s">"googleai/gemini-3-flash-preview"</span><span class="o">)</span>
                <span class="o">.</span><span class="na">prompt</span><span class="o">(</span><span class="n">prompt</span><span class="o">)</span>
                <span class="o">.</span><span class="na">outputClass</span><span class="o">(</span><span class="nc">TranslateResponse</span><span class="o">.</span><span class="na">class</span><span class="o">)</span>  <span class="c1">// ← Gemini returns a typed object!</span>
                <span class="o">.</span><span class="na">config</span><span class="o">(</span><span class="nc">GenerationConfig</span><span class="o">.</span><span class="na">builder</span><span class="o">()</span>
                    <span class="o">.</span><span class="na">temperature</span><span class="o">(</span><span class="mf">0.1</span><span class="o">)</span>
                    <span class="o">.</span><span class="na">build</span><span class="o">())</span>
                <span class="o">.</span><span class="na">build</span><span class="o">()</span>
        <span class="o">);</span>
    <span class="o">}</span>
<span class="o">);</span>
</code></pre></div></div>

<p>Look at what’s happening here:</p>

<ol>
  <li><strong><code class="language-plaintext highlighter-rouge">TranslateRequest.class</code></strong> as the flow input, Genkit automatically deserializes incoming JSON into a <code class="language-plaintext highlighter-rouge">TranslateRequest</code> object. No <code class="language-plaintext highlighter-rouge">Map.get()</code> casting.</li>
  <li><strong><code class="language-plaintext highlighter-rouge">TranslateResponse.class</code></strong> as the flow output, the flow returns a typed object, serialized automatically to JSON for the HTTP response.</li>
  <li><strong><code class="language-plaintext highlighter-rouge">outputClass(TranslateResponse.class)</code></strong> on the <code class="language-plaintext highlighter-rouge">generate</code> call, this is the magic. Genkit sends the JSON schema derived from <code class="language-plaintext highlighter-rouge">TranslateResponse</code> to Gemini, and Gemini returns structured JSON that Genkit deserializes into a <code class="language-plaintext highlighter-rouge">TranslateResponse</code> object. No <code class="language-plaintext highlighter-rouge">response.getText()</code> + manual parsing.</li>
</ol>

<p>That single <code class="language-plaintext highlighter-rouge">defineFlow</code> call:</p>
<ul>
  <li>Registers the flow in Genkit’s internal registry</li>
  <li>Exposes it as a <code class="language-plaintext highlighter-rouge">POST /api/flows/translate</code> HTTP endpoint</li>
  <li>Makes it visible in the Dev UI</li>
  <li>Adds full OpenTelemetry tracing automatically</li>
  <li>Tracks token usage, latency, and error rates</li>
</ul>

<p>Compare that to writing a Spring Boot controller + service + DTO + config + exception handler for the same functionality.</p>

<h2 id="the-genkit-dev-ui---your-ai-playground">The Genkit Dev UI - Your AI Playground</h2>

<p>This is where Genkit truly shines for development. When you run with <code class="language-plaintext highlighter-rouge">genkit start</code>, the CLI launches a <strong>visual Dev UI</strong> at <code class="language-plaintext highlighter-rouge">http://localhost:4000</code>.</p>

<h3 id="what-can-you-do-in-the-dev-ui">What Can You Do in the Dev UI?</h3>

<ul>
  <li><strong>Browse all flows</strong>, See every flow you’ve registered, like <code class="language-plaintext highlighter-rouge">translate</code>, with its typed input/output schemas.</li>
  <li><strong>Run flows interactively</strong>, Fill in a <code class="language-plaintext highlighter-rouge">TranslateRequest</code> JSON, click “Run”, see the <code class="language-plaintext highlighter-rouge">TranslateResponse</code> instantly. No cURL needed.</li>
  <li><strong>Inspect traces</strong>, Every flow execution is traced. See exactly which model was called, what the input/output was, how long it took, and how many tokens were used.</li>
  <li><strong>View registered models &amp; tools</strong>, See all available Gemini models and any tools you’ve defined.</li>
  <li><strong>Test tool calling</strong>, Watch Gemini decide to call your tools in real-time.</li>
  <li><strong>Manage datasets &amp; evaluations</strong>, Create test datasets and evaluate your AI outputs.</li>
</ul>

<h2 id="deploying-to-google-cloud-run">Deploying to Google Cloud Run</h2>

<p>The project uses <strong><a href="https://github.com/GoogleContainerTools/jib">Jib</a></strong> to build and push container images directly from Maven, <strong>no Dockerfile and no Docker daemon required</strong>. Jib is configured in the <code class="language-plaintext highlighter-rouge">pom.xml</code> and builds optimized, layered container images.</p>

<h3 id="step-by-step-deployment">Step-by-Step Deployment</h3>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># Set your GCP project</span>
<span class="nb">export </span><span class="nv">PROJECT_ID</span><span class="o">=</span><span class="si">$(</span>gcloud config get-value project<span class="si">)</span>
<span class="nb">export </span><span class="nv">REGION</span><span class="o">=</span>us-central1

<span class="c"># Build the container image and push it to Google Container Registry</span>
<span class="c"># No Docker needed, Jib does it all from Maven!</span>
mvn compile jib:build <span class="nt">-Djib</span>.to.image<span class="o">=</span>gcr.io/<span class="nv">$PROJECT_ID</span>/genkit-java-app

<span class="c"># Deploy to Cloud Run</span>
gcloud run deploy genkit-java-app <span class="se">\</span>
  <span class="nt">--image</span> gcr.io/<span class="nv">$PROJECT_ID</span>/genkit-java-app <span class="se">\</span>
  <span class="nt">--region</span> <span class="nv">$REGION</span> <span class="se">\</span>
  <span class="nt">--platform</span> managed <span class="se">\</span>
  <span class="nt">--allow-unauthenticated</span> <span class="se">\</span>
  <span class="nt">--set-env-vars</span> <span class="s2">"GOOGLE_API_KEY=</span><span class="nv">$GOOGLE_API_KEY</span><span class="s2">"</span> <span class="se">\</span>
  <span class="nt">--memory</span> 512Mi <span class="se">\</span>
  <span class="nt">--cpu</span> 1
</code></pre></div></div>

<p>Two commands. No Docker. Your Java GenAI application is now live on a globally-distributed, auto-scaling, serverless platform.</p>

<h3 id="why-jib">Why Jib?</h3>

<ul>
  <li><strong>No Dockerfile</strong>, Container image is built directly from your Maven project</li>
  <li><strong>No Docker daemon</strong>, Doesn’t require Docker installed or running on your machine</li>
  <li><strong>Fast rebuilds</strong>, Separates dependencies, classes, and resources into layers, so only changed layers are rebuilt</li>
  <li><strong>Reproducible</strong>, Builds are deterministic and don’t depend on the local Docker environment</li>
  <li><strong>Direct push</strong>, Sends the image straight to GCR/Artifact Registry without a local <code class="language-plaintext highlighter-rouge">docker push</code></li>
</ul>

<p>You can also build a local Docker image (requires Docker running) with:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>mvn compile jib:dockerBuild <span class="nt">-Djib</span>.to.image<span class="o">=</span>genkit-java-app
</code></pre></div></div>

<h2 id="available-flows--api-examples">Available Flows &amp; API Examples</h2>

<p>Once the server is running, test the translate flow:</p>

<h3 id="translate-text">Translate Text</h3>

<p>Send a <code class="language-plaintext highlighter-rouge">TranslateRequest</code> JSON object and receive a structured <code class="language-plaintext highlighter-rouge">TranslateResponse</code>:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>curl <span class="nt">-X</span> POST http://localhost:8080/api/flows/translate <span class="se">\</span>
  <span class="nt">-H</span> <span class="s1">'Content-Type: application/json'</span> <span class="se">\</span>
  <span class="nt">-d</span> <span class="s1">'{"text": "Building AI applications has never been easier", "language": "Spanish"}'</span>
</code></pre></div></div>

<p>Example response (a <code class="language-plaintext highlighter-rouge">TranslateResponse</code> object):</p>

<div class="language-json highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">{</span><span class="w">
  </span><span class="nl">"originalText"</span><span class="p">:</span><span class="w"> </span><span class="s2">"Building AI applications has never been easier"</span><span class="p">,</span><span class="w">
  </span><span class="nl">"translatedText"</span><span class="p">:</span><span class="w"> </span><span class="s2">"Construir aplicaciones de IA nunca ha sido tan fácil"</span><span class="p">,</span><span class="w">
  </span><span class="nl">"language"</span><span class="p">:</span><span class="w"> </span><span class="s2">"Spanish"</span><span class="w">
</span><span class="p">}</span><span class="w">
</span></code></pre></div></div>

<p>Try other languages:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># French</span>
curl <span class="nt">-X</span> POST http://localhost:8080/api/flows/translate <span class="se">\</span>
  <span class="nt">-H</span> <span class="s1">'Content-Type: application/json'</span> <span class="se">\</span>
  <span class="nt">-d</span> <span class="s1">'{"text": "Genkit makes Java AI development simple", "language": "French"}'</span>

<span class="c"># Japanese</span>
curl <span class="nt">-X</span> POST http://localhost:8080/api/flows/translate <span class="se">\</span>
  <span class="nt">-H</span> <span class="s1">'Content-Type: application/json'</span> <span class="se">\</span>
  <span class="nt">-d</span> <span class="s1">'{"text": "Hello world", "language": "Japanese"}'</span>
</code></pre></div></div>

<p>Notice how the response is always a structured JSON object, not a raw string. That’s the power of <code class="language-plaintext highlighter-rouge">outputClass(TranslateResponse.class)</code>. Gemini returns structured data that Genkit deserializes into your Java class automatically.</p>

<h2 id="what-genkit-gives-you-for-free">What Genkit Gives You for Free</h2>

<p>When you use Genkit, you’re not just getting a wrapper around API calls. You get a production-grade framework:</p>

<h3 id="observability-zero-config">Observability (Zero Config)</h3>

<p>Every flow execution is automatically traced with OpenTelemetry:</p>
<ul>
  <li><strong>Latency tracking</strong> per flow, per model call</li>
  <li><strong>Token usage</strong> (input/output/thinking tokens)</li>
  <li><strong>Error rates</strong> and failure tracking</li>
  <li><strong>Span hierarchy</strong> showing the full execution path</li>
</ul>

<h3 id="plugin-ecosystem">Plugin Ecosystem</h3>

<p>Need to swap Gemini for another model? Change one line:</p>

<div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// Switch from Gemini to OpenAI</span>
<span class="o">.</span><span class="na">plugin</span><span class="o">(</span><span class="nc">OpenAIPlugin</span><span class="o">.</span><span class="na">create</span><span class="o">())</span>

<span class="c1">// Or use Anthropic Claude</span>
<span class="o">.</span><span class="na">plugin</span><span class="o">(</span><span class="nc">AnthropicPlugin</span><span class="o">.</span><span class="na">create</span><span class="o">())</span>

<span class="c1">// Or run locally with Ollama</span>
<span class="o">.</span><span class="na">plugin</span><span class="o">(</span><span class="nc">OllamaPlugin</span><span class="o">.</span><span class="na">create</span><span class="o">())</span>
</code></pre></div></div>

<p>Genkit supports 10+ model providers, vector databases (Pinecone, Weaviate, PostgreSQL), Firebase integration, and more.</p>

<h3 id="type-safety">Type Safety</h3>

<p>This is where Genkit really shines for Java developers. Flows, generate calls, and even LLM responses are fully typed:</p>

<div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// The flow takes a TranslateRequest and returns a TranslateResponse</span>
<span class="n">genkit</span><span class="o">.</span><span class="na">defineFlow</span><span class="o">(</span><span class="s">"translate"</span><span class="o">,</span> <span class="nc">TranslateRequest</span><span class="o">.</span><span class="na">class</span><span class="o">,</span> <span class="nc">TranslateResponse</span><span class="o">.</span><span class="na">class</span><span class="o">,</span> <span class="o">...);</span>

<span class="c1">// The LLM returns a TranslateResponse directly, no string parsing</span>
<span class="n">genkit</span><span class="o">.</span><span class="na">generate</span><span class="o">(</span>
    <span class="nc">GenerateOptions</span><span class="o">.&lt;</span><span class="nc">TranslateResponse</span><span class="o">&gt;</span><span class="n">builder</span><span class="o">()</span>
        <span class="o">.</span><span class="na">outputClass</span><span class="o">(</span><span class="nc">TranslateResponse</span><span class="o">.</span><span class="na">class</span><span class="o">)</span>
        <span class="o">.</span><span class="na">build</span><span class="o">()</span>
<span class="o">);</span>
</code></pre></div></div>

<p>Genkit derives JSON schemas from your <code class="language-plaintext highlighter-rouge">@JsonProperty</code> and <code class="language-plaintext highlighter-rouge">@JsonPropertyDescription</code> annotations and sends them to Gemini, so the model returns structured data that maps directly to your Java classes. No <code class="language-plaintext highlighter-rouge">Object</code> casting, no <code class="language-plaintext highlighter-rouge">response.getText()</code> + <code class="language-plaintext highlighter-rouge">objectMapper.readValue()</code>, no runtime surprises.</p>

<h2 id="whats-next">What’s Next?</h2>

<p>This getting-started project covers the fundamentals. Genkit Java can do much more:</p>

<ul>
  <li><strong>RAG</strong>, Retrieval-Augmented Generation with vector stores (Firestore, Pinecone, pgvector, Weaviate)</li>
  <li><strong>Multi-agent orchestration</strong>, Coordinate multiple AI agents</li>
  <li><strong>Chat sessions</strong>, Multi-turn conversations with session persistence</li>
  <li><strong>Evaluations</strong>, RAGAS-style metrics to measure your AI output quality</li>
  <li><strong>MCP Integration</strong>, Connect to Model Context Protocol servers</li>
  <li><strong>Spring Boot</strong>, Use the Spring plugin instead of Jetty for existing Spring apps</li>
  <li><strong>Firebase</strong>, Deploy as Cloud Functions with Firestore vector search</li>
</ul>

<p>Explore the <a href="https://github.com/genkit-ai/genkit-java">full Genkit Java documentation</a> and the <a href="https://github.com/genkit-ai/genkit-java/tree/main/samples">samples directory</a> to dive deeper.</p>

<h2 id="conclusion">Conclusion</h2>

<p>As you can see, it is very easy to use Genkit Java and Gemini to build powerful generative AI applications with minimal code. The combination of typed inputs/outputs, structured LLM responses, built-in observability, and seamless deployment makes Genkit Java the best way to build GenAI features in Java.</p>

<p>You can find the full code of this example in the <a href="https://github.com/xavidop/genkit-java-getting-started">GitHub repository</a></p>

<p>Happy coding!</p>]]></content><author><name>Xavier Portilla Edo</name><email>xavierportillaedo@gmail.com</email></author><category term="genkit" /><category term="gcp" /><category term="genkit" /><category term="gcp" /><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://xavidop.me/assets/img/blog/post-headers/genkit-java-getting-started.png" /><media:content medium="image" url="https://xavidop.me/assets/img/blog/post-headers/genkit-java-getting-started.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry xml:lang="en"><title type="html">Genkit in Node, Building a Weather Service with AI Integration (English)</title><link href="https://xavidop.me/genkit/gcp/2025-02-28-genkit-node-tool/" rel="alternate" type="text/html" title="Genkit in Node, Building a Weather Service with AI Integration (English)" /><published>2025-02-28T00:00:00+00:00</published><updated>2026-03-20T17:41:31+00:00</updated><id>https://xavidop.me/genkit/gcp/genkit-node-tool</id><content type="html" xml:base="https://xavidop.me/genkit/gcp/2025-02-28-genkit-node-tool/"><![CDATA[<ol class="no_toc" id="markdown-toc">
  <li><a href="#overview" id="markdown-toc-overview">Overview</a></li>
  <li><a href="#prerequisites" id="markdown-toc-prerequisites">Prerequisites</a></li>
  <li><a href="#technical-deep-dive" id="markdown-toc-technical-deep-dive">Technical Deep Dive</a>    <ol>
      <li><a href="#ai-configuration" id="markdown-toc-ai-configuration">AI Configuration</a></li>
      <li><a href="#weather-tool-implementation" id="markdown-toc-weather-tool-implementation">Weather Tool Implementation</a></li>
      <li><a href="#ai-flow-definition" id="markdown-toc-ai-flow-definition">AI Flow Definition</a></li>
      <li><a href="#express-server-configuration" id="markdown-toc-express-server-configuration">Express Server Configuration</a></li>
    </ol>
  </li>
  <li><a href="#full-code" id="markdown-toc-full-code">Full Code</a></li>
  <li><a href="#setup--development" id="markdown-toc-setup--development">Setup &amp; Development</a></li>
  <li><a href="#dependencies" id="markdown-toc-dependencies">Dependencies</a>    <ol>
      <li><a href="#core-dependencies" id="markdown-toc-core-dependencies">Core Dependencies</a></li>
      <li><a href="#development-dependencies" id="markdown-toc-development-dependencies">Development Dependencies</a></li>
    </ol>
  </li>
  <li><a href="#project-configuration" id="markdown-toc-project-configuration">Project Configuration</a></li>
  <li><a href="#license" id="markdown-toc-license">License</a></li>
  <li><a href="#resources" id="markdown-toc-resources">Resources</a></li>
  <li><a href="#conclusion" id="markdown-toc-conclusion">Conclusion</a></li>
</ol>

<h2 id="overview">Overview</h2>

<p>This project demonstrates how to build an AI-enhanced weather service using Genkit, TypeScript, OpenWeatherAPI and Github Models. The application showcases modern Node.js patterns and AI integration techniques.</p>

<h2 id="prerequisites">Prerequisites</h2>
<p>Before you begin, ensure you have the following:</p>
<ol>
  <li>Node.js installed on your machine.</li>
  <li>GitHub account and access token for GitHub APIs.</li>
  <li>An OpenWeatherAPI key for fetching weather data.</li>
  <li>Genkit CLI installed on your machine.</li>
</ol>

<h2 id="technical-deep-dive">Technical Deep Dive</h2>

<h3 id="ai-configuration">AI Configuration</h3>
<p>The core AI setup is initialized with Genkit and GitHub plugin integration. In this case we are going to use the OpenAI o3-mini model:</p>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">const</span> <span class="nx">ai</span> <span class="o">=</span> <span class="nf">genkit</span><span class="p">({</span>
  <span class="na">plugins</span><span class="p">:</span> <span class="p">[</span>
    <span class="nf">github</span><span class="p">({</span> <span class="na">githubToken</span><span class="p">:</span> <span class="nx">process</span><span class="p">.</span><span class="nx">env</span><span class="p">.</span><span class="nx">GITHUB_TOKEN</span> <span class="p">}),</span>
  <span class="p">],</span>
  <span class="na">model</span><span class="p">:</span> <span class="nx">openAIO3Mini</span><span class="p">,</span>
<span class="p">});</span>
</code></pre></div></div>

<h3 id="weather-tool-implementation">Weather Tool Implementation</h3>
<p>The application defines a custom weather tool using Zod schema validation:</p>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">const</span> <span class="nx">getWeather</span> <span class="o">=</span> <span class="nx">ai</span><span class="p">.</span><span class="nf">defineTool</span><span class="p">(</span>
  <span class="p">{</span>
    <span class="na">name</span><span class="p">:</span> <span class="dl">'</span><span class="s1">getWeather</span><span class="dl">'</span><span class="p">,</span>
    <span class="na">description</span><span class="p">:</span> <span class="dl">'</span><span class="s1">Gets the current weather in a given location</span><span class="dl">'</span><span class="p">,</span>
    <span class="na">inputSchema</span><span class="p">:</span> <span class="nx">weatherToolInputSchema</span><span class="p">,</span>
    <span class="na">outputSchema</span><span class="p">:</span> <span class="nx">z</span><span class="p">.</span><span class="nf">string</span><span class="p">(),</span>
  <span class="p">},</span>
  <span class="k">async </span><span class="p">(</span><span class="nx">input</span><span class="p">)</span> <span class="o">=&gt;</span> <span class="p">{</span>

    <span class="kd">const</span> <span class="nx">weather</span> <span class="o">=</span> <span class="k">new</span> <span class="nc">OpenWeatherAPI</span><span class="p">({</span>
        <span class="na">key</span><span class="p">:</span> <span class="nx">process</span><span class="p">.</span><span class="nx">env</span><span class="p">.</span><span class="nx">OPENWEATHER_API_KEY</span><span class="p">,</span>
        <span class="na">units</span><span class="p">:</span> <span class="dl">"</span><span class="s2">metric</span><span class="dl">"</span>
    <span class="p">})</span>

    <span class="kd">const</span> <span class="nx">data</span> <span class="o">=</span> <span class="k">await</span> <span class="nx">weather</span><span class="p">.</span><span class="nf">getCurrent</span><span class="p">({</span><span class="na">locationName</span><span class="p">:</span> <span class="nx">input</span><span class="p">.</span><span class="nx">location</span><span class="p">});</span>

    <span class="k">return</span> <span class="s2">`The current weather in </span><span class="p">${</span><span class="nx">input</span><span class="p">.</span><span class="nx">location</span><span class="p">}</span><span class="s2"> is: </span><span class="p">${</span><span class="nx">data</span><span class="p">.</span><span class="nx">weather</span><span class="p">.</span><span class="nx">temp</span><span class="p">.</span><span class="nx">cur</span><span class="p">}</span><span class="s2"> Degrees in Celsius`</span><span class="p">;</span>
  <span class="p">}</span>
<span class="p">);</span>
</code></pre></div></div>

<h3 id="ai-flow-definition">AI Flow Definition</h3>
<p>The service exposes an AI flow that processes weather requests:</p>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">const</span> <span class="nx">helloFlow</span> <span class="o">=</span> <span class="nx">ai</span><span class="p">.</span><span class="nf">defineFlow</span><span class="p">(</span>
  <span class="p">{</span>
    <span class="na">name</span><span class="p">:</span> <span class="dl">'</span><span class="s1">helloFlow</span><span class="dl">'</span><span class="p">,</span>
    <span class="na">inputSchema</span><span class="p">:</span> <span class="nx">z</span><span class="p">.</span><span class="nf">object</span><span class="p">({</span> <span class="na">location</span><span class="p">:</span> <span class="nx">z</span><span class="p">.</span><span class="nf">string</span><span class="p">()</span> <span class="p">}),</span>
    <span class="na">outputSchema</span><span class="p">:</span> <span class="nx">z</span><span class="p">.</span><span class="nf">string</span><span class="p">(),</span>
  <span class="p">},</span>
  <span class="k">async </span><span class="p">(</span><span class="nx">input</span><span class="p">)</span> <span class="o">=&gt;</span> <span class="p">{</span>
    <span class="kd">const</span> <span class="nx">response</span> <span class="o">=</span> <span class="k">await</span> <span class="nx">ai</span><span class="p">.</span><span class="nf">generate</span><span class="p">({</span>
      <span class="na">tools</span><span class="p">:</span> <span class="p">[</span><span class="nx">getWeather</span><span class="p">],</span>
      <span class="na">prompt</span><span class="p">:</span> <span class="s2">`What's the weather in </span><span class="p">${</span><span class="nx">input</span><span class="p">.</span><span class="nx">location</span><span class="p">}</span><span class="s2">?`</span>
    <span class="p">});</span>
    <span class="k">return</span> <span class="nx">response</span><span class="p">.</span><span class="nx">text</span><span class="p">;</span>
  <span class="p">}</span>
<span class="p">);</span>
</code></pre></div></div>

<h3 id="express-server-configuration">Express Server Configuration</h3>
<p>The application uses the Genkit Express plugin to create an API server:</p>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">const</span> <span class="nx">app</span> <span class="o">=</span> <span class="nf">express</span><span class="p">({</span>
  <span class="na">flows</span><span class="p">:</span> <span class="p">[</span><span class="nx">helloFlow</span><span class="p">],</span>
<span class="p">});</span>
</code></pre></div></div>

<h2 id="full-code">Full Code</h2>

<p>The full code for the weather service is as follows:</p>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="cm">/* eslint-disable  @typescript-eslint/no-explicit-any */</span>

<span class="k">import</span> <span class="p">{</span> <span class="nx">genkit</span><span class="p">,</span> <span class="nx">z</span> <span class="p">}</span> <span class="k">from</span> <span class="dl">'</span><span class="s1">genkit</span><span class="dl">'</span><span class="p">;</span>
<span class="k">import</span> <span class="p">{</span> <span class="nx">startFlowServer</span> <span class="p">}</span> <span class="k">from</span> <span class="dl">'</span><span class="s1">@genkit-ai/express</span><span class="dl">'</span><span class="p">;</span>
<span class="k">import</span> <span class="p">{</span> <span class="nx">openAIO3Mini</span><span class="p">,</span> <span class="nx">github</span> <span class="p">}</span> <span class="k">from</span> <span class="dl">'</span><span class="s1">genkitx-github</span><span class="dl">'</span><span class="p">;</span>
<span class="k">import</span> <span class="p">{</span><span class="nx">OpenWeatherAPI</span> <span class="p">}</span> <span class="k">from</span> <span class="dl">'</span><span class="s1">openweather-api-node</span><span class="dl">'</span><span class="p">;</span>
<span class="k">import</span> <span class="nx">dotenv</span> <span class="k">from</span> <span class="dl">'</span><span class="s1">dotenv</span><span class="dl">'</span><span class="p">;</span>

<span class="nx">dotenv</span><span class="p">.</span><span class="nf">config</span><span class="p">();</span>

<span class="kd">const</span> <span class="nx">ai</span> <span class="o">=</span> <span class="nf">genkit</span><span class="p">({</span>
  <span class="na">plugins</span><span class="p">:</span> <span class="p">[</span>
    <span class="nf">github</span><span class="p">({</span> <span class="na">githubToken</span><span class="p">:</span> <span class="nx">process</span><span class="p">.</span><span class="nx">env</span><span class="p">.</span><span class="nx">GITHUB_TOKEN</span> <span class="p">}),</span>
  <span class="p">],</span>
  <span class="na">model</span><span class="p">:</span> <span class="nx">openAIO3Mini</span><span class="p">,</span>
<span class="p">});</span>

<span class="kd">const</span> <span class="nx">weatherToolInputSchema</span> <span class="o">=</span> <span class="nx">z</span><span class="p">.</span><span class="nf">object</span><span class="p">({</span> 
  <span class="na">location</span><span class="p">:</span> <span class="nx">z</span><span class="p">.</span><span class="nf">string</span><span class="p">().</span><span class="nf">describe</span><span class="p">(</span><span class="dl">'</span><span class="s1">The location to get the current weather for</span><span class="dl">'</span><span class="p">)</span>
<span class="p">});</span>

<span class="kd">const</span> <span class="nx">getWeather</span> <span class="o">=</span> <span class="nx">ai</span><span class="p">.</span><span class="nf">defineTool</span><span class="p">(</span>
  <span class="p">{</span>
    <span class="na">name</span><span class="p">:</span> <span class="dl">'</span><span class="s1">getWeather</span><span class="dl">'</span><span class="p">,</span>
    <span class="na">description</span><span class="p">:</span> <span class="dl">'</span><span class="s1">Gets the current weather in a given location</span><span class="dl">'</span><span class="p">,</span>
    <span class="na">inputSchema</span><span class="p">:</span> <span class="nx">weatherToolInputSchema</span><span class="p">,</span>
    <span class="na">outputSchema</span><span class="p">:</span> <span class="nx">z</span><span class="p">.</span><span class="nf">string</span><span class="p">(),</span>
  <span class="p">},</span>
  <span class="k">async </span><span class="p">(</span><span class="nx">input</span><span class="p">)</span> <span class="o">=&gt;</span> <span class="p">{</span>

    <span class="kd">const</span> <span class="nx">weather</span> <span class="o">=</span> <span class="k">new</span> <span class="nc">OpenWeatherAPI</span><span class="p">({</span>
        <span class="na">key</span><span class="p">:</span> <span class="nx">process</span><span class="p">.</span><span class="nx">env</span><span class="p">.</span><span class="nx">OPENWEATHER_API_KEY</span><span class="p">,</span>
        <span class="na">units</span><span class="p">:</span> <span class="dl">"</span><span class="s2">metric</span><span class="dl">"</span>
    <span class="p">})</span>

    <span class="kd">const</span> <span class="nx">data</span> <span class="o">=</span> <span class="k">await</span> <span class="nx">weather</span><span class="p">.</span><span class="nf">getCurrent</span><span class="p">({</span><span class="na">locationName</span><span class="p">:</span> <span class="nx">input</span><span class="p">.</span><span class="nx">location</span><span class="p">});</span>

    <span class="k">return</span> <span class="s2">`The current weather in </span><span class="p">${</span><span class="nx">input</span><span class="p">.</span><span class="nx">location</span><span class="p">}</span><span class="s2"> is: </span><span class="p">${</span><span class="nx">data</span><span class="p">.</span><span class="nx">weather</span><span class="p">.</span><span class="nx">temp</span><span class="p">.</span><span class="nx">cur</span><span class="p">}</span><span class="s2"> Degrees in Celsius`</span><span class="p">;</span>
  <span class="p">}</span>
<span class="p">);</span>

<span class="kd">const</span> <span class="nx">helloFlow</span> <span class="o">=</span> <span class="nx">ai</span><span class="p">.</span><span class="nf">defineFlow</span><span class="p">(</span>
  <span class="p">{</span>
    <span class="na">name</span><span class="p">:</span> <span class="dl">'</span><span class="s1">helloFlow</span><span class="dl">'</span><span class="p">,</span>
    <span class="na">inputSchema</span><span class="p">:</span> <span class="nx">z</span><span class="p">.</span><span class="nf">object</span><span class="p">({</span> <span class="na">location</span><span class="p">:</span> <span class="nx">z</span><span class="p">.</span><span class="nf">string</span><span class="p">()</span> <span class="p">}),</span>
    <span class="na">outputSchema</span><span class="p">:</span> <span class="nx">z</span><span class="p">.</span><span class="nf">string</span><span class="p">(),</span>
  <span class="p">},</span>
  <span class="k">async </span><span class="p">(</span><span class="nx">input</span><span class="p">)</span> <span class="o">=&gt;</span> <span class="p">{</span>

    <span class="kd">const</span> <span class="nx">response</span>  <span class="o">=</span> <span class="k">await</span> <span class="nx">ai</span><span class="p">.</span><span class="nf">generate</span><span class="p">({</span>
      <span class="na">tools</span><span class="p">:</span> <span class="p">[</span><span class="nx">getWeather</span><span class="p">],</span>
      <span class="na">prompt</span><span class="p">:</span> <span class="s2">`What's the weather in </span><span class="p">${</span><span class="nx">input</span><span class="p">.</span><span class="nx">location</span><span class="p">}</span><span class="s2">?`</span>
    <span class="p">});</span>

    <span class="k">return</span> <span class="nx">response</span><span class="p">.</span><span class="nx">text</span><span class="p">;</span>
  <span class="p">}</span>
<span class="p">);</span>

<span class="nf">startFlowServer</span><span class="p">({</span>
  <span class="na">flows</span><span class="p">:</span> <span class="p">[</span><span class="nx">helloFlow</span><span class="p">]</span>
<span class="p">});</span>
</code></pre></div></div>

<h2 id="setup--development">Setup &amp; Development</h2>

<ol>
  <li>Install dependencies:
    <div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>npm <span class="nb">install</span>
</code></pre></div>    </div>
  </li>
  <li>Configure environment variables:
    <div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">GITHUB_TOKEN</span><span class="o">=</span>your_token
<span class="nv">OPENWEATHER_API_KEY</span><span class="o">=</span>your_key
</code></pre></div>    </div>
  </li>
  <li>Start development server:
    <div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>npm run genkit:start
</code></pre></div>    </div>
  </li>
  <li>To run the project in debug mode and set breakpoints, you can run:
    <div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>npm run genkit:start:debug
</code></pre></div>    </div>
    <p>And then launch the debugger in your IDE. See the <code class="language-plaintext highlighter-rouge">.vscode/launch.json</code> file for the configuration.</p>
  </li>
  <li>If you want to build the project, you can run:
    <div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>npm run build
</code></pre></div>    </div>
  </li>
  <li>Run the project in production mode:
    <div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>npm run start:production
</code></pre></div>    </div>
  </li>
</ol>

<h2 id="dependencies">Dependencies</h2>

<h3 id="core-dependencies">Core Dependencies</h3>
<ul>
  <li><code class="language-plaintext highlighter-rouge">genkit</code>: ^1.0.5</li>
  <li><code class="language-plaintext highlighter-rouge">@genkit-ai/express</code>: ^1.0.5</li>
  <li><code class="language-plaintext highlighter-rouge">openweather-api-node</code>: ^3.1.5</li>
  <li><code class="language-plaintext highlighter-rouge">genkitx-github</code>: ^1.13.1</li>
  <li><code class="language-plaintext highlighter-rouge">dotenv</code>: ^16.4.7</li>
</ul>

<h3 id="development-dependencies">Development Dependencies</h3>
<ul>
  <li><code class="language-plaintext highlighter-rouge">tsx</code>: ^4.19.2</li>
  <li><code class="language-plaintext highlighter-rouge">typescript</code>: ^5.7.2</li>
</ul>

<h2 id="project-configuration">Project Configuration</h2>

<ul>
  <li>Uses ES Modules (<code class="language-plaintext highlighter-rouge">"type": "module"</code>)</li>
  <li>TypeScript with <code class="language-plaintext highlighter-rouge">NodeNext</code> module resolution</li>
  <li>Output directory: lib</li>
  <li>Full TypeScript support with type definitions</li>
</ul>

<h2 id="license">License</h2>

<p>Apache 2.0</p>

<h2 id="resources">Resources</h2>

<ul>
  <li><a href="https://genkit.dev/">genkit</a></li>
  <li><a href="https://github.com/marketplace/models">GitHub Models</a></li>
  <li><a href="https://genkit.dev/docs/frameworks/express/">Firebase Express Plugin</a></li>
</ul>

<h2 id="conclusion">Conclusion</h2>

<p>This project demonstrates how to build a weather service using Genkit in Node.js with AI integration. The application showcases modern Node.js patterns and AI integration techniques.</p>

<p>You can find the full code of this example in the <a href="https://github.com/xavidop/genkit-node-tool-example">GitHub repository</a></p>

<p>Happy coding!</p>]]></content><author><name>Xavier Portilla Edo</name><email>xavierportillaedo@gmail.com</email></author><category term="genkit" /><category term="gcp" /><category term="firebase" /><category term="genkit" /><category term="gcp" /><summary type="html"><![CDATA[Building a weather service using Genkit in Node.js with AI integration]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://xavidop.me/assets/img/blog/post-headers/firebase-genkit-node-tool.png" /><media:content medium="image" url="https://xavidop.me/assets/img/blog/post-headers/firebase-genkit-node-tool.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry xml:lang="en"><title type="html">How Can You Link LLM Agent Metrics to Business Success (English)</title><link href="https://xavidop.me/gcp/2025-01-16-llm-metrics/" rel="alternate" type="text/html" title="How Can You Link LLM Agent Metrics to Business Success (English)" /><published>2025-01-16T00:00:00+00:00</published><updated>2025-01-16T19:25:40+00:00</updated><id>https://xavidop.me/gcp/llm-metrics</id><content type="html" xml:base="https://xavidop.me/gcp/2025-01-16-llm-metrics/"><![CDATA[<ol class="no_toc" id="markdown-toc">
  <li><a href="#evaluating-ai-agents-powered-by-llms" id="markdown-toc-evaluating-ai-agents-powered-by-llms">Evaluating AI Agents Powered by LLMs</a>    <ol>
      <li><a href="#key-metrics-for-llm-driven-ai-agents" id="markdown-toc-key-metrics-for-llm-driven-ai-agents">Key Metrics for LLM-Driven AI Agents</a>        <ol>
          <li><a href="#1-interaction-metrics" id="markdown-toc-1-interaction-metrics">1. Interaction Metrics</a></li>
          <li><a href="#2-intent-usage-metrics" id="markdown-toc-2-intent-usage-metrics">2. Intent Usage Metrics</a></li>
          <li><a href="#3-goal-achievement-metrics" id="markdown-toc-3-goal-achievement-metrics">3. Goal Achievement Metrics</a></li>
          <li><a href="#4-conversation-flow-metrics" id="markdown-toc-4-conversation-flow-metrics">4. Conversation Flow Metrics</a></li>
          <li><a href="#5-llm-specific-metrics" id="markdown-toc-5-llm-specific-metrics">5. LLM-Specific Metrics</a>            <ol>
              <li><a href="#a-llm-performance" id="markdown-toc-a-llm-performance">a. LLM Performance</a></li>
              <li><a href="#b-llm-safety" id="markdown-toc-b-llm-safety">b. LLM Safety</a></li>
              <li><a href="#c-llm-hallucination" id="markdown-toc-c-llm-hallucination">c. LLM Hallucination</a></li>
              <li><a href="#d-evaluation-frameworks" id="markdown-toc-d-evaluation-frameworks">d. Evaluation Frameworks</a></li>
            </ol>
          </li>
        </ol>
      </li>
      <li><a href="#combining-metrics-for-insights" id="markdown-toc-combining-metrics-for-insights">Combining Metrics for Insights</a></li>
      <li><a href="#tools-for-evaluation" id="markdown-toc-tools-for-evaluation">Tools for Evaluation</a></li>
      <li><a href="#whats-next" id="markdown-toc-whats-next">What’s Next?</a></li>
    </ol>
  </li>
</ol>

<h1 id="evaluating-ai-agents-powered-by-llms">Evaluating AI Agents Powered by LLMs</h1>

<p>The rise of AI agents powered by Large Language Models (LLMs) has transformed the way we interact with technology. However, effectively evaluating these agents requires a comprehensive approach that goes beyond traditional metrics. This guide outlines key categories and specific metrics essential for understanding and improving LLM-driven AI agent performance.</p>

<h2 id="key-metrics-for-llm-driven-ai-agents">Key Metrics for LLM-Driven AI Agents</h2>

<h3 id="1-interaction-metrics">1. Interaction Metrics</h3>
<ul>
  <li><strong>Number of Interactions</strong>: Tracks usage frequency (e.g., daily interactions, returning users). High numbers indicate user trust and the agent’s value.</li>
  <li><strong>Session Duration</strong>: Measures the average time users spend interacting in a single session. Longer durations suggest deeper engagement but may also indicate inefficiencies.</li>
  <li><strong>Conversation Turns</strong>: Counts user-agent exchanges per session. A balance is crucial – too few might indicate limited depth, while too many could suggest inefficiency.</li>
</ul>

<h3 id="2-intent-usage-metrics">2. Intent Usage Metrics</h3>
<ul>
  <li><strong>Top Intents Used</strong>: Identifies the most frequent user intents (e.g., FAQs, bookings). This information helps guide resource allocation and prioritization of enhancements.</li>
  <li><strong>Intent Completion Rate</strong>: Measures the percentage of intents successfully handled by the agent without human intervention.</li>
</ul>

<p>These metrics are particularly relevant when enhancing a classic NLU with LLMs using NLU RAG techniques. These techniques leverage:</p>
<ul>
  <li>Intents</li>
  <li>Descriptions of intents</li>
  <li>User utterances</li>
  <li>User conversation history</li>
  <li>Current context of the conversation</li>
</ul>

<h3 id="3-goal-achievement-metrics">3. Goal Achievement Metrics</h3>
<ul>
  <li><strong>Happy Path Achievement</strong>: Tracks how smoothly users reach desired outcomes with minimal friction.</li>
  <li><strong>Task Success Rate (TSR)</strong>: Measures the percentage of interactions that successfully achieve user goals without retries or human intervention.</li>
</ul>

<h3 id="4-conversation-flow-metrics">4. Conversation Flow Metrics</h3>
<ul>
  <li><strong>Funnel Analysis</strong>: Examines user journeys within a conversation, identifying drop-offs and common pathways.</li>
  <li><strong>Turn Efficiency</strong>: Evaluates the number of exchanges required to complete a task. Fewer turns generally indicate higher efficiency.</li>
</ul>

<h3 id="5-llm-specific-metrics">5. LLM-Specific Metrics</h3>
<h4 id="a-llm-performance">a. LLM Performance</h4>
<ul>
  <li><strong>Response Accuracy</strong>: Assessed through human evaluation and automated benchmarks using pre-labeled datasets.</li>
  <li><strong>Latency</strong>: Measures the average response time of the LLM. Low latency is crucial for user satisfaction.</li>
</ul>

<h4 id="b-llm-safety">b. LLM Safety</h4>
<ul>
  <li><strong>Safety Detection</strong>: Monitors the model’s ability to detect and mitigate harmful or inappropriate content.</li>
  <li><strong>Inappropriate Response Rate</strong>: Measures the frequency of unsafe or irrelevant responses.</li>
</ul>

<h4 id="c-llm-hallucination">c. LLM Hallucination</h4>
<ul>
  <li><strong>Hallucination Rate</strong>: Tracks how often the LLM generates inaccurate or fabricated information.</li>
  <li><strong>Critical Hallucination Impact (CHI)</strong>: Identifies instances where hallucinations lead to severe misunderstandings or negative outcomes.</li>
</ul>

<h4 id="d-evaluation-frameworks">d. Evaluation Frameworks</h4>
<ul>
  <li><strong>Human Ratings</strong>: Experts evaluate responses for relevance, coherence, and correctness.</li>
  <li><strong>Automated Evaluations</strong>: Metrics like BLEU, ROUGE, and BERTScore compare generated responses to ground truth answers.</li>
</ul>

<h2 id="combining-metrics-for-insights">Combining Metrics for Insights</h2>
<p>By leveraging these metrics, you can gain valuable insights into the performance of your LLM agents and how users interact with them:</p>
<ul>
  <li><strong>Interaction and intent metrics</strong> provide insights into user engagement and core use cases.</li>
  <li><strong>Goal achievement and flow metrics</strong> refine user journeys and identify areas for improvement.</li>
  <li><strong>LLM-specific evaluations</strong> ensure safety, reliability, and accuracy.</li>
</ul>

<p>These metrics enable faster iteration: identifying gaps, improving current features, and introducing new capabilities.</p>

<h2 id="tools-for-evaluation">Tools for Evaluation</h2>
<p>Platforms like <strong>Voiceflow</strong> offer built-in evaluation metrics for AI agents, while advanced analytics tools like <strong>Feedback Intelligence</strong> can enhance your insights further.</p>

<h2 id="whats-next">What’s Next?</h2>
<p>In an upcoming article, we’ll explore a hands-on example of building an LLM agent, deploying it to production, and iterating on improvements using the metrics outlined above. Stay tuned!</p>]]></content><author><name>Xavier Portilla Edo</name><email>xavierportillaedo@gmail.com</email></author><category term="gcp" /><category term="googlecloud" /><summary type="html"><![CDATA[The rise of AI agents powered by Large Language Models (LLMs) has revolutionized how we interact with technology. However, effectively evaluating these agents requires a multifaceted approach that goes beyond traditional metrics]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://xavidop.me/assets/img/blog/post-headers/llm-metrics.jpg" /><media:content medium="image" url="https://xavidop.me/assets/img/blog/post-headers/llm-metrics.jpg" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry xml:lang="en"><title type="html">NLU powered by LLMs using genkit with GitHub Models (English)</title><link href="https://xavidop.me/genkit/gcp/2024-11-11-genkit-nlu/" rel="alternate" type="text/html" title="NLU powered by LLMs using genkit with GitHub Models (English)" /><published>2024-11-11T00:00:00+00:00</published><updated>2026-03-20T17:41:31+00:00</updated><id>https://xavidop.me/genkit/gcp/genkit-nlu</id><content type="html" xml:base="https://xavidop.me/genkit/gcp/2024-11-11-genkit-nlu/"><![CDATA[<ol class="no_toc" id="markdown-toc">
  <li><a href="#introduction" id="markdown-toc-introduction">Introduction</a></li>
  <li><a href="#setup" id="markdown-toc-setup">Setup</a>    <ol>
      <li><a href="#open-genkit-ui" id="markdown-toc-open-genkit-ui">Open Genkit UI</a></li>
      <li><a href="#run-the-firebase-emulator" id="markdown-toc-run-the-firebase-emulator">Run the Firebase emulator</a></li>
    </ol>
  </li>
  <li><a href="#configuration" id="markdown-toc-configuration">Configuration</a></li>
  <li><a href="#development" id="markdown-toc-development">Development</a>    <ol>
      <li><a href="#linting" id="markdown-toc-linting">Linting</a></li>
      <li><a href="#building" id="markdown-toc-building">Building</a></li>
    </ol>
  </li>
  <li><a href="#code-explanation" id="markdown-toc-code-explanation">Code Explanation</a></li>
  <li><a href="#prompt-definition" id="markdown-toc-prompt-definition">Prompt Definition</a></li>
  <li><a href="#usage" id="markdown-toc-usage">Usage</a>    <ol>
      <li><a href="#intents" id="markdown-toc-intents">Intents</a></li>
      <li><a href="#entities" id="markdown-toc-entities">Entities</a></li>
      <li><a href="#example" id="markdown-toc-example">Example</a></li>
    </ol>
  </li>
  <li><a href="#deploy" id="markdown-toc-deploy">Deploy</a></li>
  <li><a href="#resources" id="markdown-toc-resources">Resources</a></li>
  <li><a href="#conclusion" id="markdown-toc-conclusion">Conclusion</a></li>
</ol>

<h2 id="introduction">Introduction</h2>

<p>This project implements a Natural Language Understanding (NLU) flow using genkit AI and Firebase Functions. The NLU flow detects intents and extracts entities from a given text input.</p>

<p>this project uses GitHub Models using the Genkit GitHub models plugin.</p>

<p>This project uses the following technologies:</p>
<ol>
  <li>Firebase Functions</li>
  <li>genkit</li>
  <li>GitHub Models</li>
</ol>

<p>This project uses the following Node.js Packages:</p>
<ol>
  <li><code class="language-plaintext highlighter-rouge">@genkit-ai/firebase</code>: Genkit Firebase SDK to be able to use Genkit in Firebase Functions</li>
  <li><code class="language-plaintext highlighter-rouge">genkitx-ollama</code>: Genkit Ollama plugin to be able to use Ollama in Genkit</li>
  <li><code class="language-plaintext highlighter-rouge">genkit</code>: Genkit AI Core SDK</li>
</ol>

<h2 id="setup">Setup</h2>

<ol>
  <li>Clone this repository: <a href="https://github.com/xavidop/genkit-nlu">GitHub repository</a>.</li>
  <li>Run <code class="language-plaintext highlighter-rouge">npm install</code> to install the dependencies in the functions folder</li>
  <li>Run <code class="language-plaintext highlighter-rouge">firebase login</code> to login to your Firebase account</li>
  <li>Install genkit-cli by running <code class="language-plaintext highlighter-rouge">npm install -g genkit</code></li>
</ol>

<p>This repo is supposed to be used with NodeJS version 20.</p>

<h3 id="open-genkit-ui">Open Genkit UI</h3>

<p>Go to the functions folder and run <code class="language-plaintext highlighter-rouge">npm run genkit:start</code> to open the Genkit UI. The UI will be available at <code class="language-plaintext highlighter-rouge">http://localhost:4000</code>.</p>

<p class="figure"><img src="/assets/img/blog/tutorials/firebase-genkit-ollama/genaikitui.png" alt="Full-width image" class="lead" data-width="800" data-height="100" />
genkit UI</p>

<h3 id="run-the-firebase-emulator">Run the Firebase emulator</h3>

<p>To run the function locally, run <code class="language-plaintext highlighter-rouge">firebase emulators:start --inspect-functions</code>.</p>

<p>The emulator will be available at <code class="language-plaintext highlighter-rouge">http://localhost:4001</code></p>

<h2 id="configuration">Configuration</h2>

<ol>
  <li>
    <p>Ensure you have the necessary Firebase configuration files (<code class="language-plaintext highlighter-rouge">firebase.json</code>, <code class="language-plaintext highlighter-rouge">.firebaserc</code>).</p>
  </li>
  <li>
    <p>Update the <code class="language-plaintext highlighter-rouge">nlu/intents.yml</code> and <code class="language-plaintext highlighter-rouge">nlu/entities.yml</code> files with your intents and entities.</p>
  </li>
</ol>

<h2 id="development">Development</h2>

<h3 id="linting">Linting</h3>

<p>Run ESLint to check for code quality issues:</p>
<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>npm run lint
</code></pre></div></div>

<h3 id="building">Building</h3>

<p>Compile the TsypeScript code:</p>
<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>npm run build
</code></pre></div></div>

<h2 id="code-explanation">Code Explanation</h2>
<ul>
  <li>Configuration: The <code class="language-plaintext highlighter-rouge">genkit</code> function is called to set up the Genkit environment with plugins for Firebase, GitHub, and Dotprompt. It also sets the log level to “debug” and enables tracing and metrics.</li>
</ul>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">const</span> <span class="nx">ai</span> <span class="o">=</span> <span class="nf">genkit</span><span class="p">({</span>
  <span class="na">plugins</span><span class="p">:</span> <span class="p">[</span><span class="nf">github</span><span class="p">()],</span>
  <span class="na">promptDir</span><span class="p">:</span> <span class="dl">'</span><span class="s1">prompts</span><span class="dl">'</span><span class="p">,</span>
  <span class="na">model</span><span class="p">:</span> <span class="nx">openAIGpt4o</span>
<span class="p">});</span>
<span class="nx">logger</span><span class="p">.</span><span class="nf">setLogLevel</span><span class="p">(</span><span class="dl">'</span><span class="s1">debug</span><span class="dl">'</span><span class="p">);</span>
</code></pre></div></div>

<ul>
  <li>Flow Definition: The nluFlow is defined using the onFlow function.
    <ul>
      <li>Configuration: The flow is named <code class="language-plaintext highlighter-rouge">nluFlow</code> and has input and output schemas defined using zod. The input schema expects an object with a text string, and the output schema is a string. The flow does not require authentication (noAuth).</li>
      <li>nluFlow: The flow processes the input:
        <ul>
          <li>Schema Definition: Defines an <code class="language-plaintext highlighter-rouge">nluOutput</code> schema with intent and entities.</li>
          <li>Prompt Reference: Gets a reference to the “nlu” dotprompt file.</li>
          <li>File Reading: Reads <code class="language-plaintext highlighter-rouge">intents.yml</code> and <code class="language-plaintext highlighter-rouge">entities.yml</code> files.</li>
          <li>Prompt Generation: Uses the <code class="language-plaintext highlighter-rouge">nluPrompt</code> to generate output based on the input text and the read intents and entities.</li>
          <li>Return Output: Returns the generated output with type <code class="language-plaintext highlighter-rouge">nluOutput</code>.</li>
        </ul>
      </li>
    </ul>
  </li>
</ul>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">export</span> <span class="kd">const</span> <span class="nx">nluFlow</span> <span class="o">=</span> <span class="nx">ai</span><span class="p">.</span><span class="nf">defineFlow</span><span class="p">(</span>
  <span class="p">{</span>
    <span class="na">name</span><span class="p">:</span> <span class="dl">"</span><span class="s2">nluFlow</span><span class="dl">"</span><span class="p">,</span>
    <span class="na">inputSchema</span><span class="p">:</span> <span class="nx">z</span><span class="p">.</span><span class="nf">object</span><span class="p">({</span><span class="na">text</span><span class="p">:</span> <span class="nx">z</span><span class="p">.</span><span class="nf">string</span><span class="p">()}),</span>
    <span class="na">outputSchema</span><span class="p">:</span> <span class="nx">z</span><span class="p">.</span><span class="nf">string</span><span class="p">(),</span>
    <span class="na">authPolicy</span><span class="p">:</span> <span class="nf">noAuth</span><span class="p">(),</span> <span class="c1">// Not requiring authentication.</span>
  <span class="p">},</span>
  <span class="k">async </span><span class="p">(</span><span class="nx">toDetect</span><span class="p">)</span> <span class="o">=&gt;</span> <span class="p">{</span>
    <span class="kd">const</span> <span class="nx">nluOutput</span> <span class="o">=</span> <span class="nx">ai</span><span class="p">.</span><span class="nf">defineSchema</span><span class="p">(</span>
      <span class="dl">"</span><span class="s2">nluOutput</span><span class="dl">"</span><span class="p">,</span>
      <span class="nx">z</span><span class="p">.</span><span class="nf">object</span><span class="p">({</span>
        <span class="na">intent</span><span class="p">:</span> <span class="nx">z</span><span class="p">.</span><span class="nf">string</span><span class="p">(),</span>
        <span class="na">entities</span><span class="p">:</span> <span class="nx">z</span><span class="p">.</span><span class="nf">map</span><span class="p">(</span><span class="nx">z</span><span class="p">.</span><span class="nf">string</span><span class="p">(),</span> <span class="nx">z</span><span class="p">.</span><span class="nf">string</span><span class="p">()),</span>
      <span class="p">}),</span>
    <span class="p">);</span>

    <span class="kd">const</span> <span class="nx">nluPrompt</span> <span class="o">=</span> <span class="nx">ai</span><span class="p">.</span><span class="nx">prompt</span><span class="o">&lt;</span>
                        <span class="nx">z</span><span class="p">.</span><span class="nx">ZodTypeAny</span><span class="p">,</span> <span class="c1">// Input schema</span>
                        <span class="k">typeof</span> <span class="nx">nluOutput</span><span class="p">,</span> <span class="c1">// Output schema</span>
                        <span class="nx">z</span><span class="p">.</span><span class="nx">ZodTypeAny</span> <span class="c1">// Custom options schema</span>
                      <span class="o">&gt;</span><span class="p">(</span><span class="dl">"</span><span class="s2">nlu</span><span class="dl">"</span><span class="p">);</span>

    <span class="kd">const</span> <span class="nx">intents</span> <span class="o">=</span> <span class="nf">readFileSync</span><span class="p">(</span><span class="dl">'</span><span class="s1">nlu/intents.yml</span><span class="dl">'</span><span class="p">,</span><span class="dl">'</span><span class="s1">utf8</span><span class="dl">'</span><span class="p">);</span>
    <span class="kd">const</span> <span class="nx">entities</span> <span class="o">=</span> <span class="nf">readFileSync</span><span class="p">(</span><span class="dl">'</span><span class="s1">nlu/entities.yml</span><span class="dl">'</span><span class="p">,</span><span class="dl">'</span><span class="s1">utf8</span><span class="dl">'</span><span class="p">);</span>

    <span class="kd">const</span> <span class="nx">result</span> <span class="o">=</span> <span class="k">await</span> <span class="nf">nluPrompt</span><span class="p">({</span>
        <span class="na">intents</span><span class="p">:</span> <span class="nx">intents</span><span class="p">,</span>
        <span class="na">entities</span><span class="p">:</span> <span class="nx">entities</span><span class="p">,</span>
        <span class="na">user_input</span><span class="p">:</span> <span class="nx">toDetect</span><span class="p">.</span><span class="nx">text</span><span class="p">,</span>
    <span class="p">});</span>

    <span class="k">return</span> <span class="nx">JSON</span><span class="p">.</span><span class="nf">stringify</span><span class="p">(</span><span class="nx">result</span><span class="p">.</span><span class="nx">output</span><span class="p">);</span>
  <span class="p">},</span>
<span class="p">);</span>

<span class="k">export</span> <span class="kd">const</span> <span class="nx">nluFunction</span> <span class="o">=</span> <span class="nf">onCallGenkit</span><span class="p">({</span>
  <span class="na">authPolicy</span><span class="p">:</span> <span class="p">()</span> <span class="o">=&gt;</span> <span class="kc">true</span><span class="p">,</span> <span class="c1">// Allow all users to call this function. Not recommended for production.</span>
<span class="p">},</span> <span class="nx">nluFlow</span><span class="p">);</span>
</code></pre></div></div>

<h2 id="prompt-definition">Prompt Definition</h2>

<p>This <code class="language-plaintext highlighter-rouge">nlu.prompt</code> file defines a prompt for a Natural Language Understanding (NLU) model. Here’s a breakdown of its components:</p>

<ol>
  <li><strong>Model Specification</strong>:
    <div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="na">model</span><span class="pi">:</span> <span class="s">github/gpt-4o</span>
</code></pre></div>    </div>
    <p>This specifies the LLM model to be used, in this case, <code class="language-plaintext highlighter-rouge">github/gpt-4o</code>.</p>
  </li>
  <li><strong>Input Schema</strong>:
    <div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="na">input</span><span class="pi">:</span>
  <span class="na">schema</span><span class="pi">:</span>
    <span class="na">intents</span><span class="pi">:</span> <span class="s">string</span>
    <span class="na">entities</span><span class="pi">:</span> <span class="s">string</span>
    <span class="na">user_input</span><span class="pi">:</span> <span class="s">string</span>
</code></pre></div>    </div>
    <p>This defines the input schema for the prompt. It expects three string inputs:</p>
    <ul>
      <li><code class="language-plaintext highlighter-rouge">intents</code>: A string representing the intents.</li>
      <li><code class="language-plaintext highlighter-rouge">entities</code>: A string representing the entities.</li>
      <li><code class="language-plaintext highlighter-rouge">user_input</code>: A string representing the user’s input text.</li>
    </ul>
  </li>
  <li><strong>Output Specification</strong>:
    <div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="na">output</span><span class="pi">:</span>
  <span class="na">format</span><span class="pi">:</span> <span class="s">json</span>
  <span class="na">schema</span><span class="pi">:</span> <span class="s">nluOutput</span>
</code></pre></div>    </div>
    <p>This defines the output format and schema. The output will be in JSON format and should conform to the <code class="language-plaintext highlighter-rouge">nluOutput</code> schema.</p>
  </li>
  <li><strong>Prompt Text</strong>:
    <div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code>  <span class="s">---</span>
  <span class="s">model</span><span class="err">:</span> <span class="s">github/gpt-4o</span>
  <span class="s">input</span><span class="err">:</span>
 <span class="na">schema</span><span class="pi">:</span>
   <span class="na">intents</span><span class="pi">:</span> <span class="s">string</span>
   <span class="na">entities</span><span class="pi">:</span> <span class="s">string</span>
   <span class="na">user_input</span><span class="pi">:</span> <span class="s">string</span>
  <span class="na">output</span><span class="pi">:</span>
 <span class="na">format</span><span class="pi">:</span> <span class="s">json</span>
 <span class="na">schema</span><span class="pi">:</span> <span class="s">nluOutput</span>
  <span class="s">---</span>
<span class="s">You are a NLU that detects intents and extract entities from a given text.</span>

<span class="na">you have these intents and utterances</span><span class="pi">:</span>
<span class="pi">{{</span><span class="nv">intents</span><span class="pi">}}</span>
<span class="na">You also have these entities</span><span class="pi">:</span>
<span class="pi">{{</span><span class="nv">entities</span><span class="pi">}}</span>

<span class="na">The user says</span><span class="pi">:</span> <span class="pi">{{</span><span class="nv">user_input</span><span class="pi">}}</span>
<span class="s">Please specify the intent detected and the entity detected</span>
   
</code></pre></div>    </div>

    <p>This is the actual prompt text that will be used by the model. It provides context and instructions to the model:</p>
    <ul>
      <li>It describes the role of the model as an NLU system.</li>
      <li>It includes placeholders (<code class="language-plaintext highlighter-rouge">{{intents}}</code>, <code class="language-plaintext highlighter-rouge">{{entities}}</code>, <code class="language-plaintext highlighter-rouge">{{user_input}}</code>) that will be replaced with the actual input values.</li>
      <li>It asks the model to specify the detected intent and entity based on the provided user input.</li>
    </ul>
  </li>
</ol>

<h2 id="usage">Usage</h2>

<p>The main NLU flow is defined in index.ts. It reads intents and entities from YAML files and uses a prompt defined in <code class="language-plaintext highlighter-rouge">nlu.prompt</code> to generate responses.</p>

<h3 id="intents">Intents</h3>
<p>The intents are defined in the <code class="language-plaintext highlighter-rouge">nlu/intents.yml</code> file. Each intent has a name and a list of training phrases.</p>

<p>As an example, the following intent is defined in the <code class="language-plaintext highlighter-rouge">nlu/intents.yml</code> file:</p>
<div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="na">order_shoes</span><span class="pi">:</span>
  <span class="na">utterances</span><span class="pi">:</span> 
    <span class="pi">-</span> <span class="s">I want a pair of shoes from {shoes_brand}</span>
    <span class="pi">-</span> <span class="s">a shoes from {shoes_brand}</span>
</code></pre></div></div>
<p>The format is as follows:</p>
<div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="na">intent-name</span><span class="pi">:</span>
  <span class="na">utterances</span><span class="pi">:</span>
    <span class="pi">-</span> <span class="s">training phrase </span><span class="m">1</span>
    <span class="pi">-</span> <span class="s">training phrase </span><span class="m">2</span>
    <span class="pi">-</span> <span class="s">...</span>
</code></pre></div></div>

<h3 id="entities">Entities</h3>
<p>The entities are defined in the <code class="language-plaintext highlighter-rouge">nlu/entities.yml</code> file. Each entity has a name and a list of synonyms.</p>

<p>As an example, the following entity is defined in the <code class="language-plaintext highlighter-rouge">nlu/entities.yml</code> file:</p>
<div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="na">shoes_brand</span><span class="pi">:</span>
  <span class="na">examples</span><span class="pi">:</span>
    <span class="pi">-</span> <span class="s">Puma</span>
    <span class="pi">-</span> <span class="s">Nike</span>
</code></pre></div></div>
<p>The format is as follows:</p>
<div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="na">entity-name</span><span class="pi">:</span>
  <span class="na">examples</span><span class="pi">:</span>
    <span class="pi">-</span> <span class="s">synonym </span><span class="m">1</span>
    <span class="pi">-</span> <span class="s">synonym </span><span class="m">2</span>
    <span class="pi">-</span> <span class="s">...</span>
</code></pre></div></div>

<h3 id="example">Example</h3>

<p>To trigger the NLU flow, send a request with the following structure:</p>
<div class="language-json highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">{</span><span class="w">
  </span><span class="nl">"text"</span><span class="p">:</span><span class="w"> </span><span class="s2">"Your input text here"</span><span class="w">
</span><span class="p">}</span><span class="w">
</span></code></pre></div></div>

<p>The response will be a JSON object with the following structure:</p>
<div class="language-json highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">{</span><span class="w">
  </span><span class="nl">"intent"</span><span class="p">:</span><span class="w"> </span><span class="s2">"intent-name"</span><span class="p">,</span><span class="w">
  </span><span class="nl">"entities"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w">
    </span><span class="nl">"entity-name"</span><span class="p">:</span><span class="w"> </span><span class="s2">"entity-value"</span><span class="w">
  </span><span class="p">}</span><span class="w">
</span><span class="p">}</span><span class="w">
</span></code></pre></div></div>

<h2 id="deploy">Deploy</h2>

<p>To deploy the function, run <code class="language-plaintext highlighter-rouge">firebase deploy --only functions</code>.</p>

<h2 id="resources">Resources</h2>

<ul>
  <li><a href="https://genkit.dev/">genkit</a></li>
  <li><a href="https://github.com/marketplace/models">GitHub Models</a></li>
  <li><a href="https://firebase.google.com/docs/functions">Firebase Functions</a></li>
</ul>

<h2 id="conclusion">Conclusion</h2>

<p>As you can see, it is very easy to use Genkit and GitHub Models in Firebase Functions. You can use this example as a starting point to create your own functions using Genkit and GitHub Models.</p>

<p>You can find the full code of this example in the <a href="https://github.com/xavidop/genkit-nlu">GitHub repository</a></p>

<p>Happy coding!</p>]]></content><author><name>Xavier Portilla Edo</name><email>xavierportillaedo@gmail.com</email></author><category term="genkit" /><category term="gcp" /><category term="firebase" /><category term="genkit" /><category term="gcp" /><summary type="html"><![CDATA[The NLU powered by LLMs that just works]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://xavidop.me/assets/img/blog/post-headers/firebase-genkit-nlu.png" /><media:content medium="image" url="https://xavidop.me/assets/img/blog/post-headers/firebase-genkit-nlu.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry xml:lang="en"><title type="html">The Evolution of Conversational AI. Blending Determinism with Dynamism (English)</title><link href="https://xavidop.me/azure/gcp/2024-08-26-future-of-agents/" rel="alternate" type="text/html" title="The Evolution of Conversational AI. Blending Determinism with Dynamism (English)" /><published>2024-08-26T00:00:00+00:00</published><updated>2024-08-26T10:06:06+00:00</updated><id>https://xavidop.me/azure/gcp/future-of-agents</id><content type="html" xml:base="https://xavidop.me/azure/gcp/2024-08-26-future-of-agents/"><![CDATA[<ol class="no_toc" id="markdown-toc">
  <li><a href="#introduction" id="markdown-toc-introduction">Introduction</a></li>
  <li><a href="#the-need-for-determinism-in-conversational-ai" id="markdown-toc-the-need-for-determinism-in-conversational-ai">The Need for Determinism in Conversational AI</a></li>
  <li><a href="#the-role-of-dynamism-and-creativity" id="markdown-toc-the-role-of-dynamism-and-creativity">The Role of Dynamism and Creativity</a></li>
  <li><a href="#the-synergy-of-determinism-and-dynamism" id="markdown-toc-the-synergy-of-determinism-and-dynamism">The Synergy of Determinism and Dynamism</a></li>
  <li><a href="#large-language-models-and-the-future-of-conversational-ai" id="markdown-toc-large-language-models-and-the-future-of-conversational-ai">Large Language Models and the Future of Conversational AI</a></li>
  <li><a href="#resources" id="markdown-toc-resources">Resources</a></li>
  <li><a href="#conclusion" id="markdown-toc-conclusion">Conclusion</a></li>
</ol>

<h2 id="introduction">Introduction</h2>

<p>Conversational AI agents have come a long way from their early days of simple, scripted interactions. With the explosion of large language models (LLMs) like GPT-3, Gemini and beyond, the landscape of human-computer interaction is undergoing a significant transformation.</p>

<p>These AI agents are increasingly expected to mimic human-like interactions, which demands a delicate balance between deterministic (convergent) workflows and dynamic, creative responses (divergent). This dual approach is redefining how these agents function across various domains, including education, customer service, and personal assistance.</p>

<h2 id="the-need-for-determinism-in-conversational-ai">The Need for Determinism in Conversational AI</h2>

<p>At the core of any effective and complex conversational AI agent lies a structured, deterministic flow. Determinism ensures that the agent follows a predefined path (happy path), making decisions based on set criteria and providing predictable outcomes and accomplished the goals of that specific agent. This is particularly crucial in educational contexts, such as language learning, where the agent must track a user’s progress and deliver content in a logical sequence.</p>

<p>Let’s consider an AI agent designed to teach Spanish to non-Spanish speakers. The learning process must be carefully structured to ensure that the student builds their knowledge incrementally. The agent needs to assess where the student is in their learning journey and determine the next appropriate step, whether it’s introducing new vocabulary, reinforcing grammar rules, or practicing conversation skills. In this scenario, the agent operates within a deterministic framework, governed by <strong>workflows</strong> that handle the control of its main actions.</p>

<p>Platforms like Voiceflow, Langflow, and Dialogflow CX are instrumental in building these deterministic workflows. They allow developers to design conversational experiences where each interaction is mapped out in advance. This ensures that the AI agent can reliably guide the user through the learning process, providing a sense of progression and achievement. In this deterministic mode, the agent is less about simulating human conversation and more about delivering structured, purposeful instruction.</p>

<h2 id="the-role-of-dynamism-and-creativity">The Role of Dynamism and Creativity</h2>

<p>However, not all interactions can or should be rigidly scripted. Human conversations are inherently dynamic, filled with nuances, unexpected questions, and shifts in context. To address this, modern conversational AI agents are increasingly incorporating LLMs to introduce a level of creativity and natural interaction that deterministic workflows cannot provide.</p>

<p>Returning to the example of the Spanish language learning agent, while the core lessons may be delivered through a deterministic flow, there comes a point where the conversation might need to become more fluid and adaptable. For instance, after completing a structured lesson, the agent could transition to a more open-ended conversation managed by an LLM. Here, the AI could engage the learner in a casual chat in Spanish, responding to any topic the learner introduces, providing explanations, and correcting mistakes in a manner that feels more like conversing with a native speaker than interacting with a machine.</p>

<p>This dynamic mode is powered by the LLM’s ability to generate responses that are contextually relevant and varied, offering a more natural and engaging user experience. It allows the AI to handle the unpredictable nature of human interaction, filling in gaps where deterministic systems might falter.</p>

<h2 id="the-synergy-of-determinism-and-dynamism">The Synergy of Determinism and Dynamism</h2>

<p>The future of conversational AI lies in the seamless integration of these two approaches. Deterministic workflows provide the structure necessary for achieving specific goals, ensuring that users are guided efficiently and effectively through complex processes. At the same time, the dynamism introduced by LLMs brings the interaction to life, making it more engaging and human-like.</p>

<p>In practical terms, this means designing AI agents that can transition smoothly between these modes. For example, an agent could start a conversation with a user in a deterministic mode, collecting necessary information and performing specific tasks. Once these tasks are complete, the agent could switch to a more dynamic mode, allowing the conversation to flow naturally and adapt to the user’s needs and preferences.</p>

<p>The integration of these approaches can significantly enhance the user experience, making AI agents not just tools for completing tasks, but also companions capable of engaging, meaningful interactions. As these technologies continue to evolve, we can expect conversational AI to become even more sophisticated, blending the predictability of deterministic workflows with the creativity and adaptability of LLMs to create truly human-like interactions.</p>

<h2 id="large-language-models-and-the-future-of-conversational-ai">Large Language Models and the Future of Conversational AI</h2>

<p>Large Language Models have revolutionized the field of conversational AI, enabling agents to generate text that is indistinguishable from human writing. But not only that, LLMs have been used to create a wide range of agents from chatbots to assisstants, tutors, etc. By giving to them intructions and data, they can perform a wide range of tasks. Such us for example, detect the correctness, sentiment, agressiveness of a text. This is what we called <strong>Instructional AI</strong>, a new way to interact with LLMs and make them perform specific tasks in conversational AI applications.</p>

<p>An example has been developed to showcase these capabilities. This project implements a simple language correctness detector that detects grammatical errors, sentiment, aggressiveness, and provides solutions for the errors in the text. The project will be part of a bigger conversational AI agent that helps users improve their language skills.</p>

<p>This is an input/output example of the project:
<strong>Input:</strong></p>
<div class="language-json highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">{</span><span class="w"> </span><span class="err">language:</span><span class="w"> </span><span class="s2">"Spanish"</span><span class="p">,</span><span class="w"> </span><span class="err">text:</span><span class="w"> </span><span class="s2">"Yo soy enfadado"</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></code></pre></div></div>

<p><strong>Output:</strong></p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>{
  result: {
    sentiment: 'angry',
    aggressiveness: 2,
    correctness: 7,
    errors: [
      "The correct form of the verb 'estar' should be used instead of 'ser' when expressing emotions or states."
    ],
    solution: 'Yo estoy enfadado',
    language: 'Spanish'
  }
}
</code></pre></div></div>

<p>The project can be found <a href="https://github.com/xavidop/langchain-language-correctness-detector">here</a>.</p>

<h2 id="resources">Resources</h2>

<ul>
  <li><a href="https://medium.com/google-cloud/generative-ai-is-new-and-exciting-but-conversation-design-principles-are-forever-193371489f99">Generative AI is new and exciting but conversation design principles are forever from Alessia Sachi</a></li>
</ul>

<h2 id="conclusion">Conclusion</h2>

<p>The explosion of LLMs has brought a new dimension to conversational AI, making it possible to create agents that interact with humans in ways that are increasingly natural and engaging. However, the complexity of human interaction requires more than just creativity also demands structure and determinism. By combining deterministic workflows with the dynamic capabilities of LLMs, developers can create AI agents that not only perform tasks efficiently but also offer a level of interaction that feels genuinely human. Moreover, the integration of Instructional AI with LLMs is a new way to interact with them and make them perform specific tasks in conversational AI applications.</p>

<p>This blend of structure and creativity is the key to the next generation of conversational AI, where agents can guide, teach, and interact in ways that are both predictable and profoundly engaging.</p>

<p>Happy coding!</p>]]></content><author><name>Xavier Portilla Edo</name><email>xavierportillaedo@gmail.com</email></author><category term="azure" /><category term="gcp" /><category term="azure" /><category term="gcp" /><summary type="html"><![CDATA[Learn how the future of conversational AI lies in the seamless integration of deterministic workflows and dynamic, creative responses.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://xavidop.me/assets/img/blog/post-headers/future-of-agents.png" /><media:content medium="image" url="https://xavidop.me/assets/img/blog/post-headers/future-of-agents.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry xml:lang="en"><title type="html">Langchain Language Correctness Detector (English)</title><link href="https://xavidop.me/azure/2024-08-25-langchain-language-correctness-detector/" rel="alternate" type="text/html" title="Langchain Language Correctness Detector (English)" /><published>2024-08-25T00:00:00+00:00</published><updated>2024-11-11T09:04:23+00:00</updated><id>https://xavidop.me/azure/langchain-language-correctness-detector</id><content type="html" xml:base="https://xavidop.me/azure/2024-08-25-langchain-language-correctness-detector/"><![CDATA[<ol class="no_toc" id="markdown-toc">
  <li><a href="#introduction" id="markdown-toc-introduction">Introduction</a></li>
  <li><a href="#features" id="markdown-toc-features">Features</a></li>
  <li><a href="#stack-used" id="markdown-toc-stack-used">Stack Used</a></li>
  <li><a href="#installation" id="markdown-toc-installation">Installation</a></li>
  <li><a href="#usage" id="markdown-toc-usage">Usage</a></li>
  <li><a href="#code-explanation" id="markdown-toc-code-explanation">Code Explanation</a>    <ol>
      <li><a href="#imports-and-environment-setup" id="markdown-toc-imports-and-environment-setup">Imports and Environment Setup</a></li>
      <li><a href="#system-template-and-schema-definition" id="markdown-toc-system-template-and-schema-definition">System Template and Schema Definition</a></li>
      <li><a href="#prompt-template-and-model-selection" id="markdown-toc-prompt-template-and-model-selection">Prompt Template and Model Selection</a></li>
      <li><a href="#main-function" id="markdown-toc-main-function">Main Function</a></li>
    </ol>
  </li>
  <li><a href="#prompts-used-for-detecting-correctness" id="markdown-toc-prompts-used-for-detecting-correctness">Prompts Used for Detecting Correctness</a></li>
  <li><a href="#examples" id="markdown-toc-examples">Examples</a>    <ol>
      <li><a href="#openai" id="markdown-toc-openai">OpenAI</a></li>
      <li><a href="#gemini" id="markdown-toc-gemini">Gemini</a></li>
    </ol>
  </li>
  <li><a href="#license" id="markdown-toc-license">License</a></li>
  <li><a href="#contributing" id="markdown-toc-contributing">Contributing</a></li>
  <li><a href="#resources" id="markdown-toc-resources">Resources</a></li>
  <li><a href="#conclusion" id="markdown-toc-conclusion">Conclusion</a></li>
</ol>

<h2 id="introduction">Introduction</h2>

<p>This project implements a simple Langchain language correctness detector that detects grammatical errors, sentiment, aggressiveness, and provides solutions for the errors in the text.</p>

<h2 id="features">Features</h2>

<ul>
  <li>Detects grammatical errors in the text.</li>
  <li>Analyzes the sentiment of the text.</li>
  <li>Measures the aggressiveness of the text.</li>
  <li>Provides solutions for the detected errors.</li>
</ul>

<h2 id="stack-used">Stack Used</h2>

<ul>
  <li><strong>Node.js</strong>: JavaScript runtime environment.</li>
  <li><strong>TypeScript</strong>: Typed superset of JavaScript.</li>
  <li><strong>Langchain</strong>: Language processing library.</li>
  <li><strong>OpenAI API</strong>: For language model capabilities.</li>
  <li><strong>Google Cloud</strong>: For additional language processing services.</li>
</ul>

<h2 id="installation">Installation</h2>

<ol>
  <li>Clone the repository:
    <div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code> git clone https://github.com/xavidop/langchain-example.git
 <span class="nb">cd </span>langchain-example
</code></pre></div>    </div>
  </li>
  <li>Install the dependencies:
    <div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code> yarn <span class="nb">install</span>
</code></pre></div>    </div>
  </li>
  <li>Create a <code class="language-plaintext highlighter-rouge">.env</code> file in the root directory and add your OpenAI API key and Google Application credentials:
    <pre><code class="language-env"> OPENAI_API_KEY="your-openai-api-key"
 GOOGLE_APPLICATION_CREDENTIALS=credentials.json
 LLM_PROVIDER='OPENAI'
</code></pre>
  </li>
</ol>

<h2 id="usage">Usage</h2>

<ol>
  <li>Build the project:
    <div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code> yarn run build
</code></pre></div>    </div>
  </li>
  <li>Start the application:
    <div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code> yarn start
</code></pre></div>    </div>
  </li>
  <li>For development, you can use:
    <div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code> yarn run dev
</code></pre></div>    </div>
    <h2 id="code-explanation">Code Explanation</h2>
  </li>
</ol>

<h3 id="imports-and-environment-setup">Imports and Environment Setup</h3>
<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">import</span> <span class="p">{</span> <span class="nx">ChatOpenAI</span><span class="p">,</span> <span class="nx">ChatOpenAICallOptions</span> <span class="p">}</span> <span class="k">from</span> <span class="dl">"</span><span class="s2">@langchain/openai</span><span class="dl">"</span><span class="p">;</span>
<span class="k">import</span> <span class="p">{</span> <span class="nx">ChatVertexAI</span> <span class="p">}</span> <span class="k">from</span> <span class="dl">"</span><span class="s2">@langchain/google-vertexai</span><span class="dl">"</span><span class="p">;</span>
<span class="k">import</span> <span class="p">{</span> <span class="nx">ChatPromptTemplate</span> <span class="p">}</span> <span class="k">from</span> <span class="dl">"</span><span class="s2">@langchain/core/prompts</span><span class="dl">"</span><span class="p">;</span>
<span class="k">import</span> <span class="p">{</span> <span class="nx">z</span> <span class="p">}</span> <span class="k">from</span> <span class="dl">"</span><span class="s2">zod</span><span class="dl">"</span><span class="p">;</span>
<span class="k">import</span> <span class="o">*</span> <span class="kd">as </span><span class="nx">dotenv</span> <span class="k">from</span> <span class="dl">"</span><span class="s2">dotenv</span><span class="dl">"</span><span class="p">;</span>

<span class="c1">// Load environment variables from .env file</span>
<span class="nx">dotenv</span><span class="p">.</span><span class="nf">config</span><span class="p">();</span>
</code></pre></div></div>
<ul>
  <li><strong>Imports:</strong> The code imports necessary modules from Langchain, Zod for schema validation, and dotenv for environment variable management.</li>
  <li><strong>Environment Setup:</strong> Loads environment variables from a <code class="language-plaintext highlighter-rouge">.env</code> file.</li>
</ul>

<h3 id="system-template-and-schema-definition">System Template and Schema Definition</h3>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">const</span> <span class="nx">systemTemplate</span> <span class="o">=</span> <span class="dl">"</span><span class="s2">You are an expert in {language}, you have to detect grammar problems sentences</span><span class="dl">"</span><span class="p">;</span>

<span class="kd">const</span> <span class="nx">classificationSchema</span> <span class="o">=</span> <span class="nx">z</span><span class="p">.</span><span class="nf">object</span><span class="p">({</span>
  <span class="na">sentiment</span><span class="p">:</span> <span class="nx">z</span><span class="p">.</span><span class="nf">enum</span><span class="p">([</span><span class="dl">"</span><span class="s2">happy</span><span class="dl">"</span><span class="p">,</span> <span class="dl">"</span><span class="s2">neutral</span><span class="dl">"</span><span class="p">,</span> <span class="dl">"</span><span class="s2">sad</span><span class="dl">"</span><span class="p">,</span> <span class="dl">"</span><span class="s2">angry</span><span class="dl">"</span><span class="p">,</span> <span class="dl">"</span><span class="s2">frustrated</span><span class="dl">"</span><span class="p">]).</span><span class="nf">describe</span><span class="p">(</span><span class="dl">"</span><span class="s2">The sentiment of the text</span><span class="dl">"</span><span class="p">),</span>
  <span class="na">aggressiveness</span><span class="p">:</span> <span class="nx">z</span><span class="p">.</span><span class="nf">number</span><span class="p">().</span><span class="nf">int</span><span class="p">().</span><span class="nf">min</span><span class="p">(</span><span class="mi">1</span><span class="p">).</span><span class="nf">max</span><span class="p">(</span><span class="mi">10</span><span class="p">).</span><span class="nf">describe</span><span class="p">(</span><span class="dl">"</span><span class="s2">How aggressive the text is on a scale from 1 to 10</span><span class="dl">"</span><span class="p">),</span>
  <span class="na">correctness</span><span class="p">:</span> <span class="nx">z</span><span class="p">.</span><span class="nf">number</span><span class="p">().</span><span class="nf">int</span><span class="p">().</span><span class="nf">min</span><span class="p">(</span><span class="mi">1</span><span class="p">).</span><span class="nf">max</span><span class="p">(</span><span class="mi">10</span><span class="p">).</span><span class="nf">describe</span><span class="p">(</span><span class="dl">"</span><span class="s2">How the sentence is correct grammatically on a scale from 1 to 10</span><span class="dl">"</span><span class="p">),</span>
  <span class="na">errors</span><span class="p">:</span> <span class="nx">z</span><span class="p">.</span><span class="nf">array</span><span class="p">(</span><span class="nx">z</span><span class="p">.</span><span class="nf">string</span><span class="p">()).</span><span class="nf">describe</span><span class="p">(</span><span class="dl">"</span><span class="s2">The errors in the text. Specify the proper way to write the text and where it is wrong. Explain it in a human-readable way. Write each error in a separate string</span><span class="dl">"</span><span class="p">),</span>
  <span class="na">solution</span><span class="p">:</span> <span class="nx">z</span><span class="p">.</span><span class="nf">string</span><span class="p">().</span><span class="nf">describe</span><span class="p">(</span><span class="dl">"</span><span class="s2">The solution to the errors in the text. Write the solution in {language}</span><span class="dl">"</span><span class="p">),</span>
  <span class="na">language</span><span class="p">:</span> <span class="nx">z</span><span class="p">.</span><span class="nf">string</span><span class="p">().</span><span class="nf">describe</span><span class="p">(</span><span class="dl">"</span><span class="s2">The language the text is written in</span><span class="dl">"</span><span class="p">),</span>
<span class="p">});</span>
</code></pre></div></div>
<ul>
  <li><strong>System Template:</strong> Defines a template for the system message, indicating the language and the task of detecting grammar problems.</li>
  <li><strong>Classification Schema:</strong> Uses Zod to define a schema for the expected output, including sentiment, aggressiveness, correctness, errors, solution, and language.</li>
</ul>

<h3 id="prompt-template-and-model-selection">Prompt Template and Model Selection</h3>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">const</span> <span class="nx">promptTemplate</span> <span class="o">=</span> <span class="nx">ChatPromptTemplate</span><span class="p">.</span><span class="nf">fromMessages</span><span class="p">([</span>
  <span class="p">[</span><span class="dl">"</span><span class="s2">system</span><span class="dl">"</span><span class="p">,</span> <span class="nx">systemTemplate</span><span class="p">],</span>
  <span class="p">[</span><span class="dl">"</span><span class="s2">user</span><span class="dl">"</span><span class="p">,</span> <span class="dl">"</span><span class="s2">{text}</span><span class="dl">"</span><span class="p">],</span>
<span class="p">]);</span>

<span class="kd">let</span> <span class="nx">model</span><span class="p">:</span> <span class="kr">any</span><span class="p">;</span>
<span class="k">if </span><span class="p">(</span><span class="nx">process</span><span class="p">.</span><span class="nx">env</span><span class="p">.</span><span class="nx">LLM_PROVIDER</span> <span class="o">==</span> <span class="dl">"</span><span class="s2">OPENAI</span><span class="dl">"</span><span class="p">)</span> <span class="p">{</span>
  <span class="nx">model</span> <span class="o">=</span> <span class="k">new</span> <span class="nc">ChatOpenAI</span><span class="p">({</span> 
    <span class="na">model</span><span class="p">:</span> <span class="dl">"</span><span class="s2">gpt-4</span><span class="dl">"</span><span class="p">,</span>
    <span class="na">temperature</span><span class="p">:</span> <span class="mi">0</span><span class="p">,</span>
  <span class="p">});</span>
<span class="p">}</span> <span class="k">else</span> <span class="p">{</span>
  <span class="nx">model</span> <span class="o">=</span> <span class="k">new</span> <span class="nc">ChatVertexAI</span><span class="p">({</span> 
    <span class="na">model</span><span class="p">:</span> <span class="dl">"</span><span class="s2">gemini-1.5-pro-001</span><span class="dl">"</span><span class="p">,</span>
    <span class="na">temperature</span><span class="p">:</span> <span class="mi">0</span><span class="p">,</span>
  <span class="p">});</span>
<span class="p">}</span>
</code></pre></div></div>
<ul>
  <li><strong>Prompt Template:</strong> Creates a prompt template using the system message and user input.</li>
  <li><strong>Model Selection:</strong> Selects the language model based on the <code class="language-plaintext highlighter-rouge">LLM_PROVIDER</code> environment variable. It can either be OpenAI’s GPT-4 or Google’s Vertex AI.</li>
</ul>

<h3 id="main-function">Main Function</h3>
<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">export</span> <span class="kd">const</span> <span class="nx">run</span> <span class="o">=</span> <span class="k">async </span><span class="p">()</span> <span class="o">=&gt;</span> <span class="p">{</span>
  <span class="kd">const</span> <span class="nx">llmWihStructuredOutput</span> <span class="o">=</span> <span class="nx">model</span><span class="p">.</span><span class="nf">withStructuredOutput</span><span class="p">(</span><span class="nx">classificationSchema</span><span class="p">,</span> <span class="p">{</span>
    <span class="na">name</span><span class="p">:</span> <span class="dl">"</span><span class="s2">extractor</span><span class="dl">"</span><span class="p">,</span>
  <span class="p">});</span>

  <span class="kd">const</span> <span class="nx">chain</span> <span class="o">=</span> <span class="k">await</span> <span class="nx">promptTemplate</span><span class="p">.</span><span class="nf">pipe</span><span class="p">(</span><span class="nx">llmWihStructuredOutput</span><span class="p">);</span>

  <span class="kd">const</span> <span class="nx">result</span> <span class="o">=</span> <span class="k">await</span> <span class="nx">chain</span><span class="p">.</span><span class="nf">invoke</span><span class="p">({</span> <span class="na">language</span><span class="p">:</span> <span class="dl">"</span><span class="s2">Spanish</span><span class="dl">"</span><span class="p">,</span> <span class="na">text</span><span class="p">:</span> <span class="dl">"</span><span class="s2">Yo soy enfadado</span><span class="dl">"</span> <span class="p">});</span>

  <span class="nx">console</span><span class="p">.</span><span class="nf">log</span><span class="p">({</span> <span class="nx">result</span> <span class="p">});</span>
<span class="p">};</span>

<span class="nf">run</span><span class="p">();</span>
</code></pre></div></div>

<ul>
  <li><strong>Structured Output:</strong> Configures the model to use the defined classification schema.</li>
  <li><strong>Pipeline:</strong> Creates a pipeline by combining the prompt template and the structured output model.</li>
  <li><strong>Invocation:</strong> Invokes the pipeline with a sample text in Spanish, and logs the result.</li>
</ul>

<h2 id="prompts-used-for-detecting-correctness">Prompts Used for Detecting Correctness</h2>

<p>The following prompts are used to detect the correctness of the text:</p>

<ol>
  <li><strong>Grammatical Errors</strong>:
    <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code> "Please check the following text for grammatical errors: {text}"
</code></pre></div>    </div>
  </li>
  <li><strong>Sentiment Analysis</strong>:
    <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code> "Analyze the sentiment of the following text: {text}"
</code></pre></div>    </div>
  </li>
  <li><strong>Aggressiveness Detection</strong>:
    <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code> "Measure the aggressiveness of the following text: {text}"
</code></pre></div>    </div>
  </li>
  <li><strong>Error Solutions</strong>:
    <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code> "Provide solutions for the errors found in the following text: {text}"
</code></pre></div>    </div>
  </li>
</ol>

<h2 id="examples">Examples</h2>

<p>This project can be used with different language models to detect language correctness. Here are some examples using OpenAI and Gemini models.</p>

<h3 id="openai">OpenAI</h3>

<p>With OpenAI’s GPT-4 model, the system can detect grammatical errors, sentiment, and aggressiveness in the text.</p>

<p>Input:</p>
<div class="language-json highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">{</span><span class="w"> </span><span class="err">language:</span><span class="w"> </span><span class="s2">"Spanish"</span><span class="p">,</span><span class="w"> </span><span class="err">text:</span><span class="w"> </span><span class="s2">"Yo soy enfadado"</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></code></pre></div></div>

<p>Output:</p>
<div class="language-json highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">{</span><span class="w">
  </span><span class="err">result:</span><span class="w"> </span><span class="p">{</span><span class="w">
    </span><span class="err">sentiment:</span><span class="w"> </span><span class="err">'angry'</span><span class="p">,</span><span class="w">
    </span><span class="err">aggressiveness:</span><span class="w"> </span><span class="mi">2</span><span class="p">,</span><span class="w">
    </span><span class="err">correctness:</span><span class="w"> </span><span class="mi">7</span><span class="p">,</span><span class="w">
    </span><span class="err">errors:</span><span class="w"> </span><span class="p">[</span><span class="w">
      </span><span class="s2">"The correct form of the verb 'estar' should be used instead of 'ser' when expressing emotions or states."</span><span class="w">
    </span><span class="p">],</span><span class="w">
    </span><span class="err">solution:</span><span class="w"> </span><span class="err">'Yo</span><span class="w"> </span><span class="err">estoy</span><span class="w"> </span><span class="err">enfadado'</span><span class="p">,</span><span class="w">
    </span><span class="err">language:</span><span class="w"> </span><span class="err">'Spanish'</span><span class="w">
  </span><span class="p">}</span><span class="w">
</span><span class="p">}</span><span class="w">
</span></code></pre></div></div>

<h3 id="gemini">Gemini</h3>

<p>With Google’s Vertex AI Gemini model, the output is quite similar:</p>

<p>Input:</p>
<div class="language-json highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">{</span><span class="w"> </span><span class="err">language:</span><span class="w"> </span><span class="s2">"Spanish"</span><span class="p">,</span><span class="w"> </span><span class="err">text:</span><span class="w"> </span><span class="s2">"Yo soy enfadado"</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></code></pre></div></div>

<p>Output:</p>
<div class="language-json highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">{</span><span class="w">
  </span><span class="err">result:</span><span class="w"> </span><span class="p">{</span><span class="w">
    </span><span class="err">sentiment:</span><span class="w"> </span><span class="err">'angry'</span><span class="p">,</span><span class="w">
    </span><span class="err">aggressiveness:</span><span class="w"> </span><span class="mi">1</span><span class="p">,</span><span class="w">
    </span><span class="err">correctness:</span><span class="w"> </span><span class="mi">8</span><span class="p">,</span><span class="w">
    </span><span class="err">errors:</span><span class="w"> </span><span class="p">[</span><span class="w">
      </span><span class="err">'The</span><span class="w"> </span><span class="err">correct</span><span class="w"> </span><span class="err">grammar</span><span class="w"> </span><span class="err">is</span><span class="w"> </span><span class="s2">"estoy enfadado"</span><span class="w"> </span><span class="err">because</span><span class="w"> </span><span class="s2">"ser"</span><span class="w"> </span><span class="err">is</span><span class="w"> </span><span class="err">used</span><span class="w"> </span><span class="err">for</span><span class="w"> </span><span class="err">permanent</span><span class="w"> </span><span class="err">states</span><span class="w"> </span><span class="err">and</span><span class="w"> </span><span class="s2">"estar"</span><span class="w"> </span><span class="err">is</span><span class="w"> </span><span class="err">used</span><span class="w"> </span><span class="err">for</span><span class="w"> </span><span class="err">temporary</span><span class="w"> </span><span class="err">states.</span><span class="w"> </span><span class="err">In</span><span class="w"> </span><span class="err">this</span><span class="w"> </span><span class="err">case</span><span class="p">,</span><span class="w"> </span><span class="err">being</span><span class="w"> </span><span class="err">angry</span><span class="w"> </span><span class="err">is</span><span class="w"> </span><span class="err">a</span><span class="w"> </span><span class="err">temporary</span><span class="w"> </span><span class="err">state.'</span><span class="w">
    </span><span class="p">],</span><span class="w">
    </span><span class="err">solution:</span><span class="w"> </span><span class="err">'Estoy</span><span class="w"> </span><span class="err">enfadado'</span><span class="p">,</span><span class="w">
    </span><span class="err">language:</span><span class="w"> </span><span class="err">'Spanish'</span><span class="w">
  </span><span class="p">}</span><span class="w">
</span><span class="p">}</span><span class="w">
</span></code></pre></div></div>

<h2 id="license">License</h2>

<p>This project is licensed under the Apache License, Version 2.0. See the <a href="https://www.apache.org/licenses/LICENSE-2.0">LICENSE</a> file for more details.</p>

<h2 id="contributing">Contributing</h2>

<p>Contributions are welcome! Please open an issue or submit a pull request for any changes.</p>

<h2 id="resources">Resources</h2>

<ul>
  <li><a href="https://langchain.com/">Langchain</a></li>
  <li><a href="https://cloud.google.com/vertex-ai">Vertex AI</a></li>
  <li><a href="https://beta.openai.com/">Open AI API</a></li>
</ul>

<h2 id="conclusion">Conclusion</h2>

<p>This project demonstrates how to use Langchain to detect language correctness using different language models. By combining the system template, classification schema, prompt template, and language model, you can create a powerful language processing system. OpenAI and Gemini models provide accurate results for detecting grammatical errors, sentiment, and aggressiveness in the text.</p>

<p>You can find the full code of this example in the <a href="https://github.com/xavidop/langchain-language-correctness-detector">GitHub repository</a></p>

<p>Happy coding!</p>]]></content><author><name>Xavier Portilla Edo</name><email>xavierportillaedo@gmail.com</email></author><category term="azure" /><category term="azure" /><category term="gcp" /><summary type="html"><![CDATA[Learn how to detect grammatical errors, sentiment, and aggressiveness in text using Langchain and OpenAI or Google Cloud language models.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://xavidop.me/assets/img/blog/post-headers/langchain-language-correctness-detector.png" /><media:content medium="image" url="https://xavidop.me/assets/img/blog/post-headers/langchain-language-correctness-detector.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry xml:lang="en"><title type="html">Genkit with Gemma using Ollama (English)</title><link href="https://xavidop.me/genkit/gcp/2024-05-24-genkit-ollama/" rel="alternate" type="text/html" title="Genkit with Gemma using Ollama (English)" /><published>2024-05-24T00:00:00+00:00</published><updated>2026-03-20T17:41:31+00:00</updated><id>https://xavidop.me/genkit/gcp/genkit-ollama</id><content type="html" xml:base="https://xavidop.me/genkit/gcp/2024-05-24-genkit-ollama/"><![CDATA[<ol class="no_toc" id="markdown-toc">
  <li><a href="#introduction" id="markdown-toc-introduction">Introduction</a></li>
  <li><a href="#setup" id="markdown-toc-setup">Setup</a>    <ol>
      <li><a href="#open-genkit-ui" id="markdown-toc-open-genkit-ui">Open Genkit UI</a></li>
      <li><a href="#run-the-firebase-emulator" id="markdown-toc-run-the-firebase-emulator">Run the Firebase emulator</a></li>
      <li><a href="#run-gemma-with-ollama" id="markdown-toc-run-gemma-with-ollama">Run Gemma with Ollama</a></li>
    </ol>
  </li>
  <li><a href="#code-explanation" id="markdown-toc-code-explanation">Code explanation</a></li>
  <li><a href="#invoke-the-function-locally" id="markdown-toc-invoke-the-function-locally">Invoke the function locally</a></li>
  <li><a href="#deploy" id="markdown-toc-deploy">Deploy</a></li>
  <li><a href="#resources" id="markdown-toc-resources">Resources</a></li>
  <li><a href="#conclusion" id="markdown-toc-conclusion">Conclusion</a></li>
</ol>

<h2 id="introduction">Introduction</h2>

<p>This is a simple example of a Firebase function that uses Genkit and Ollama to translate any test to Spanish.</p>

<p>This project uses the following technologies:</p>
<ol>
  <li>Firebase Functions</li>
  <li>genkit</li>
  <li>Ollama</li>
</ol>

<p>This project uses the following Node.js Packages:</p>
<ol>
  <li><code class="language-plaintext highlighter-rouge">genkitx-ollama</code>: Genkit Ollama plugin to be able to use Ollama in Genkit</li>
  <li><code class="language-plaintext highlighter-rouge">genkit</code>: Genkit AI Core SDK</li>
</ol>

<h2 id="setup">Setup</h2>

<ol>
  <li>Clone this repository: <a href="https://github.com/xavidop/firebase-genkit-ollama">GitHub repository</a>.</li>
  <li>Run <code class="language-plaintext highlighter-rouge">npm install</code> to install the dependencies in the functions folder</li>
  <li>Run <code class="language-plaintext highlighter-rouge">firebase login</code> to login to your Firebase account</li>
  <li>Install genkit-cli by running <code class="language-plaintext highlighter-rouge">npm install -g genkit</code></li>
</ol>

<p>This repo is supposed to be used with NodeJS version 20.</p>

<h3 id="open-genkit-ui">Open Genkit UI</h3>

<p>Go to the functions folder and run <code class="language-plaintext highlighter-rouge">npm run genkit:start</code> to open the Genkit UI. The UI will be available at <code class="language-plaintext highlighter-rouge">http://localhost:4000</code>.</p>

<p class="figure"><img src="/assets/img/blog/tutorials/firebase-genkit-ollama/genaikitui.png" alt="Full-width image" class="lead" data-width="800" data-height="100" />
genkit UI</p>

<h3 id="run-the-firebase-emulator">Run the Firebase emulator</h3>

<p>To run the function locally, run <code class="language-plaintext highlighter-rouge">firebase emulators:start --inspect-functions</code>.</p>

<p>The emulator will be available at <code class="language-plaintext highlighter-rouge">http://localhost:4001</code></p>

<h3 id="run-gemma-with-ollama">Run Gemma with Ollama</h3>

<p>You will need to install Ollama by running <code class="language-plaintext highlighter-rouge">brew install ollama</code> and then run <code class="language-plaintext highlighter-rouge">ollama run gemma</code> to start the Ollama server running the Gemma LLM.</p>

<h2 id="code-explanation">Code explanation</h2>

<p>The code is in the <code class="language-plaintext highlighter-rouge">functions/index.ts</code> file. The function is called <code class="language-plaintext highlighter-rouge">translatorFlow</code> and it uses the Genkit SDK to translate any given text to Spanish.</p>

<p>First, we have to configure the Genkit SDK with the Ollama plugin:</p>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">const</span> <span class="nx">ai</span> <span class="o">=</span> <span class="nf">genkit</span><span class="p">({</span>
  <span class="na">plugins</span><span class="p">:</span> <span class="p">[</span>
    <span class="nf">ollama</span><span class="p">({</span>
      <span class="na">models</span><span class="p">:</span> <span class="p">[{</span> <span class="na">name</span><span class="p">:</span> <span class="dl">'</span><span class="s1">gemma</span><span class="dl">'</span> <span class="p">}],</span>
      <span class="na">serverAddress</span><span class="p">:</span> <span class="dl">'</span><span class="s1">http://127.0.0.1:11434</span><span class="dl">'</span><span class="p">,</span> <span class="c1">// default ollama local address</span>
    <span class="p">}),</span>
  <span class="p">]</span>
<span class="p">});</span>
<span class="nx">logger</span><span class="p">.</span><span class="nf">setLogLevel</span><span class="p">(</span><span class="dl">'</span><span class="s1">debug</span><span class="dl">'</span><span class="p">);</span>
</code></pre></div></div>

<p>Then, we define the function, in the Gen AI Kit they call it Flows. A Flow is a function with some additional characteristics: they are strongly typed, streamable, locally and remotely callable, and fully observable. genkit provides CLI and Developer UI tooling for working with flows (running, debugging, etc):</p>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">export</span> <span class="kd">const</span> <span class="nx">translatorFlow</span> <span class="o">=</span> <span class="nx">ai</span><span class="p">.</span><span class="nf">defineFlow</span><span class="p">(</span>
  <span class="p">{</span>
    <span class="na">name</span><span class="p">:</span> <span class="dl">"</span><span class="s2">translatorFlow</span><span class="dl">"</span><span class="p">,</span>
    <span class="na">inputSchema</span><span class="p">:</span> <span class="nx">z</span><span class="p">.</span><span class="nf">object</span><span class="p">({</span> <span class="na">text</span><span class="p">:</span> <span class="nx">z</span><span class="p">.</span><span class="nf">string</span><span class="p">()</span> <span class="p">}),</span>
    <span class="na">outputSchema</span><span class="p">:</span> <span class="nx">z</span><span class="p">.</span><span class="nf">string</span><span class="p">(),</span>
  <span class="p">},</span>
  <span class="k">async </span><span class="p">(</span><span class="nx">toTranslate</span><span class="p">)</span> <span class="o">=&gt;</span> <span class="p">{</span>
    <span class="kd">const</span> <span class="nx">prompt</span> <span class="o">=</span>
      <span class="s2">`Translate this </span><span class="p">${</span><span class="nx">toTranslate</span><span class="p">.</span><span class="nx">text</span><span class="p">}</span><span class="s2"> to Spanish. Autodetect the language.`</span><span class="p">;</span>

    <span class="kd">const</span> <span class="nx">llmResponse</span> <span class="o">=</span> <span class="k">await</span> <span class="nx">ai</span><span class="p">.</span><span class="nf">generate</span><span class="p">({</span>
      <span class="na">model</span><span class="p">:</span> <span class="dl">'</span><span class="s1">ollama/gemma</span><span class="dl">'</span><span class="p">,</span>
      <span class="na">prompt</span><span class="p">:</span> <span class="nx">prompt</span><span class="p">,</span>
      <span class="na">config</span><span class="p">:</span> <span class="p">{</span>
        <span class="na">temperature</span><span class="p">:</span> <span class="mi">1</span><span class="p">,</span>
      <span class="p">},</span>
    <span class="p">});</span>

    <span class="k">return</span> <span class="nx">llmResponse</span><span class="p">.</span><span class="nx">text</span><span class="p">;</span>
  <span class="p">}</span>
<span class="p">);</span>

<span class="k">export</span> <span class="kd">const</span> <span class="nx">translatedFunction</span> <span class="o">=</span> <span class="nf">onCallGenkit</span><span class="p">({</span>
  <span class="na">authPolicy</span><span class="p">:</span> <span class="p">()</span> <span class="o">=&gt;</span> <span class="kc">true</span><span class="p">,</span> <span class="c1">// Allow all users to call this function. Not recommended for production.</span>
<span class="p">},</span> <span class="nx">translatorFlow</span><span class="p">);</span>
</code></pre></div></div>

<p>As we saw above, we use Zod to define the input and output schema of the function. We also use the <code class="language-plaintext highlighter-rouge">generate</code> function from the Genkit SDK to generate the translation.</p>

<h2 id="invoke-the-function-locally">Invoke the function locally</h2>

<p>Now you can invoke the function by running <code class="language-plaintext highlighter-rouge">genkit flow:run translatorFlow '{"text":"hi"}'</code> in the terminal.</p>

<p>You can also make a curl command by running <code class="language-plaintext highlighter-rouge">curl -X GET -H "Content-Type: application/json" -d '{"data": { "text": "hi" }}' http://127.0.0.1:5001/&lt;firebase-project&gt;/&lt;region&gt;/translatedFunction</code> in the terminal.</p>

<p>For example:</p>
<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="o">&gt;</span> curl <span class="nt">-X</span> GET <span class="nt">-H</span> <span class="s2">"Content-Type: application/json"</span> <span class="nt">-d</span> <span class="s1">'{"data": { "text": "hi" }}'</span> http://127.0.0.1:5001/action-helloworld/us-central1/translatedFunction
<span class="o">{</span><span class="s2">"result"</span>:<span class="s2">"Hola</span><span class="se">\n\n</span><span class="s2">The translation of </span><span class="se">\"</span><span class="s2">hi</span><span class="se">\"</span><span class="s2"> to Spanish is </span><span class="se">\"</span><span class="s2">Hola</span><span class="se">\"</span><span class="s2">."</span><span class="o">}</span>
</code></pre></div></div>

<p>You can also use Postman or any other tool to make a GET request to the function:</p>

<p class="figure"><img src="/assets/img/blog/tutorials/firebase-genkit-ollama/postman.png" alt="Full-width image" class="lead" data-width="800" data-height="100" />
Postman Request</p>

<h2 id="deploy">Deploy</h2>

<p>To deploy the function, run <code class="language-plaintext highlighter-rouge">firebase deploy --only functions</code>. You will need to change the ollama URL in the function to the URL of the Ollama server.</p>

<h2 id="resources">Resources</h2>

<ul>
  <li><a href="https://genkit.dev/">genkit</a></li>
  <li><a href="https://ollama.com/">Ollama</a></li>
  <li><a href="https://firebase.google.com/docs/functions">Firebase Functions</a></li>
</ul>

<h2 id="conclusion">Conclusion</h2>

<p>As you can see, it is very easy to use Genkit and Ollama in Firebase Functions. You can use this example as a starting point to create your own functions using Genkit and Ollama.</p>

<p>You can find the full code of this example in the <a href="https://github.com/xavidop/firebase-genkit-ollama">GitHub repository</a></p>

<p>Happy coding!</p>]]></content><author><name>Xavier Portilla Edo</name><email>xavierportillaedo@gmail.com</email></author><category term="genkit" /><category term="gcp" /><category term="firebase" /><category term="genkit" /><category term="gcp" /><summary type="html"><![CDATA[Firebase project that uses the Gen AI Kit with Gemma using Ollama]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://xavidop.me/assets/img/blog/post-headers/firebase-genkit-ollama.png" /><media:content medium="image" url="https://xavidop.me/assets/img/blog/post-headers/firebase-genkit-ollama.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry></feed>