Xavier Portilla Edo

Firebase GenKit with Gemma using Ollama (English)

2024-05-24T00:00:00+00:00

Introduction
Setup
Code explanation
Invoke the function locally
Deploy
Resources
Conclusion

Introduction

This is a simple example of a Firebase function that uses Genkit and Ollama to translate any test to Spanish.

This project uses the following technologies:

Firebase Functions
Firebase Genkit
Ollama

This project uses the following Node.js Packages:

@genkit-ai/firebase: Genkit Firebase SDK to be able to use Genkit in Firebase Functions
genkitx-ollama: Genkit Ollama plugin to be able to use Ollama in Genkit
@genkit-ai/ai, @genkit-ai/core and @genkit-ai/flow: Genkit AI Core SDK
@genkit-ai/dotprompt: Plugin to use DotPrompt in Genkit

Setup

Clone this repository: GitHub repository.
Run npm install to install the dependencies in the functions folder
Run firebase login to login to your Firebase account
Install genkit-cli by running npm install -g genkit

This repo is supposed to be used with NodeJS version 20.

Run the Firebase emulator

To run the function locally, run GENKIT_ENV=dev firebase emulators:start --inspect-functions.

The emulator will be available at http://localhost:4000

Open Genkit UI

Go to the functions folder and run genkit start --attach http://localhost:3100 --port 4001 to open the Genkit UI. The UI will be available at http://localhost:4001.

Firebase Genkit UI

Run Gemma with Ollama

You will need to install Ollama by running brew install ollama and then run ollama run gemma to start the Ollama server running the Gemma LLM.

Code explanation

The code is in the functions/index.ts file. The function is called translatorFlow and it uses the Genkit SDK to translate any given text to Spanish.

First, we have to configure the Genkit SDK with the Ollama plugin:

configureGenkit({
  plugins: [
    firebase(),
    ollama({
      models: [{ name: 'gemma' }],
      serverAddress: 'http://127.0.0.1:11434', // default ollama local address
    }),
  ],
  logLevel: "debug",
  enableTracingAndMetrics: true,
});

Then, we define the function, in the Gen AI Kit they call it Flows. A Flow is a function with some additional characteristics: they are strongly typed, streamable, locally and remotely callable, and fully observable. Firebase Genkit provides CLI and Developer UI tooling for working with flows (running, debugging, etc):

export const translatorFlow = onFlow(
  {
    name: "translatorFlow",
    inputSchema: z.object({ text: z.string() }),
    outputSchema: z.string(),
    authPolicy: noAuth(), // Not requiring authentication, but you can change this. It is highly recommended to require authentication for production use cases.
  },
  async (toTranslate) => {
    const prompt =
      `Translate this ${toTranslate.text} to Spanish. Autodetect the language.`;

    const llmResponse = await generate({
      model: 'ollama/gemma',
      prompt: prompt,
      config: {
        temperature: 1,
      },
    });

    return llmResponse.text();
  }
);

As we saw above, we use Zod to define the input and output schema of the function. We also use the generate function from the Genkit SDK to generate the translation.

We also have disabled the authentication for this function, but you can change this by changing the authPolicy property:

firebaseAuth((user) => {
  if (!user.email_verified) {
    throw new Error('Verified email required to run flow');
  }
});

For the example above you will need to import the firebaseAuth function from the @genkit-ai/firebase/auth package:

import { firebaseAuth } from '@genkit-ai/firebase/auth';

Invoke the function locally

Now you can invoke the function by running genkit flow:run translatorFlow '{"text":"hi"}' in the terminal.

You can also make a curl command by running curl -X GET -H "Content-Type: application/json" -d '{"data": { "text": "hi" }}' http://127.0.0.1:5001///translatorFlow in the terminal.

For example:

> curl -X GET -H "Content-Type: application/json" -d '{"data": { "text": "hi" }}' http://127.0.0.1:5001/action-helloworld/us-central1/translatorFlow
{"result":"Hola\n\nThe translation of \"hi\" to Spanish is \"Hola\"."}

You can also use Postman or any other tool to make a GET request to the function:

Postman Request

Deploy

To deploy the function, run firebase deploy --only functions. You will need to change the ollama URL in the function to the URL of the Ollama server.

Resources

Conclusion

As you can see, it is very easy to use Genkit and Ollama in Firebase Functions. You can use this example as a starting point to create your own functions using Genkit and Ollama.

You can find the full code of this example in the GitHub repository

Happy coding!

Integrate Alexa with Voiceflow (English)

2023-11-21T00:00:00+00:00

Integrate Alexa with Voiceflow

Integrate Alexa with Voiceflow

Alexa has a lot of capabilities, but it is not easy to create a complex conversation. Voiceflow is a tool that allows you to create complex conversations with Alexa without writing code. This integration allows you to create a conversation in Voiceflow and then deploy it to Alexa.

Because of that, In this repository you will find a simple example of how to integrate Alexa with Voiceflow using the Alexa Skills Kit SDK for Node.js and calling the Voiceflow’s Dialog Manager API.

Prerequisites

You need to have an account on Voiceflow
You need to have an account on Alexa Developer
Node.js and npm/yarn installed on your computer

Voiceflow Project

On Voiceflow, you will need to create a project and create a conversation. You can follow the Voiceflow Quick Start to create a simple conversation. On Voiceflow the only thing that you have to care about is to design the conversation.

In this example, we are going to create a simple conversation that asks the user for information about pokemons. The conversation will be like this:

Voiceflow Conversation

NLU

Voiceflow has a built-in NLU, since we are going to call Voiceflow using the Dialog Manager API, we will need to design our NLU on Voiceflow and on Alexa.

Following the example, we are going to create an intent called info_intent and a slot called pokemon that will be filled with the name of the pokemon that the user wants to know about:

Voiceflow NLU

Dialog Manager API

The Dialog Manager API is a REST API that allows you to interact with Voiceflow. You can find the documentation here.

The DM API automatically creates and manages the conversation state. Identical requests to the DM API may produce different responses depending on your diagram’s logic and the previous request that the API received.

The DM API endpoints is:

https://general-runtime.voiceflow.com/state/user/{userID}/interact

There are different types of requests that can be sent. To see a list of all request types, check out the documentation for the action field below.

To start a conversation, you should send a launch request. Then, to pass in your user’s response, you should send a text request. If you have your own NLU matching, then you may want to directly send an intent request.

Here you have an example of a request:

curl --request POST \
     --url 'https://general-runtime.voiceflow.com/state/user/{userID}/interact?logs=off' \
     --header 'accept: application/json' \
     --header 'content-type: application/json' \
     --header 'Authorization: VF.DM.96ds3423ds9423fs87492fds79792gf343' \
     --data '
{
  "action": {
    "type": "launch"
  },
  "config": {
    "tts": false,
    "stripSSML": true,
    "stopAll": true,
    "excludeTypes": [
      "block",
      "debug",
      "flow"
    ]
  }
}
'

As you can see, you need to pass the userID and the Authorization header. The userID is the user id that you want to interact with. The Authorization header is the API key that you can find on the Voiceflow project settings.

You can find the Voiceflow project that I used for this example in voiceflow/project.vf.

Alexa Skill

To create an Alexa Skill you need to go to Alexa Developer and create a new skill. You can follow the Alexa Developer Console Quick Start to create a simple skill.

NLU

We will need to replicate the Voiceflow NLU (intents and entities) in our Alexa Skill:

Alexa NLU

As you can see, we are using the SearchQuery type. This type is used to get the user input and send it directly to Voiceflow. You can find more information about this type here.

Lambda Code

The Alexa Skill Code is going to be generic, that means that this Alexa Skill Code can be used with any Voiceflow project. To do that, we are going to implement a Lambda function that will call the Voiceflow Dialog Manager API. We are going to use the Alexa Skills Kit SDK for Node.js and Axios to call the API.

We will need to touch only 2 handlers, the LaunchRequestHandler and the ListenerIntentHandler. The LaunchRequestHandler will be used to start the conversation and the ListenerIntentHandler will be used to send the user input to Voiceflow.

Let’s start with the LaunchRequestHandler:

const LaunchRequestHandler = {
    canHandle(handlerInput) {
        return Alexa.getRequestType(handlerInput.requestEnvelope) === 'LaunchRequest';
    },
    async handle(handlerInput) {

        let chatID = Alexa.getUserId(handlerInput.requestEnvelope).substring(0, 8);
        const messages = await utils.interact(chatID, {type: "launch"});

        return handlerInput.responseBuilder
            .speak(messages.join(" "))
            .reprompt(messages.join(" "))
            .getResponse();
    }
};

This Handler is called when the skill is launched. We are going to get the user id and call the Voiceflow Dialog Manager API with the launch action. Then, we are going to return the response.

The following interactions are going to be handled by the ListenerIntentHandler:

const ListenerIntentHandler = {
    canHandle(handlerInput) {
        return Alexa.getRequestType(handlerInput.requestEnvelope) === 'IntentRequest'
    },
    async handle(handlerInput) {

        let chatID = Alexa.getUserId(handlerInput.requestEnvelope).substring(0, 8);
        const intent = Alexa.getIntentName(handlerInput.requestEnvelope);
        const entitiesDetected = utils.alexaDetectedEntities(handlerInput.requestEnvelope);

        const request = { 
            type: "intent", 
            payload: { 
                intent: {
                    name: intent
                },
                entities: entitiesDetected
            }
        };

        const messages = await utils.interact(chatID, request);

        return handlerInput.responseBuilder
            .speak(messages.join(" "))
            .reprompt(messages.join(" "))
            .getResponse();
    }
};

This Handler is called when the user says something. We are going to get the user input and call the Voiceflow Dialog Manager API with the intent action. Since the NLU Inferece is done by Alexa, we need to get the detected entities and the detected intents and send them to Voiceflow. Then, we are going to return the response.

To get the detected entities, we are going to use the following function:

module.exports.alexaDetectedEntities = function alexaDetectedEntities(alexaRequest) {
    let entities = [];
    const entitiesDetected = alexaRequest.request.intent.slots;
    for ( const entity of Object.values(entitiesDetected)) {
        entities.push({
            name: entity.name,
            value: entity.value
        });
    }
    return entities;
}

You can find the code of this function in lambda/utils.js.

Finally we have to make sure that we add the handlers to the skill:

exports.handler = Alexa.SkillBuilders.custom()
    .addRequestHandlers(
        LaunchRequestHandler,
        ListenerIntentHandler,
        HelpIntentHandler,
        CancelAndStopIntentHandler,
        FallbackIntentHandler,
        SessionEndedRequestHandler,
        IntentReflectorHandler)
    .addErrorHandlers(
        ErrorHandler)
    .withCustomUserAgent('sample/hello-world/v1.2')
    .lambda();

In the handlers above you can see that we are using a function called utils.interact. This function is going to call the Voiceflow Dialog Manager API. You can find the code of this function in lambda/utils.js:

const axios = require('axios');

const VF_API_KEY = "VF.DM.96ds3423ds9423fs87492fds79792gf343";

module.exports.interact = async function interact(chatID, request) {
    let messages = [];
    console.log(`request: `+JSON.stringify(request));

    const response = await axios({
        method: "POST",
        url: `https://general-runtime.voiceflow.com/state/user/${chatID}/interact`,
        headers: {
            Authorization: VF_API_KEY
        },
        data: {
            request
        }

    });

    for (const trace of response.data) {
        switch (trace.type) {
            case "text":
            case "speak":
                {
                    // remove break lines
                    messages.push(this.filter(trace.payload.message));
                    break;
                }
            case "end":
                {
                    messages.push("Bye!");
                    break;
                }
        }
    }

    console.log(`response: `+messages.join(","));
    return messages;
};

This function is going to return an array of messages. We are going to use this array to build the response. We have also added some code to remove the break lines and weird characters:

module.exports.filter = function filter(string) {
    string = string.replace(/\&#39;/g, '\'')
    string = string.replace(/(<([^>]+)>)/ig, "")
    string = string.replace(/\&/g, ' and ')
    string = string.replace(/[&\\#,+()$~%*?<>{}]/g, '')
    string = string.replace(/\s+/g, ' ').trim()
    string = string.replace(/ +(?= )/g,'')

	return string;
};

With this code, we have finished the Alexa Skill. You can find the code of the Lambda function in lambda/index.js.

Testing

Once you have created the Alexa Skill and the Voiceflow project, you can test it. To test it, you can use the Alexa Simulator or you can use a real device.

Following the example we were using, you can test the Alexa Skill with the following sentences to request information about pokemons:

Testing

Conclusion

As you can see, it is very easy to integrate Alexa with Voiceflow. You can create complex conversations with Voiceflow and then deploy them to Alexa. So your focus will be on the conversation and not on the code!

I hope you have enjoyed this tutorial.

You can find the code of this tutorial here.

Happy coding!

Podcast “Voztech”. Alexa Widgets, Dialogflow CX Playbooks y Últimas novedades ChatGPT

2023-10-31T00:00:00+00:00

Contenido

Contenido

En este décimo capítulo de la segunda temporada hablamos sobre las novedades dentro del mundo de Alexa, Dialogflow CX y OpenAI de este pasado mes. Hablamos largo y tendido de los novedades en el ecosistema Alexa:

APLWidgets
Account Linking usando voz
A/B Testing en voz
Rutinas en tus Skills

Posteriormente, hablamos sobre las novedades de Dialogflow CX:

CX Playgrounds, Qué son y cómo funcionan
Componentes reusables para Asistentes bancarios
Mejoras en detección de habla y reconocimiento del NLU

Finalmente, hablamos sobre las últimas novedades de ChatGPT y OpenAI:

Dalle-E 3 integrado en ChatGPT
Reconocimiento de imágenes
Interacción via voz

¿Te lo vas a perder? ¡Espero que te guste este episodio!

Podcast “Voztech”. Alexa LLM, Dialogflow CX Call Companion y Meta Connect recap

2023-09-29T00:00:00+00:00

Contenido

Contenido

En este noveno capítulo de la segunda temporada hablamos sobre las novedades dentro del mundo de Alexa, Dialogflow CX y Meta de este pasado mes. Hablamos largo y tendido de los novedades en el ecosistema Alexa:

Integración de las Custom Triggers para las Rutinas de Alexa.
Alexa Office Hours
Alexa LLM

Posteriormente, hablamos sobre las novedades de Dialogflow CX:

La nueva función FILTER
Export recursivo de Flows
Call Companion

Finalmente, hablamos sobre las últimas novedades del Meta Connect:

Meta AI
AI Stickers
Image Editing With AI
Quest 3 y Ray-Ban Meta smart glasses

¿Te lo vas a perder? ¡Espero que te guste este episodio!

Podcast “Voztech”. Alexa Skills ISP tips, IA Generetiva en Dialogflow CX y Voiceflow y Google Cloud Next

2023-08-27T00:00:00+00:00

Contenido

Contenido

En este octavo capítulo de la segunda temporada hablamos sobre las novedades dentro del mundo de Alexa, Dialogflow CX y Voiceflow de este pasado mes.

Hablamos largo y tendido de los novedades en el ecosistema Alexa:

Nuevo Alexa Slot AMAZON.AlphaNumeric.
Nueva Versión de APL 2023.2
Cómo utiliza labworks.io los ISP y las pruebas A/B para hacer crecer un negocio con Alexa.
4 tips para crear o mejorar tu skill con ISP.
Como Tinychef crea experiencia únicas
Monoceros Lab Spotlight!

Posteriormente, hablamos sobre las novedades de Dialogflow CX:

Nuevas Integraciones (Slack, telegram, Google Chat, etc.)
Nuevas analiticas
Intent split y Intent merge
Intent suggestions
IA Generativa en Dialogflow CX: Gen AI Fallback, Knowledge Base y Knowledge Base handler y Generators

Finalmente, hablamos sobre las últimas novedades de Vocieflow:

AI Blocks
Knowledge Base y Knowledge Base API
Gen AI Fallback
Autgeneracion de variantes, intents, slots

¿Te lo vas a perder? ¡Espero que te guste este episodio!

Podcast “Voztech”. AWS X-Ray en una Alexa Skill y Google IO Connect Amsterdam

2023-06-30T00:00:00+00:00

Contenido

Contenido

En este séptimo capítulo de la segunda temporada hablamos sobre las novedades dentro del mundo de Alexa y Dialogflow CX de este pasado mes. Hablamos largo y tendido de los novedades en el ecosistema Alexa:

Como configurar AWS X-RAY en tu Alexa Skill.
New Devs & Coffee series.
Tips para traducir y escalar tu Alexa Skill a diferentes idiomas, regiones y paises.
Nuevos Alexa Champions

Posteriormente, hablamos sobre las novedades de Dialogflow CX:

Flow-scoped parameters
New System Function.

Finalmente, os cuento mi experencia en el Google IO Connect Amsterdam:

Conocí a Kristopher Overholt, devrel en conversational AI en persona!
Las charlas y workshops a los que asistí.
Fuí nominado a best contributor of the year.

¿Te lo vas a perder? ¡Espero que te guste este episodio!

Podcast “Voztech”. Como idear una Alexa Skill y Especial Google IO

2023-05-25T00:00:00+00:00

Contenido

Contenido

En este sexto capítulo de la segunda temporada hablamos sobre las novedades dentro del mundo de Alexa y Dialogflow CX de este pasado mes. Hablamos largo y tendido de los novedades en el ecosistema Alexa:

Como configurar el entorno de desarrollo.
Como idear una Alexa Skill.

Finalmente hablamos sobre las novedades del Google I/O:

Palm 2.
Bard.
Generative Studio en Vertex AI.
GenAPP Builder.
PaLM API y MakerSuite.

¿Te lo vas a perder? ¡Espero que te guste este episodio!

Podcast “Voztech”. Amazon Chime en tus Alexa Skills y Flexible Webhooks en Dialogflow CX

2023-04-27T00:00:00+00:00

Contenido

Contenido

En este quinto capítulo de la segunda temporada hablamos sobre las novedades dentro del mundo de Alexa y Dialogflow CX de este pasado mes. Hablamos largo y tendido de los novedades en el ecosistema Alexa:

Alexa Widgets finalmente están disponibles para todos los desarrolladores. Explicamos que es el Alexa Radio Kit y su consola no-code. Integración de las Amazon Chime con Alexa Skills para hacer llamadas directamente desde tus dispositivos Alexa Finalmente hablamos sobre las novedades de Dialogflow CX:

Que son los flexible Webhooks y como usarlo en Dialogflow CX. Importar y Exportar intents. Importar y Exportar training phrases. ¿Te lo vas a perder? ¡Espero que te guste este episodio!

Podcast “Voztech”. Design Guide para nuestras Alexa Skills y Sentiment Analysis en Dialogflow CX

2023-03-30T00:00:00+00:00

Contenido

Contenido

En este cuarto capítulo de la segunda temporada hablamos sobre las novedades dentro del mundo de Alexa y Dialogflow CX de este pasado mes. Hablamos largo y tendido de los novedades en el ecosistema Alexa:

Consejos de depuración y rendimiento de la Web API de Alexa.
Alexa Design Guide
Monetización de tu Alexa Skill

Finalmente hablamos sobre las novedades de Dialogflow CX:

Que es el Sentiment Analysis y como usarlo en Dialogflow CX.
Que es un Flow y el cambio en el numero de flows por agente (de 20 a 50).
System functions, como y donde usarlas. ¿Te lo vas a perder? ¡Espero que te guste este episodio!

Podcast “Voztech”. Novedades y design tips para nuestras Alexa Skills y testing en Dialogflow CX

2023-02-25T00:00:00+00:00

Contenido

Contenido

En este tercer capítulo de la segunda temporada hablamos sobre las novedades dentro del mundo de Alexa y Dialogflow CX de este pasado mes. Hablamos largo y tendido de los novedades en el ecosistema Alexa:

Las 5 preguntas que te tienes que hacer si eres nuevo en el mundo de desarrollo de Alexa Skills.
5 cosas que toda aplicación de voz debe hacer
Nueva versión APL 2023.1

Finalmente hablamos sobre las novedades de Dialogflow CX y como testear nuestros agentes en la consola:

Entities disponibles en nuevos locales e idiomas disponibles en nuevas regiones de GCP.
Importar Flow.
Dialogflow CX Simulator en la consola.

¿Te lo vas a perder? ¡Espero que te guste este episodio!

Ya está disponible en todas las plataformas de podcasts.