Langchain Language Correctness Detector (English)
Learn how to detect grammatical errors, sentiment, and aggressiveness in text using Langchain and OpenAI or Google Cloud language models.
- Introduction
- Features
- Stack Used
- Installation
- Usage
- Code Explanation
- Prompts Used for Detecting Correctness
- Examples
- License
- Contributing
- Resources
- Conclusion
Introduction
This project implements a simple Langchain language correctness detector that detects grammatical errors, sentiment, aggressiveness, and provides solutions for the errors in the text.
Features
- Detects grammatical errors in the text.
- Analyzes the sentiment of the text.
- Measures the aggressiveness of the text.
- Provides solutions for the detected errors.
Stack Used
- Node.js: JavaScript runtime environment.
- TypeScript: Typed superset of JavaScript.
- Langchain: Language processing library.
- OpenAI API: For language model capabilities.
- Google Cloud: For additional language processing services.
Installation
- Clone the repository:
git clone https://github.com/xavidop/langchain-example.git cd langchain-example
- Install the dependencies:
yarn install
- Create a
.env
file in the root directory and add your OpenAI API key and Google Application credentials:OPENAI_API_KEY="your-openai-api-key" GOOGLE_APPLICATION_CREDENTIALS=credentials.json LLM_PROVIDER='OPENAI'
Usage
- Build the project:
yarn run build
- Start the application:
yarn start
- For development, you can use:
yarn run dev
Code Explanation
Imports and Environment Setup
import { ChatOpenAI, ChatOpenAICallOptions } from "@langchain/openai";
import { ChatVertexAI } from "@langchain/google-vertexai";
import { ChatPromptTemplate } from "@langchain/core/prompts";
import { z } from "zod";
import * as dotenv from "dotenv";
// Load environment variables from .env file
dotenv.config();
- Imports: The code imports necessary modules from Langchain, Zod for schema validation, and dotenv for environment variable management.
- Environment Setup: Loads environment variables from a
.env
file.
System Template and Schema Definition
const systemTemplate = "You are an expert in {language}, you have to detect grammar problems sentences";
const classificationSchema = z.object({
sentiment: z.enum(["happy", "neutral", "sad", "angry", "frustrated"]).describe("The sentiment of the text"),
aggressiveness: z.number().int().min(1).max(10).describe("How aggressive the text is on a scale from 1 to 10"),
correctness: z.number().int().min(1).max(10).describe("How the sentence is correct grammatically on a scale from 1 to 10"),
errors: z.array(z.string()).describe("The errors in the text. Specify the proper way to write the text and where it is wrong. Explain it in a human-readable way. Write each error in a separate string"),
solution: z.string().describe("The solution to the errors in the text. Write the solution in {language}"),
language: z.string().describe("The language the text is written in"),
});
- System Template: Defines a template for the system message, indicating the language and the task of detecting grammar problems.
- Classification Schema: Uses Zod to define a schema for the expected output, including sentiment, aggressiveness, correctness, errors, solution, and language.
Prompt Template and Model Selection
const promptTemplate = ChatPromptTemplate.fromMessages([
["system", systemTemplate],
["user", "{text}"],
]);
let model: any;
if (process.env.LLM_PROVIDER == "OPENAI") {
model = new ChatOpenAI({
model: "gpt-4",
temperature: 0,
});
} else {
model = new ChatVertexAI({
model: "gemini-1.5-pro-001",
temperature: 0,
});
}
- Prompt Template: Creates a prompt template using the system message and user input.
- Model Selection: Selects the language model based on the
LLM_PROVIDER
environment variable. It can either be OpenAI’s GPT-4 or Google’s Vertex AI.
Main Function
export const run = async () => {
const llmWihStructuredOutput = model.withStructuredOutput(classificationSchema, {
name: "extractor",
});
const chain = await promptTemplate.pipe(llmWihStructuredOutput);
const result = await chain.invoke({ language: "Spanish", text: "Yo soy enfadado" });
console.log({ result });
};
run();
- Structured Output: Configures the model to use the defined classification schema.
- Pipeline: Creates a pipeline by combining the prompt template and the structured output model.
- Invocation: Invokes the pipeline with a sample text in Spanish, and logs the result.
Prompts Used for Detecting Correctness
The following prompts are used to detect the correctness of the text:
- Grammatical Errors:
"Please check the following text for grammatical errors: {text}"
- Sentiment Analysis:
"Analyze the sentiment of the following text: {text}"
- Aggressiveness Detection:
"Measure the aggressiveness of the following text: {text}"
- Error Solutions:
"Provide solutions for the errors found in the following text: {text}"
Examples
This project can be used with different language models to detect language correctness. Here are some examples using OpenAI and Gemini models.
OpenAI
With OpenAI’s GPT-4 model, the system can detect grammatical errors, sentiment, and aggressiveness in the text.
Input:
{ language: "Spanish", text: "Yo soy enfadado" }
Output:
{
result: {
sentiment: 'angry',
aggressiveness: 2,
correctness: 7,
errors: [
"The correct form of the verb 'estar' should be used instead of 'ser' when expressing emotions or states."
],
solution: 'Yo estoy enfadado',
language: 'Spanish'
}
}
Gemini
With Google’s Vertex AI Gemini model, the output is quite similar:
Input:
{ language: "Spanish", text: "Yo soy enfadado" }
Output:
{
result: {
sentiment: 'angry',
aggressiveness: 1,
correctness: 8,
errors: [
'The correct grammar is "estoy enfadado" because "ser" is used for permanent states and "estar" is used for temporary states. In this case, being angry is a temporary state.'
],
solution: 'Estoy enfadado',
language: 'Spanish'
}
}
License
This project is licensed under the Apache License, Version 2.0. See the LICENSE file for more details.
Contributing
Contributions are welcome! Please open an issue or submit a pull request for any changes.
Resources
Conclusion
This project demonstrates how to use Langchain to detect language correctness using different language models. By combining the system template, classification schema, prompt template, and language model, you can create a powerful language processing system. OpenAI and Gemini models provide accurate results for detecting grammatical errors, sentiment, and aggressiveness in the text.
You can find the full code of this example in the GitHub repository
Happy coding!