Part 6: How to chat with SQL database with Langchain and OpenAI API
Previously in part four , I wrote on how to do Question and Answer on YouTube videos using OpenAI API in Python.
In this article, I will write about how to do question and answer on SQL database with OpenAI API using langchain.
DATASET
I am using data of pincodes of India from the following link
SQL Database
First I load into a sqlite database using following code. First download the csv file from above link and run the following code.
import sqlite3
# Connect to database
conn = sqlite3.connect('pincode.db')
cursor = conn.cursor()
#Create table
query = '''
CREATE TABLE Postal_Offices (
CircleName VARCHAR(255),
RegionName VARCHAR(255),
DivisionName VARCHAR(255),
OfficeName VARCHAR(255),
Pincode INTEGER,
OfficeType VARCHAR(255),
Delivery VARCHAR(255),
District VARCHAR(255),
StateName VARCHAR(255)
);
'''
cursor.execute(query)
#Read csv
import pandas as pd
df = pd.read_csv('Pincode_30052019.csv',encoding='ISO-8859-1')
df.columns =['CircleName', 'RegionName', 'DivisionName', 'OfficeName', 'Pincode',
'OfficeType', 'Delivery', 'District', 'StateName']
#Import the csv into database
df.to_sql('Postal_Offices', conn, if_exists='append', index=False)
conn.close()
Our CSV is loaded into SQL database. We can check and verify sql database by running queries.
LANGCHAIN
Next we first set our OpenAPI key
import os
os.environ["OPENAI_API_KEY"] = api_key
Then we import SqlDatabase Chain from Langchain library and initialize it with our pincode database
from langchain.llms import OpenAI
from langchain.utilities import SQLDatabase
from langchain_experimental.sql import SQLDatabaseChain
db = SQLDatabase.from_uri("sqlite:///pincode.db")
llm = OpenAI(temperature=0, verbose=True)
db_chain = SQLDatabaseChain.from_llm(llm, db, verbose=True)
SQLDatabaseChain first sends our query to GPT 3.5 turbo API and converts it into SQL query. It runs the SQL Query and sends the output to GPT 3.5 turbo API and outputs the response of GPT 3.5 turbo.
It converted our natural language query into SQL Query and used GPT 3.5 turbo API to convert output of SQL into natural language again.
Let us see some other examples
We see SQLDatabaseChain is able to handle other types of queries too.
If we don’t want to permit Langchain to execute SQL query on its own then we can use create_sql_query_chain. Example is given below
from langchain.chains import create_sql_query_chain
from langchain.chat_models import ChatOpenAI
chain = create_sql_query_chain(ChatOpenAI(temperature=0), db)
response = chain.invoke({"question": "What is address of pincode 800020"})
print(response)
cursor.execute(response)
print(cursor.fetchall())
Tips for better performance
- We could pass which tables to use and sample rows to Langchain
db = SQLDatabase.from_uri(
"sqlite:///dbname.db",
include_tables=[
"Tablename"
], # we include only one table to save tokens in the prompt :)
sample_rows_in_table_info=2,
)
2. We can pass examples of inputs and outputs in prompt
Conclusion
We can chat with SQL database using OpenAI API. We could improve response by giving sample outputs in prompt. We can further extend our chain by integrating with another OpenAI call to generate plots in Python.
If you liked my article, please clap and subscribe.