Part 6: How to chat with SQL database with Langchain and OpenAI API

Rohit Raj
3 min readDec 17, 2023

--

Previously in part four , I wrote on how to do Question and Answer on YouTube videos using OpenAI API in Python.

In this article, I will write about how to do question and answer on SQL database with OpenAI API using langchain.

DATASET

I am using data of pincodes of India from the following link

SQL Database

First I load into a sqlite database using following code. First download the csv file from above link and run the following code.

import sqlite3
# Connect to database
conn = sqlite3.connect('pincode.db')
cursor = conn.cursor()

#Create table
query = '''
CREATE TABLE Postal_Offices (
CircleName VARCHAR(255),
RegionName VARCHAR(255),
DivisionName VARCHAR(255),
OfficeName VARCHAR(255),
Pincode INTEGER,
OfficeType VARCHAR(255),
Delivery VARCHAR(255),
District VARCHAR(255),
StateName VARCHAR(255)
);

'''
cursor.execute(query)
#Read csv
import pandas as pd
df = pd.read_csv('Pincode_30052019.csv',encoding='ISO-8859-1')
df.columns =['CircleName', 'RegionName', 'DivisionName', 'OfficeName', 'Pincode',
'OfficeType', 'Delivery', 'District', 'StateName']

#Import the csv into database
df.to_sql('Postal_Offices', conn, if_exists='append', index=False)
conn.close()

Our CSV is loaded into SQL database. We can check and verify sql database by running queries.

LANGCHAIN

Next we first set our OpenAPI key

import os 
os.environ["OPENAI_API_KEY"] = api_key

Then we import SqlDatabase Chain from Langchain library and initialize it with our pincode database

from langchain.llms import OpenAI
from langchain.utilities import SQLDatabase
from langchain_experimental.sql import SQLDatabaseChain
db = SQLDatabase.from_uri("sqlite:///pincode.db")
llm = OpenAI(temperature=0, verbose=True)
db_chain = SQLDatabaseChain.from_llm(llm, db, verbose=True)

SQLDatabaseChain first sends our query to GPT 3.5 turbo API and converts it into SQL query. It runs the SQL Query and sends the output to GPT 3.5 turbo API and outputs the response of GPT 3.5 turbo.

It converted our natural language query into SQL Query and used GPT 3.5 turbo API to convert output of SQL into natural language again.

Let us see some other examples

We see SQLDatabaseChain is able to handle other types of queries too.

If we don’t want to permit Langchain to execute SQL query on its own then we can use create_sql_query_chain. Example is given below

from langchain.chains import create_sql_query_chain
from langchain.chat_models import ChatOpenAI

chain = create_sql_query_chain(ChatOpenAI(temperature=0), db)
response = chain.invoke({"question": "What is address of pincode 800020"})
print(response)
cursor.execute(response)
print(cursor.fetchall())

Tips for better performance

  1. We could pass which tables to use and sample rows to Langchain
db = SQLDatabase.from_uri(
"sqlite:///dbname.db",
include_tables=[
"Tablename"
], # we include only one table to save tokens in the prompt :)
sample_rows_in_table_info=2,
)

2. We can pass examples of inputs and outputs in prompt

Conclusion

We can chat with SQL database using OpenAI API. We could improve response by giving sample outputs in prompt. We can further extend our chain by integrating with another OpenAI call to generate plots in Python.

If you liked my article, please clap and subscribe.

--

--

Rohit Raj
Rohit Raj

Written by Rohit Raj

Studied at IIT Madras and IIM Indore. Love Data Science

Responses (1)