SQL Queries From Natural Text in Python | OpenAI SQL Translate

Structured Query Language is the most popular language used for working with databases. There are different versions of SQL like MYSQL, PostgreSQL, Oracle, SQL lite etc. It is capable of performing all the operations that involve interaction with databases and is the most demanding skill in the IT industry.

What if we find a way to automatically generate the SQL queries?

Creating SQL queries from natural language has been an area of discussion for a long time. With advancements in the field of artificial intelligence, it has now become smart enough to create complex SQL queries from natural language. This makes accessing and manipulating data in databases very easy even for non-technical people.

In this article, we will use python code to convert text descriptions to SQL queries.

Text-to-SQL

Text to SQL is a process of converting a query in natural language into an SQL query.

How can Text-to-SQL be Useful?

Structured Query Language abbreviated as SQL is used to manage data present in a Relational Database Management System (RDBMS). RDBMS is a type of database management system in which data is stored in a tabular format.

NoSQL databases have more flexible functionality compared to traditional RDBMS but they still cannot support transactions. Therefore all industries are still dependent on RDBMS for storing transactional-related data.

Text-to-SQL query systems can create accurate SQL queries and ease the process of working with databases for non-technical people.

OpenAI SQL Translate

OpenAI has been working on natural language processing and consistently developing industry revolutionary tools. Have you heard about ChatGPT? It is the best chatbot available today. Learn about it from the complete guide on using OpenAI ChatGPT.

OpenAI SQL Translate is another tool by OpenAI that can be used to create efficient SQL queries from textual descriptions. It works well even with very complex SQL queries.

In the next section, we will create queries using prompt text using python code.

Python Code

import openai

openai.api_key = API-Key
response = openai.Completion.create(
    model='code-davinci-002',
    prompt='### SQL tables, with their properties: Left Join two tables Employee and Department on Employee_id Column',
    temperature=0,
    max_tokens=150,
    top_p=1.0,
    frequency_penalty=0.0,
    presence_penalty=0.0,
    stop=['#', ';']
)

OpenAI Completion Parameters

Below is a detailed description of all the parameters used in the above code:

1. model 

  • Id of the model to be used.

2. prompt

  • Text description of the SQL query to be generated.

3. temperature

  • It defines the amount of risk the model will take.
  • Higher the temperature value, the more the risk.
  • More risk helps in creating more creative applications.
  • Low risk will create applications with well-defined answers.

4. max_tokens

  • The maximum number of tokens to be generated in the completion.

5. top_p

  • The model considers the result of the tokens with top_p probability mass.
  • For example, top_p=0.2 means the model will only consider the tokens comprising the top 20% probability mass.

6. frequency_penalty

  • Value lies between -2 and +2.
  • A positive value decreases the model’s likelihood to repeat the same line.
  • Positive values penalize new tokens based on their existing frequency in the text.

7. presence_penalty

  • The value lies between -2 and +2.
  • A positive value increases the model’s likelihood to talk about a new topic.

8. stop

  • The returned text will not contain a stop sequence.
  • The maximum length of the sequence can be 4.

Related Read: OpenAI Completion API

response['choices'][0]['text']

Output

SELECT * FROM Employee LEFT JOIN Department ON Employee.Employee_id = Department.Employee_id

Isn’t the result amazing? Let’s try SQL Translate from OpenAI to create more queries. Below are three prompts (natural text query) and their corresponding output.

Prompt 1

Write a query to fetch the EmpFname from the EmployeeInfo table in the upper case and use the ALIAS name as EmpName.

Output 1

SELECT UPPER(EmpFname) AS EmpName FROM EmployeeInfo

Prompt 2

Write a query to find the names of employees that begins with ‘S’.

Output 2

SELECT * FROM employees WHERE first_name LIKE ‘S%’

Prompt 3

Write a query to fetch details of all employees excluding the employees with first names, “Sanjay” and “Sonia” from the EmployeeInfo table.

Output 3

SELECT * FROM EmployeeInfo WHERE FirstName NOT IN (‘Sanjay’, ‘Sonia’)

OpenAI Playground

OpenAI provides a playground that lets you ask anything from an AI bot. It is free of cost, the only requirement is to make an account on OpenAI’s website.

In the below screenshot, you can see SQL query generated from the text on OpenAI’s platform.

Try OpenAI Playground from here.

End Notes

Thank you for reading this article till the end. I hope you found it informative.

Feel free to give your feedback or ask for any query in the comment section below. Happy Learning 🙂