Creating a Summarization Tool with OpenAI’s GPT-3: A Step-by-Step Guide

8 min readJan 28, 2023

Unleashing the power of GPT-3 to automatically summarize large paragraphs with ease

Introduction

Summarization is the process of reducing a large piece of text to its most important points. This can be useful for a variety of applications, such as news summarization, document summarization, and more. In this article, we will show you how to use the OpenAI API to create a summarization tool that can automatically summarize large pieces of text.

Setting up the OpenAI API

To use the OpenAI API, you will need to sign up for an account on the OpenAI website (https://beta.openai.com/signup/). Once you have created an account and logged in, you will be able to generate an API key from the settings page. Additionally, you will also need to add a payment method to your account in order to use the GPT-3 API.

Installing the OpenAI library

To use the OpenAI API in your Python code, you will need to install the openai library. You can do this by running the following command in your command line:

pip install openai

Creating the summarization tool

Once you have set up the OpenAI API and installed the openai library, you can start creating your summarization tool. Here is an example of how you can use the GPT-3 API to summarize a large paragraph

Here is a sample paragraph

"Up to the 1980s, most natural language processing systems were based on complex sets of hand-written rules. Starting in the late 1980s, however, there was a revolution in natural language processing with the introduction of machine learning algorithms for language processing. This was due to both the steady increase in computational power (see Moore’s law) and the gradual lessening of the dominance of Chomskyan theories of linguistics (e.g. transformational grammar), whose theoretical underpinnings discouraged the sort of corpus linguistics that underlies the machine-learning approach to language processing.[3] Some of the earliest-used machine learning algorithms, such as decision trees, produced systems of hard if-then rules similar to existing hand-written rules. However, part-of-speech tagging introduced the use of hidden Markov models to natural language processing, and increasingly, research has focused on statistical models, which make soft, probabilistic decisions based on attaching real-valued weights to the features making up the input data. The cache language models upon which many speech recognition systems now rely are examples of such statistical models. Such models are generally more robust when given unfamiliar input, especially input that contains errors (as is very common for real-world data), and produce more reliable results when integrated into a larger system comprising multiple subtasks.Many of the notable early successes occurred in the field of machine translation, due especially to work at IBM Research, where successively more complicated statistical models were developed. These systems were able to take advantage of existing multilingual textual corpora that had been produced by the Parliament of Canada and the European Union as a result of laws calling for the translation of all governmental proceedings into all official languages of the corresponding systems of government. However, most other systems depended on corpora specifically developed for the tasks implemented by these systems, which was (and often continues to be) a major limitation in the success of these systems. As a result, a great deal of research has gone into methods of more effectively learning from limited amounts of data. Recent research has increasingly focused on unsupervised and semi-supervised learning algorithms. Such algorithms can learn from data that has not been hand-annotated with the desired answers or using a combination of annotated and non-annotated data. Generally, this task is much more difficult than supervised learning, and typically produces less accurate results for a given amount of input data. However, there is an enormous amount of non-annotated data available (including, among other things, the entire content of the World Wide Web), which can often make up for the inferior results if the algorithm used has a low enough time complexity to be practical.In the 2010s, representation learning and deep neural network-style machine learning methods became widespread in natural language processing, due in part to a flurry of results showing that such techniques[4][5] can achieve state-of-the-art results in many natural language tasks, for example in language modeling,[6] parsing,[7][8] and many others. Popular techniques include the use of word embeddings to capture semantic properties of words, and an increase in end-to-end learning of a higher-level task (e.g., question answering) instead of relying on a pipeline of separate intermediate tasks (e.g., part-of-speech tagging and dependency parsing). In some areas, this shift has entailed substantial changes in how NLP systems are designed, such that deep neural network-based approaches may be viewed as a new paradigm distinct from statistical natural language processing. For instance, the term neural machine translation (NMT) emphasizes the fact that deep learning-based approaches to machine translation directly learn sequence-to-sequence transformations, obviating the need for intermediate steps such as word alignment and language modeling that was used in statistical machine translation (SMT)."

import openai

# Use your own API key
openai.api_key = "YOUR_API_KEY"

text = "Up to the 1980s, most natural language processing systems were based on complex sets of hand-written rules. Starting in the late 1980s, however, there was a revolution in natural language processing with the introduction of machine learning algorithms for language processing. This was due to both the steady increase in computational power (see Moore’s law) and the gradual lessening of the dominance of Chomskyan theories of linguistics (e.g. transformational grammar), whose theoretical underpinnings discouraged the sort of corpus linguistics that underlies the machine-learning approach to language processing.[3] Some of the earliest-used machine learning algorithms, such as decision trees, produced systems of hard if-then rules similar to existing hand-written rules. However, part-of-speech tagging introduced the use of hidden Markov models to natural language processing, and increasingly, research has focused on statistical models, which make soft, probabilistic decisions based on attaching real-valued weights to the features making up the input data. The cache language models upon which many speech recognition systems now rely are examples of such statistical models. Such models are generally more robust when given unfamiliar input, especially input that contains errors (as is very common for real-world data), and produce more reliable results when integrated into a larger system comprising multiple subtasks.Many of the notable early successes occurred in the field of machine translation, due especially to work at IBM Research, where successively more complicated statistical models were developed. These systems were able to take advantage of existing multilingual textual corpora that had been produced by the Parliament of Canada and the European Union as a result of laws calling for the translation of all governmental proceedings into all official languages of the corresponding systems of government. However, most other systems depended on corpora specifically developed for the tasks implemented by these systems, which was (and often continues to be) a major limitation in the success of these systems. As a result, a great deal of research has gone into methods of more effectively learning from limited amounts of data. Recent research has increasingly focused on unsupervised and semi-supervised learning algorithms. Such algorithms can learn from data that has not been hand-annotated with the desired answers or using a combination of annotated and non-annotated data. Generally, this task is much more difficult than supervised learning, and typically produces less accurate results for a given amount of input data. However, there is an enormous amount of non-annotated data available (including, among other things, the entire content of the World Wide Web), which can often make up for the inferior results if the algorithm used has a low enough time complexity to be practical.In the 2010s, representation learning and deep neural network-style machine learning methods became widespread in natural language processing, due in part to a flurry of results showing that such techniques[4][5] can achieve state-of-the-art results in many natural language tasks, for example in language modeling,[6] parsing,[7][8] and many others. Popular techniques include the use of word embeddings to capture semantic properties of words, and an increase in end-to-end learning of a higher-level task (e.g., question answering) instead of relying on a pipeline of separate intermediate tasks (e.g., part-of-speech tagging and dependency parsing). In some areas, this shift has entailed substantial changes in how NLP systems are designed, such that deep neural network-based approaches may be viewed as a new paradigm distinct from statistical natural language processing. For instance, the term neural machine translation (NMT) emphasizes the fact that deep learning-based approaches to machine translation directly learn sequence-to-sequence transformations, obviating the need for intermediate steps such as word alignment and language modeling that was used in statistical machine translation (SMT)."

# Use the GPT-3 engine to summarize the text
response = openai.Completion.create(
    engine="text-davinci-002",
    prompt=(f"Please summarize the following text: {text}"),
    temperature=0.5,
    max_tokens=150
)

# Print the summary
print(response.choices[0].text)

Output

The text discusses the history of natural language processing, with a focus on the shift from hand-written rules to machine learning algorithms in the late 1980s. This shift was due to both the increasing computational power of computers and the lessening dominance of Chomskyan theories of linguistics. The machine learning approach to language processing is more robust when given unfamiliar input, and produces more reliable results when integrated into a larger system. However, most machine learning systems for natural language processing depend on corpora specifically developed for the tasks they are designed to perform, which can be a major limitation.

This script uses the openai.Completion.create() function to make a request to the GPT-3 API and pass the text to be summarized. The engine parameter is used to specify which GPT-3 model to use. The prompt parameter is used to pass the text to be summarized, and the temperature parameter is used to control the randomness of the summary. The max_tokens parameter is used to control the length of the summary.

Conclusion

In this article, we showed you how to use the OpenAI API to create a summarization tool that can automatically summarize large pieces of text. The OpenAI API is a powerful tool that can be used to perform various natural language processing tasks, and it is easy to use with the openai library. However, it's worth noting that GPT-3 is a powerful model and it can give a good summary but it might also include some irrelevant information or generate a summary that is not coherent with the original text, depending on the complexity and the type of text, so it's important to review and check the generated

Creating a Summarization Tool with OpenAI’s GPT-3: A Step-by-Step Guide

Unleashing the power of GPT-3 to automatically summarize large paragraphs with ease

Introduction

Setting up the OpenAI API

Installing the OpenAI library

Creating the summarization tool

Output

Conclusion

Sign up to discover human stories that deepen your understanding of the world.

Free

Membership

Written by Sajid Hasan Sifat

No responses yet

More from Sajid Hasan Sifat

Building a Price Elasticity Model in Excel: A Step-by-Step Guide

Are you ready to delve into the world of economics and data analysis? Today, I’ll walk you through the step-by-step process of creating a…

Customer Segmentation Using Cluster Modeling for Churn Analysis: A Step-by-Step Tutorial

Revealing Hidden Patterns: Unveiling Churn Insights through Cluster Analysis and Customer Segmentation

Marketing Mix Attribution Modeling: A Step-by-Step Guide with Python

Using Regression Analysis to Optimize Your Marketing Mix and Boost Sales

Create Text Summary Using Python Without NLP Libraries

The simplest way to summarize your text using Python

Recommended from Medium

How to Chat with Your Excel Data Using Langchain and OpenAI API

Imagine being able to ask questions directly to your Excel data, as if you’re having a conversation with a financial analyst.

How I build an LLM application that helps me learn new languages

Learning a new language is not easy we need to know a lot of new vocabulary and grammar and with this, it requires us to use a lot of…

Lists

ChatGPT prompts

The New Chatbots: ChatGPT, Bard, and Beyond

ChatGPT

What is ChatGPT?

Building a Speech-to-Text Analysis System with Python

Speaker Diarization and Identification

Best Prompt Techniques for Best LLM Responses

Better prompts is all you need for better responses

Automating CSV Data Analysis with LLMs: A Comprehensive Workflow

This article outlines a comprehensive workflow for analyzing CSV data using an LLM-powered system

Creating an Audio Transcription and Summarization with OpenAI’s Whisper and Python

Audio processing has never been more accessible. With advancements in machine learning, we can now transcribe and summarize audio…