This repository contains code to load a GPT-2 model, perform text generation and create a Twitter Bot that interact with Twitter users when it is mentioned.
I fine-tuned 2 small GPT-2 models (124M parameters) to generate Eminem lyrics as well as Storytelling lyrics. The following samples correspond to the outputs of such models:
Eminem Bot Lyrics (@rap_god_bot)
Music Storytelling Bot Lyrics (@musicstorytell)
sudo apt-get update sudo apt-get install python3 sudo apt install python3-pip
Open the file run_twitter_bot.py and modify it with your model file path and twitter user information.
Modify the authentication files to match your user / keys. More info on how to create a Twitter development account in the section below.
This year, OpenAI released a new set of language generation models: GPT-2. These large-scale unsupervised language models were able to generate coherent paragraphs of text while achieving state-of-the-art performance on many language modeling benchmarks.
For storage and memory purposes I decided to fine-tune the smallest one (124M) though the 355M was genenerating more diverse outputs.
These are some resources that I used:
In order to download all the song lyrics that I used to fine-tune the GPT-2 model, I used a great library called LyricsGenious. This package offers a really clean interface that interacts with the Genious API and makes easy the download of lyrics.
import lyricsgenius genius = lyricsgenius.Genius("my_client_access_token_here") artist = genius.search_artist("Eminem", max_songs=10, sort="title") print(artist.songs)
Tweepy is an easy-to-use Python library for accessing and interacting with the Twitter API. Getting started is as simple as:
import tweepy # Authenticate to Twitter auth = tweepy.OAuthHandler("CONSUMER_KEY", "CONSUMER_SECRET") auth.set_access_token("ACCESS_TOKEN", "ACCESS_TOKEN_SECRET") # Create API object api = tweepy.API(auth) # Create a tweet api.update_status("Hello Tweepy")
I used the Free Tier of Amazon EC2 instances to deploy the models. Even though they were the smallest GPT-2 models, they weren’t fitting on RAM memory. The solution I opted for was creating a Swap space in the system.
Swap is a space on a disk that is used when the amount of physical RAM memory is full. When a Linux system runs out of RAM, inactive pages are moved from the RAM to the swap space.
I used the following code to allocate 2GB of space:
# Create a file which will be used for swap sudo fallocate -l 2G /swapfile # Set the correct permissions sudo chmod 600 /swapfile # Set up a Linux swap area sudo mkswap /swapfile # Enable the swap sudo swapon /swapfile # Verify the swap status sudo swapon --show
Check this blogpost for more information.
I created a function to create an image and draw text on it using PIL.
def print_text_in_image(text, font_path='Pillow/Tests/fonts/FreeMono.ttf', image_color=(255, 255, 225)): # Create a blank image # image = np.uint8(np.ones((1100, 1000, 3)) * 255) image = np.ones((1100, 1000, 3)) # Give some color to the base image image[:, :, 0] *= image_color image[:, :, 1] *= image_color image[:, :, 2] *= image_color image = np.uint8(image) # Create a PIL Image pil_image = Image.fromarray(image, 'RGB') font = ImageFont.truetype(font_path, 40) # Get a drawing context d = ImageDraw.Draw(pil_image) # Margins vertical_coord = 50 horizontal_margin = 50 # Draw text, full opacity for sentence in text: d.text((horizontal_margin, vertical_coord), sentence, font=font, fill=(0, 0, 0, 255)) vertical_coord += 40 return pil_image
I wanted to set up an automatic e-mail messaging service so every time the bot is down I could get a notification. I ended up using SMTP_SSL and a gmail account:
import ssl import json import smtplib class EmailSender: def __init__(self, authentication_json_path): with open(authentication_json_path, 'r') as f: authentication_params = json.load(f) self.password = authentication_params['PASSWORD'] self.email = authentication_params['EMAIL'] # Port for SSL self.port = 465 # Create a secure SSL context self.context = ssl.create_default_context() def send_email(self, email_receiver, message): with smtplib.SMTP_SSL("smtp.gmail.com", self.port, context=self.context) as server: server.login(self.email, self.password) server.sendmail(self.email, email_receiver, message)
More information about how to set an e-mail service can be found here.