Load the data, get all unique characters, create variables to go from characters to their assigned indices and from indices to characters, and then encode text into integers:
seq_len=120total_num_seq=len(text)//(seq_len+1)importtensorflowastfchar_dataset=tf.data.Dataset.from_tensor_slices(encoded_text)sequences=char_dataset.batch(seq_len+1,drop_remainder=True)defcreate_seq_targets(seq):input_txt=seq[:-1]# e.g. "Hello my nam"
target_txt=seq[1:]# e.g. "ello my name"
returninput_txt,target_txtdataset=sequences.map(create_seq_targets)batch_size=128buffer_size=10000dataset=dataset.shuffle(buffer_size).batch(batch_size,drop_remainder=True)
Create the model:
vocab_size=len(vocab)embed_dim=64# Same order of mag as vocab size
rnn_neurons=1026
Load the model:
This consumed lots of resources, so the class had the model already saved and we loaded it to see what it did. But instead of loading the whole model, we created a new instance of the untrained model and then only loaded the weights from the trained model. Finally we need to build the model.