Recreating Text Embeddings From An Example Dataset

Question

I have a list of sentences, and a list of their ideal embeddings on a 25-dimensional vector. I am trying to use a neural network to generate new encodings, but I am struggling. While the model runs fine, its output makes no sense, and it doesn't even accurately replicate training data!

import numpy as np
from tensorflow.keras.layers import Input, Dense
from tensorflow.keras.models import Model
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.preprocessing.text import Tokenizer
from tensorflow.keras.preprocessing.sequence import pad_sequences


# Tokenization
tokenizer = Tokenizer()
tokenizer.fit_on_texts(sentence_list)
sequences = tokenizer.texts_to_sequences(sentence_list)


# Assuming your vectors are 25-dimensional
input_dim = 25

# Define encoder
input_vec = Input(shape=(max_sequence_length,))
encoded = Dense(25, activation='tanh')(input_vec)   # Example reduction to 16 dimensions
encoder = Model(input_vec, encoded)

# Define decoder
decoded = Dense(input_dim, activation='sigmoid')(encoded)
autoencoder = Model(input_vec, decoded)

# Compile model
autoencoder.compile(optimizer=Adam(), loss='mse')

# Train the model
autoencoder.fit(padded_sequences, combined_vectors_clean,
                epochs=10,
                batch_size=32,
                shuffle=True, validation_split= 0.2)

As far as I can tell, there's nothing wrong with my input and my labels, so what am I missing?

IronSpiderMan · Accepted Answer · 2024-07-10 09:48:43Z

0

I think you can try to use another activation function. Embedding Vector may have negative values. But your model outputs is between 0 and 1.

answered Jul 10 at 9:48

IronSpiderMan

384 bronze badges

Add a comment |

Collectives™ on Stack Overflow

Recreating Text Embeddings From An Example Dataset

1 Answer 1

Not the answer you're looking for? Browse other questions tagged
machine-learning
autoencoder
word-embedding
or ask your own question.

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Not the answer you're looking for? Browse other questions tagged machine-learningautoencoderword-embedding or ask your own question.

Related

Not the answer you're looking for? Browse other questions tagged
machine-learning
autoencoder
word-embedding
or ask your own question.