Abstract: In this paper, TF-IDF algorithm is used to extract high-frequency keywords, LDA topic model is used to extract topic words, and TF-IDF vectorized text is used to calculate topic similarity.