https://github.com/facebookresearch/faiss/
Article: 海量文本求topk相似:faiss库初探
Chinese: https://github.com/liqima/faiss_note
Siamese Recurrent Architectures for Learning Sentence Similarity - MIT2016
Code: https://github.com/LuJunru/Sentences_Pair_Similarity_Calculation_Siamese_LSTM (Keras)
Code: 基于Simaese LSTM的句子相似度计算 (Keras)
Code: https://github.com/eliorc/Medium/blob/master/MaLSTM.ipynb (Keras)
Article: How to predict Quora Question Pairs using Siamese Manhattan LSTM - 2017
Chinese: Siamese Recurrent Architectures for Learning Sentence Similarity
Learning Text Similarity with Siamese Recurrent Networks - Netherlands2016
Code:
https://github.com/likejazz/Siamese-LSTM (Keras)
https://github.com/eliorc/Medium/blob/master/MaLSTM.ipynb (Keras)
https://github.com/dhwajraj/deep-siamese-text-similarity (Tensorflow)
https://github.com/vishnumani2009/siamese-text-similarity (Tensorflow)
Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks - Germany2019
详情参考:07-Pretrained_Model.md
【Great】https://github.com/RandolphVI/Text-Pairs-Relation-Classification (Tensorflow)
Text Pairs (Sentence Level) Classification (Similarity Modeling) Based on Neural Network
模型有:ABCNN, ANN, CNN, CRNN, FastText, HAN, RCNN, RNN, SANN
YAO:
模型很丰富,且具有结构可视化结果,待看……
https://github.com/Vincent131499/TextSim_cn_finetune (Tensorflow)
微调预训练语言模型(BERT、Roberta、XLBert等),用于计算两个中文文本之间的相似度(通过句子对分类任务转换)
https://github.com/yanqiangmiffy/sentence-similarity (Keras)
问题句子相似度计算,即给定客服里用户描述的两句话,用算法来判断是否表示了相同的语义。
YAO: 里面提到了5个文本相似度计算的比赛
https://github.com/liuhuanyong/SentenceSimilarity
基于同义词词林,知网,指纹,字词向量,向量空间模型的句子相似度计算
https://github.com/ashengtx/CilinSimilarity
Word similarity computation based on Tongyici Cilin
https://github.com/BiLiangLtd/WordSimilarity
基于哈工大同义词词林扩展版的单词相似度计算方法
Article: 基于同义词词林扩展版的词语相似度计算
https://github.com/PengboLiu/Doc2Vec-Document-Similarity
利用Doc2Vec计算文本相似度
https://github.com/cjymz886/sentence-similarity
对四种句子/文本相似度计算方法进行实验与比较: cosine, cosine+idf, bm25, jaccard
https://github.com/liuhuanyong/SiameseSentenceSimilarity
SiameseSentenceSimilarity,个人实现的基于Siamese bilstm模型的相似句子判定模型,提供训练数据集和测试数据集.
https://github.com/fssqawj/SentenceSim
中文短文句相似读, 2016年项目,比较传统,方法有:基于知网、onehot向量模型、基于Word2Vec、基于哈工大SDP、融合算法、LSTM
https://github.com/Leputa/CIKM-AnalytiCup-2018 (Tensorflow)
CIKM AnalytiCup 2018 – 阿里小蜜机器人跨语言短文本匹配算法竞赛 – Rank12方案
判断不同语言的两个问句语义是否相同。
https://github.com/ziweipolaris/atec2018-nlp (Keras, PyTorch)
ATEC2018 NLP赛题,判断两个问句是否意思相同
https://github.com/zake7749/CIKM-AnalytiCup-2018 (Tensorflow & Keras)
[ACM-CIKM] 2nd place solution at CIKM AnalytiCup 2018, a task for determining short text similarities
2018atec蚂蚁金服NLP智能客服比赛
给定客服里用户描述的两句话,判断问句相似度
https://github.com/zle1992/atec (Keras)
Rank 16/2631
https://github.com/Lapis-Hong/atec-nlp (PyTorch)
Kaggle: Quora Question Pairs
判断 whether question pairs are duplicates or not
https://github.com/HouJP/kaggle-quora-question-pairs (TextNet)
Rank 4
此处可能存在不合适展示的内容,页面不予展示。您可通过相关编辑功能自查并修改。
如您确认内容无涉及 不当用语 / 纯广告导流 / 暴力 / 低俗色情 / 侵权 / 盗版 / 虚假 / 无价值内容或违法国家有关法律法规的内容,可点击提交进行申诉,我们将尽快为您处理。