Skip to content

ChenglongChen/tensorflow-DSMM

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

tensorflow-DSMM

Ongoing project for implementing various Deep Semantic Matching Models (DSMM). DSMM is widely used for:

  • duplicate detection
  • sentence similarity
  • question answering
  • search relevance
  • ...

Quickstart

Data

This project is developed with regard to the data format provided in the 第三届魔镜杯大赛.

You can see /data/DATA.md for the data format description and prepared data accordingly. Your data should be placed in the data directory. Current data directory also holds a toy data.

If you want to run a quick demo, you can download data from the above competition link. Download is allowed after registration.

Demo

python src/main.py

Supported Models

Representation based methods

  • DSSM style models
    • DSSM: use FastText as encoder
    • CDSSM: use TextCNN as encoder
    • RDSSM: use TextRNN/TextBiRNN as encoder

Interaction based methods

  • MatchPyramid style models
    • MatchPyramid: use identity/cosine similarity/dot product as match matrix
    • General MatchPyramid: use match matrices based on various embeddings and various match scores
      • word embeddings
        • original word embedding
        • compressed word embedding
        • contextual word embedding (use an encoder to encode contextual information)
      • match score
        • identity
        • cosine similarity/dot product
        • element product
        • element concat
  • BCNN style models
    • BCNN
    • ABCNN1
    • ABCNN2
    • ABCNN3
  • ESIM
  • DecAtt (Decomposable Attention)

Building Blocks

Encoder layers

  • FastText
  • TimeDistributed Dense Projection
  • TextCNN (Gated CNN and also Residual Gated CNN)
  • TextRNN/TextBiRNN with GRU and LSTM cell

Attention layers

  • mean/max/min pooling
  • scalar-based and vector-based attention
  • self and context attention
  • multi-head attention

Acknowledgments

This project gets inspirations from the following projects:

About

Tensorflow implementations of various Deep Semantic Matching Models (DSMM).

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy