A Progressive Model to Enable Continual Learning for Semantic Slot Filling

Abstract

Lifelong topic modeling has attracted much attention in natural language processing (NLP), since it can accumulate knowledge learned from past for the future task. However, the existing lifelong topic models often require complex derivation or only utilize part of the context information. In this study, we propose a knowledge-enhanced adversarial neural topic model (KATM) and extend it to LKATM for lifelong topic modeling. KATM employs a knowledge extractor to encourage the generator to learn interpretable document representations and retrieve knowledge from the generated documents. LKATM incorporates knowledge from the previous trained KATM into the current model to learn from prior models without catastrophic forgetting. Experiments on four benchmark text streams validate the effectiveness of our KATM and LKATM in topic discovery and document classification.

References

Aletras, N., Stevenson, M.: Evaluating topic coherence using distributional semantics. In: Proceedings of the 10th International Conference on Computational Semantics, pp. 13–22 (2013). https://www.aclweb.org/anthology/W13-0102/. Accessed 23 Oct 2020
Bengio, Y.: Discussion of the neural autoregressive distribution estimator. In: Proceedings of the 14th International Conference on Artificial Intelligence and Statistics, vol. 15, pp. 38–39 (2011). http://proceedings.mlr.press/v15/bengio11a/bengio11a.pdf. Accessed 22 Oct 2020
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. Journal of Machine Learning Research 3, 993–1022 (2003)

MATH Google Scholar
Cai, H., Chen, T., Zhang, W., Yu, Y., Wang, J.: Efficient architecture search by network transformation. In: Proceedings of the 32nd AAAI Conference on Artificial Intelligence, pp. 2787–2794 (2018). https://www.aaai.org/ocs/index.php/AAAI/AAAI18/paper/view/16755. Accessed 11 Nov 2020
Chaudhry, A., Ranzato, M., Rohrbach, M., Elhoseiny, M.: Efficient lifelong learning with A-GEM. In: Proceedings of the 7th International Conference on Learning Representations (2019). https://openreview.net/forum?id=Hkf2_sC5FX. Accessed 1 Nov 2020
Chen, D., Mei, J., Wang, C., Feng, Y., Chen, C.: Online knowledge distillation with diverse peers. In: Proceedings of the 34th AAAI Conference on Artificial Intelligence, pp. 3430–3437 (2020). https://aaai.org/ojs/index.php/AAAI/article/view/5746. Accessed 1 Nov 2020
Chen, Q., Zhu, X., Ling, Z., Inkpen, D., Wei, S.: Neural natural language inference models enhanced with external knowledge. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, pp. 2406–2417. Association for Computational Linguistics (2018). https://aclanthology.org/P18-1224/. Accessed 1 Jan 2021
Chen, T., Goodfellow, I.J., Shlens, J.: Net2net: Accelerating learning via knowledge transfer. In: Y. Bengio, Y. LeCun (eds.) Proceedings of the 4th International Conference on Learning Representations (2016). arxiv:1511.05641. Accessed 1 Jan 2021
Donahue, J., Jia, Y., Vinyals, O., Hoffman, J., Zhang, N., Tzeng, E., Darrell, T.: Decaf: A deep convolutional activation feature for generic visual recognition. In: Proceedings of the 31th International Conference on Machine Learning, pp. 647–655 (2014)
Du, W., Black, A.W.: Data augmentation for neural online chats response selection. In: Proceedings of the 2018 EMNLP Workshop SCAI: The 2nd International Workshop on Search-Oriented Conversational AI, pp. 52–58 (2018). https://doi.org/10.18653/v1/w18-5708
Fan, W., Guo, Z., Bouguila, N., Hou, W.: Clustering-based online news topic detection and tracking through hierarchical bayesian nonparametric models. In: Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 2126–2130. ACM (2021). https://doi.org/10.1145/3404835.3462982
Feng, Y., Feng, J., Rao, Y.: Reward-modulated adversarial topic modeling. In: Proceedings of the 25th International Conference on Database Systems for Advanced Applications, vol. 12112, pp. 689–697 (2020). https://doi.org/10.1007/978-3-030-59410-7_47
Fu, Y., Feng, Y.: Natural answer generation with heterogeneous memory. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 185–195. Association for Computational Linguistics (2018). https://doi.org/10.18653/v1/n18-1017
Goodfellow, I.J., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A.C., Bengio, Y.: Generative adversarial nets. In: Advances in Neural Information Processing Systems vol 27, pp. 2672–2680 (2014). https://proceedings.neurips.cc/paper/2014/hash/5ca3e9b122f61f8f06494c97b1afccf3-Abstract.html. Accessed on 1 Sep 2020
Gupta, P., Chaudhary, Y., Buettner, F., Schütze, H.: Document informed neural autoregressive topic models with distributional prior. In: Proceedings of the 33rd AAAI Conference on Artificial Intelligence, pp. 6505–6512 (2019). https://doi.org/10.1609/aaai.v33i01.33016505
Gupta, P., Chaudhary, Y., Runkler, T.A., Schütze, H.: Neural topic modeling with continual lifelong learning. In: Proceedings of the 37th International Conference on Machine Learning, vol. 119, pp. 3907–3917 (2020). http://proceedings.mlr.press/v119/gupta20a.html. Accessed 10 Sep 2020
Han, X., Dai, Y., Gao, T., Lin, Y., Liu, Z., Li, P., Sun, M., Zhou, J.: Continual relation learning via episodic memory activation and reconsolidation. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 6429–6440 (2020). https://www.aclweb.org/anthology/2020.acl-main.573/. Accessed 1 Oct 2020
He, S., Liu, C., Liu, K., Zhao, J.: Generating natural answers by incorporating copying and retrieving mechanisms in sequence-to-sequence learning. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, pp. 199–208. Association for Computational Linguistics (2017). https://doi.org/10.18653/v1/P17-1019
Hida, R., Takeishi, N., Yairi, T., Hori, K.: Dynamic and static topic model for analyzing time-series document collections. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, pp. 516–520 (2018). https://www.aclweb.org/anthology/P18-2082/. Accessed 20 Sep 2020
Hinton, G.E., Vinyals, O., Dean, J.: Distilling the knowledge in a neural network. arxiv:1503.02531 (2015). Accessed 1 Sep 2020
Hoyle, A., Goel, P., Resnik, P.: Improving neural topic models using knowledge distillation. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, pp. 1752–1771 (2020). https://doi.org/10.18653/v1/2020.emnlp-main.137
Hu, X., Wang, R., Zhou, D., Xiong, Y.: Neural topic modeling with cycle-consistent adversarial training. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, pp. 9018–9030 (2020). https://www.aclweb.org/anthology/2020.emnlp-main.725/. Accessed 25 Nov 2020
Huang, J., Peng, M., Li, P., Hu, Z., Xu, C.: Improving biterm topic model with word embeddings. World Wide Web 23(6), 3099–3124 (2020). https://doi.org/10.1007/s11280-020-00823-w

Article Google Scholar
Ioffe, S., Szegedy, C.: Batch normalization: Accelerating deep network training by reducing internal covariate shift. In: Proceedings of the 32nd International Conference on Machine Learning, vol. 37, pp. 448–456 (2015). http://proceedings.mlr.press/v37/ioffe15.html. Accessed 1 Oct 2020
Jiang, H., Zhou, R., Zhang, L., Wang, H., Zhang, Y.: A topic model based on poisson decomposition. In: Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, pp. 1489–1498. ACM (2017). https://doi.org/10.1145/3132847.3132942
Jiang, H., Zhou, R., Zhang, L., Wang, H., Zhang, Y.: Sentence level topic models for associated topics extraction. World Wide Web 22(6), 2545–2560 (2019). https://doi.org/10.1007/s11280-018-0639-1

Article Google Scholar
Ke, G., Meng, Q., Finley, T., Wang, T., Chen, W., Ma, W., Ye, Q., Liu, T.: Lightgbm: A highly efficient gradient boosting decision tree. In: Proceedings of the 31st Conference on Neural Information Processing Systems, pp. 3146–3154 (2017). https://proceedings.neurips.cc/paper/2017/hash/6449f44a102fde848669bdd9eb6b76fa-Abstract.html. Accessed on 20 Nov 2020
Keskar, N.S., Mudigere, D., Nocedal, J., Smelyanskiy, M., Tang, P.T.P.: On large-batch training for deep learning: Generalization gap and sharp minima. In: Proceedings of the 5th International Conference on Learning Representations (2017). https://openreview.net/forum?id=H1oyRlYgg. Accessed 1 Oct 2020
Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. In: Proceedings of the 3rd International Conference on Learning Representations (2015). arxiv: 1412.6980. Accessed 15 Sep 2020
Kirkpatrick, J., Pascanu, R., Rabinowitz, N., Veness, J., Desjardins, G., Rusu, A.A., Milan, K., Quan, J., Ramalho, T., Grabska-Barwinska, A., Hassabis, D., Clopath, C., Kumaran, D., Hadsell, R.: Overcoming catastrophic forgetting in neural networks. In: Proceedings of the National Academy of Sciences, pp. 3521–3526 (2017)
Lauly, S., Zheng, Y., Allauzen, A., Larochelle, H.: Document neural autoregressive distribution estimation. Journal of Machine Learning Research 18, 113:1–113:24 (2017). http://jmlr.org/papers/v18/16-017.html. Accessed 20 Sep 2020
Li, Z., Hoiem, D.: Learning without forgetting. In: Proceedings of the 14th European Conference on Computer Vision, pp. 614–629. Springer (2016)
Liu, Y., Zhang, W., Wang, J.: Adaptive multi-teacher multi-level knowledge distillation. Neurocomputing 415, 106–113 (2020). https://doi.org/10.1016/j.neucom.2020.07.048

Article Google Scholar
Madotto, A., Wu, C., Fung, P.: Mem2seq: Effectively incorporating knowledge bases into end-to-end task-oriented dialog systems. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, pp. 1468–1478. Association for Computational Linguistics (2018). https://aclanthology.org/P18-1136/
Marsland, S., Shapiro, J., Nehmzow, U.: A self-organising network that grows when required. Neural Networks 15, 1041–1058 (2002). https://www.sciencedirect.com/science/article/pii/S0893608002000783. Accessed 11 Nov 2020
Mccloskey, M.: Catastrophic interference in connectionist networks: The sequential learning problem. Psychology of Learning and Motivation 24, 109–165 (1989)

Article Google Scholar
Miao, Y., Grefenstette, E., Blunsom, P.: Discovering discrete latent topics with neural variational inference. In: Proceedings of the 34th International Conference on Machine Learning, Proceedings of Machine Learning Research, vol. 70, pp. 2410–2419 (2017). http://proceedings.mlr.press/v70/miao17a.html. Accessed 23 Sep 2020
Miao, Y., Yu, L., Blunsom, P.: Neural variational inference for text processing. In: Proceedings of the 33nd International Conference on Machine Learning, vol. 48, pp. 1727–1736 (2016). http://proceedings.mlr.press/v48/miao16.html. Accessed 20 Sep 2020
Mimno, D.M., Wallach, H.M., Talley, E.M., Leenders, M., McCallum, A.: Optimizing semantic coherence in topic models. In: Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing, pp. 262–272 (2011). https://www.aclweb.org/anthology/D11-1024/. Accessed 1 Oct 2020
li Ming, G., Song, H.: Adult neurogenesis in the mammalian brain: Significant answers and significant questions. Neuron 70, 687–702 (2011). https://www.sciencedirect.com/science/article/pii/S0896627311003485. Accessed 15 Nov 2020
Mnih, A., Gregor, K.: Neural variational inference and learning in belief networks. In: Proceedings of the 31th International Conference on Machine Learning, JMLR Workshop and Conference Proceedings, pp. 1791–1799. JMLR.org (2014). http://proceedings.mlr.press/v32/mnih14.html. Accessed 10 Sep 2020
Nair, V., Hinton, G.E.: Rectified linear units improve restricted boltzmann machines. In: Proceedings of the 27th International Conference on Machine Learning, pp. 807–814 (2010). https://icml.cc/Conferences/2010/papers/432.pdf. Accessed 11 Oct 2020
Nan, F., Ding, R., Nallapati, R., Xiang, B.: Topic modeling with wasserstein autoencoders. In: Proceedings of the 57th Conference of the Association for Computational Linguistics, pp. 6345–6381 (2019). https://doi.org/10.18653/v1/p19-1640
Parisi, G.I., Kemker, R., Part, J.L., Kanan, C., Wermter, S.: Continual lifelong learning with neural networks: A review. Neural Networks 113, 54–71 (2019). https://doi.org/10.1016/j.neunet.2019.01.012

Article Google Scholar
Pennington, J., Socher, R., Manning, C.D.: Glove: Global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, pp. 1532–1543 (2014). https://doi.org/10.3115/v1/d14-1162
Peters, M.E., Neumann, M., IV, R.L.L., Schwartz, R., Joshi, V., Singh, S., Smith, N.A.: Knowledge enhanced contextual word representations. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, pp. 43–54. Association for Computational Linguistics (2019). https://doi.org/10.18653/v1/D19-1005
Rebuffi, S., Kolesnikov, A., Sperl, G., Lampert, C.H.: icarl: Incremental classifier and representation learning. In: Proceedings of the 30th Conference on Computer Vision and Pattern Recognition, pp. 5533–5542 (2017). https://doi.org/10.1109/CVPR.2017.587
Robins, A.V.: Catastrophic forgetting, rehearsal and pseudorehearsal. Connect. Sci. 7(2), 123–146 (1995). https://doi.org/10.1080/09540099550039318

Article Google Scholar
Röder, M., Both, A., Hinneburg, A.: Exploring the space of topic coherence measures. In: Proceedings of the 8th ACM International Conference on Web Search and Data Mining, pp. 399–408 (2015). https://doi.org/10.1145/2684822.2685324
Shen, Y., Zeng, X., Jin, H.: A progressive model to enable continual learning for semantic slot filling. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, pp. 1279–1284 (2019). https://doi.org/10.18653/v1/D19-1126
Srivastava, A., Sutton, C.: Autoencoding variational inference for topic models. In: Proceedings of the 5th International Conference on Learning Representation (2017). https://openreview.net/forum?id=BybtVK9lg. Accessed 19 Sep 2020
Venkatesaramani, R., Downey, D., Malin, B.A., Vorobeychik, Y.: A semantic cover approach for topic modeling. In: Proceedings of the 8th Joint Conference on Lexical and Computational Semantics, pp. 92–102 (2019). https://doi.org/10.18653/v1/s19-1011
Wang, H., Xiong, W., Yu, M., Guo, X., Chang, S., Wang, W.Y.: Sentence embedding alignment for lifelong relation extraction. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 796–806 (2019). https://doi.org/10.18653/v1/n19-1086
Wang, R., Hu, X., Zhou, D., He, Y., Xiong, Y., Ye, C., Xu, H.: Neural topic modeling with bidirectional adversarial training. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 340–350 (2020). https://www.aclweb.org/anthology/2020.acl-main.32/. Accessed 19 Sep 2020
Wang, R., Zhou, D., He, Y.: ATM: adversarial-neural topic model. Information Processing and Management 56 (2019). https://doi.org/10.1016/j.ipm.2019.102098
Wang, R., Zhou, D., He, Y.: Open event extraction from online text using a generative adversarial network. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, pp. 282–291 (2019). https://doi.org/10.18653/v1/D19-1027
Wang, S., Chen, Z., Liu, B.: Mining aspect-specific opinion using a holistic lifelong topic model. In: Proceedings of the 25th International Conference on World Wide Web, pp. 167–176 (2016). https://doi.org/10.1145/2872427.2883086
Yang, P., Li, L., Luo, F., Liu, T., Sun, X.: Enhancing topic-to-essay generation with external commonsense knowledge. In: Proceedings of the 57th Conference of the Association for Computational Linguistics, pp. 2002–2012. Association for Computational Linguistics (2019). https://doi.org/10.18653/v1/p19-1193
Yu, W., Zhu, C., Li, Z., Hu, Z., Wang, Q., Ji, H., Jiang, M.: A survey of knowledge-enhanced text generation. arxiv:2010.04389 (2020)
Zenke, F., Poole, B., Ganguli, S.: Continual learning through synaptic intelligence. In: Proceedings of the 34th International Conference on Machine Learning, pp. 3987–3995 (2017)
Zhang, H., Liu, Z., Xiong, C., Liu, Z.: Grounded conversation generation as guided traverses in commonsense knowledge graphs. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 2031–2043 (2020). https://doi.org/10.18653/v1/2020.acl-main.184
Zhou, H., Young, T., Huang, M., Zhao, H., Xu, J., Zhu, X.: Commonsense knowledge aware conversation generation with graph attention. In: Proceedings of the 27th International Joint Conference on Artificial Intelligence, pp. 4623–4629. ijcai.org (2018) https://doi.org/10.24963/ijcai.2018/643

Download references

Acknowledgements

The research described in this paper was supported by the National Natural Science Foundation of China (61972426), Guangdong Basic and Applied Basic Research Foundation (2020A1515010536), the Hong Kong Research Grants Council (project no. PolyU 11204919), and an internal research grant from the Hong Kong Polytechnic University (project 1.9B0V).

Author information

Authors and Affiliations

School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou, China

Xuewen Zhang & Yanghui Rao
Department of Computing, The Hong Kong Polytechnic University, Kowloon, Hong Kong

Qing Li

Corresponding author

Correspondence to Yanghui Rao.

Ethics declarations

Conflicts of interest

The authors declare that they have no conflict of interest.

About this article

Verify currency and authenticity via CrossMark

Cite this article

Zhang, X., Rao, Y. & Li, Q. Lifelong topic modeling with knowledge-enhanced adversarial network. World Wide Web 25, 219–238 (2022). https://doi.org/10.1007/s11280-021-00984-2

Download citation

Received: 20 March 2021
Revised: 24 November 2021
Accepted: 29 November 2021
Published: 23 December 2021
Issue Date: January 2022
DOI : https://doi.org/10.1007/s11280-021-00984-2

Keywords

Neural topic modeling
Lifelong learning
Knowledge distillation

walkerwerre1963.blogspot.com

Source: https://link.springer.com/article/10.1007/s11280-021-00984-2

A Progressive Model to Enable Continual Learning for Semantic Slot Filling

Abstract

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflicts of interest

About this article

Cite this article

Keywords

0 Response to "A Progressive Model to Enable Continual Learning for Semantic Slot Filling"

Publicar un comentario

Iklan Atas Artikel

Iklan Tengah Artikel 1

Iklan Tengah Artikel 2

Iklan Bawah Artikel