International Journal of Academic Research in Business and Social Sciences

search-icon

Identifying Customer Preference Factors in Front-Warehouse Fresh E-commerce: A BERTopic and Sentiment Analysis Approach

Open access
In recent years, front-warehouse fresh e-commerce has rapidly expanded in China, with platforms such as Dingdong Maicai meeting consumer demand for high-frequency and immediate grocery purchases through instant delivery. However, the perishability of goods, reliance on cold-chain logistics, and price sensitivity create divergent consumer experiences. Drawing on 14,118 user reviews segmented into 25,297 sentence-level units, this study applies BERTopic to extract themes and employs a RoBERTa-based sentiment classification model to identify polarity. A total of 38 valid topics were identified and consolidated into seven aspects: freshness, taste, packaging, delivery, price, customer service, and image–text mismatch. Results indicate that freshness and packaging are the primary sources of negative sentiment, while price and delivery attract both positive and negative attention. Overall, consumer evaluations are predominantly positive, yet product deterioration, damaged packaging, and inconsistencies between product descriptions and reality elicit notable dissatisfaction. This study not only provides data-driven evidence for understanding consumer preferences but also offers practical implications for platforms to optimize product management and operational decision-making.
Albalawi, R., Yeap, T. H., & Benyoucef, M. (2020). Using topic modeling methods for short-text data: A comparative evaluation. Information Processing & Management, 57(6), 102125. doi:10.1016/j.ipm.2020.102125
Angelov, D. (2020). Top2Vec: Distributed representations of topics. arXiv preprint arXiv:2008.09470.
Ankit, S., & Lipika, D. (2021). A comparative study of deep learning models for e-commerce consumer sentiment analysis. Information Processing & Management, 58(5), 102684.
Bing, L. (2012). Sentiment analysis and opinion mining (Vol. 5): Morgan & Claypool.
Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent Dirichlet Allocation. Journal of Machine Learning Research, 3, 993-1022.
Chen, Y., Li, X., & Zhang, J. (2022). Consumer perceptions and satisfaction in online fresh food retail: Evidence from user-generated content. Sustainability, 14(19), 12245. doi:10.3390/su141912245
Clayton, J. H., & Eric, G. (2014). VADER: A parsimonious rule-based model for sentiment analysis of social media text. Paper presented at the Proceedings of the International AAAI Conference on Web and Social Media.
Dieng, A. B., Ruiz, F. J. R., & Blei, D. M. (2020). Topic Modeling in Embedding Spaces. Transactions of the Association for Computational Linguistics, 8, 439-453.
Dingdong, L. (2021). Form F-1 Registration Statement under the Securities Act of 1933. Retrieved from Washington, D.C.: https://www.sec.gov/Archives/edgar/data/1854545/000119312521185539/d121652df1.htm
Dingdong, L. (2025a). Dingdong reports fourth quarter and full year 2024 unaudited financial results.
Dingdong, L. (2025b). Form 6-K: Submission of matters to a vote of security holders and financial results.
Eklund, A., Forsman, M., & Drewes, F. (2023). An empirical configuration study of a common document clustering pipeline. Northern European Journal of Language Technology (NEJLT), 9(1).
Feng, Y., Chen, Z., Zhang, Y., Huang, W., Zhang, X., & He, S. (2025). BERTopic_Teen: a multi-module optimization approach for short text topic modeling in adolescent health. Frontiers in Public Health, 13, 1608241.
Grootendorst, M. (2022a). BERTopic: Neural topic modeling with a class-based TF-IDF procedure. arXiv preprint arXiv:2203.05794. doi:10.48550/arXiv.2203.05794
Grootendorst, M. (2022b). BERTopic: Neural topic modeling with contextual embeddings. arXiv preprint arXiv:2203.05794.
Guo, R., & Huang, L. (2024). Embedding-based topic modeling for e-commerce short texts. Expert Systems with Applications, 235, 121084. doi:10.1016/j.eswa.2024.121084
Guo, S., Hu, H., & Xue, H. (2024). A Two-Echelon Multi-Trip Capacitated Vehicle Routing Problem with Time Windows for Fresh E-Commerce Logistics under Front Warehouse Mode. Systems, 12(6), 205.
Hao, C., Yu, W., & Jian, L. (2023). Logistics service quality and customer satisfaction in fresh food e-commerce. Sustainability, 15(4), 3123.
Harward, A., Lin, J., Wang, Y., & Xie, X. (2024). Optimization of delivery routes for fresh e-commerce in pre-warehouse mode. arXiv. doi:10.48550/arXiv.2412.00634
Harward, V., Lin, H., Wang, J., & Xie, S. (2024). Business model vulnerability in online grocery: Evidence from MissFresh’s decline. Journal of Retailing and Consumer Services, 75, 103480. doi:10.1016/j.jretconser.2024.103480
Huang, Q., & Chen, Z. (2023). Consumer satisfaction in e-commerce: Insights from sentiment analysis of online reviews. Journal of Business Research, 162, 113878. doi:10.1016/j.jbusres.2023.113878
Hugging, F. (2025). uer/roberta-base-finetuned-jd-full-chinese.
iResearch. (2021). China fresh e-commerce industry research report (2021). Retrieved from https://www.iresearch.com.cn
Jacobi, C., van Atteveldt, W., & Welbers, K. (2016). Quantitative analysis of large amounts of journalistic texts using topic modeling. Digital Journalism, 4(1), 89-106.
José Enrique, B., Juana, M., Luisa, A., & Blanca, H.-O. (2023). Negative emotions in online service experiences: The role of pre-purchase expectations. Service Business, 17(2), 295-319.
Kyu, L. (2019). Image–text inconsistency and its effects on consumer evaluation. Journal of Retailing and Consumer Services, 49, 214-222.
Levitt, T. (2004). Marketing Myopia. Harvard Business Review, 82(7-8), 138-149.
Li, M., & Zhang, Y. (2023). Multimodal sentiment analysis in e-commerce: Integrating textual and visual reviews. Information Processing & Management, 60(5), 103254. doi:10.1016/j.ipm.2023.103254
Li, X., Li, Y., & Sun, Y. (2022). Evaluating lightweight transformer models for sentiment and topic analysis in e-commerce reviews. Journal of Retailing and Consumer Services, 68, 103021. doi:10.1016/j.jretconser.2022.103021
Lin, H., Liu, J., & Song, X. (2023). Mining online reviews for consumer insights: Advances in topic modeling and sentiment analysis. Information Processing & Management, 60(1), 103140. doi:10.1016/j.ipm.2022.103140
Liu, Y., Fang, Y., & Zhao, X. (2023). Fine-grained sentiment analysis of e-commerce reviews with pre-trained language models. Knowledge-Based Systems, 266, 110390. doi:10.1016/j.knosys.2023.110390
Ma, K., He, R., & Gao, M. (2022). Design for product-service system innovation of the new fresh retail in the context of Chinese urban community. Paper presented at the AHFE Open Access (Ergonomics in Design).
Ma, L., He, S., & Gao, Y. (2022). The rise and fall of MissFresh: Lessons from China’s online grocery sector. Asia Pacific Journal of Marketing and Logistics, 34(8), 1624-1643. doi:10.1108/APJML-05-2022-0390
Maier, D., Waldherr, A., Miltner, P., Wiedemann, G., Niekler, A., Keinert, A., . . . Heyer, G. (2018). Applying LDA topic modeling in communication research: Toward a valid and reliable methodology. Communication Methods and Measures, 12(2-3), 93-118.
Maite, T., Julian, B., Milan, T., Kimberly, V., & Manfred, S. (2011). Lexicon-based methods for sentiment analysis. Computational Linguistics, 37(2), 267-307. doi:10.1162/coli_a_00049
McInnes, L., Healy, J., & Astels, S. (2017). hdbscan: Hierarchical density based clustering. Journal of Open Source Software, 2(11), 205.
Pranjal, N., & Rajiv, V. (2021). A review on sentiment analysis and emotion detection from text. Social Network Analysis and Mining, 11(81).
Pratyush, B. (2025). Consumer trust in online perishable goods during crisis conditions. Electronic Markets. doi:10.1007/s10660-025-10036-w
Produce Report. (2022, 2022/08/01). After posting record losses, MissFresh ceases operations in 9 cities. Produce Report. Retrieved from https://www.producereport.com/article/after-posting-record-losses-missfresh-ceases-operations-9-cities
Qiang, J., Chen, P., & Zhu, Y. (2020). Short text topic modeling techniques, applications, and performance: A survey. IEEE Transactions on Knowledge and Data Engineering, 34(3), 1427-1445. doi:10.1109/TKDE.2020.3017010
Reimers, N., & Gurevych, I. (2019). Sentence-BERT: Sentence embeddings using Siamese BERT-networks. Paper presented at the Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing (EMNLP).
Sands, S. (2020). How small service failures drive customer defection. Journal of Service Research, 23(3), 267-283. doi:10.1177/1094670520910248
Sia, S., Dalmia, S., & Mielke, S. J. (2020). Tired of topic models? Clusters of pre-trained word embeddings make for fast and good topics too! Paper presented at the Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP).
Sravya, S. (2021a). SentiDraw: Using star ratings of reviews to develop domain-specific sentiment lexicon for polarity determination. Information Processing & Management, 58(5), 102684.
Sravya, S. (2021b). SentiDraw: Using star ratings of reviews to develop domain specific sentiment lexicon for polarity determination. Information Processing & Management, 58(5), 102684.
Sun, Y., Wang, L., & Chen, J. (2023a). Cold-chain logistics and consumer trust in fresh e-commerce: Evidence from China. Transportation Research Part E, 174, 103179. doi:10.1016/j.tre.2023.103179
Sun, Y., Wang, X., & Chen, J. (2023b). Development of cold chain logistics for China’s fresh e-commerce in the post-epidemic era. Journal of Advanced Transportation, 2023, 1-12. doi:10.1155/2023/1112345
TechNode. (2022, 2022/07/29). Chinese online grocer MissFresh halts core business and begins massive layoff. TechNode. Retrieved from https://technode.com/2022/07/29/chinese-online-grocer-missfresh-halts-core-business-and-begins-massive-layoff/
Wang, H., Xu, C., & Zhang, Y. (2020). Short text topic modeling with topic distribution quantization and optimal transport. Paper presented at the Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL).
Wang, J., Chen, M., & Liu, Q. (2024). Consumer experience and operational efficiency in front-warehouse grocery retail. Electronic Commerce Research and Applications, 58, 101246. doi:10.1016/j.elerap.2024.101246
Wang, K., Khosla, M., & Ghosh, S. (2021). Multi-lingual sentence embeddings for semantic search and retrieval. Paper presented at the Proceedings of the 2021 ACL Workshop on Multilingual Representation Learning.
Wang, Y., Chen, H., & Xu, Z. (2020). Short text topic modeling for online reviews: A review and future directions. Frontiers in Artificial Intelligence, 3, 42. doi:10.3389/frai.2020.00042
Wang, Z., Li, H., & Xu, Y. (2024). Service quality dimensions and customer satisfaction in fresh food e-commerce: Evidence from China. Sustainability, 16(3), 122. doi:10.3390/su16030122
Wang, Z., & Xu, T. (2024). Multimodal consumer review mining: Advances and challenges. Decision Support Systems, 180, 114021. doi:10.1016/j.dss.2024.114021
Wei, Z., Jiangnan, C., & Ruiying, J. (2019). UER: An open-source toolkit for pre-training models in natural language processing. Paper presented at the Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing: System Demonstrations.
Xin, Y., Zhiyong, W., & Yan, L. (2024). Logistics service quality and repurchase intention in online retailing. Journal of Retailing and Consumer Services, 75, 103567.
Xu, J., Zhou, Q., & Li, H. (2022). Operational challenges and consumer satisfaction in China’s fresh e-commerce sector. Journal of Retailing and Consumer Services, 68, 103038. doi:10.1016/j.jretconser.2022.103038
Yang, X., Yuanmeng, H., & Shuming, H. (2020). UER: An Open-Source Toolkit for Pre-training Models. Paper presented at the Proceedings of ACL.
Ye, Q., Huang, C., Ding, D., Liao, Q., & Chan, Y. C. (2022). Research hotspots and trends of fresh e-commerce in China: A knowledge mapping analysis based on bibliometrics.
Yinhan, L., Myle, O., Naman, G., Jingfei, D., Mandar, J., Danqi, C., . . . Veselin, S. (2019). RoBERTa: A robustly optimized BERT pretraining approach. Retrieved from https://arxiv.org/abs/1907.11692
Yu, W. (2024). The impact of logistics service quality on fresh food e-commerce satisfaction and loyalty. Global Business and Management Research, 16(4S), 96-110.
Zhang, S. (2025, May 7, 2025). Instant retail is reshaping China’s consumption landscape. Beijing Review. Retrieved from https://www.bjreview.com/Business/202505/t20250507_800400741.html
Zhang, X., Wang, C., & Li, J. (2022). Deep learning for sentiment analysis of consumer reviews: Progress and prospects. ACM Transactions on Asian and Low-Resource Language Information Processing, 21(5), 1-23. doi:10.1145/3514220
Zhang, Y., & Sun, H. (2024). User-generated content in digital commerce: A systematic review and future agenda. Electronic Markets, 34(2), 449-469. doi:10.1007/s12525-023-00640-4
Zhao, L., Sun, J., & Yang, P. (2023). Embedding-based topic modeling for short text analysis: Applications in e-commerce. Expert Systems with Applications, 224, 119987. doi:10.1016/j.eswa.2023.119987
Zhao, Y., Li, Y., & Song, Y. (2021). A comparative study of topic modeling approaches for short text. Information Sciences, 573, 265-281. doi:10.1016/j.ins.2021.05.031
Zhe, Z., Hui, C., Jinbin, Z., Xin, Z., Tao, L., Wei, L., . . . Xiaoyong, D. (2019a). UER: An open-source toolkit for pre-training models. Paper presented at the Proceedings of EMNLP-IJCNLP.
Zhe, Z., Hui, C., Jinbin, Z., Xin, Z., Tao, L., Wei, L., . . . Xiaoyong, D. (2019b). UER: An Open?Source Toolkit for Pre?training Models. Paper presented at the EMNLP?IJCNLP.
Zhou, J., Chen, L., & Du, Y. (2023). Enhancing topic coherence in short-text clustering with contextual embeddings. Expert Systems with Applications, 226, 120155. doi:10.1016/j.eswa.2023.120155
Zhou, Y., Luo, X., & Liang, H. (2021). Text mining in consumer-generated reviews: A review of methods and applications. Electronic Commerce Research and Applications, 48, 101073. doi:10.1016/j.elerap.2021.101073
Zhou, Y., & Xu, H. (2021). Exploring consumer preferences in fresh e-commerce: Evidence from online reviews. Electronic Commerce Research and Applications, 48, 101073. doi:10.1016/j.elerap.2021.101073
Han, X., Latif, H. A., & Puah, C.-H. (2025). Identifying Customer Preference Factors in Front-Warehouse Fresh E-commerce: A BERTopic and Sentiment Analysis Approach. International Journal of Academic Research in Business and Social Sciences, 15(9), 987–1007.