报告时间:2026年6月6日(周六)下午 15:00-17:00
报告地点:91国产 天赐庄校区博远楼304
报告人:晁越 博士后,厦门大学
报告摘要:
Multi-resolution optimal subsampling (MROSS) has recently emerged as a powerful technique for binary linear classification with massive data. However, extending this approach to multi-class softmax regression is nontrivial, because the decision boundary is no longer a single hyperplane, the likelihood involves class probabilities, and the parameterization depends on identifiability constraints. In this paper, we propose Optimal Subsampling for Softmax Regression via Multi-resolution Partitioning (OSSRMP). Building on the core idea of MROSS, OSSRMP develops a multi-class subsampling framework that tailors multi-resolution partitioning, Rao-Blackwellized estimating equations, and optimal subsampling probabilities to the softmax regression. Based on OSSRMP, we further develop a model-constraint-invariant prediction-oriented subsampling strategy through mean squared prediction error (MSPE) minimization, together with a Poisson subsampling extension to alleviate memory constraints in massive data applications. Theoretically, we rigorously establish consistency, asymptotic normality under mild regularity conditions, and valid sandwich variance estimation for the proposed estimators. The analysis explicitly incorporates the pilot subsample-based estimated partition, Rao-Blackwellization, prediction-oriented sampling, and Poisson sampling. Extensive simulation studies and a real data analysis further demonstrate that OSSRMP achieves competitive estimation and prediction accuracy with substantially reduced computational cost.
报告人简介:
晁越,厦门大学王亚南经济研究院在站博士后,研究方向为分布式学习、海量数据分析,目前在Journal of Computational and Graphical Statistics, Statistics and Computing, Applied Mathematical Modelling, ACM Transactions on Knowledge Discovery from Data, Journal of Statistical Planning and Inference等统计学和机器学习期刊发表论文10余篇,获得福建省优秀博士后支持专项以及国家资助博士后研究人员计划资助。
邀请人:张园园