节点文献
基于深度学习的点击率预测模型研究
Research on Click-through-rate Prediction Model Based on Deep Learning
【作者】 陆洋洋;
【作者基本信息】 华南理工大学 , 软件工程(专业学位), 2020, 硕士
【摘要】 随着移动互联和云计算的进步,在线广告和新闻等大量涌现,导致用户难以直接从大量、复杂且高维的数据中筛选出目标信息。为解决该问题,许多基于深度学习的推荐系统算法被提出,并在实际应用中取得了突破性成就。然而,在广告推荐任务中,数据特征通常是多领域、多类型、且关联较少的。现有主流模型(如x Deep FM模型)虽显性或隐性地组合特征以学习更多信息,但仍存在一些重要问题亟需解决:(1)高阶隐性特征组合通常采用前馈神经网络来实现,但前馈神经网络的非线性组合表达能力不足,具有一定的局限性;(2)当前实现显性特征组合的模型只到二阶,而集成模型中高阶隐性特征组合的阶数不能确定,因此不同阶特征组合对点击率预测模型的影响无法确定。针对特征组合不充分的问题,本文通过在x Deep FM模型中的前馈神经网络模块前加入对数转换层,从而提出自适应极深因子分解机模型(Ax Deep FM)。所提出模型的前馈神经网络能从不同阶、有用的特征组合中学到更多模式的特征表示。Ax Deep FM模型在Movielens20M数据集、Avazu数据集以及Criteo数据集上的AUC值分别是0.8301,0.7872和0.7821,证明了该模型的有效性。针对不同阶特征组合对点击率预测的影响问题,本文采用二阶特征组合模块AFM模块替换Ax Deep FM模型中的前馈神经网络,进一步将隐性特征组合和显性特征组合模型变为纯显性高阶特征组合模型,从而提出了注意力因子分解与压缩交互网络模型(AFM&CIN模型)。该模型各模块都显性地实现特征组合,同时改变CIN层数可有效实现确定阶数的显性特征组合。AFM&CIN模型在Movielens20M数据集、Avazu数据集和Criteo数据集上的AUC值分别为0.8241,0.7858和0.7887,证明了该模型的有效性。此外,本文还分别将AFM&CIN模型的特征交互层数设置为11,15,19和23,从而探究不同阶显性特征组合对点击率的影响。
【Abstract】 With the advancement of mobile internet and cloud computing,various online advertisements and news proliferate.As a result,it becomes difficult for users to directly filter out target information from a large amount of complex and high-dimensional data.To resolve this problem,many deep learning based recommendation system techniques are proposed and have made breakthrough achievements in practical applications.However,in the advertisement recommendation task,data features are usually multi-domain,multi-type,and less related.Although existing mainstream models(e.g.,x Deep FM),combine explicitly or implicitly features to learn more information,there are still some important issues:(1)Implicit high-order feature interactions usually use feed-forward neural networks to implement feature interaction,but the feature representation of such non-linear combination is insufficient and thus leads to limited performance;(2)Existing models generally realize2-order explicit features combination,and thus the high-order implicit feature combination in the integrated model cannot be determined.As a result,the impact of different order feature combinations on the click-through rate prediction is unclear.To resolve the problem of insufficient feature combination,this paper proposes an adaptive extreme deep factorization machine model(Ax Deep FM)by adding a logarithmic transformation layer in front of the feed forward neural network module in the x Deep FM model.The proposed model is able to learn more patterns of feature representation from different orders’ feature combinations.We also empirically demonstrate the effectiveness of the proposed method.To be specific,Ax Deep FM achieves 0.8301,0.7872 and 0.7821 in terms of AUC on Movielens20 M,Avazu and Criteo datasets,respectively.To study the influence of different order feature combinations on click-through rate prediction,we additional propose Attention Factorization Machine and Compression Interaction Network Model(AFM&CIN).Specifically,we replace feed-forward neural network in the x Deep FM model with a second-order feature combination module--AFM module,and transform the implicit feature combination and the explicit feature combination model into a pure explicit higher-order feature interaction model.The two modules in theproposed model explicitly interact features,and the compression interaction network layers help to achieve certain-order explicit features combination.Empirically,AFM&CIN achives0.8241,0.7858 and 0.7887 in terms of AUC on Movielens20 M,Avazu and Criteo datasets,which verifies its effectiveness in real-world datasets.In addition,we further explore the impact of different levels of feature combinations on click-through rate prediction by seting the number of feature interaction layers of the AFM&CIN model to 11,15,19,and 23 respectively.
- 【网络出版投稿人】 华南理工大学 【网络出版年期】2021年 02期
- 【分类号】TP18;TP391.3
- 【被引频次】6
- 【下载频次】389