展示广告点击率预估方法研究

更新时间:2023-04-06 06:11:01 阅读量: 教育文库 文档下载

说明:文章内容仅供预览,部分内容可能不全。下载后的文档,内容与下面显示的完全一致。下载之前请确认下面内容是否您想要的,是否完整无缺。

Abstract

With the rapid development of the Internet, online advertising acts a pivotal part in the Internet in our daily life and it has become the most popular approach to do brand promotion and product marketing for the advertiser. Accurate click-through rate (CTR) prediction is the most important part of online advertising. Improving the accuracy of the ads' CTR estimation can not only benefit to advertisers, but also improve user experience.

Many traditional click through rate prediction methods, such as logistic regression, have been applied to advertising click rate prediction system and achieved good results. Furthermore, it has been large-scale deployed in the industry. Recently, the deep learning technology has achieved great success in multiply fields of Natural Language Processing and Computer Vision, such as Textual Entailment, Text Summarization, Image Generation and so on. Meanwhile, a number of deep learning models right now have been used in personalized recommender system and CTR prediction and their model structures are similar. Both of them reduce the dimension of the feature by vectorization, then utilize nonlinear operation to extract the feature combination, and calculate nonlinear relationship between the features and the click rate through by neural network. The content of this paper of the following three main aspects:

(1) Ensemble Learning by multiple traditional machine learning models based CTR model. We first do feature engineering on two large-scale real-world display advertising datasets manually and extract high-order combination feature by GBDT. Then we calculate CTR by mature machine learning models such as logistic regression and factorization machine. Then we utilize ensemble learning base on multiple single models. Finally, we calculate the result of ensemble learning method.

(2) Advance deep learning model based CTR model. We use deep neural network and recurrent neural network to do click-through rate prediction. We try to combine the features extracted from feature engineering and get the input of deep neural network through feature hashing and feature connection. Finally, we calculate the result of advance deep learning model.

(3) Multi-Embedding deep model based CTR model. We propose a novel CTR predicting model, Multi-Embedding Deep Model. We implement deep neural network based and convolutional neural network based traditional multi-embedding deep model, and also implement deep neural network based and convolutional neural network based bilinear multi-embedding deep model. which

- II -

we utilize bilinear matrix to do feature interactions instead of factorization machines. We design a system to address the cold-start problem for static data set by combining clustering method and marking rare embedding vectors method. We evaluate the proposed model on IPinYou and Avazu datasets, two large-scale real-world display advertising datasets. Experimental results show that the model can improve the estimation performance of ads' click-through rate effectively.

Keywords:online advertising, click-through rate, deep learning, convolutional neural network, bilinear

- III -

目录

摘要 .......................................................................................................................... I ABSTRACT ................................................................................................................ II 第1章绪论 .. (1)

1.1课题的来源及研究的目的和意义 (1)

1.1.1 课题的来源 (1)

1.1.2 课题的研究目的和意义 (1)

1.2国内外研究现状 (2)

1.2.1 基于机器学习的点击率预估模型研究现状 (2)

1.2.2 基于深度学习的点击率预估模型研究现状 (4)

1.3数据集与问题定义 (5)

1.3.1 数据集描述 (5)

1.3.2 点击率预估的问题定义 (9)

1.3.3 点击率预估的评价指标 (9)

1.3.4 基线系统选择 (11)

1.4本文的主要研究内容 (13)

1.5本文内容安排 (14)

第2章基于模型融合的点击率预估研究 (15)

2.1引言 (15)

2.2单模型点击率预估 (15)

2.2.1 GBDT高阶特征组合模型 (15)

2.2.2 FM点击率预估模型 (18)

2.2.3 FFM点击率预估模型 (19)

2.3集成学习点击率预估 (20)

2.3.1 强模型融合 (20)

2.3.2 机器学习元算法 (21)

2.4基于模型融合的点击率预估模型 (23)

2.5实验结果与分析 (24)

2.5.1 模型参数设置 (24)

2.5.2 实验结果对比分析 (25)

2.6本章小结 (27)

- IV -

第3章基于深度学习的点击率预估研究 (29)

3.1引言 (29)

3.2基于传统深度模型的点击率预估研究 (29)

3.2.1 激活函数 (29)

3.2.2 Dropout (30)

3.2.3 Batch Normalization (31)

3.2.4 反向传播算法 (33)

3.2.5 基于传统深度神经网络的点击率预估模型 (34)

3.3基于循环神经网络的点击率预估研究 (34)

3.3.1 循环神经网络 (35)

3.3.2 长短期记忆网络 (36)

3.3.3 门控循环单元 (37)

3.3.4 双向循环神经网络 (38)

3.3.5 基于时间的反向传播算法 (39)

3.3.6 基于循环神经网络的点击率预估模型 (40)

3.4浅层特征与深层特征结合的点击率预估模型 (41)

3.5实验结果与分析 (41)

3.5.1 模型参数设置 (42)

3.5.2 实验结果对比分析 (42)

3.6本章小结 (46)

第4章基于MULTI-EMBEDDING的点击率预估研究 (48)

4.1引言 (48)

4.2卷积神经网络相关技术研究 (48)

4.2.1 卷积层 (48)

4.2.2 池化层 (50)

4.3双线性特征组合 (51)

4.4冷启动问题模型 (52)

4.5基于传统M ULTI-E MBEDDING的点击率预估模型 (53)

4.5.1 基于深度神经网络的传统Multi-Embedding点击率预估模型 (53)

4.5.2 基于卷积神经网络的传统Multi-Embedding点击率预估模型 (54)

4.6基于双线性M ULTI-E MBEDDING的点击率预估模型 (55)

4.6.1 基于深度神经网络的双线性Multi-Embedding点击率预估模型 (55)

4.6.2 基于卷积神经网络的双线性Multi-Embedding点击率预估模型 (56)

4.7实验结果与分析 (57)

- V -

4.7.1 模型参数设置 (57)

4.7.2 实验结果对比分析 (57)

4.8本章小结 (61)

结论 (62)

参考文献 (65)

攻读硕士学位期间发表的论文及其它成果 (72)

哈尔滨工业大学学位论文原创性声明和使用权限 (73)

致谢 (74)

- VI -

本文来源:https://www.bwwdw.com/article/48fl.html

Top