Abstract

Hashtags, a user provides to a micro-video, are the ones which can well describe the semantics of the micro-video’s content in his/her mind. At the same time, hashtags have been widely used to facilitate various micro-video retrieval scenarios (e.g., search, browse, and categorization). Despite their importance, numerous micro-videos lack hashtags or contain inaccurate or incomplete hashtags. In light of this, hashtag recommendation, which suggests a list of hashtags to a user when he/she wants to annotate a post, becomes a crucial research problem. However, little attention has been paid to micro-video hashtag recommendation, mainly due to the following three reasons: 1) lack of benchmark dataset; 2) the temporal and multi-modality characteristics of micro-videos; and 3) hashtag sparsity and long-tail distributions. In this paper, we recommend hashtags for micro-videos by presenting a novel multiview representation interactive embedding model with graph-based information propagation. It is capable of boosting the performance of micro-videos hashtag recommendation by jointly considering the sequential feature learning, the video-user-hashtag interaction, and the hashtag correlations. Extensive experiments on a constructed dataset demonstrate our proposed method outperforms state-ofthe-art baselines. As a side research contribution, we have released our dataset and codes to facilitate the research in this community.

Figure 1: Overview of the proposed model for hashtag recommendation. It frst models the video, user, and hashtag modules and then integrates them into an interactive embedding model to exploit their interactions. In addition, we also designed a propagation mechanism in the hashtag embedding module based on the hashtags correlations.

Figure 2: Illustration of the propagation mechanism in hashtag embedding module. A Graph G is built over the hashtags representations, where each node denotes a hashtag. Stacked GCNs are learned over the graph to map initial hashtag representations {e(h1),e(h2), ...,e(hH )} to new representations {e′(h1),e′(h2), ...,e′(hH )} with knowledge encoded.

Figure 3: Illustration of the interactive embedding model based on neural network.