Keywords

1 Introduction

In our daily life, various services (e.g., Web service, API, mobile Apps) are commonly used in all walks of life and continue to grow enormously. They exist in various software systems, called by users to meet their needs. Specifically, open source software repositories in social coding sites could be treated as one kind of service provided by developers. Users explore functional code modules in the massive software service to build their web application. In this paper, we will focus on the research of this special service, i.e., software service.

As is widely known to programmers, GithubFootnote 1 is a large-scale open source hosted platform popular with different developers from all over the world. Plentiful users explore particular code service and use it in accelerating construction of their complex application. As of April 2017, GitHub reports having almost 20 million users and 57 million code programs [6], making it the largest host of source code service in the world [7]. These large-scale software services have undoubtedly increased the difficulty of searching target code service for users. However, there is not any personalized repository recommender for users. They can only find the software service they are interested in by browsing popular repositories or by following other friends. Thus, service recommendation has become of practical importance.

To better offer users the software service of interest, it is crucial to design customized recommendation lists for various users.

Traditional recommendation methods mainly include collaborative filtering approach and content-based recommendation approach [4, 20, 22]. Jyun-Yu et al. [10] extended the one-class collaborative filtering approach (OCCF) [18] and proposed the Language-Regularized Matrix Factorization model (LRMF) based on Bayesian Personalized Ranking (BPRMF) [19]. Tadej et al. [16] constructed a network by projects and users on GitHub and then use link prediction to generate personalized recommendation lists for various users. Xu et al. [25] considered not only the user behavior, but also the content of repositories to improve the recommendation accuracy. In our work, we take into account the programming language preference from users and repositories as an important input.

In addition to the traditional recommender systems approaches, there are a number of other methods including deep learning, tensor factorization, factorization machines, and so on. These more advanced methods are good for taking the quality of your recommenders to the next level.

The past few decades have witnessed the tremendous success of the deep learning in many application domains such as computer vision, speech recognition and object detection [2, 13, 21]. Models with deep learning are mostly composed of multiple processing layers. They use the backpropagation algorithm to update the internal parameters of the model, which are used to compute the representation in each layer. Trained in this way, models are able to learn representations of data with multiple levels of abstraction [13].

Deep learning in recommender system has gained more attention [28]. Due to its ability to capture non-linear relationships and better data presentation for sparse high-dimensional vector, deep learning is able to gain good performance in rating prediction in the recommendation.

Most recently, Google proposed the Wide&Deep model [1] to jointly train wide linear models and deep neural network and apply it to Google Play. Guo et al. [8] constructed a new neural network architecture to use deep learning for feature learning in factorization machines for the recommendation. Similarly, He et al. [9] explored the use of deep neural networks based on collaborative filtering (CF). Another extension is CCCFNet (Cross-domain Content-boosted Collaborative Filtering neural Network) [14], which combines CF and content-based filtering in a unified framework. Moreover, they further embed this cross-domain model into a multi-view neural network due to the advantages of capturing hidden representation learning.

In this paper, we explore an emerging technology in the recommendation, which combines the repository recommendation with the use of the neural network. We first introduce the framework of recommendation and later implement it into a specific recommender of Github program.

Our contributions are listed as follows:

  1. 1.

    We propose a general framework of PNCF, a preference-based neural collaborative filtering recommender model, for repository recommender.

  2. 2.

    Due to a great influence of language preference on repository recommendation, we develop the instantiation of PNCF framework with language preference called LR-PNCF in Github repository recommendation. In this model, we abstract the language preference from users and repositories and consider it as one of the inputs.

  3. 3.

    The result of the experiment shows that the model we proposed can outperform the existing state-of-the-art models in repository recommendation.

The remainder of this paper is organized as follows. In Sect. 2, we give an overview of the service recommendation framework. In Sect. 3, we present an instantiation of the PNCF framework for Github repository recommendation with language preference. In Sect. 4, we conduct the model on a real-world dataset compared other methods in the experiment. Section 5 makes a conclusion and give details of the future work.

2 Framework Overview

We first present a general framework of PNCF, which is short for a preference-based neural collaborative filtering recommender model. We adopt a multi-layer representation to show the whole structure, shown in Fig. 1, where the model is divided into four layers: Input layer, Embedding layer, Interaction Layer, and Output Layer.

Fig. 1.
figure 1

Preference-based neural collaborative filtering recommender framework

In the Input Layer, we obtain n sparse features from users and projects respectively, which are set as \(\mathbf {x_{u_1}},\mathbf {x_{u_2}},...,\mathbf {x_{u_n}}\) for user u and \(\mathbf {x_{i_1}},\mathbf {x_{i_2}},...,\mathbf {x_{i_n}}\) for item i. These feature vectors of user u or item i are then separately compressed to low-dimensional vector through the Embedding layer and will be merged into a concatenated vector, named user concatenated vector \(c_{u}\) or item concatenated vector \(c_{i}\) respectively. This process will preserve the same or similar features in the vectors before interaction training, allowing the same user or vector to have the same suffix in vector and therefore enhancing the similarity between similar users or items.

After merging, user concatenated vector \(c_{u}\) as well as item concatenated vector \(c_{i}\) will be fed into a neural network to model the interaction \(\phi \) between user u and item i in the Interaction layer. Finally, score \(\hat{y}\) is output through the Output layer to normalize the result.

Overall, PNCF framework can be formulated as below.

$$\begin{aligned} c_{u} = [\varphi {(x_{u_1})},\varphi {(x_{u_2})},...,\varphi {(x_{u_n})}] \end{aligned}$$
(1)
$$\begin{aligned} c_{i} = [\varphi {(x_{i_1})},\varphi {(x_{i_2})},...,\varphi {(x_{i_ns})}] \end{aligned}$$
(2)
$$\begin{aligned} \hat{y} = f_{out}(\phi {(c_{u},c_{i})}) \end{aligned}$$
(3)

After calculating the output scores of candidate repositories, we sort it into a ranked list and select top-k items to recommend to user u.

3 Recommendation Model

In this section, we develop the instantiation of PNCF framework in Github repository recommendation with language preference, named LR-PNCF. In this case, our ultimate goal is to recommend suitable repositories in Github. At this point, users have a lot of features, such as language preference, various operations on the repositories or auxiliary information of following users, as well as the repositories. Since users in Github more likely to find programs with language they used, the preference of language is heavily taken into consideration. Additionally, the proposed model attempts to strengthen the similarity between similar users by extending the processed feature vectors at the end of corresponding latent vectors result in closer embedding vectors between similar users or items. As shown above, we divide the model into four layers and give details for each layer in implementation.

3.1 Input Layer

The proposed recommendation model is a kind of sequential model, each layer of which is arranged in sequence. The first layer is the Input layer. In this layer, there consists of two kinds of input: identity and language preference vector for each user or repository (Fig. 2).

Fig. 2.
figure 2

LR-PNCF, an instantiation of PNCF framework with language preference

For the input of identity, we define u and i as the identities of user and repository. As usual, these two identities for user and repository are encoded by one-hot encoding and is used to obtain the corresponding latent vectors from the embedding matrix when training.

For input vectors of language preference from user or repository, we use \( \mathbf {p}_u=\{p_u^1,p_u^2,...,p_u^Y\} \) and \( \mathbf {p}_i=\{p_i^1,p_i^2,...,p_i^Y\} \) to model the language preference of user u and repository i. It is obvious that both users and repositories are many-to-many relationships with language preference, we use preference vectors to capture their language information. Consider the number of users and repositories as M and N and the number of languages as Y, Here, Y determines the rank of preference vectors \(|\mathbf {p}_u|=Y \) and \(|\mathbf {p}_i|=Y \). We define the preference matrix \(\mathbf {P} \in \mathbb {R} ^{(M+N) \times Y}\). For preference vector \( \mathbf {p}_a=\{p_a^1,p_a^2,...,p_a^Y\} \) from user or repository a, there exists

$$\begin{aligned} p_{a}^{j}=\left\{ \begin{aligned} 1&\text {, if the}\,\, j^{th}\,\,\text {language is favored by}\,a \\ 0&\text {, otherwise} \end{aligned} \right. \end{aligned}$$
(4)

For example, if there are totally 5 languages and user a is keen on the language 1,2 and 5, the language preference vector of user a is encoded as \(p_{a} = (1,0,1,0,1)\). As such, we convert the language preferences of users and repositories into corresponding preference vectors.

3.2 Embedding Layer

Above the Input Layer is the Embedding Layer. This layer aims at learning a dense vector to summarize the profile of each user or repository. In this layer, we process the two kinds of input, identity input and preference vector, into one dense vector. Since the input of identity is binarized sparse vectors with one-hot encoding, we pick out corresponding latent vectors from the embedding matrix \(\mathbf {E} \in \mathbb {R} ^{(M+N) \times D}\) marking it as \(\mathbf e_u\) and \(\mathbf e_i\) where D is the dimension of the latent vectors. These two latent vectors will learn the context from the latent factor model by using the information of an user-item rating matrix [9, 12]:

$$\begin{aligned} \mathop {\min }_{e^*} \sum _{}{(y_{ui}- \mathbf {e_i}^T\mathbf {e_u})^2+\lambda (||\mathbf {e_i}||^2+||\mathbf {e_u}||^2)} \end{aligned}$$
(5)

To be more specific, these latent vectors are initialized randomly following the normal distribution and then trained to optimize the loss function with the rating information.

For language preference input vector, it is a sparse, high-dimensional categorial features and thus we converted it into a low-dimensional and dense real-valued vector through a feed-forward neural network. Language preference vectors obtained from the Input layer are then fed into the input layers of a feed-forward neural network:

$$\begin{aligned} \mathbf p_u^{(0)} = \mathbf {p}_u ,\ \mathbf p_i^{(0)} = \mathbf {p}_i \end{aligned}$$
(6)

Each hidden layer in this neural network performs:

$$\begin{aligned} \mathbf p^{(l+1)} = f( \mathbf W^{(l)} \times \mathbf p^{(l)} + \mathbf b^{(l)} ) \end{aligned}$$
(7)

where \( f(\cdot )\) is a specified activation function (often use ReLUs), \(\mathbf p^{(l)}, \mathbf b^{(l)}\) and \( \mathbf W^{(l)} \) are respectively the output, bias and weights at l-th layer.

To enhance the representation for embedding vectors of each user or repository, we merge the latent vector and preference vector into one concatenated vector. We formularize the concatenation as follow, \(\mathbf a_u\) and \(\mathbf a_i\) respectively for user u or repository i:

$$\begin{aligned} \mathbf a_u =[\mathbf e_u,\mathbf {p}_u^{(l)}],\ \mathbf a_i = [\mathbf e_i,\mathbf {p}_i^{(l)}] \end{aligned}$$
(8)

3.3 Interaction Layer

The purpose of the Interaction layer is to conduct the interaction behaviours for user or item embedding vectors acquired by the Embedding layer. We use f(.) to model these learning behaviours.

$$\begin{aligned} \hat{q}_{ui} = f(\mathbf a_u,\mathbf a_i) \end{aligned}$$
(9)

where \(a_u\) and \(a_i\) is the embedding vectors for user u and repository i.

In most of research, f(.) is represented by the inner product of input vectors adding a bias from corresponding user and item. In this paper, we use MF (Generalized Matrix Factorization) to model the interaction between the users and repositories. Let \( \hat{y}_{ui} \) denotes the output of the interaction layer and use the inner product of \( \mathbf {a} _{u} \) and \( \mathbf {a} _{i} \) to model the interaction:

$$\begin{aligned} \hat{q}_{ui} = \mathbf {a} _{u}^T * \mathbf {a} _{i} = \sum _{j=1}^L a_{u}^j*a_{i}^j \end{aligned}$$
(10)

where L is the dimension of the embedding vector. Interaction function can be replaced by others common, classical or complicated neural network. In this paper, we use the simplest interaction function since it works.

3.4 Output Layer

There raise a problem when we directly use the output of the Interaction layer to be the rating prediction. It is obvious that the rating is calculated by a linear combination of user embedding factors, item embedding factors and bias in Eq. 10. These may fail in capturing the non-linear or more complex structure implied in embedding vectors. Some research works on various fields, such as natural language processing, show enhancement of representation learning when using non-linear transformations.

We let \(\hat{y}_{ui}\) denotes the final output of the neural network. Since \(\hat{y}_{ui}\) is considered as the probability that the user u likes the repository i, we use sigmoid function to limit the value into [0,1].

$$\begin{aligned} \hat{y}_{ui} = \sigma (\hat{q}_{ui})\ ,\ \sigma (x) = \frac{1}{1+e^{-x}} \end{aligned}$$
(11)

where \(\hat{y}_{ui}\) indicates the preference degree of the repository i for user u which can be also considered as the rank score. In this way, we map user embedding vectors and item embedding vectors into real-valued ratings. Based on the scores of candidate repositories, one can sort all the repositories and the final ranked list is generated.

3.5 Model Training

To learn the model parameters, there are many learning-to-rank algorithms can fit into the above framework. Note that different approaches model the process of learning to rank in different ways. These approaches are usually divided into point-wise, pair-wise and list-wise approaches [15]. In this paper, we use point-wise approach to train our model. Among different point-wise approaches, ranking is commonly modeled as regression, classification and ordinal regression. The most common use is Mean squared error which is largely applied in regression in machine learning. However, it is usually better to use cross-entropy error to evaluate the quality due to the structure of neural network [11]. Moreover, this formular much the same as the negative logarithm of the likelihood function when training parameters with probabilistic methods [9].

Let \(\hat{y}_{ui}\) is the output of our model and \(y_{ui}\) is the desired output. Here \(y_{ui}=1\) represents repository i is favour by user u and otherwise \(y_{ui}=0\). The definition of cross-entropy loss function as follows:

$$\begin{aligned} \mathop {\min }_{\varTheta } - \frac{1}{n} \sum { [y\ln \hat{y}_{ui} + (1-y)\ln (1-\hat{y}_{ui})] } \end{aligned}$$
(12)

where \(\varTheta \) denotes the parameters of the model, n denotes the number of samples. The above loss function can be minimized by updating its parameters using optimization techniques like stochastic gradient descent.

Note that pointwise approach has its limitations. Since it does not consider the relationship between any two items, the corresponding loss function does not reflect the influence of different positions of items in the rank list.

4 Experiment

In this section, we evaluate the proposed model for Github Recommendation using a subset of real-world Github dataset. We first give a brief introduction of the dataset used in the experiment. And then describe two kinds of metrics and as well the evaluation protocols. Subsequently, the baseline of the experiments and their parameter setting including proposed LR-PNCF are illustrated. Finally shows the results. The experimental results demonstrate better improvement over competitive baselines.

4.1 Dataset Description

There are over one million people and three million public repositories in Github. We screen out those who have more than 10 repositories and randomly drew a part of it which contains about 3982 users and 4987 repositories. In Github, users have multiple behaviours for public repositories. For example, they may fork, watch or contribute coding to the repositories if they are interested. Therefore, in our paper, we define that a user is considered to prefer a repository if the user folk, watch or contribute coding to it. After the data is processed, there are about 179370 ratings in the dataset, the sparsity of user-item matrix is nearly 0.9%.

4.2 Evaluation and Metrics

For experiment, we apply leave-one-out schema to evaluate the performance of recommendation [9, 19]. Leave-one-out validation involves selecting one item for each user as the validation set and the rest observations as the training set. For more efficient verification, we randomly select 100 unobserved items, along with the test item, to generate the recommendation list for each user [5, 9]. We assume that the user prefers repositories that have been observed over all other non-observed repositories [19]. E.g. user u interacts with repository \(i_2\) but not repository \(i_1\), so we consider that user u prefers repository \(i_2\) over \(i_1\). The higher score or rank of the test repository indicates the better performance of the recommender.

To judge the ranked list quantitatively, we adopt two common metrics to evaluate the performance of ranking: Hit Ratio (HR) and Normalized Discounted Cumulative Gain (NDCG).

Hit Ratio is defined below:

$$\begin{aligned} HR@k =\left\{ \begin{aligned} 1&\text {, }rank_{i}\,\,\le \,\,\text {k}\\ 0&\text {, otherwise} \end{aligned} \right. \end{aligned}$$
(13)

where \(rank_{i}\) denotes the rank of the test repository i of ranked list.

NDCG is ratio of DCG and iDCG. Here iDCG is the ideal discounted cumulative gain. The above DCG is calculated by summing up all "gains" along the rank list with a log discount factor and is able to quantify the usefulness of an item based on its position in the result list. Its formula is defined as:

$$\begin{aligned} DCG@K = \sum _{i=1}^{k}{\frac{2^{r(i)}-1}{log_2(1+rank_{i})}} \end{aligned}$$
(14)

where r(i) denotes the desired rank of \(i^{th}\) repository in recommendation list R.

Intuitively, HR measures whether the test repository is appeared in ranked list and NDCG measure how good the quality of recommendation by focusing on the position of the test repository.

4.3 Comparing Methods

We compare our recommendation approach with another recommendation algorithm. These methods, except for the PopN, is related to matrix factorization and are often used in repository recommendation:

  1. 1.

    PopN: Recommender is implementation by pushing the most popular repositories to users. This non-personalized method rate items using some form of popularity measure such as most forked or contributed even watched by users.

  2. 2.

    GMF [12, 17]: This is the basic matrix factorization using only user-repository matrix for recommendations.

  3. 3.

    NCF [9]: This is a general framework named NCF, short for Neural networkbased Collaborative Filtering. We use one of its instances NeuMF in following experiments.

  4. 4.

    BPRMF [19]: This is a matrix factorization model training with a generic optimization criterion BPR-OPT. It optimizes the measure about rankings directly. BPRMF is one of the common methods for ranking in the OCCF problem.

  5. 5.

    LRMF [10]: This is one of the state-of-the-art approaches for Github Recommendation, short for language-regularised matrix factorization. It based on matrix factorization and is regularized by the relationships between user programming language preferences.

In experiments, we set respective optimal parameters either according to corresponding references or based on our experiments for all methods to be validated. All of these methods executes 100 epochs. The initialization strategy we adopt for all involved embedding matrixes is to be randomly initialized with a uniform distribution within the interval [0, 1]. The radio of numbers of positive and negative samples is 1:1 when training.

For BPRMF method, we train it in batches and execute 100 epochs as well. In each batch, we randomly generate a certain number of pairs \((u,repo_i,repo_j)\) fed into a model where u denotes the user u, \(repo_i\) denotes the repository observed by user u and \(repo_i\) denotes the non-observed repository. For LRMF method, we apply bootstrapping-based stochastic gradient descent according to the original paper. Refer to their experimental setting, we set \(c=100\) which indicates we randomly select 100 authors for each user to calculate language preference regularization. For LR-PNCF models, the dimension of preference vector \(p_u^{(0)}\) and \(p_u^{(0)}\) in the following experiments is set to be 100. We use three layers in the feed-forward neural network to dense the preference vector where the number of neurons is the same as that of embedding vector \(\mathbf {e}\) . Later, we will evaluate the performance of the above model through quantitative indicators NDCG and HR in the next subsection.

4.4 Experimental Results

In this section, we compare the performance of models mentioned above using the metrics of HR and NDCG in the Top-K recommendation. Figure 4 and Fig. 3 shows the best HR and NDCG in evaluation for each competitive approaches. Since the performance of PopN is to weak to be negligible, we remove the HR and HDCG of PopN in comparison. In experiment, we focus on two issues. The first problem is whether this proposed approach performs better than the other comparative approaches. Another is how the performance of proposed method is affected by the dimension of latent vectors or the number of recommendation list. To simplify the explanation, we set dim as the dimension of latent vectors and topk as the number of recommendation list. We carried out the experiments where dim is assigned to 8,16 and 32 respectively with fixed value \( topk = 10 \) and topk is assigned to 10,20 and 30 respectively with fixed value \( dim = 8 \). All other variables remain the same. In each experiment, we record the best HR and NDCG in evaluation. The results are shown in Fig. 4 and Fig. 3. To show more numerical information, we also present the value of best HR and NDCG onto the Table 2 and Table 1 among the 100 epoches.

Fig. 3.
figure 3

Best HR in different dimension of latent vectors and in different recommendation list size

From the Fig. 4 and Fig. 3, we can see that the proposed model LR-PNCF has the better HR and NDCG that the other competitive methods. BPRMF and LRMF perform weaker when they are evaluated by the metrics of HR and NDCG while the proposed model LR-PNCF, as well as GMF and NCF, keeps relatively high scores of metrics in different assignment of variables topk or dim. GMF and NCF behave better that BPRMF and LRMF. Among them, LR-PNCF outperforms other comparative methods in either HR or NDCG.

Shown in Table 2 and Table 1, LR-PNCF achieves higher HR and NDCG where the maximum value of HR and NDCG can be increased by 13.9% and 18.3% compared with the NCF and by 12.1% and 8.7% compared with GMF method under the same settings of dim and topk. Comparing the performance differences between comparative models, those difference is the most significant when the variables of dim and topk are assigned to smaller values. Especially with topk increase, these difference are shorten.

Fig. 4.
figure 4

Best DNCG in different dimension of latent vectors and in different recommendation list size

Additionly, in order to evaluate performance affected by the number of recommendation list, HR and NDCG were recorded under different assignment where \(topk = 10,20,30 \) respectively in Table 1. It can be clearly seen that HR and NDCG go higher when topk increase. It is quite understandable that target test items are easier to appear in the recommendation list expanding the list of recommendations under the fixed total number of sorted items in evaluation. In terms of the influence of the dimension of latent vectors, we recorded HR and NDCG under different assignment where \(dim = 8,16,32 \) respectively in Table 2. It can be clearly seen that HR and NDCG have a slight volatility when dim increase, which may encounter the problem of overfitting.

Overall we can make the conclusion that our model obtains the best performance and efficiency than other competitive methods.

Table 1. The best HR and NDCG in different topk
Table 2. The best HR and NDCG in different dim.

5 Related Work

In recent years, there is a variety of research works of service discovery and recommendation, but only a handful of them in Github recommendation. Previous work can be mainly classified into three types, semantics-based, CF-based and network-based. The semantics-based approaches mostly consider the semantic similarity of web services that they define the similarity between two web services due to particular semantic aspects [24]. The CF-based approaches are under the assumption that the target user has similar friends or target service owns similar services, without considering the exceptional scenarios [17, 18]. The network-based approaches are to construct the information network based on relations using their history information [16].

Lately, as a special software service, Github repository become a new object of the recommendation system. The idea of matrix factorization represents a feasible resolution for software service discovery and has forced various CF-based approaches, e.g Probabilistic Matrix Factorization (PMF)[17], One-Class Collaborative Filtering (OCCF)[18]. Inspired by OCCF, Jyun-Yu et al. [10] proposed the Language-Regularized Matrix Factorization (LRMF) by utilizing user’s programming language preference, which is an improvement of matrix factorization based on Bayesian Personalized Ranking(BPRMF) [19] and have achieved better performance on the real-world dataset of Github. Tadej et al. [16] constructed a network by the contributions based on Github data and then use link prediction to generate personalized recommendation lists for various users. Xu et al. [25] proposed a recommendation system named REPERSP, which is constructed based on both the developers behavior and the content of each project.

With less feature engineering, deep neural networks can generalize better to unseen feature combinations through low-dimensional dense embedding learned for the sparse features. Recently, many people explore deep learning to improve the performance of the recommendation. For the rating matrix, collaborative deep learning has been extended by coupling deep learning for content information and collaborative filtering [23, 27].

Google presented the Wide & Deep learning framework [1] to achieve both memorization and generalization in one model. Wide linear models memorize sparse feature interactions using cross-product feature transformations, while deep neural networks generalize to unseen feature combinations through low-dimensional embedding. To avoid expertise feature engineering on the input in the wide part, Guo et al. [8] proposed an end-to-end model called DeepFM combining factorization machines with deep learning techniques for the recommendation. Wang et al. [23] proposed a hierarchical Bayesian model called collaborative deep learning (CDL), which jointly performs deep representation learning for the content information and collaborative filtering for the rating (feedback) matrix. Dong et al. [3] extended the stacked denoising autoencoder to integrate additional side information into the inputs and presents a hybrid collaborative filtering model with the deep structure of recommender systems. Similarly, He et al. [9] proposed the NCF model, which is short for Neural Collaborative Filtering. They use a neural architecture to learn an arbitrary function from data with matrix factorization and multi-layer perception. To make full use of both explicit ratings and implicit feedback, Xue et al. [26] proposed Deep Matrix Factorization models (DMF) that map the users and items into low-dimensional vectors and both of the input matrix and loss function consider both explicit ratings and implicit feedback. CCCFNet [14], namely Cross-domain Content-boosted Collaborative Filtering Neural Network, is an extension of a unified framework with content-based filtering.

6 Conclusion and Future Work

In this paper, we present a general framework of PNCF, a preference-based neural collaborative filtering recommender model, and develop the instantiation of PNCF framework in Github repository recommendation with language preference. To better illustrate, the model is divided into four layers: Input layer, Embedding layer, Interaction layer, and Output layer. Then we explained the implementation of each layer in detail. Additionally, we extract the language preference from users and repositories to make the better performance of recommendation.

The results of extensive experiments show that our methods can significantly outperform the existing state-of-the-art repository recommender models. In this work, we combine deep learning with collaborative filtering to further enhance the effectiveness of the traditional recommender systems approaches.

In future, we will add more features of users and repositories to the input layer, such as users’ actions on the repositories, rather than only using the important but single language preferences. Moreover, due to the pointwise approach has its limitations that it does not consider the relationship between any two items, we will study pairwise learners for PNCF instead.