Download our paper
Television report
Television report

Problem

What if a successful company starts to receive a torrent of low-valued (one or two stars) recommendations in its mobile apps from multiple users within a short (say one month) period of time? Is it legitimate evidence that the apps have lost in quality, or an intentional plan (via lockstep behavior) to steal market share through defamation? In the case of a systematic attack to one's reputation, it might not be possible to manually discern between legitimate and fraudulent interaction within the huge universe of possibilities of user-product recommendation. Previous works have focused on this issue, but none of them took into account the context, modeling, and scale that we consider in this paper.

Algorithm

Here, we propose the novel method Online-Recommendation Fraud ExcLuder (ORFEL) to detect defamation and/or illegitimate promotion of online products by using vertex-centric asynchronous parallel processing of bipartite (users-products) graphs. With an innovative algorithm, our results demonstrate both efficacy and efficiency -- over 95% of potential attacks were detected, and ORFEL was at least two orders of magnitude faster than the state-of-the-art. Over a novel methodology, our main contributions are: (1) a new algorithmic solution; (2) one scalable approach; and (3) a novel context and modeling of the problem, which now addresses both defamation and illegitimate promotion.

We expand former works by considering weighted graphs, that is, the weight of the user-product interactions holds semantics that correspond to defamation or to illegitimate promotion on domains ranging from social networks to e-commerce recommendations.

Our work deals with relevant issues of the Web 2.0, potentially augmenting the credibility of online recommendation to prevent losses to both customers and vendors.

Results

We tested ORFEL's performance considering two aspects: Efficacy and Efficiency:

For Efficacy we test wether the algorithm is capable of finding at least 95% of the attacks (frauds) in the dataset.

For Efficiency we measure the runtime of the algorithm versus the graph size.