A lot of people ask us about the technology on which Feeds 2.0 powerful personalization engine is based. Even though Feeds 2.0 personalization algorithms are proprietrary and patentable, we believe that we can indeed elaborate on the principles of our algorithms since, after all, they represent state of the art techniques in information retrieval and machine learning.
Feeds 2.0 personalization engine is based on the principle of text categorization. Text categorization is the process of classifying documents to one or more existent categories according to the concepts present in their texts. The organization of text in categories allows the user to limit the target of a search submitted to an information retrieval system (e.g. a search engine), to explore the collection of documents, and to find relevant information to their needs without any prior knowledge about the various keywords describing topics.
You can think of the process of personalizing individual posts coming from various feed sources as a text categorization task. In this case there are just two categories: Interesting and not-interesting groups of posts. For each individual user, Feeds 2.0 assigns new posts into one of his/her interesting or not-interesting groups.
The text categorization task can in general be utilized by machine learning algorithms or computational intelligence techniques. These algorithms can be for example artificial neural networks (feedfroward networks or Self-organizing Maps (SOM) ) or more traditional machine learning algorithms like for example C4.5 decision trees, PART decision rules and Naive Bayes or Markov classifiers.
Comparing the best performance of each algorithm, in terms of classification error, experimental results have shown that artificial neural networks are good classifiers for text categorization problems. In general, the feedforward networks are distinguished as the best classifiers and the SOM networks have usually better performance than traditional machine learning algorithms.
Feeds 2.0 uses a unique combination of the principles of the above techniques. In particular, it utilizes advanced statistical natural language processing and feature selection techniques as well as proprietrary artificial neural network classifiers. Other factors are also taken into account, like for example the sources a particular user likes, or authors and topics he’s interested in. This combination provides an advanced computational intelligence framework which gives Feeds 2.0 personalization engine a classification accuracy almost equal to 100% for the individual categories of each user.
In our next post we will elaborate more on the Feeds 2.0 Recommendation feature.