Web Personalization is a process of personalizing content of a website according to the need of a user. Most of the websites nowadays personalize their contents according to the needs of users. For instance, we must have realized popular platforms like YouTube, Netflix presents different contents for the different users. If a user is interested in sports type videos, he will be presented videos related to the sports.
In this post, we talk all about the personalization with special reference to research. The following contents are covered:
Page Contents
- Introduction
- Generations of the Web
- Web Mining
- Web Mining Taxonomy
- Web Personalization
- Research Gap
Introduction to Web Personalization
With the large amount of information available on the web, it is more complicated to extract desirable information from the web. At that instance semantic web and personalization techniques play an important role to mine valuable information. Web Personalization implies the delivery of dynamic and personalized content, such as textual elements, links, product recommendations, advertisement, etc., that are customized to needs or interests of a particular user or a segment of users. The critical goal of the information retrieval system is to provide the user relevance web documents based on his past behaviour. On receiving the documents from the recommender system, the users not only satisfy their requirements but also provide an implicit feedback to the system.
Five Generations of the Web:
- Web 1.0
- Web 2.0
- Web 3.0
- Web 4.0
- Web 5.0
Web 1.0 is also known as read only web where users were only allowed to read information from the web. In the first generation, the users were not allowed to communicate with information provider and other users on the web.
Web 2.0 is also known as read and write web. In this generation of the web, the users can read as well as write on the web. Examples of second generation of the web are blogs, Twitter, Facebook etc.
Web 3.0, popularly known as semantic web. It introduced the concept of personalization and semantic web.
Web 4.0 and Web 5.0 are based on artificial intelligence.
Defining Web Mining
On receiving the information from the web, data mining techniques are used and the whole process is referred to as web mining.
Web mining is the use of mining techniques to extract useful information from the web. Web mining is categorized into three parts: Web Content Mining, Web Structure Mining and Web Usage Mining.
Taxonomy of Web Mining
Web Content Mining is the scanning and mining of texts, images, videos from the Web pages. It is related to data mining.
Web Structure Mining is a tool used to identify the relationship between web pages and how the web pages are interlinked with each other.
- Page Rank Algorithm
- HITS Algorithm
- Weighted Page Rank Algorithm
Web Usage Mining tells the identity of a user and his browsing behaviour on the web. With the help of it, we can identify a user by using IP address, cookies, or login authentications.
- Data Collection And Preprocessing
- Pattern Discovery
- Pattern Analysis
Defining Web Personalization
It is divided into three main phases:
- Learning: Implicit Learning and Explicit Learning.
- Matching: Collaborative Filtering, Content Based Filtering, Rule Based, Hybrid Filtering.
- Recommendation.
Collaborative Filtering: The basic assumption of CF is that people who had similar tastes in the past will also have similar tastes in the future. One of its earliest definitions is ” collaboration between people to help one another perform filtering by recording their reactions to documents they read”.
Content-Based Filtering: CBF is based on the assumption that people who liked items with certain attributes in the past, will like the same kind of items in the future as well. It makes use of item features to compare the item with user profiles and provide recommendations. Recommendation quality is limited by the selected features of the recommended items.
Research Gap in Web Recommendation:
- Despite the huge success of recommendation techniques, these techniques suffer from several limitations. Challenges which associated with Collaborative Filtering are Cold Start Problem, Sparsity and Scalability.
- Content Based Filtering also exhibits problems like over specialization, content extraction problem and Sparsity of data. The quality of recommendations provided by these techniques is reduced due to these problems.
- Moreover, in order to perform CBF, it is requested to structure (preprocess) the data as most of the data available is in unstructured form.
- Selecting a movie often requires users to perform numerous operations when faced with vast resources from online movie platforms.
- The recommendations of a content-based system are based on individual information and ignore contributions from other users.