|
|
|
Project SAMVIDHA
Personalized Content Access and Presentation
The objective of this project is to select appropriate material for presentation to a user based on his user profile, interests and capabilities. A large volume of information is available in electronic form. To improve the absorption and assimilation of information, our personalization system aims to use domain specific knowledge to identify the most relevant materials keeping in mind the requirements of a given user at a given instance of time. The application relates to the access of the internet by children of rural schools in India. The need to select the most relevant information is greater here as we are limited by the low bandwidth availability in the schools. Because school students at various grades have different levels of understanding of different topics, we need to carefully select the contents most appropriate for different students. The goal of this project is to develop technology for personalized content access, and the immediate application objective is to make the wealth of information in the Internet accessible to rural schools. The main barriers are high cost of Internet access, providing the users flexibility in specifying their requirements more naturally, and getting relevant material for the users. This project aims at breaking the accessibility barrier by bridging the gap between the user and well formed queries, identification of relevant information, and packaging of a small quantity of relevant material in response to a user's information requirements. Our objective is to identify the relevant aspects of domain knowledge, the structure of the user profile needed for this, and to develop technology to use this knowledge in retrieving, analyzing and presenting appropriate content. The research aims at modeling the information requirements of an individual in certain domains, as well as the modeling of domain knowledge and relationships within the domain. This knowledge is used to select and present appropriate material that is relevant to the needs of a given user. Most schools in rural India cannot afford high bandwidth connection to the Internet. Our system aims to provide offline Internet access to the schools as in the TeK project where Internet access is done through email via a central server. This project extends the TeK project by adding other features: • Accessibility
The users are provided with an offline Internet browser, which they use to request search queries. These queries are transmitted via email to a central server having a high bandwidth connection. The central server is responsible for processing the queries, content retrieval, content analysis, filtering and packaging. It packages content appropriate for the user and this response is emailed to the user in the school.
The system requires to represent domain knowledge of the subjects of interest, and the user profiles of the different users. Domain knowledge for the subjects are stored in the form of an ontology of the concepts, the relationships between the concepts and the list of words indicative of the various concepts. The user requirements are stored as a set of topics. Each topic is a collection of concepts and their associated importance with respect to the topic. The common requirements of a group can be taken as their curriculum content. Individual variations among the different students can be captured by their user profile which captures each student’s individual interests and capabilities. The system works as follows. The users in schools are provided with a few terminals with an offline browser to enter their queries. Domain knowledge and user profile may be used to help students formulate better and more focused queries.The queries are transferred to the central server when a connection takes place. The server retrieves results by directing the queries to search engines. The relevance calculator analyses the sites returned and estimates their relevance to the particular user based on domain knowledge and user profile information. The clusterer clusters the relevant web pages. The server decides whether some sited should be explored further to get more relevant pages. The site downloader explores some sites by best first crawling and retrieved further pages of interest. The presentation system selects a set of pages, and packages them appropriately for presentation to the user. The package is emailed to the user. The technology developed is general, and can be used to provide
personalized access to information to any individual provided we have domain
knowledge and a way of elarning the user profile. We will like to test
this for typical information access needs of typical rural people. We will
also like to work on automated acquiring of domain knowledge and user profile
information.
Sponsor: Media Lab Asia Principal Investigator: Prof Sudeshna Sarkar Co-Investigator: Anupam Basu
|