Ontological Technologies for User Modeling____SOA_Sosnovsky_

更新时间：2023-04-24 22:36:01 阅读量：实用文档文档下载

说明：文章内容仅供预览，部分内容可能不全。下载后的文档，内容与下面显示的完全一致。下载之前请确认下面内容是否您想要的，是否完整无缺。

ontological推荐度：
相关推荐

Ontological Technologies for

User Modeling

Sergey Sosnovsky

sas15@7372b8920b4c2e3f56276314

State-of-the-Art Paper

Submitted to the Information Science PhD. Committee of the School of Information Sciences, University of Pittsburgh as Part of Requirements for the Comprehensive Examinations

November 29, 2007

Last modified: March 13, 2008

Table of Contents

Ontological Technologies for User Modeling 0

Abstract (3)

Introduction (4)

1 User Modeling (6)

1.1 Definition (6)

1.2 History of User Modeling (6)

1.3 User Modeling Dimensions (7)

1.3.1 Knowledge, Beliefs, Skills, and Background (7)

1.3.2 Interests and Preferences (8)

1.3.3 Goals, Plans, Tasks and Needs (8)

1.3.4 Demographic Information (9)

1.3.5 Emotional State (9)

1.3.6 Context (9)

1.4 User Model Representation (9)

1.4.1 Overlay User Modeling (10)

1.4.2 Keyword-based User Modeling (11)

1.4.3 Stereotype User Modeling (11)

1.4.4 Other Technologies for User Model Representation (12)

1.5 User Model Elicitation (13)

1.5.1 User Model Inference based on Rich Feedback (14)

1.5.2 Unobtrusive User Modeling (17)

1.5.3 Open, Editable and Interactive User Modeling (21)

2 Web-Ontologies (26)

2.1 Formal Ontologies (26)

2.2 Ontologies on the Semantic Web (27)

2.3 Ontology Representation and Development (30)

2.3.1 Resource Description Framework (RDF) and RDF-Schema (30)

2.3.2 Web Ontology Language (33)

2.3.3 Other Ontology Languages (34)

2.3.4 Other Semantic Web Representation Technologies (35)

2.3.5 Ontology Development Tools (36)

2.4 Ontology Mapping (39)

2.4.1 Ontology Mapping based on a Reference Ontology (40)

2.4.2 Ontology Mapping based on Lexical Information (40)

2.4.3 Ontology Mapping based on Ontology Structure (41)

2.4.4 Ontology Mapping based on Instance Corpora (42)

2.5 Ontology Visualization (42)

2.5.1 Visualization of Ontologies (43)

2.5.2 Ontology-based Information Visualization (46)

2.6 Ontology-based Knowledge Acquisition (48)

2.6.1 Ontology-based Annotation (49)

2.6.2 Ontology Learning (53)

3 Ontologies Meet User Modeling (58)

3.1 Ontologies for User Model Representation (59)

3.1.1 Ontology-based Overlay User Modeling (60)

3.1.2 Personal Ontology Views (62)

3.1.3 Ontologies of User Profiles (65)

3.1.4 Ontologies for Stereotype User Modeling (68)

3.1.5 User Modeling Based on Lexical Ontologies (70)

3.2 Ontologies for User Model Elicitation (71)

3.2.1 Ontology-based Population of User Model (71)

3.2.2 Ontology-based Learning of User Model Structure (76)

3.2.3 Ontology-based Open, Editable and Interactive User Modeling (78)

3.3 Ontology-based User Model Interoperability (84)

3.3.1 Integration of Different User Modeling Approaches (85)

3.3.2 Resolution of Domain Discrepancies (85)

3.3.3 Resolution of User Profile Discrepancies (88)

3.3.4 Resolution of Scale Discrepancies (92)

4 Experiment Design (93)

4.1 Introduction (93)

4.2 Ontology Mapping for Student Models Translation (95)

4.2.1 Domain Ontologies and Content Modeling (95)

4.2.2 Ontology Mapping Algorithm (96)

4.3 Experiment (97)

4.3.1 Data Collection (97)

4.3.2 Student Model Validation (98)

4.3.3 Student Model Translation Based on Automatic Ontology Mapping (100)

4.4 Discussion (101)

4.5 Conclusions (102)

References (103)

Appendix I: Pre-quiz on C Programming Knowledge (127)

Appendix II: Target Set of C Concepts (133)

Appendix III: Extra-Credit Java Questions (135)

Appendix IV: Target Set of Java Concepts (140)

Abstract

This paper brings together research from two different fields: user modeling and Web-ontologies – in attempt to demonstrate how recent semantic trends in Web development can be combined with the modern technologies of user modeling. Over the last several years, a number of user-adaptive systems have been exploiting ontologies for the purposes of semantics representation, automatic knowledge acquisition, domain and user model visualization and creation of interoperable and reusable architectural solutions. Before discussing these projects we first overview the underlying user modeling and ontological technologies. As an example of the project employing ontology-based user modeling, we present an experiment design for translation of overlay student models for relative domains by means of ontology mapping. Copyright (c) 2007 Sergey Sosnovsky Last Modified 5/12/20083

Introduction

Among the promising directions of World Wide Web (WWW) development, we can name two paradigms sharing the ultimate goal of providing users with more efficient access to online information and services: the Adaptive Web (AW) and the Semantic Web (SW). First aims at organizing different kinds of personalized information experiences from adaptive e-shopping to online intelligent tutoring. Second proposes a set of technologies for meaningful (semantically-enriched) description and automatic discovery of data and knowledge on the Web.

Bringing together these two fields, in our opinion, would result in a successful mutually beneficial synergy. AW challenges can introduce interesting application areas for the SW technologies; while SW vision, in its turn, is able to open new perspectives for the AW research.

Such collaboration looks promising also due to the fact that AW and SW have common routes in several areas of Artificial Intelligence (AI), such as Knowledge Representation, Agent-based Systems, Natural Language Processing, etc.

In this review we do not attempt to describe all aspects of such technology fusion, instead we concentrate on analyzing the novel projects cooperating on the border of two core subfields of AW and SW: user modeling and Web-ontologies.

User modeling is much an older area of research than AW and WWW itself. First adaptive systems modeling users were developed as long ago as 1970-s. Nevertheless, to this day, user modeling stays an active field and constitutes one of the major and, arguably, the most challenging part of creating adaptive Web-applications. The precision of modeling assumptions about a user in many respects defines the effectiveness of adaptive systems in general. An incorrect interpretation of a user leads to wrong adaptive decisions, which may result in user’s frustration, loss of trust, decrease in motivation to use the system, etc. Adequate representation of knowledge about a user, effective elicitation of user-related information, and utilization of this information for organizing coherent and meaningful adaptation are crucial factors for the success of AW systems.

The notion of a Web-ontology is central for the SW initiative. Automatic discovery of information on the Web and its machine-based interpretation are not possible unless the information is semantically-enriched with metadata providing the shared meaning of the content.

Ontologies are the instruments to design and convey such meaning. The ontologies have been studied for many years, first by philosophers and logicians, and later by researchers in the field of Artificial Intelligence (AI) and Knowledge Representation (KR). SW, however, gave a new spin to this study; nowadays, the field of Web-ontologies is attracting much of an interest. The Semantic Web Activity of World Wide Web Consortium (W3C) leads and supports these efforts by developing a set of standards for ontology representation and processing on the Web [W3C, 2001b].

Recently a lot of research activity has been generated on the border of user modeling and web-ontologies. Ultimately, both disciplines attempt to model the real world phenomena qualitatively: ontologies – a particular area of knowledge, user modeling – the internal state of a human user. Many user-modeling approaches exploit the content-based characteristics of a user (user’s knowledge, interests, etc.) and hence can directly benefit of the high-quality domain models provided by ontologies. Besides, as the majority of user modeling projects have been deployed on the Web, and Web-ontologies are becoming the de facto standard for WWW-based Copyright (c) 2007 Sergey Sosnovsky Last Modified 5/12/20084

KR, the cooperation between user modeling and Web-ontologies seems inevitable. Following paragraphs provide a brief outline of the rest of this paper.

The first chapter gives an overview of user modeling. It starts with the definition and historical background of the field. The main part of the chapter is devoted to the description of the most important technologies of user model representation, and elicitation. The list, though, does not pretend to be complete. Several cutting-edge problems, like user modeling in virtual reality, collaborative filtering, privacy and security issues of user modeling, are not overviewed.

The set of technologies is chosen to demonstrate the applicability of ontologies for user modeling.

The analysis of the ontological technologies recently developed in the framework of SW initiative is provided in the second chapter. It first discusses the origins of the ontological research and the requirements for Web-ontologies and then summarizes different aspects of Web-ontologies, including ontology representation, ontology reasoning and querying, ontology acquisition, and ontology-based annotation, ontology mapping etc. Some of the important topics, such as ontology reasoning and querying, ontology versioning, ontology validation and evaluation, ontology robustness and scalability, etc., while being currently in the focus of SW research are out of the scope of this review, as they are not directly related to the problems of user modeling and adaptive Web technologies.

The third and central chapter examines a set of projects that merge user modeling and ontological technologies. Such features of Web-ontologies as controlled vocabulary of terms, available formats for sharable knowledge representation, well-defined structure of domain knowledge, support of logical reasoning, etc., have been appreciated by the user-modeling researchers and utilized for developing adaptive Web-systems. The chapter attempts to draw the connection between certain technologies, like open user modeling and ontology visualization, overlay user modeling and ontology representation, unobtrusive user modeling and ontology learning etc. As this area of research is rather young, the chapter may seem unbalanced. Some of its sections are more detailed than others, for some problems the solutions have not yet been proposed. More research efforts are needed for ontology-based user modeling technologies to mature.

Finally, chapter 4 proposes the experiment design for translation of overlay models of student knowledge for relative domains based on domain ontology mapping. If two learning domains share a sufficient volume of knowledge, like C and Java Programming, it can be possible to mediate between overlay user models of a student collected in these domains.

Consequently, if an adaptive educational system does not know about student’s knowledge of Java, but has an access to the model of his/her C programming knowledge, it could use that model to populate the relevant parts of the student model for the Java domain. Having these two models implemented as overlays over C and Java ontologies, we can apply ontology mapping techniques to identify the set of relevant concepts that can be used as a basis for user model translation. The evaluation of the quality of automatic translation can be performed by comparison with the manual translation. The proposed subjects for the experiment are students of the introductory Java course.

1 User Modeling

The field of user modeling has more then 30 years of research by now. It came a long way from lab projects [Carbonell, 1970] to the development of dedicated commercial user modeling servers supporting millions of users (see [Fink & Kobsa, 2000] for a comprehensive review). This chapter provides a brief overview of the evolution of efforts in this field and lists the most important user modeling technologies developed over years. It will, especially, focus on the user modeling trends that can benefit of integration with the field of Web-ontologies.

1.1 Definition

Many authors have proposed their explanations for the concepts of user model and user modeling. By restating [Wahlster & Kobsa, 1989] to adjust to a broader class of user-adaptive systems we can define a user model as a

“…knowledge source in an intelligent system which contains assumptions on

different aspects of the user that may be relevant to the system’s adaptive behavior.

These assumptions must be separable from the rest of system’s knowledge.”

Consequently, user modeling is the field of Information Science dealing with elicitation, representation, and utilization of user models.

1.2 History of User Modeling

As a research field User Modeling originated from several areas of Information Science and Artificial Intelligence. Among its main ancestors are: Intelligent Computer Aided Instruction [Carbonell, 1970], HCI (see [Fisher, 2001] for a review), Expert Systems [Clancey & Letsinger, 1981], Computer Games [Burton & Brown, 1976] and Natural Language Systems [Perrault et al., 1978].

Table 1 provides a historical perspective of the field by listing 4 main stages of user modeling development along with the distinctive features of every stage. The stages’ titles and time periods are rather conventional. They reflect the main trends of the field development and do not pretend to set exact borders.

Table 1: History of User Modeling Development Time Stage Features

1970s Beginning First user-adaptive systems appear. Major technologies are defined: overlay user modeling and stereotype user modeling. User modeling research stays in the Lab. Adaptive systems often do not make a clear distinction in functionality of user modeling components and other parts of the system.

1980s Maturity User modeling separates from its ancestors as a research field. First dedicated user modeling systems and shells are developed: [Finin & Drager, 1986]. First user modeling workshop (UM’1986), and first book ([Kobsa & Wahlster, 1989]).

1990s WWW Web-era in all areas of IS including user modeling. WWW becomes the major infrastructure for information and service delivery. Multiple adaptive systems appear on the Web. Web-based user modeling servers. Commercial user modeling products: [Horvitz et al., 1998], [Fink & Kobsa, 2000].

2000s New trends Two main initiatives in Web development – Social Web (Web 2.0) and Semantic Web – influence the development of user modeling technologies on the Web. Another influential trend due to the spread of mobile devises and wireless technologies is user modeling for pervasive and ubiquitous applications.

The rest of this chapter analyses the main technologies developed mainly during the first three periods, while section 3 describes user modeling projects influenced by one of the modern trends, namely the Semantic Web.

1.3 User Modeling Dimensions

The core idea of adaptation is based on the assumption that differences in some user characteristics should influence the inpidual utility of the service/information provided; hence if system’s behavior is tailored according to these characteristics, the system value will be increased. Some adaptive systems store inpidual information only for a single characteristic, others model users along multiple dimensions. It is hardly possible to enlist all the dimensions utilized by all user modeling systems over the years. This section will describe the most important of them from our point of view.

1.3.1 Knowledge, Beliefs, Skills, and Background

These characteristics are especially important for the adaptive systems modeling students.

Among all kinds of user-adaptive systems adaptive educational systems (AES) have the longest history of research; and, arguably, this class of adaptive systems is the most perse and numerous. To name a few, it includes ICAI systems (e.g. Scholar [Carbonell, 1970]), Cognitive Tutors (e.g. Lisp-Tutor[Corbett & Anderson, 1994]), and Adaptive Educational Hypermedia Systems (e.g. Interbook[Brusilovsky et al., 1996]). For any of these systems student’s knowledge is the main characteristic defining system’s adaptivity. The most popular approach to the modeling of student knowledge relies on a fine-grained conceptual structure of a learning domain; thus the aggregate knowledge model consists of knowledge assessments for particular domain concepts. Different systems manage differently the evidence of incorrect knowledge or

misconceptions. Some AESs simply accumulate this negative evidence in the knowledge assessment for corresponding concepts [Galeev et al., 1994], other systems support a special bug model [Vassileva, 1997], or mark the concept as having a misconception/bug [Mabbott et al., 2004]. Some authors argue that the student model of an AES should not directly differentiate between correct and incorrect knowledge of a concept and treat them instead as student beliefs [Self, 1988]. Intelligent tutoring systems (ITS) in general and especially ACT-R cognitive tutors support modeling of cognitive skills [Corbett & Anderson, 1994], which represent procedural knowledge (“how-to”-knowledge) comparing to concepts representing declarative (“what-is”-knowledge). Recently, some attempts have been made to reflect upon student meta-cognitive skills (general problem solving skill, help-seeking strategy, self-assessment ability, etc) [Roll et al., 2005]. Another relevant characteristic of a user is her/his background. Traditionally, background is defined as relevant experience gained outside the system before the user started working with it. Unlike knowledge models, a background model is usually static (it does not change over time) and coarse-grained. Most typically, it is represented as a single parameter with several possible levels (e.g. “novice”, “advanced”, “expert”) [Horvitz et al., 1998] or as a set of stereotypes [Boyle & Encarnacion, 1994].

1.3.2 Interests and Preferences

These two characteristics have been often used as synonyms, especially in the context of AW systems. Several classes of adaptive systems have been developed to assist information harvesting on the Web (adaptive recommenders [Pazzani & Billsus, 2007], adaptive search engines [Micarelli et al., 2007], adaptive browser agents [Lieberman, 1995]). The most utilized user characteristics for such systems are user information preference or user information interests. When modeling user preferences/interests systems typically distinguish between long-term and short-term models. Long-term interests are relatively stable; they slowly evolve along the entire period of user’s working with the system, or provided by the user explicitly in the form of general categories. A short-term model of interests is dynamic and usually populated only during a single session, reflecting user’s current information task. After the session is over, the influence of short-term model on the system adaptation ends. Sometimes systems need to model user’s preferences not related to the main task of the system, e.g. a preference over a certain type of the interface, or a certain language [Kay & Kummerfeld, 1994].

1.3.3 Goals, Plans, Tasks and Needs

Modeling of user’s goals and plan has been widely used in adaptive dialog systems.

Knowing of what situation a user is trying to achieve (goal) and what sequence of actions s/he is going to take on the way to the desired state of affairs (plan) is essential for such system [Kass & Finin, 1988]. Examples of such goals will be “To search for a certain information”, “To express an opinion”, “To get help”. A very close to these modeling dimensions is a concept of a task or

a need. Modeling of user’s information need or information seeking task is a very popular

approach employed by different adaptive systems on the Web. Thus, for adaptive recommender systems [McNee et al., 2006] argue that recommendations not taking into account user information task/need are likely to be meaningless and useless, or even harmful, since they would eventually lead to the lack of trust in the system. Adaptive educational systems in general model student’s goal/task from a different perspective. The ultimate student goal for them is certain – to learn the material. Therefore, AES do not try to recognize a general goal of a user; Copyright (c) 2007 Sergey Sosnovsky Last Modified 5/12/20088

instead they try to find the best strategy leading to the most efficient learning (see, for example, [Conati et al., 1997]).

1.3.4 Demographic Information

For some kinds of adaptive systems, it is vital to be aware of demographic characteristics of a user from the very basic features, like gender, age or native language to more complex socio-cultural parameters, such as level of formal education and family income. Adaptation to demographic information is widely used in adaptive e-commerce systems [Bowne, 2000] and personalized ubiquitous applications [Fink & Kobsa, 2002]. However, it has been also recognized, that modeling of user’s demography can be important in educational setting [Desimone, 1999].

1.3.5 Emotional State

An adaptive system capable of modeling certain user’s emotions obtains one more source for proper tailoring and justification of its behavior. For example, an AES, recognizing the decrease in student’s motivation can change the content or the difficulty level of the problems, or switch to another learning activity. Recently a new class of technologies combined under the name “affective computing” has gained a lot of interest in the field of user modeling [Picard & Klein, 2002]. Thus, many developers of AESs come to the understanding that for providing adequate adaptation it is not enough to model only the cognitive state of a learner, but also to recognize her/his emotional state. For example, [Rodrigo et al., 2007] have found, that such emotions as boredom and confusion lead students to an off-task behavior, namely gaming the adaptive system.

One of the main challenges in affective computing is how to recognize different kinds of user’s emotions [Picard, 2003]. Some progress towards this goal has been made in the area of educational games [Conati, 2002], dialog systems [Litman & Forbes-Riley, 2004] and multimedia environments [Bianchi-Berthouze & Lisetti, 2002].

1.3.6 Context

The user context modeling is becoming an important direction of research as the interest is growing in the development of adaptive applications for pervasive and ubiquitous settings. The context of a user is a very broad concept; it can include any information about the user’s location, time, physical and social environment, the device being used, etc. Currently, the most popular class of applications adapting to the user’s context is various kinds of personalized guides and tours, both, indoor and outdoor. For example, GUIDE system [Cheverst et al., 2000] creates an adaptive tour of the city of Lancaster by utilizing the current location of the user (to navigate to the most appropriate city attraction) and the time of the day (to match the hours when the attractions are open for visitors). [Eisenstein et al., 2000] propose the architecture for adapting information presentation to different kinds of mobile interfaces. [Heckmann et al., 2005b] suggests the framework for modeling various kinds of contexts, such as being late for an airplane vs. sitting in a chair and waiting for boarding.

1.4 User Model Representation

The previous section answered the question “What can be modeled in the adaptive system?” Next, we demonstrate how the listed characteristics of a user can be modeled. Two classic approaches: overlay user modeling and user modeling via stereotypes are discussed. In Copyright (c) 2007 Sergey Sosnovsky Last Modified 5/12/20089

addition, we describe the keyword-based representation of user models, which became popular as the Web-based adaptive information retrieval technologies matured. Finally, we briefly summarize other important technologies for user model representations, which are less relevant to the main topic of this survey.

1.4.1 Overlay User Modeling

This is the oldest approach to the user model representation. Traditionally, it was employed by different kinds of AES for modeling student knowledge as a subset of domain expert knowledge (see Figure 1). Carr and Goldstein in their classic paper [Carr & Goldstein, 1977] provided the first definition of overlay user modeling and coined the term itself:

"Overlay modeling is a technique for describing a student's problem solving skills in

terms of a modular program designed to be an expert for the given domain. The

model is an overlay on the expert program in that it consists of a set of hypotheses

regarding the student's familiarity with the skills employed by the expert."

Figure 1. Overlay Model of Student knowledge (from [Beck et al., 1996]) However, the main idea of graining the domain of discourse into elementary components and using them for evaluation of student’s knowledge was first proposed in the Scholar system [Carbonell, 1970]. These components have been named in various ways: topics, knowledge elements, learning outcomes, and – the most widely used – concepts. A concept represents an atomic piece of declarative domain knowledge, coherent and semantically complete. An aggregate of concepts form the domain model. Some systems use simple domain models represented as sets (or vectors) of concepts [Galeev et al., 1994], others employ more sophisticated models based on the networks of concepts [Matthews & Biswas, 1986]. The overlay user model of conceptual knowledge relies on the domain model as on a template and, essentially, consists of a set of concept-value pairs, where a value represents an assessment of modeled characteristic for this particular concept. The value could be a binary entity (“knows” / “does not know”), a categorical variable (“low”, “average”, “high”), a probability (e.g. that a user knows this concept), or a numeric parameter on an arbitrary scale. Knowledge is not the only user characteristics that can be modeled as a domain overlay, some systems applied this approach to representing user preferences or interests (e.g. [De Bra et al., 2003]).

The benefit of the overlay user model is its precision and flexibility. Fine-grained concept-based modeling allows systems to adjust their adaptive interventions on a very detailed level. An overlay model is capable to dynamically and precisely reflect the evolution of user’s characteristics, which it is especially important for AES. The detailed nature of model enables relatively easy conversion to other, coarser-grained models. Among the drawbacks, we can name Copyright (c) 2007 Sergey Sosnovsky Last Modified 5/12/200810

the necessity of developing an accurate and formal domain model, which is very challenging task for some domains.

1.4.2 Keyword-based User Modeling

Keyword-based user modeling was originated in the area of information filtering and information retrieval [Belkin & Croft, 1992]. In these fields, the content of a document is traditionally represented as vectors of term (or keywords) extracted from the text. A generic task of textual information filtering is based on the matching of a document term vector to the query tossed by a user. In the case of information filtering the document model is matched against a set of profile model (e.g. model of spam e-mail). The adaptive information retrieval and filtering applications aggregate the user’s history of launched queries, accessed documents, or rejected e-mails in a form of keyword vector and use this vector for tailoring future retrieval/filtering process to the user’s information-seeking idiosyncrasies.

To a certain extent, this approach can be considered as a shallow version of overlay user modeling. It also utilizes elements of domain representation as a frame of reference to express user characteristics on the atomic level. However, instead of concepts modeling domain semantics, this technique uses keywords/terms found in the content. It became very popular in the context of adaptive information retrieval on the Web [Brusilovsky & Tasso, 2004]. Many adaptive Web systems model user information interests or needs as vectors of keywords extracted from the documents user have browsed or requested (sometimes such vectors were enriched with tf*idf values), e.g. Letizia [Lieberman, 1995], WebMate [Chen & Sycara, 1998], NewsDude [Billsus & Pazzani, 1999], etc. Some of the systems model user interests as networks of keywords instead of plane lists. Nodes in the networks represent keywords and arcs usually connect keywords, which co-occurred in the browsed content; for example, ifWeb[Asnicar & Tasso, 1997], PIN [Tan & Teo, 1998] etc.

A big advantage of this approach is the automatic modeling of content based on well-

developed IR techniques of text analysis. This not only reduces the laboriousness of the system creation, but also opens opportunities for open-corpus adaptation. However, the keywords support only shallow content models. To remedy for problems like homonymy (multiple meanings of a word) and synonymy (multiple words expressing the same meaning), elaborate natural language (NL) technologies are required. Keyword-based modeling is not able to represent the true meaning of the content. Instead, it relies on statistical regularities within the text and provides the framework for retrieving statistically close documents.

1.4.3 Stereotype User Modeling

An ultimate goal of adaptive systems is to adjust maximally its behavior to the idiosyncrasies of inpidual users. However, for some tasks it is possible to identify typical categories of system’s users characterized by similar sets of features, using the system in a similar way, and expecting from it similar outcomes. Such categories “constituting strong points of commonalities” [Kay, 1994] among users are called stereotypes.

An adaptive system relying on the stereotype-based modeling does not update every single facet of the user model directly. It utilizes a stock of preset stereotype profiles instead.

Whenever a system receives an evidence of a user being characterized by a certain stereotype, the entire user model is updated with the information from this stereotype profile. A user can be described by one stereotype or a combination of several orthogonal stereotypes. A popular way of stereotype-based user modeling is a linear set of categories for representing typical levels of Copyright (c) 2007 Sergey Sosnovsky Last Modified 5/12/200811

Copyright (c) 2007 Sergey Sosnovsky Last Modified 5/12/2008 12 user proficiency. For example [Chin, 1989] describes the KNOM system modeling users’ expertise of UNIX operating system on one of four levels (novice, beginner, intermediate, and expert ). Often, however, stereotypes form a hierarchy, where more specific stereotypes can inherit some information from their parents. For example Figure 2 demonstrates a part of a stereotype hierarchy used by the classic system Grundy

[Rich, 1979a, 1979b].

Figure 2. Stereotype Hierarchy in GRUNDY (from [Rich, 1979b])

Stereotype-based user modeling is advantageous when from a little evidence about a user a system should infer a great deal of modeling information. However, for modeling fine-grained characteristics of inpidual users (e.g. a knowledge level of a particular concept) more precise overlay models should be employed. Several systems have applied combination of these two approaches. Stereotypical user models have been used to seed the overlay model (or parts of it) of a new user with little history, once the system collected new evidence, the modeling continued in the overlay manner [de Rosis et al., 1993]. The mentioned above KNOME system adjusts inpidual user models by explicit representation of knowledge only for concepts, where it is different from a stereotypical model, to which the user belongs.

1.4.4 Other Technologies for User Model Representation

The three described approaches do not constitute the entire spectrum of user model representation. We have chosen them due to the existence of the projects employing the synergy between these technologies and the Web-ontologies. At the same time, several more approaches should be mentioned.

Constrained-based user modeling has been successfully used for implementing ITSs

[Mitrovic et al., 2001]. It is based on the theory of learning from performance errors by [Ohlsson, 1996]. The student knowledge in the domain of study is modeled by a set of constraints. Every constraint represents an acceptable set of equivalent problem states. A violated constraint indicates an error. There are several advantages of the constrained-based student modeling. Probably, the most important one is the computational simplicity; the constrained-based approach reduces the student-modeling task to the matching of the problem state against the constraint conditions.

One of the dominating adaptive technologies on the web nowadays is collaborative filtering. Unlike most of the user modeling technologies mentioned before it relies on modeling users it terms of their relationships with other users. The typical collaborative user model is based on a vector of ratings a user provided for particular items. The adaptive systems employing collaborative filtering will utilize this vector for discovering a similar user (a user rating the same items similarly) and recommending new items rated highly by this like-minded user [Konstan et al., 1997].

Bayesian Networks is yet another very popular formalism for representation of different aspects of user models. Multiple systems used Bayesian Networks to model the relations between different components/dimensions of a user model, such as emotions, goals and knowledge (e.g. [Zhou & Conati, 2003]). Other systems used Bayesian Networks to implement an overlay user model with internal inference capabilities, where every node represents a domain concept and links stay for the concept relations, e.g. [de Rosis et al., 1992]. [Reye, 1996] used an extension of this approach, where every concept node expended into a smaller network of evidence nodes influencing the knowledge of the concept. The difference of Bayesian networks is that they provide both the representation and the inference framework. Section 1.5.1.2 of this review discusses application of Bayesian Networks for eliciting user modeling assumptions.

In the e-commerce applications it is often effective to model a customer without deriving any explicit modeling assumptions about him, but rather by identifying certain statistical regularities that can be utilized for building effective selling strategies [Agrawal et al., 1993]. A user model in this case can contain a filtered set of transactions matched against an association rule of items bought together or satisfying some linear pattern of buyers’ behavior, or belonging to a cluster of similar buyers. Recently this approach has been employed for modeling web-users as well. The transactional logs of users on the Web are analyzed to issue adaptive recommendations of information items [Baumgarten et al., 2000].

1.5 User Model Elicitation

There are two principle ways for an adaptive system to obtain user modeling information: to ask a user directly or to derive it based on user’s activity with the system. Many systems use a combination of these approaches, as the natural flow of the “user-system” communication (defined by the task a user tries to achieve with the system’s help) often necessitates the active user feedback. For example, a student, solving problems in an ITS, has to provide intermediate and final answers, based on which the system can infer his knowledge of target concepts. A user involved in a dialog with an adaptive help system may express her/his interest in a certain topic by asking about it. Other systems, however, rely only on their ability to mine knowledge about a user from the logs of her/his actions by applying a set of machine learning techniques. If the main task of a user does not imply any direct input to the system, such approach is preferable, as it does not intervene with the task-related activity and does not increase the user cognitive load.

Finally, some adaptive systems purposely and actively involve the user in the modeling process.

They may allow or even encourage a user to directly modify her/his user model. Most often this modeling approach is used in AES. The potential benefits of such student involvement are learning by reflection and trust in the system’s judgments.

In this section we will briefly describe these elicitation approaches. While talking about editable user models we also discuss open and scrutable user modeling that follow the same HCI principles, though are not directly used for model elicitation.

1.5.1 User Model Inference based on Rich Feedback

Generally speaking, adaptive systems populate user models based on the user activity and the built-in inference mechanisms used to resolve uncertainty and deduce modeling assumptions from this activity. For some systems the feedback users provide while performing their task is rich and informative; other rely mostly on the analysis of transactional logs of user’s mouse clicks. Both the “rich feedback” systems and the “log mining” systems often use similar technologies coming from the field of machine learning and probabilistic reasoning to “comprehend” the user actions. However, there are several important differences. While the first class of systems traditionally uses knowledge representation technologies and strongly relies on the semantically-reach overlay or stereotype user models, systems from the second class usually utilize the information retrieval techniques and model the user by keyword vectors or simply by sequences of accessed documents. This subsection concentrates on the first kind of systems and how they are to elicit user modeling information. The systems of the second kind are described in the next subsection.

1.5.1.1 Model Inference in Natural Language Systems

The most numerous and typical examples of an adaptive system receiving rich feedback from a user come from the broad class of NL systems. Intelligent dialog systems dominated the field of user modeling and user-adaptive applications in 1980s as the general NL systems dominated the area of AI applications. The user modeling was summoned to help such systems recognize the goals/plans/beliefs of a user and provide a basis for a more justified and robust dialog actions. Typically the interface of a dialog system was text-based, where a user was able to enter his utterances in free manner, either asking the system a question, or giving an answer to the system’s qualifying question. Both students’ answers and questions were used to gradually construct the user model; system’s responses were often driven by the “intention” to refine some of the modeling assumptions. For example Figure 3 shows an extract from a typical dialog between a user and the system Grundy [Rich, 1979b] playing the role of a librarian and helping the user to choose the appropriate book. Utterances in UPPER CASE are those made by the system, short answers in lower case are from the user, detailed annotations aligned right explain the system’s internal decision process. To save space the original dialog has been slightly shortened, though it still shows how Grundy performs the inference based on the user’s answers. Copyright (c) 2007 Sergey Sosnovsky Last Modified 5/12/200814

Figure 3. Dialog Aimed at Eliciting the User Model (modified from [Rich, 1979b])

Grundy exploits stereotype-based user modeling. Typically, for stereotype systems the process of applying a particular stereotype is controlled by specific rules named triggers. If a particular event occurs that matches the trigger condition (a user has provided certain evidence, or application of a previous stereotype has changes some user model facet to a certain value), it acts by associating the controlled stereotype with the user model.

To elicit inpidual user models, intelligent dialog systems of the early user modeling age usually followed one of the two ways: they either attempted to ask a user a fishing question, or utilized rules, predicate logic and other classic AI reasoning techniques, especially, those used for plan recognition. Many of the reasoning components and ad-hoc heuristics developed along this line of research were very much domain-dependant and could not be generalized for other tasks. The most detailed source of the early works in this area is provided by [Wahlster & Kobsa, 1989]. [Zukerman & Litman, 2001] analyzed some of the more recent efforts to employ user modeling for NL applications. Their paper concentrated not solely on the text-based intelligent dialog systems, but also on the NL generation, NL understanding, language learning assistants etc., including multimedia systems as well.

Starting from the late 80s the user modeling researchers began to use a number of powerful technologies for numerical assessment of uncertainty, such as Bayesian Networks [Pearl, 1988], Fuzzy Logic [Zadeh, 1983] and Dempster-Shafer theory of evidence [Shafer, 1976]. Capability to estimate and manage the uncertainty is very important for user modeling, since it always happens under the conditions of insufficient information and consequently the uncertainty about the actual state of a user. In 1995 Anthony Jameson published his seminal overview covering the first decade of efforts in this area [Jameson, 1995]. He analyzed several dozens of systems based on how they use uncertainty management for user modeling and provided some arguments in favor of using this methodology in adaptive system design.

1.5.1.2 Model Inference in Adaptive Educational Systems

The second large class of adaptive systems relying on both the rich user feedback and their inference capabilities is constituted by various AESs and especially by ITSs. Developers of ITS extensive used technologies for reasoning under uncertainty. A typical input given by a user of an ITS is providing an answer to a problem or taking a certain step towards the problem solution. Based on this information as well as the domain model and the problem description the ITS employs its reasoning mechanism to update the user model, which will result in adaptive system behavior afterwards (presenting the next problem of optimal difficulty; generating remedial hints; suggesting to take a test or review a tutorial).

Various approaches have been applied for deducing user characteristics. The mechanism known as knowledge tracing implemented in Lisp-Tutor uses simple Bayesian inference to calculate the posterior probability that the user mastered a certain skill based on the prior probability for this skill, probability of guessing and the evidence provided by the user (correct/incorrect solution of corresponding problem) [Corbett & Anderson, 1994]. Similar approach is implemented in ITSs developed with the help of the authoring tool MONAP-PLUS [Galeev et al., 1994]. The distribution of probabilities that knowledge of a student for a particular skill is on a certain level as well depends on the prior and the correctness of the current answer.

A generalization of this methodology is provided by Bayesian Networks. Figure 4

visualizes the simplified inference version of the inference mechanism implemented in the Ipsometer system [Jameson, 1992]. The three nodes represent three variables that the system tries to model and the relations between them. The general user’s expertise and the difficulty of a Copyright (c) 2007 Sergey Sosnovsky Last Modified 5/12/200816

Copyright (c) 2007 Sergey Sosnovsky Last Modified 5/12/2008 17 concept influence user’s knowledge of this concept. White and grey bars represent probability distributions for each variable before and after the system has gained the evidence (user has attempted a relevant problem). The figure represents both the predictive and the diagnostics inference supplied by Bayesian Networks. When no evidence is available a system is still able to predict some probability distribution for the knowledge node (“knows” is more probably) based on the facts that the concept is probably easy and the user’s expertise is generally higher then novice. After evince is received not only the knowledge node changed its distribution, the diagnostics inference propagates the changes of probabilities in the cause nodes (the expertise

and the node difficulty).

Figure 4. Prediction and Interpretation of the User's Knowledge of a Concept with a small BN

(from [Jameson, 1995])

Similar mechanisms have been applied by such systems as Olae [Martin & VanLehn, 1995] and Andes [Conati et al., 1997]; however these systems also used Bayesian Networks for representing problem solution graph and student plan recognition. The Bayesian Network in Epi-Umod system [de Rosis et al., 1992] modeled user’s knowledge as an overlay model over the network of domain concepts. Conditional probabilities represented the mutual effects of knowledge of relevant concepts; hence, the propagation of user’s knowledge over this network was taking into account the prerequisite-outcome relationships between them.

Although Bayesian Networks have been the most popular technology for modeling user’s features in ITS (and other kinds of adaptive systems) under the uncertainty, there are examples of other technologies being used for this purpose. [Petrushin & Sinitsa, 1993] developed a system applying Dempster-Shafer’s approach to infer student’s misconceptions and the level of expertise based on the answers to the test questions. [Capuano et al., 2000] presented the ITS ABITS modeling student’s knowledge of domain concept with vectors of fuzzy numbers, which narrowed as the evidence was provided by the corresponding test questions. [Beal et al., 2007] applied Hidden Markov Models to represent such unobservable user model parameter as student’s engagement with an ITS.

1.5.2 Unobtrusive User Modeling

One of the goals of adaptation is to assist a human user in performing her/his task in the most efficient way. That implies, if possible, an adaptive system should not interfere with the main task of the user and should not demand from her/him extra efforts to maintain the effective adaptation. One of the motives for that is the intention to reduce the user’s cognitive load and minimize the off-task activity. Another reason is the experiments demonstrating that users do not will to provide feedback. For example, [Carroll & Rosson, 1987] observed that usually users of software systems are unlikely to undertake any additional activity, even if it results in a rewarding behavior of the system. Hence, one of the challenges in user modeling nowadays is to

obtain user models in an unobtrusive way, which does not require from a user additional, non-task-related activity.

The exponential growth of information and users on the Web as well as the transfer of many services and activities to the electronic and online form developed the need for applications unobtrusively supporting users by navigating them to the desired information or filtering out the non-relevant pages, documents, products, etc. At the same time, the recent advancement in machine learning and data mining technologies and dramatic improvement in the computational and networking capabilities of the modern computers provide broad opportunities for developing such applications. Most of the research in this field has been generated for AW applications; therefore in our analysis we are going to concentrate on what technologies are exploited by adaptive Web-systems to unobtrusively elicit inpidual user models.

1.5.

2.1 Content-based and Collaborative Filtering

Generally speaking, there are two basic approaches to personalization employed by adaptive Web applications: content-based adaptation and collaborative filtering. Content-based adaptation relies on similarity of Web-resources expressed in their content metadata; content-based modeling represents a user as a product of her/his inpidual history of rating, purchases and page accesses. Collaborative filtering instead of searching for similarities between resources’ content models looks for the like-minded users expressing similar behavior (purchasing the same products, rating items similarly, etc.). However, the approaches used to populate these two different kinds of user models, are often similar, though they operate on orthogonal data sets. For example, two recommender systems can use the same clustering algorithm to identify sets of similar users based on their rating history (collaborative) or to group similar web-pages based on their textual content (content-based). Consequently, the first system would recommend to a user items highly rated by other users from the same cluster, while the second system’s recommendation will contain the pages similar to those, which have drawn user’s interest in the past. Both technologies have their shortcomings. The quality of content-based filtering very much depends on the quality of content description of resources, which is not always perfect.

Besides, content-based adaptation is based solely on the user’s history; it excludes unexpected and serendipitous recommendation of items different from those seen by the user before.

Collaborative filtering does not suffer from those factors, however it has own problems.

Collaborative approaches are less scalable; hence, they require more offline data processing. The quality of collaborative recommendation drops if the user-item matrix is sparse, which is common for Web datasets. Finally, collaborative systems provide an inpidual user with recommendations based on the evidence collected from many other users, which might not always lead top the adequate results. Hybrid recommender systems try to compensate these problems by utilizing both content-based and collaborative technologies (see [Burke, 2007] for an overview).

1.5.

2.2 Challenges of Unobtrusive User Modeling

Several important reviews analyzing different aspects of implicit elicitation of user models have been published. [Webb et al., 2001] defined several challenges specific for user modeling and limiting application of machine learning technologies for adaptive system development. The amount of information collected by an adaptive application about their users is often not enough to build a straightforward computational model of the acceptable accuracy (small datasets). Generally, user modeling is a dynamic task; hence the parameters Copyright (c) 2007 Sergey Sosnovsky Last Modified 5/12/200818

Copyright (c) 2007 Sergey Sosnovsky Last Modified 5/12/2008 19 characterizing a user are likely to change over time; however, ML modeling algorithm are often not able to adjust to these changes quickly enough (concept drift ). User modeling applications are supposed to operate in an interactive mode, however many of ML algorithms take a long time to converge (interactivity vs. computational complexity ). Supervised ML algorithms require explicit labels. In the context of user modeling such labels often can be provided only by a user (for example, the user notifies the systems that s/he does or does not like this web-page). Unfortunately, users are known for unwilling to provide systems with the information, which does not directly connected with their needs (lack of labeled data ).

1.5.

2.3 Web Mining for Personalization

[Pierrakos et al., 2003] and [Mobasher, 2007] overviewed the applications employing data mining technologies for personalization on the Web. Both works provided exhaustive description of the main stages of information flow in these applications as well as the techniques they use on every stage. There are three main stages common for all data mining systems: data collection, data pre-processing and pattern discovery; some applications perform two later stages

in offline mode (see Figure 5).

Figure 5. Web Usage Mining for Personalization (from [Pierrakos et al., 2003])

1.5.

2.4 Data Collection

On the first stage the data from different sources is collected, combined and structured to provide the basis for the following processing. The main source of data is provided by the Web server’s transactional logs, which store the basic usage data, the click stream of users accessing system’s resources (date/time, requesting IP address, requested resource, HTTP method, user platform characteristics, etc). In addition to the main logs, usages data can be obtained from the logs of proxy servers and packet sniffers, as well from the client-side cookies. Sometimes, the data about demographic features of a user are available from the dedicated online services (e.g. ComScore, NetRating, Acxiom).

1.5.

2.5 Data Preprocessing

Data preprocessing involves several tasks. First, the raw click stream data should be cleaned: the references to the non-informational resources, such as graphics, audio, stile sheets etc. are removed, the missing transactions (dues to the client-side and proxy-side caching) are generated based on the analysis of the site hyper-structure, transactions generated by spiders and other robots are filtered out. Other important tasks performed on this stage are user identification

and session separation. When neither server-side user login information nor client-side cookies are available, a number of ad-hoc techniques are employed to identify transactions coming from the same user; usually they involve integration of information about the IP, platform used as well as the website topology. The separation of transactional data into sessions is very important, since every session is usually characterized by a specific information task and represents a logically complete pattern of user’s navigational behavior. There are two groups of methods for session identification: time-oriented and structure (content)-oriented. Time-based session separation is the most common. It is usually based on the time spent by a user on a single page; if the time exceeds a certain threshold (commonly about 30 minutes) the beginning of a new session is registered. The structure- and content-oriented methods utilize the website topology and/or the content models of accessed resources to track the possible shift in the focus of browsing, which is used as a mark of a new session. Besides the direct manipulations with the raw log data, some applications on the stage of preprocessing also enrich the logs with the metadata about users or resources stored in databases.

1.5.

2.6 Pattern Discovery

The final stage of data flow in adaptive data mining application, which is the most related to user modeling, is pattern discovery. On this stage application mine from the processed and enriched transactional data statistical patterns representing observed usage regularities, which are later used for recommending an appropriate resource, navigating towards a relevant webpage or retrieving a document with desired features. In addition to works by Pierrakos and Mobasher [Pazzani & Billsus, 2007] also provide a nice analysis of pattern discovery technologies in the context of content-based recommender systems and give the application examples, as well. There are several groups of such technologies coming from the field of Machine Learning: clustering, classification, association rules, sequential patterns, and latent variable models.

Clustering methods have been used mainly for two purposes. User sessions have been clustered to mine the typical navigation patterns. As a result the systems were able to recommend hyperlinks based on the other sessions in the cluster [Yan et al., 1996], to generated an index page navigating a user to all pages from the current session’s cluster [Perkowitz & Etzioni, 2000], etc. To remedy the problem of scalability collaborative systems often use offline clustering of users and/or items [Mobasher, 2007]. It allows the online component to reduce the dimensionality of the search task and provide recommendations only from the current cluster.

Classification has been less popular then clustering among the adaptive data mining applications, because of the need in the pre-classified (labeled) data, which is very rarely available. The majority of classification approaches have been used to build descriptive models of Web-user interests. For example, when a user browsing on the web chooses to save, print or bookmark a certain page, it is a good indicator, that the material of this webpage is interesting to the user. The pioneer Letizia system [Lieberman, 1995] recommends links on the current Web-page that the user should be the most interested in. The recommended links lead to web pages similar to the pages that user saved or bookmarked previously. Letizia modeled both negative and positive interests based on the fact that a user chooses the link or misses it. This may result in the faulty model, since users might skip by mistake or postpone the visit. The NewsDude system [Billsus & Pazzani, 1999] recommended news to a user on the basis of her/his long-term and short-term interests. The long-term interest model was populated on the basis of explicit ratings a user provided for the seen news stories; however the short-term interests, which Copyright (c) 2007 Sergey Sosnovsky Last Modified 5/12/200820

本文来源：https://www.bwwdw.com/article/z5oq.html

相关文章：

怎样活到100岁08-31

有效运用膜除菌过滤技术04-18

猫和老鼠新传作文600字06-17

CommVault备份软件指标-2011 - 图文11-11

争电视作文500字07-11

暖暖的歌声作文450字06-20

2018年回弹法检测混凝土强度试卷-答案05-19

沪科版八年级第四章第一节光的反射学案同步（带答案）-精选教育文档10-05

2017年教师资格证考试《保教知识与能力》预测试题(1)01-21

上一篇：自考复习资料 03708中国近现代史纲要 word文件85页下一篇：广东省汕头市金山中学2013-2014学年高一英语上学期期中试题新人