Some Limitations of Feature-Based Recognition in Case-Based Design

Thomas R. Hinrichs
The Institute for the Learning Sciences
Northwestern University
Evanston, IL 60201

Abstract

A crucial part of Case-Based Reasoning is retrieving cases that are similar or otherwise relevant to the problem at hand. Traditionally, this has been formulated as a problem of indexing and accessing cases based on sets of predictive features. More generally, however, we can think of retrieval as a problem of recognition. In this light, several limitations of the feature-based approach become apparent. What constitutes a feature? What makes a feature predictive? And how is retrieval possible when the structure of an input is predictive, but its components are not?

This paper presents an analysis of some of the limitations of feature-based recognition and describes a process that integrates structural recognition with retrieval. This structural recognition algorithm is designed to augment the retrieval capabilities of case-based reasoners by facilitating the recognition of functional design cliches, natural laws, and sub problems for which individual features may not be predictive.

Introduction

Recognition is an important component of problem solving. Engineers and programmers recognize design cliches, scientists recognize natural laws at work in different situations, and people generally recognize problems they've solved before. In AI, one of the central tenets of case-based reasoning (CBR) is that problems can be solved by retrieving and adapting solutions to similar problems (Kolodner, 1993). To do this, problem solvers typically access and evaluate previous cases by recognizing their common features. Such an approach is called feature-based recognition. It seems plausible that reducing a problem to constituent features would sometimes be insufficient or inappropriate. In fact, the limitations of feature-based recognition was one of the cornerstones of Minsky and Papert's book Perceptrons (Minsky and Papert, 1968). Why, then, are most problem-solving accounts of retrieval based solely on feature-based recognition? This paper presents a problem for which this fails, proposes an alternative model that more closely integrates problem solving with perception, and outlines an algorithm to achieve this.

One reason that feature-based recognition is so ubiquitous is the assumption that recognition should be cheap. After all, human beings easily recognize faces and situations on the order of milliseconds. But the deceptive ease with which we recognize the familiar may lead us astray by suggesting computational models that emphasize reduction to simple primitives that are ``immediate'' or ``free''. Moreover, by reducing to primitive features, it is easy to see how to amortize the cost of recognition by indexing concepts and cases at storage time, rather than performing complex computations at retrieval/recognition time. This apparent computational efficiency can be an illusion, however, because it does not account for the cost of perceiving features, and in some ways it trades off competence for performance.

Some recent work in case-based reasoning has involved effectively extracting features from geometrical structure (Berger, 1993), and from traces of continuous parameters over time (Ram and Santamaria, 1993). In this paper, we concentrate instead on recognition and retrieval based on the topological structure of engineered designs.

An Example From Engineering

Recently, I was attempting to apply the case-based design system JULIA (Hinrichs, 1992) to some problems of engineering design. An issue that quickly became apparent was how to index and retrieve concepts as simple as a voltage divider. A voltage divider is not so much a component as a configuration of components: two resistors in series with a voltage source divide the voltage in proportion to the resistance of one of the resistors divided by the sum of their resistances (Fig. 1).

Figure 1: Voltage Divider

Clearly this is not a problem for computational design in general. CAD systems perform such calculations all the time. In fact, the voltage divider example has been used previously to illustrate constraint propagation in ThingLab (Borning, 1981). In ThingLab, however, the voltage divider constraint must be added manually by the user in order to solve for the voltage across the meter M. Ideally, a computational model of design should not only perform such inference, but also be able to quickly recognize such configurations.

A case-based design system, for example, should be able to retrieve a similar circuit for which the voltage across M has been solved, and to transfer the reasoning or constraints that determined the value. However, because JULIA relies on feature-based recognition, it cannot do this. JULIA's memory is arranged as a multiple-discrimination network that discriminates based on deviations from and specializations of normative features. Consequently, it could retrieve circuits containing resistors and batteries, or even resistors with a specific value, but it could not retrieve only those circuits with two resistors in series. Unless, that is, the concept ``two resistors in series'' were explicitly represented in the input, or arbitrary procedures were permitted as discrimination tests. What makes this problem hard is that the primitive components of the circuit, its features, are for the most part the same; by themselves they are not predictive. What is predictive is relationships that are implicit in the input representation.

This is not merely a problem with discrimination networks and knowledge-poor matching. The same limitation is true of other retrieval feature-based recognition algorithms. For example, PROTOS (Bareiss, 1989) does not rely on discrimination to retrieve exemplars, but instead uses a two-phase retrieval algorithm in which the reminding strengths of the input features are combined to activate hypothesis categories, and exemplars of those categories are then matched against the input case using knowledge about the relationships between features. In addition, PROTOS permits categories to be features of other categories thereby representing complex features. Given the right category and the right relational vocabulary, PROTOS could distinguish between a circuit with resistors in series and one with resistors in parallel. Nevertheless, it could never be reminded of a category based solely on relationships in its input, even if those relationships were explicitly represented. This is a fundamental problem of feature-based recognition.

In effect, feature-based recognition reduces the input to a set of features, thereby losing all information about their configuration and cardinality. For some problems, this suffices to enable an efficient first pass retrieval, followed by a matching phase that ranks retrieved items or verifies a hypothesis category. From a pattern-recognition perspective, this is essentially pattern classification followed by template matching.

For many problems, however, this begs the question of exactly what should constitute a feature. If symbols in the input do not represent predictive features, then at what granularity and specificity are features to be found? For example, is a 5Ω resistor a different feature from a 500Ω resistor? Is a transistor a feature, or should it be translated into a `small signal' model composed of passive elements such as resistors? One could appeal to basic level effects and argue that component types are natural categories (hence their designation by different schematic symbols.) But one could also argue that the voltage divider example shows that features can be at a larger granularity, and the existence of alternative representations of transistors suggests that, ultimately, there may not be a fixed set of primitives.

Feature Extraction

One standard solution to these problems is to employ pattern recognition processes as a front end in order to make explicit the relationships that are otherwise implicit in the input. These extrinsic relationships are then treated as features by the problem solving component. This process, called pattern classification, is illustrated in Fig. 2. Pattern classification takes as input a pattern that may be a perspective view or may be partially occluded, and transforms the pattern through a process of normalization. It then extracts symbolic features that can be used by the decision procedure to classify the input.

Figure 2: The Feature Extraction process

Feature extraction has been used by many AI programs to recognize domain-specific configurations, such as chess positions (Simon and Gilmartin 1973). While it makes sense to exploit domain expectations in perception, this approach is limited because it effectively hard-wires the domain-specific part of the program; the sort of knowledge that is least likely to be innate in any cognitive model of intelligence.

Moreover, for recognizing relational structure, feature extraction can be like the tail wagging the dog. Not only is it a domain-specific black box, but there is typically no feedback from the decision procedure to the feature-extraction process. Feature extraction provides no general account of how to focus attention and it ignores cardinality and configuration information, so there must be a small number of relevant configurational categories. Most importantly, because features are extracted via arbitrary code, they are not systematically composed of simpler features that can be derived or learned from the input. The vocabulary of features is either fixed or determined by the task and the local features of the input. While top-down processing is important because it provides expectations in the form of a pragmatic (task-driven) context, it must be applied in a global-to-local fashion in order to provide a structural context as well.

The Problem With Parsing

An alternative to pattern classification is to re-represent the input to extract its relational structure. Syntactic pattern recognition extracts structural information through syntactic analysis (Fu, 1980). It translates local relationships into features using a pattern description language consisting of pattern primitives, composition operators, and a grammar. The generality of this approach depends on the existence of a natural or universal set of primitives, composition operators, and a common grammar.

One candidate for a common grammar that is increasingly used in engineering design representations is the bond-graph grammar (Paynter, 1961). Bond graphs are a mathematical notation used to model a very wide range of dynamic systems in terms of a small set of abstract primitive elements. Although they are almost universally applicable for modeling the flow of energy in dynamic systems, bond graphs are generally not applicable for modeling software systems or physical configurations. Choosing or learning an appropriate domain-specific grammar could be an arbitrarily difficult task.

Furthermore, parsing by itself can only extract the structure of formal languages. Informal languages, such as natural language, require some kind of semantic analysis as well. The basic problem with parsing is that it is fundamentally a bottom-up process. Contextual information might determine an initial pattern-description language, but after that there is no feedback from higher-level processes.

Toward Design Perception

To overcome the limitations of feature-based recognition in case-based design requires a greater integration of problem solving and perception. A theory of design perception should combine a model of memory retrieval with a model of attention. This changes the emphasis of case-based reasoning from retrieval to recognition, and changes the emphasis of perception from classifying an input to re-representing it. Design perception differs from visual perception in terms of its input or base-representation (which could be a topological space rather than a measure space) and in terms of the ambiguities it must address (Design perception need not address perspective transformations, occlusion, or noise.) The principle characteristics of a model of design perception would be:

  1. Recognition is based primarily on relationships. Rather than recognizing concepts from sets of features, it must be possible to recognize configurations based solely on relationships that may be implicit in the base representation.

  2. Processing is primarily global-to-local. Rather than building up an interpretation by composing primitive features, decompose the input representation and try to recognize concepts or features at progressively finer granularity.

  3. Higher-level processes feed back to decomposition strategies. Flexible design perception can be neither purely top-down nor purely bottom-up, but must entail feedback from higher-level recognition processes and from problem-solving processes in order to provide structural and pragmatic context to focus attention.

  4. Primitives are decomposition strategies. Instead of relying on features of the world to serve as primitive building blocks, invariants would be strategies for decomposing patterns. This is in opposition to, for example, the idea of recognition by components (Biederman, 1987).

  5. Abstract relationships are recognized first. At any given granularity, recognize relationships in the form of the base representation first. Domain-specific features are range-restrictions on these relationships.

  6. Decomposition knowledge is encoded as constraints or preferences, rather than as plans. Decomposition knowledge must serve multiple uses. It should support both local-to-global and global-to-local recognition processes, and for problem solving, it should support both synthesis and problem reduction.

If recognition is to be based first on relational structure, this raises the problem of inductive bias. Since there are virtually infinitely many relationships implicit in an input, which should be attended to? Part of the answer is that expectations from pragmatic and structural context must provide preferences between decompositions. However, to rely solely on context is to invite infinite regress. Another part of the answer may be to focus attention using gestalt laws of perceptual grouping such as grouping similar elements, grouping highly interconnected clusters and contiguous elements (Rock and Palmer, 1990).

Although originally postulated to account for visual perception, these laws also provide guidance in recognizing abstract relationships in a topological representation. While topological representations have no analog for proximity relationships, there could be other decomposition strategies, such as identifying parallel and serial connections of two-terminal devices, disconnecting graphs at cut nodes that form points of cleavage, and separating circuits or loops. In effect, this would promote configurational information from the role of disambiguation to a more predictive role.

To see how this could work, let's revisit the voltage divider example. Given a structural description of the circuit, the first step would be to group similar elements R1 and R2. The next step would be to identify their relationships as a serial connection of two-terminal devices. If the components were complex sub-circuits, then this abstract relationship could provide a structural context to further constrain their interpretation. In this case, however, the components are simple resistors and their serial connection is enough to retrieve the concept `voltage divider' from which the constraint equation can be transferred to solve the original problem. In this way, recognition can be seen as a process of searching through the space of decompositions for representations that further constrain the recognition process.

An Experimental Implementation

To make this process more concrete, we have implemented an experimental algorithm for structure-based recognition. This algorithm takes as input a topological structure of components rather than a set of features. It re-represents that input structure in a higher-level vocabulary that may be more useful for reasoning or for retrieving specific cases. The algorithm has three main components: 1) A feature-driven recognition component that uses features of the input to predict a domain and possible context, 2) A decomposition process that takes a domain and an input graph and breaks the input into subgraphs, and 3) A graph re-writing component that matches subgraphs against predicted conceptual features and replaces low-level features with higher-level features. The algorithm iteratively re-represents the input until no further refinement can be made.

This process is supported by a conceptual representation with a number of unusual characteristics. First, domains are represented explicitly as categories that determine equivalence classes. Because there is no uniform vocabulary of primitive features with which an input must be specified, the level of abstraction at which features are considered to be `the same' is determined by the domain's equivalence classes. For example, in conceptually breaking apart a circuit design, sometimes you want to group all impedances together, and sometimes you need to distinguish resistors, capacitors, and inductors. In matching and decomposing the input, we progressively abstract the domain. Next, topological structure is itself represented as a domain. While designs are often implicitly represented as graphs of components, structural recognition makes it necessary to manipulate a representation with respect to progressively more abstract domains, up to and including the purely structural. In addition, for any domain, there are two privileged equivalence classes: the `null device' and the `infinite device'. In an electrical circuit, for example, we typically consider wire to be a null device (except for transmission line effects, etc.) In the high frequency domain, capacitors may also belong to the null device class, while inductors become infinite devices. The null device permits us to treat all of its terminals as a single node, while an infinite device can simply be removed from the representation (in the context of a particular domain).

(defdomain electric-Circuit        (defconcept resistor
 :type graph                        :domain electric-circuit
 :equivalence-classes               :type two-terminal-device)
    ((null-device wire)
     (resistor)                    (defconcept voltage-Divider
     (voltage-Source)))             :domain electric-circuit
                                    :type series
(defconcept two-terminal-device     :terminals (r1.1 r1.2 r2.2)
 :domain graph                      :features ((r1 resistor)
 :terminals (t1 t2)                            (r2 resistor)))
 :remindings (series parallel))
                                   (defconcept CommonEmitter
(defconcept series                  :domain active-circuit
 :domain graph                      :type transistor-amp-config
 :type three-terminal-device        :terminals (Q1.1 Q1.2 Q1.3)
 :terminals (d1.1 d1.2 d2.2)        :features ((Q1 transistor)
 :features                                     (Rc resistor)
     ((d1 two-terminal-device)                 (BN1 Bias-Net))
      (d2 two-terminal-device))     :relationships
 :relationships                         ((connected Q1.2 BN1.2)
     ((connected d1.2 d2.1)))            (connected VD1.1 Rc.1)
                                         (connected Rc.2 Q1.1)
                                         (connected Q1.3 BN1.3)))
Figure 3: The Circuit Domain

Figure 3 shows some example representations for the electrical circuit domain. Here, compound devices are represented as graphs of features (i.e., components) and the relationships between their terminals. Features associate a name with a type constraint to support matching. Relationships also support matching by denoting connections between terminals. These connections need not be direct; they may pass through an arbitrary number of null devices (e.g., wires.) Concepts also indicate their domain, parent type, and any remindings for which the concept is predictive. A reminding is another concept in the same domain for which the first concept may be a component feature.

An input to the program is a graph consisting of a set of features (instances of devices) and terminals. All relationships are implicitly represented by terminals shared between features. This input representation is generated automatically by a simple drafting program.

The RECOGNIZE procedure works by first determining if any feature in the input graph predicts (via remindings) a configuration or concept that can be directly matched. If not, it decomposes the input to subgraphs and progressively abstracts the domain and features until a feature is predictive of some configuration or concept. It then re-represents the input in terms of the new concept and repeats the process until no further refinement is possible. The DECOMPOSE procedure first determines the domain of the input based on its features. It then combines all terminals connected via components in the null device equivalence class for the current domain and eliminates `infinite devices'. Next, it groups together features based on similarity (defined by equivalence classes), and lastly, it decomposes those groups based on contiguity (i.e., connectivity). The RE-REPRESENT procedure takes a subgraph and re-writes it in terms of features that are relevant to the domain. It works bottom-up by predicting the context of the subgraph, matching to find a classification and bindings, discriminating to specialize the classification, and finally re-writing the subgraph of the input in terms of the new feature. Verifying a reminding and discriminating to a specialization employ a subgraph matching algorithm that uses constraint propagation and marker passing to minimize the amount of non-deterministic graph matching performed. To illustrate the algorithm, consider again the problem of recognizing a voltage divider in a common-emitter transistor amplifier (Fig. 4):

Figure 4: A common emitter amplifier

With this input, the features do not immediately predict a configuration, so the procedure decomposes the circuit to the subgraphs ((Q276) (V273) (R274 R277 R275)). Q276 predicts transistor amplifier configuration, but by itself the transistor does not match anything. V274 is a two-terminal device so it predicts the concepts series and parallel configurations, but it too fails to match in isolation. The resistors predict series and parallel configurations and it succeeds in matching R275 to feature d1 of series and propagates the mapping d2 = R274 to complete the match. It then discriminates from series to voltage-divider and re-writes the circuit with R274 and R275 replaced by the new voltage divider. With this new feature, the process iterates and identifies the parallel combination of the voltage source and voltage divider as a bias network in the context of a transistor circuit. Given these higher-level features, it should be easier to retrieve cases from a long-term memory.

Conclusions

To retrieve cases based on structural properties requires treating retrieval as a recognition process. Classical pattern recognition methods fail us in two ways: Feature extraction is a top-down process that relies on a domain-dependent set of features and procedures for extracting them. Syntactic pattern recognition is a bottom-up parsing process that requires a pre-specified grammar and vocabulary of primitive tokens. We claim that an input can be decomposed and re-constituted in a more general way by reifying the domain-dependent expectations as explicit equivalence classes, by decomposing and identifying features with respect to an explicit domain and by allowing the recognition process to feed back upon failure so that it can progressively abstract the domain rather than directly abstract the features of the input. For pathological cases where only the configuration is predictive, the domain can be abstracted to the level of simple graph decompositions. This process is more akin to gestalt grouping than to the extraction of a fixed set of features. By using this structural recognition process in conjunction with the featural recognition processes of other case-retrieval algorithms, we can extend the range of design cliches and cases that can be retrieved.

Acknowledgments

This work was supported in part by the Defense Advanced Research Projects Agency, monitored by the Office of Naval Research under contracts N-00014-91-J-4092 and N-00014-91-J-4117. The Institute for the Learning Sciences was established in 1989 with the support of Andersen Consulting, part of The Arthur Andersen Worldwide Organization. The Institute receives additional support from Ameritech and North West Water Group plc, Institute Partners.

References

  1. Bareiss, R.: Exemplar-based Knowledge Acquisition: A Unified Approach to Concept Representation, Classification, and Learning. San Diego: Academic Press (1989)

  2. Berger, J.: Knowledge Acquisition and Design Support in a Medical Domain. In Case-Based Reasoning: Papers from the 1993 Workshop. Technical Report WS-93-01. Menlo Park, CA: AAAI Press (1993) 141-146

  3. Biederman, I.: Recognition-by-Components: A Theory of Human Image Understanding. Psychological Review 94 (1987) 115-147

  4. Borning, A.: The Programming Language Aspects of ThingLab, A Constraint-Oriented Simulation Laboratory. ACM Transactions on Programming Languages and Systems 3 (1981) 353-387

  5. Fu, K.S.: Syntactic (Linguistic) Pattern Recognition. In K.S. Fu (Ed.), Digital Pattern Recognition, New York: Springer-Verlag (1980) 95-134

  6. Hinrichs, T.R.: Problem Solving in Open Worlds: A case study in design. Hillsdale, NJ: Lawrence Erlbaum Associates (1992)

  7. Kolodner, J.L.: Case-Based Reasoning. San Mateo, CA: Morgan Kaufman (1993)

  8. Minsky, M., Papert. S.: Perceptrons. Cambridge, MA: MIT Press (1969)

  9. Paynter, H.M.: Analysis and Design of Engineering Systems. Cambridge, MA: MIT Press (1961)

  10. Ram, A. And Santamaria, J.M.: Continuous Case-Based Reasoning. In Case-Based Reasoning: Papers from the 1993 Workshop. Technical Report WS-93-01. Menlo Park, CA: AAAI Press (1993) 141-146

  11. Rock, I. and Palmer, S.: The Legacy of Gestalt Psychology. Scientific American, December (1990) 84-90

  12. Simon, H.A. and Gilmartin, K.: A simulation of memory for chess positions. Cognitive Psychology 5 (1973) 29-46