Revisiting the critique of NN from 30 years ago: neural networks cannot achieve explainable AI

Walid S. Saba, senior research scientist at Northeastern University’s Institute for Experiential Artificial Intelligence From the perspective of compositional semantics, a point is put forward: deep learning cannot construct a reversible compositional semantics, so it cannot realize explainable AI.

Author | Walid S. Saba

Compilation | Antonio

Editor | Chen Caixian

1
Explainable AI (XAI)

As Deep Neural Networks (DNNs) are used to make decisions about loan approvals, job applications, granting court bail, etc. that are in people’s interest or some life-or-death decisions (such as a sudden stop on the highway), to explain these decisions, not just generate A prediction score is critical.

Research in explainable artificial intelligence (XAI) has recently focused on the concept of counterfactual examples. The idea is simple: first make some counterfactual examples with expected outputs and feed them into the original network; then, read the hidden layer units to explain why the network produces some other output. More formally:

“The score p is returned because the variable V has a value (v1, v2, …) associated with it. If V has a value (v’1, v’2, …), and all other variables are held constant, the score p’ will be return.”

Here is a more specific example:

“You were refused a loan because your annual income was £30,000. If your income was £45,000, you would get a loan.”

However, a paper by Browne and Swift [1] (hereafter referred to as B&W) recently showed that counterfactual examples are only slightly more meaningful adversarial examples that are created by performing small and unobservable perturbations to the input. generated, thus causing the network to misclassify them with high confidence.

Furthermore, counterfactual examples “explain” what some features should be to get a correct prediction, but “do not open the black box”; that is, do not explain how the algorithm works. The article goes on to argue that counterfactual examples do not provide a solution to interpretability and that “there is no explanation without semantics”.

In fact, the article makes even stronger recommendations:

1) We either find a way to extract the semantics assumed to exist in the hidden layers of the network, or

2) Admit that we have failed.

And Walid S. Saba himself is pessimistic about (1), in other words he regretfully admits our failure, and the following are his reasons.

2
“Ghosts” by Fodor and Pylyshyn

While the general public fully agrees with B&W’s view that “there is no explanation without semantics”, the hope of explaining the semantics represented by the hidden layers in deep neural networks to produce satisfactory explanations for deep learning systems has not been fulfilled, the authors argue, because Exactly from the reasons outlined in Fodor and Pylyshyn [2] over thirty years ago.

Legend: Jerry A. Fodor (left) and Zenon Pylyshyn

Walid S. Saba goes on to argue: Before explaining where the problem is, we need to note that purely extensional models (such as neural networks) cannot model systematicity and compositionality because they do not Recognition of symbolic structures with rederivable syntax and corresponding semantics.

So representations in neural networks are not really “symbols” that correspond to anything interpretable – they are distributed, correlated, and continuous numerical values, which by themselves do not imply anything conceptually interpretable .

In simpler terms, subsymbolic representations in neural networks by themselves do not refer to anything conceptually understandable by humans (hidden units by themselves cannot represent objects of any metaphysical meaning). Instead, it is a set of hidden units that usually collectively represent some salient feature (e.g., a cat’s whiskers).

But this is exactly why neural networks cannot achieve interpretability, i.e. because the combination of several hidden features is not determinable – once the combination is done (by some linear combination function), a single unit is lost (which we show below). ).

Explainability is “Reverse Reasoning”

DNN cannot reverse reason

The authors discussed why Fodor and Pylyshyn concluded that NNs cannot model systematic (and therefore interpretable) inferences [2].

In symbolic systems, there are well-defined compositional semantic functions that compute the meaning of compound words from the meanings of the constituents. But this combination is reversible—

That is, one can always get the (input) component that produces that output, and precisely because in symbology one has access to a “syntax structure” that contains a map of how the components are assembled. And this is not the case in NN. Once vectors (tensors) are combined in a NN, their decomposition cannot be determined (the ways in which vectors (including scalars) can be decomposed are infinite!)

To illustrate why this is at the heart of the problem, let’s consider B&W’s proposal to extract semantics in DNNs for interpretability. B&W’s recommendation is to follow these principles:

The input image is labeled “building” because the hidden neuron 41435 that normally activates the hubcap has an activation value of 0.32. If hidden neuron 41435 had an activation value of 0.87, the input image would be labeled “car”.

To see why this does not lead to interpretability, just note that requiring an activation of 0.87 for neuron 41435 is not sufficient. For simplicity, assume neuron 41435 has only two inputs, x1 and x2. What we have now is shown in Figure 1 below:

Legend: The output of a single neuron with two inputs is 0.87

Now suppose our activation function f is the popular ReLU function, which produces an output of z = 0.87. This means that for the values of x1, x2, w1 and w2 shown in the table below, an output of 0.87 is obtained.

Table Note: Various input methods can generate a value of 0.87

Looking at the table above, it’s easy to see that there are an infinite number of linear combinations of x1, x2, w1, and w2 that would produce an output of 0.87. The point here is that compositionality in NNs is irreversible, so meaningful semantics cannot be captured from any neuron or any collection of neurons.

In keeping with the B&W slogan “no explanation without semantics”, we declare that no explanation can ever be obtained from NN. In short, there is no semantics without compositionality, no interpretation without semantics, and DNNs cannot model compositionality. This can be formalized as follows:

1. There is no explanation without semantics [1]

2. No semantics without reversible compositionality [2]

3. Compositionality in DNN is irreversible [2]

=> DNN can’t explain (no XAI)

Finish.

By the way, the fact that compositionality in DNNs is irreversible has other consequences besides failing to produce interpretable predictions, especially in areas that require higher-level reasoning such as natural language understanding (NLU).

In particular, such a system really fails to explain how a child can learn how to interpret an infinite number of sentences from just a template like (<human> <likes> <entity>), since “John”, “Neighbor Girl”, “Total It’s the boy who came here in a t-shirt”, etc. are all possible instantiations of <human>, and “Classic Rock”, “Fame”, “Mary’s Granny”, “Running on the Beach”, etc. are all <entity> all possible instances of .

Because such systems have no “memory” and their composition cannot be reversed, they would theoretically require countless examples to learn this simple structure. [Editor’s note: This is precisely Chomsky’s questioning of structuralist linguistics, and thus started the transformational generative grammar that has influenced linguistics for more than half a century. 】

Finally, the authors highlight that more than three decades ago Fodor and Pylyshyn [2] presented a critique of NNs as cognitive architectures – they showed why NNs cannot model systemicity, productivity, and compositionality, all of which are Necessary to talk about any “semantics” – and this convincing critique was never perfectly answered.

As the need to address AI interpretability becomes critical, we must revisit that classic paper as it shows the limitations of equating statistical pattern recognition with advances in AI.

references:

Blog address: https://ift.tt/dtLy2ZM [1] Browne, Kieran, and Ben Swift. “Semantics and explanation: why counterfactual explanations produce adversarial examples in deep neural networks.” arXiv preprint arXiv:2012.10076 (2020). https ://ift.tt/S1xpi63 [2] Fodor, Jerry A., and Zenon W. Pylyshyn. “Connectionism and cognitive architecture: A critical analysis.” Cognition 28.1-2 (1988): 3-71. https:// For more content on ift.tt/xAPeGpK , click below to follow: Scan the code to add AI technology review WeChat account, contribute & join the group:

Leifeng Network

This article is reprinted from: https://www.leiphone.com/category/academic/JBtpckWlP4ZU7pkd.html
This site is for inclusion only, and the copyright belongs to the original author.