Presentations B Abstracts

Session A


Sven Eichholtz         

How AI Will Not Learn Reference: A Critique Of Cross-Modal Vector Space Alignment     

Recently developed artificial deep Neural Networks (dNNs) known as Large Language Models (LLMs) have shown unprecedented proficiency in producing seemingly meaningful language. A prevalent critique of these systems is that they are mere next-word predictors whose outputs and internal representations are not in fact meaningful (Bender and Koller 2020; Bender et al. 2021; Landgrebe and Smith 2021). This has led to renewed interest in what Searle (1980) called "the problem of intrinsic intentionality" and Harnad (1990) "the symbol grounding problem": how can the outputs and internal representations of AI systems that were given only meaningless text as training data become meaningfully related to the world independently from human interpretation? Central to this multifaceted issue as it applies to dNNs—dubbed "the vector grounding problem"—is referential grounding (Mollo and Millière 2023). For a system to understand and produce referring expressions, it must be able to meaningfully associate them with their referents. The rapid and global adoption of dNNs for aiding in countless tasks where linguistic meaning is non-trivial makes solving this problem a pressing challenge.

On a new proposal by Anders Søgaard (2022; 2023), progress in solving this problem can be made by developing an AI system that can match linguistic vector representations encoded by an LLM to visual vector representations of objects encoded by a computer Vision Model (VM). His idea is that if the vector space of linguistic representations generated by the LLM is geometrically isomorphic to the vector space of visual representations generated by the VM, then the two spaces can be aligned such that the AI system acquires a linear mapping between them, thereby allowing it to ground expressions in non-linguistic representations of their referents. Experiments by Li et al. (2023) and Merullo et al. (2023) show that there can indeed be an isomorphism between such vector spaces and that the accuracy of linear mappings picking out the right linguistic vector based on a given image increases with model size. This is thought to imply that, with enough data, machines could acquire referential semantics in virtue of being able to map expressions to referents and vice versa.

This paper offers a critique of the proposal, investigating whether the vector representations of such a system would indeed be referentially grounded such that there would be a sense in which the system can be said to grasp referential semantics. As emphasised by Shanahan (2023), when it comes to judging whether we should ascribe such human capabilities to an AI system, it is important to look at precisely how the system functions. In that spirit, I start by considering the details of Søgaard's proposal in tandem with the experimental set-up of Li et al. Then, I offer three objections to the claim that such an AI system could capture referential semantics. First, I argue that the system does not learn reference on the suggested methods for aligning the vector spaces because these introduce what I call "semantic contamination" into the system. Second, I argue that a linear mapping between vector spaces does not show the system's vector representations are referentially grounded because geometric isomorphism between vector spaces is a structural resemblance relation, which is neither necessary nor sufficient for fixing representation or reference. Third, I argue that while the system's vector representations might have the right causal-informational relations to their putative referents, they cannot have the right historical-normative relations which are commonly held to be necessary for fixing representation and reference. I conclude that developing an AI system that integrates cross-modal vector space alignment is a dead-end for making progress on the vector grounding problem and, by extension, for finding a way for machines to grasp referential semantics.


Bender, Emily M., Timnit Gebru, Angelina McMillan-Major, and Shmargaret Shmitchell. 2021. “On the Dangers of Stochastic Parrots: Can Language Models Be Too Big?” In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, 610–623. Virtual Event Canada: ACM.

Bender, Emily M., and Alexander Koller. 2020. “Climbing towards NLU: On Meaning, Form, and Understanding in the Age of Data.” In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 5185–5198. Online: ACL.

Harnad, Stevan. 1990. “The Symbol Grounding Problem.” Physica D: Nonlinear Phenomena 42 (1): 335–346.

Landgrebe, Jobst, and Barry Smith. 2021. “Making AI Meaningful Again.” Synthese 198 (March): 2061–2081.

Li, Jiaang, Yova Kementchedjhieva, and Anders Søgaard. 2023. “Implications of the Convergence of Language and Vision Model Geometries.” arXiv: 2302.06555.

Merullo, Jack, Louis Castricato, Carsten Eickhoff, and Ellie Pavlick. 2023. “Linearly Mapping from Image to Text Space.” arXiv: 2209.15162.

Mollo, Dimitri Coelho, and Raphaël Millière. 2023. “The Vector Grounding Problem.” arXiv: 2304.01481.

Searle, John R. 1980. “Minds, brains, and programs.” Behavioral and Brain Sciences 3 (3): 417–424.

Shanahan, Murray. 2023. “Talking About Large Language Models.” arXiv: 2212.03551.

Søgaard, Anders. 2022. “Understanding Models Understanding Language.” Synthese 200 (6): 443.

Søgaard, Anders. 2023. “Grounding the Vector Space of an Octopus: Word Meaning from Raw Text.” Minds and Machines 33 (1): 33–54.


Céline Budding        

What do large language models know? Tacit knowledge as a potential causal-explanatory structure

Large language models (LLMs), such as ChatGPT (OpenAI, 2022) and Claude (Anthropic AI, 2023) have recently exhibited impressive performance in conversational AI, generating seemingly human-like and coherent texts. In contrast to earlier language modeling methods, transformer-based LLMs do not contain any prior linguistic knowledge or rules, but are trained on next-word prediction: predicting the most likely next word in an input sequence based on large quantities of training data. Despite this seemingly simple training objective, transformer-based LLMs far exceed the performance of earlier methods.

Given this impressive performance, there is an increased interest in not only studying the behavior of these systems, but also investigating the underlying processing and how these systems make their predictions. That is, the focus is not only on the performance, but increasingly also on something like what Chomsky (1965) called the underlying competence. For instance, it has been proposed that LLMs might not just perform next-word prediction based on surface statistics, but that they in fact learn something akin to symbolic rules (Lepori et al., 2023; Pavlick, 2023) or even knowledge (Dai et al., 2022; McGrath et al., 2022; Meng et al., 2022; Yildirim & Paul, 2023). While it might seem appealing to attribute knowledge to LLMs to explain their impressive performance, what seems to be missing in the recent literature is a clear way to determine if LLMs are capable of acquiring a form of knowledge and, if so, when knowledge can be attributed.

Taking inspiration from earlier debates between symbolic and connectionist AI in the 1980s and 90s (e.g. Clark, 1991; Fodor & Pylyshyn, 1988), I propose that tacit knowledge, as defined by Davies (1990, 2015), provides a suitable way to conceptualize potential knowledge in LLMs. That is, if the constraints as set out by Davies (1990) are met by a particular LLM, that system can be said to have a form of knowledge, namely tacit knowledge. Tacit knowledge, in this context, refers to implicitly represented rules or structures that causally affect the system’s behavior. As connectionist systems are known not to have explicit knowledge, in contrast to earlier symbolic systems, tacit knowledge provides a promising way to nevertheless conceptualize and identify meaningful representations in the model internals. The aim of this contribution is to further explain Davies’ account of tacit knowledge, in particular the main constraint, and to show that at least some current transformer-based LLMs meet the constraints and could thus be said to acquire a form of knowledge.

While Davies’ account of tacit knowledge (1990) bears similarities to earlier accounts of tacit knowledge (e.g. Chomsky, 1965) insofar as it appeals to implicit rules, it is targeted specifically to connectionist networks. That is, Davies challenged the claim that connectionist networks cannot have knowledge and proposed that their behavior might be guided by implicit, rather than explicit, rules. To further substantiate his claim, Davies proposed a number of constraints, the most important being causal systematicity. Causal systematicity specifies what kind of implicit rules might be learned by a connectionist system by constraining the internal processing. In particular, causal systematicity requires that the internal processing of a system reflects a given pattern in the data, for example semantic structure. For example, imagine a network that only processes two kinds of inputs: ones related to ‘Berlin’ and ones related to ‘Paris’. In a causally systematic network, there should be one causal structure, called a causal common factor, that processes all inputs related to Berlin, and another processing all inputs related to Paris.

For LLMs to be attributed tacit knowledge, they should thus meet the constraint of causal systematicity: their internal processing should reflect the semantic structure of the data. Nevertheless, Davies himself argued that more complex connectionist networks with distributed representation, which LLMs are an example of, cannot meet this constraint. If this objection were to hold, tacit knowledge would be unsuitable for characterizing potential knowledge in LLMs. In this contribution, I challenge Davies’ objection and argue that his objection does not hold for contemporary transformer-based LLMs. More precisely, architectural innovations in the transformer architecture, as compared to the connectionist networks Davies was concerned with, ensure that LLMs could in principle be causally systematic. As such, Davies’ objection to applying tacit knowledge to connectionist systems does not hold for transformer-based LLMs.

So far, I have explained the main constraint that should be met for attributing tacit knowledge and addressed a potential objection. I have not yet shown, however, whether any LLMs in fact meet this constraint. In the final part of my contribution, I analyze recent technical work by Meng and colleagues (2022), who identify representations of what they call factual knowledge in the model internals of a recent LLM. I show that these representations seem to fulfill the requirement of causal systematicity. While further verification of these results is needed, this suggests that some LLMs, like the one investigated by Meng and colleagues (2022), could meet the constraints for tacit knowledge and, as such, can be said to acquire a form of knowledge.

Taken together, the aims of this contribution are as follows: 1) to illustrate why LLMs might be thought to learn more than mere next-word prediction, 2) to propose Davies’ account of tacit knowledge as way to characterize and provide conditions for attributing a form of knowledge to LLMs, 3) to address Davies’ objection to applying tacit knowledge to connectionist networks, and 4) to illustrate how this framework can be applied to contemporary LLMs. Although this work is exploratory, these results suggest that LLMs are capable of acquiring a form of knowledge, given that they meet the conditions as set out by Davies.


Anthropic AI. (2023). Introducing Claude. Anthropic.

Chomsky, N. (1965). Aspects of the Theory of Syntax. MIT Press.

Clark, A. (1991). Systematicity, Structured Representations and Cognitive Architecture: A Reply to Fodor and Pylyshyn. In T. Horgan & J. Tienson (Eds.), Connectionism and the Philosophy of Mind (pp. 198–218). Springer Netherlands.

Dai, D., Dong, L., Hao, Y., Sui, Z., Chang, B., & Wei, F. (2022). Knowledge Neurons in Pretrained Transformers (arXiv:2104.08696). arXiv.

Davies, M. (1990). Knowledge of Rules in Connectionist Networks. Intellectica. Revue de l’Association Pour La Recherche Cognitive, 9(1), 81–126.

Davies, M. (2015). Knowledge (Explicit, Implicit and Tacit): Philosophical Aspects. In International Encyclopedia of the Social & Behavioral Sciences (pp. 74–90). Elsevier.

Fodor, J. A., & Pylyshyn, Z. W. (1988). Connectionism and cognitive architecture: A critical analysis. Cognition, 28(1), 3–71.

Lepori, M. A., Serre, T., & Pavlick, E. (2023). Break It Down: Evidence for Structural Compositionality in Neural Networks (arXiv:2301.10884). arXiv.

McGrath, T., Kapishnikov, A., Tomašev, N., Pearce, A., Hassabis, D., Kim, B., Paquet, U., & Kramnik, V. (2022). Acquisition of Chess Knowledge in AlphaZero. Proceedings of the National Academy of Sciences, 119(47), e2206625119.

Meng, K., Bau, D., Andonian, A., & Belinkov, Y. (2022). Locating and Editing Factual Associations in GPT (arXiv:2202.05262). arXiv.

OpenAI. (2022, November 30). ChatGPT: Optimizing Language Models for Dialogue. OpenAI.

Pavlick, E. (2023). Symbols and grounding in large language models. Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences, 381(2251), 20220041.

Yildirim, I., & Paul, L. A. (2023). From task structures to world models: What do LLMs know? (arXiv:2310.04276). arXiv.


Mitchell Green and Jan Michel     

What Robots Could Do with Words

This essay aims to extend our understanding of the limits and potential of AI. Specifically, we are interested in the question whether AI systems, particularly robots, can perform speech acts in the technical sense of that term used in the philosophy of language (Austin 1962). Building on previous work (X&Y 2022), we first investigate the psychological preconditions needed to perform speech acts in this sense. Second, we take a more nuanced look at the speech acts that robots could perform in different settings and at different stages of their development.

Following WWW (2018), we argue first that human mental states may be divided into four broad categories: cognitive (beliefs, memories, predictions), conative (desires, plans, objectives), affective (emotions and moods), and experiential states such as the taste of lemon or the smell of sulfur. We also contend that it is not necessary that all four of these types of state come in a package: some insects appear to have cognitive, conative, and experiential but not affective states (Barrachi et al 2017), and congenital abnormalities in humans may prevent visual, auditory, etc., experience while leaving cognitive, conative and affective capacities intact (Weiskrantz 2004). We next argue with the help of a thought experiment that technologically feasible robots (which we define as robots that could plausibly be built in the next 50 years) would exhibit behavior best explained by ascription of cognitive and conative states, but that only these two types of state are needed for the performance of important speech acts such as assertions and commands. In arguing for the ability of robots to perform speech acts, we thereby bypass the question whether they can be sentient – a question on which we remain agnostic. We also show that such robots can understand the language they use so long as we avoid the implausible assumption that all understanding must involve an “Aha!” episode or an experience of insight (de Regt 2017).

In response to an earlier version of this argument (X&Y 2022), Butlin and Viebahn (ms) argue that any system that makes assertions must in fact be sentient. For them, two necessary conditions need to be met for a system to make assertions: (1) It must produce utterances that have a descriptive function. (2) It must be capable of being sanctioned for failing to meet the standards of assertion. Butlin and Viebahn spell out this liability-to-sanction idea as follows: sanctions for asserting require that (a) interlocutors keep track of asserters’ credibility, (b) asserters are sensitive to this, and (c) losing credibility is bad for asserters. Butlin and Viebahn argue that while machines can meet condition 1, as well as conditions 2a and and 2b, it is doubtful that they can meet condition 2c. Their reason is that if we assume that a machine cannot be sentient, nothing can be either good or bad for it. Hence, one must either relinquish the claim that a machine can assert, or hold, controversially, that a machine can be sentient.

In reply, we argue that an outcome can be good or bad for a machine without its being sentient. Note, first, that lack of water or sunlight may be bad for a plant in spite of its lacking sentience. One might protest that an outcome is good or bad for an entity only if it is alive. That view however faces counterexamples: it is bad for an automobile to be driven with its emergency brake engaged, and bad for a democracy when free public discourse is suppressed. (Democracies comprise living organisms but are not themselves alive.)

To strengthen our case that an outcome can be good or bad for a machine without its being sentient, we develop a scenario in which robots have objectives that do not reduce to those of their designers; certain situations may then be good or bad for these machines to the extent that they promote or undermine the achievement of those objectives. To this end we elaborate the thought experiment given elsewhere (X&Y 2022): RobotCops are mobile police robots deployed by a municipality to prevent accidents and sanction traffic offenders. In short, RobotCops can move freely on the hard shoulder and communicate with each other; they are equipped with cameras, microphones and loudspeakers, among other things; they have connections to various databases where they can retrieve information on, for example, incoming weather or a motorist’s driving record. RobotCops are built on a sophisticated deep-learning architecture, can communicate with human drivers and, if necessary, sanction them for their traffic offenses within the scope of a discretion (for details cf. X&Y 2022).

When a robot behaves exactly as it was intended to by its designers, parsimony may lead us to infer that the only entities in the scenario possessing objectives are those designers. However, contemporary robots already may be, and technologically feasible robots likely will be, so complex as to behave in ways their designers could not have predicted. In such cases, any apparent objectives such robots have cannot be reduced to those of their designers. For instance a RobotCop’s learning algorithm may lead it over time to tend to detain motorists driving cars of a certain make, to the surprise of its designers. (This may be because such drivers are objectively more likely to speed than other drivers; see Lönnqvist et al. 2019; Gardner 2022.) That may in turn create in the RobotCop the objective of detaining drivers of certain makes of cars particularly. In that situation, the presence of road conditions preventing it from pursuing such vehicles will make achieving the robot’s objective more difficult and thus be bad for it.

Because all this can occur without the robots being sentient, Butlin and Viebahn’s challenge has been met. This finding in turn sheds light on the kinds of communicative acts of which AI’s may be capable, as well as on the nature of speech acts themselves. We close by raising doubts that the argument can be generalized across all speech acts.


Austin, J.L. (1962), How to Do Things with Words, 2nd Edition, J. Urmson and M. Sbisa, eds. (Harvard).

Baracchi, D., M. Lihoreau, and M. Giurfa (2017), ‘Do Insects Have Emotions? Some Insights from Bumble Bees,’ Frontiers in Behavioral Neuroscience, vol. 11,

Butlin, P., and E. Viebahn (ms), ‘AI Assertion,’ preprint available at

Damassino, N., and N. Novelli (2020), ‘Rethinking, Reworking, and Revolutionizing the Turing Test,’ Minds and Machines 30: 463–468.

de Regt, H. (2017), Understanding Scientific Understanding (Oxford).

Gardner, C. (2022), ‘These 5 Car Brands Have the Most Speeding Tickets in 2022,’

Lönnqvist, J.E., Ilmarinen, V.-J. and Leikas, S. (2020), ‘Not only Assholes Drive Mercedes. Besides Disagreeable Men, Also Conscientious People Drive High-Status Cars,’ International Journal of Psychology 55 (4): 572–576.

Weiskrantz, L. (2004), ‘Blindsight,’ in The Oxford Companion to the Mind, R. L. Gregory, ed. (Oxford).


Session B


Silvia Dadà

AI codes of ethics and guidelines. Are there really common principles?



Jan-Hendrik Heinrichs

AMAs, function creep and moral overburdening

Designing artificially intelligent systems to function as moral agents has been suggested as either a necessary or at least the most effective way to counter the risks inherent in intelligent automation [2]. This suggestion to generate so-called Artificial Moral Agents (AMAs) has received ample support [see the contributions in 7] as well as critique [9]. What has yet to be sufficiently taken into account in the discussion about AMAs is their effect on the moral discourse itself. Some important considerations in this regard have been presented by Shannon Vallor [8] and by Sherry Turkle [3] under the heading of moral deskilling. The core claim of the moral deskilling literature is that the employment of artificially intelligent systems can change spheres of activity which used to require moral skills in such a way that these skills lose their relevance and thus cease to be practiced.

This contribution will argue that deskilling is just one among several changes in the moral landscape that might accompany the development and dissemination of AMAs. It will be argued that there are two complementary trends which AMAs might initiate, both of which might fall under the heading of functions creep as defined by Koops: „Based on this, function creep can be defined as an imperceptibly transformative and therewith contestable change in a data-processing system’s proper activity.” [4, 10]. These developments are a) the moralization of spheres of action previously under little or no moral constraints and b) changing – maybe rising – and increasingly complex standards of moral conduct across spheres of action.

a) The former trend – moralization of additional spheres of action – occurs when in the process of (partially) automating a certain task moral precautions are implemented, which have not been a part of previous considerations. A common example is the explicit, prior weighing of lives which automated driving systems are supposed to implement, but which typically do not play a role in a real driver’s education, much less their reaction during an accident. Automatization – be it of cars, government services or any other sphere of activity – typically requires actively revising or maintaining the structures of a given process and therefore generates the option to include moral considerations where there previously were none or few. The availability of established AMA-techniques is likely to influence stakeholders to include such moral considerations, whether for product safety reasons or for mere commercial advantage.

b) The latter trend – the change of standards of moral conduct – is an effect of the requirements of intelligent automatization on the one hand and of professionalism on the other. Compared to humans, AMAs and their algorithmic processes employ different processes of behavioural control in general and of observing moral constraints in particular [6]. Thus, even if automation of tasks tries to mimic pre-established moral behaviour and rules there will be differences in the representation, interdependence, and acceptable ambiguity of moral categories in the human original and the AMA implementation. Furthermore, the implementation of moral constraints in automated systems already is a professional field which tries to live up to extremely high standards – as exemplified by the increasingly complex approaches to algorithmic fairness [11]. Accordingly, it is to be expected that the field will aim to implement high moral standards in their products, sometimes beyond what would be expected of human agents in the same sphere of activity.

While both trends seem to be positive developments at first hand, there is relevant risk that changing and complex moral standards in more and more spheres of action overburden an increasing number of people, making them increasingly dependent on external guidance or setting them up for moral failure and its consequences. This development combines two effects which have previously been identified in literature. On the one hand, it seems to be a case of what Frischmann and Selinger called ‘techno-social engineering creep’ [4], that is the detrimental change of our collective social infrastructure and standards through individual, seemingly rational choices. By individually implementing AMA-algorithms in appliances in several spheres of activity for reasons of safety (or market share), we change the moral landscape throughout. On the other hand, it is similar to what Daniel Winkler has identified as one of the threats of human enhancement, namely that standards suited for highly functioning individuals will overburden the rest of the community[5]. The present trends differ from Winkler’s analysis insofar as it might affect all human agents and not just those who forgo some form of cognitive improvement.

Clearly, the proliferation and change of moral standards does not merely carry risks. It has the potential to generate relevant social benefits, which need to be considered in a balanced account. However, a purely consequentialist analysis of these trends would lack regard for the important dimension of procedural justification. While changes in the moral landscape often occur without explicit collective attention or explicit deliberation, there seem to be minimal conditions of the legitimacy, such as not providing any affected party with good reasons to withhold their consent [1]. The current contribution will sketch the consequentialist and procedural constraints on the trends identified above and try to spot a path between relinquishing the design of AMAs and alienating human agents from moral discourse.


1. Scanlon, T., What we owe to each other. 1998, Cambridge, Mass.: Belknap Press of Harvard University Press. ix, 420 p.

2. Wallach, W. and C. Allen, Moral machines. Teaching robots right from wrong. 2009, Oxford ; New York: Oxford University Press. xi, 275 p.

3. Turkle, S., Alone Together: Why We Expect More from Technology and Less from Each Other. 2011, New York: Basic Books.

4. Frischmann, B. and E. Selinger, Re-Engineering Humanity 2018, Cambridge: Cambridge University Press.

5. Wikler, D., Paternalism in the Age of Cognitive Enhancement: Do Civil Liberties Presuppose Roughly Equal Mental Ability?, in Human Enhancement, J. Savulescu and N. Bostrom, Editors. 2010, Oxford University Press: Oxford / New York.

6. Liu, Y., et al., Artificial Moral Advisors: A New Perspective from Moral Psychology, in Proceedings of the 2022 AAAI/ACM Conference on AI, Ethics, and Society. 2022, Association for Computing Machinery: Oxford, United Kingdom. p. 436–445.

7. Anderson, M. and S.L. Anderson, eds. Machine Ethics. 2011, Cambridge University Press: Cambridge.

8. Vallor, S., Moral Deskilling and Upskilling in a New Machine Age: Reflections on the Ambiguous Future of Character. Philosophy & Technology, 2015. 28(1): p. 107-124.

9. van Wynsberghe, A. and S. Robbins, Critiquing the Reasons for Making Artificial Moral Agents. Science and Engineering Ethics, 2019. 25(3): p. 719-735.

10. Koops, B.-J., The Concept of Function Creep. Management of Innovation eJournal, 2020.

11. Fazelpour, S. and D. Danks, Algorithmic bias: Senses, sources, solutions. Philosophy Compass, 2021. 16(8): p. e12760.


Phillip Honenberger

Fairness in AI/ML Systems: An Integrative Ethics Approach



Arzu Formánek

Is it wrong to kick Kickable 3.0? An affordance based approach to ethics of human-robot interaction