Presentations D Abstracts
Session G
Katsunori Miyahara and Hayate Shimizu
Discerning genuine and artificial sociality: a technomoral virtue to live with chatbots
N/A
Luise Mueller
Generative AI and Art as a Social Practice
Text-to-image generators like Open AI’s Dall-E, Midjourney or Stable Diffusion create images based on human-given natural language prompts. The data on which text-to-image generators are trained on is scraped from the web, and consequently contains an enormous quantity of images that were created by human artists.
Some artists criticize that companies use their artistic output for their text-to-image generator without proper authorization or compensation. In her written testimony submitted to the US senate subcommittee on Intellectual Property, artist and illustrator Karla Ortiz argues that “AI needs to be fair to the customers who use these materials, and also for creative people like me who make the raw material that these AI materials depend upon. These systems depend entirely on the work of humans, especially creatives such as visual artists, writers and musicians.”
But while text-to-image generators have indeed been trained on human-created art, they do not simply reproduce those images; rather, they create novel depictions of whatever they have been prompted. The artists’ complaint is thus not a complaint about copying or plagiarizing the artists’ work. The complaint is rather that their artistic output has been used and remixed – without authorization or compensation – to create art that can neither be described as wholly original nor as fully derivative.
In this paper, I ask whether the complaint has any (moral) merit, and if so, what its merit is. To make sense of the complaint, I discuss three models that conceptualize creating art as a social practice that is governed by rules and principles that distribute the rights, duties, burdens and benefits that arise within the practice: the moral-rights-oriented model, the goal-oriented model, and the reciprocity-oriented model.
The moral-rights-oriented model best fits the artists’ complaint. According to the moral-rights-oriented model, the rules that govern social practices functionally ensure that the moral rights of those participating in the social practice are protected. The complaint would then be that generative AI violates the intellectual property rights of human artists because human artists have an exclusive moral right to the use of their artistic output, even if that use does not consist of copying that output, but of using it as ‘input’ for novel images.
But this model is implausible because the argument supporting the complaint would apply equally to other human artists who draw inspiration from works of art that is freely available. Humans use visual input that consists of other artists’ creative work for their own creative process all the time. It seems wholly accepted that a certain artist’s style evolves from other artists’ styles, or that artists reference – and indeed remix – other creative work in their own creative output. Why would that constitute a moral rights violation in one case, but not in the other?
According to the goal-oriented model, social practices have certain goals, and the rules that govern those practices ensure that participants of the practice can successfully contribute to that goal. Examples are the rules of a sports game, driving regulations or the rules governing vaccine research. The rights and duties of those participating in the practice are tailored to the overall aim of contributing to the goal of the practice, which itself is often defined as what is beneficial for, or valuable to all. If this model was correct for the practice of creating art, then compensation claims or claims about wronging would be harder to justify because those claims must always be balanced against the larger goal of the practice. Take the example of the social practice of scientific discovery: here, it serves the goal of the practice that participants in the practice are allowed, and indeed encouraged, to use and build on the work of others.
The problem with the goal-oriented model is that while it may be useful for describing the practice of scientific discovery, it does not neatly fit the practice of creating art. The idea that the goal of creative art is to benefit society or to serve a specific goal is too narrow, because not all creative art has to serve a goal or be beneficial to society – you might even think that it is precisely the point of art that it does not serve a goal or is beneficial.
Both models thus have serious difficulties in explaining what goes on between artists and text-to-image generators: the moral-rights-based model is inadequate when applied to the relations between human artists, and the goal-oriented model is inadequate because it does not fit the point of the social practice of creating art. A model that does better on both counts is the reciprocity-based approach. On the reciprocity-based approach, the social practice of creating art is a practice that is based on the idea that artists reciprocally influence, inspire, reference and remix one another without copying each other’s work. According to the reciprocity-based model then, text-to-image generators are problematic when they take on the role of a free rider that enjoys the advantages of a given practice while at the same time failing to contribute to them. A more appropriate way to integrate text-to-image generators into the social practice of creating art is to have them make fair contributions to the benefits they take advantage of, and to distribute the burdens of generating those benefits more equally.
Pierre Saint-Germier
“I sing the body algorithmic” Machine Learning and Embodiment in Human-Machine collective music-making
In spite of recent progress in Computer Vision, Image Generation, Natural Language Processing, and (arguably) General Artificial Intelligence, AI research based on the application of deep learning techniques to digital or digitalized data seems confined to the digital realm. Several decades of work in the field of Embodied Cognition (Shapiro 2014) suggest however that there is a limit to the kind of capacities that can be conferred to artificial agents by algorithmic means. In cases where machine learning exploits data containing information about embodied states or processes (e.g., motion capture data, vocal signature data, recorded musical performances), it seems plausible that some sort of embodiment may be preserved through machine learning. What remains to be clarified is the sense in which data may be said to be embodied, and whether machine learning from such embodied data is sufficient to confer at least some of the advantages of embodiment to algorithmic agents. The present paper proposes to contribute to this clarification by a combination of conceptual analysis and experimental study, focusing on the case of human-machine co-improvisation in musical AI.
Research in the field of Embodied Music Cognition (Lesaffre et al. 2017) has shown that embodiment is essential to expressive and interactional properties of collective music performance. Furthermore, collective musical improvisation instantiates the sort of Continuous Reciprocal Causation (Clark 2008) for which Embodied Cognition approaches are particularly suited. Finally, the use of machine learning techniques has recently led to important progress in the design of algorithmic agents for collective improvisation. For instance, the SoMax2 application, designed at IRCAM, outputs stylistically coherent improvisations, based on a generative model constructed by machine learning, while interacting with a human improviser (Borg 2021). The musical agency of SoMax2 is essentially algorithmic in the sense that the physical properties of the sensors (microphones) and effectors (loudspeakers) play no role in the generation of the musical output. This makes human-machine co-improvisation a particularly relevant case study.
On the conceptual side, we argue for a distinction between two orthogonal dimensions of embodiment. On the one hand, the musicians’ embodiment qua multimodal resource provides visual and as well as auditory cues that facilitate musician coordination (Moran 2015). On the other hand, the contingencies of the musician’s body (e.g., the fact that a pianist has two hands of five fingers each) limit and shape the sort of musical signals that may be produced. This is in particular the source of instrumental idiomaticity and gestural expressivity in music (Souza 2017). In virtue of this limiting and shaping effect on the space of all possible musical signals, embodiment qua generative constraint allows listeners and co-improvisers to exploit low-level perceptual expectations, which are the basis for the perception and appreciation of musical expressivity (Meyer 1956; Gingras et al. 2016), as well
as coordination within collective improvisation (Vesper et al. 2010). While embodiment qua multimodal resource is not easily reflected in audio data and Musical Instrument Digital Interface (MIDI) data, these data plausibly bear the marks of the shaping effect of embodiment qua generative constraint. We submit the view that embodiment qua generative constraint may be transferred from the corpus data to the musical behavior of an algorithmic agent. The isolation of the concept of embodiment qua generative constraint provides an explanation of the sense in which algorithmic musical agents may nevertheless be embodied via machine learning.
The argument presented so far relies on the empirical assumption that the marks of embodiment in the corpus data are able to provide the agent with some of the benefits of embodiment. In order to fill this gap, we conducted a series of experimental studies. We collected a corpus of MIDI data by having an expert pianist record 7 miniature solo improvisations (min=2’03; max=3’13) and one long solo improvisation (10’30). We operated three manipulations to the resulting MIDI data, designed to selectively erase the marks of the bodily constraint: (a) the randomization of all the notes’ velocity (e.g., the MIDI encoding of the loudness of notes), erasing all the dynamical shapes while keeping all harmonic and melodic information; (b) the application of random octave jumps to all the notes, erasing melodic shapes, but keeping all dynamic and harmonic information; (c) the combination of (a) and (b). We then isolated randomly chosen 15-second excerpts from each track of the corps, applied the aforementioned manipulations to all of them, presented the resulting 32 excepts to a group of (self-declared) musician subjects (n=29) and asked them to judge, on a scale from 0 to 10, the pianistic plausibility of the excerpt. A one-way ANOVA revealed a significant effect of Manipulation (F=12.601, p<0.001). A post-hoc Tukey test showed that participants’ ratings were significantly higher for “Original” (M=6.775, SD=2.524) than for “Random Octaves (M=5.976, SD=2.862) (p=0.008) and for “Random Velocities and Octaves (M=5.795, SD=2.691) (p<0.001). This suggests that the randomization of octaves alone, and in combination with the randomization of velocity was sufficient to indeed remove the marks of embodiment in corpus data.
Second, we recorded a solo output of the SoMax2 application fed with each version of each track of the corpus, and randomly extracted a 15-second excerpt from the resulting improvisation. We asked the same group of musicians to evaluate the quality of the resulting improvisation. A one-way ANOVA revealed a significant effect of Manipulation (F=9.206, p<0.001). A post-hoc Tukey test showed that participants’ ratings were significantly higher for “Original” (M=5.699, SD=2.496) than for “Random Velocities and Octaves” (M=4.758, SD=2.766). This suggests that the erasure of the stronger marks of embodiment significantly diminished the perceived quality of artificial improvisations.
These results are in line with the view that the embodiment in the data makes a positive difference to the behavior learned from the data, coherent with the expected benefits of embodiment known from the Embodied Music Cognition literature. We have thus articulated a sense in which the behavior of algorithmic agents may benefit from some advantages of embodiment when their behavior is learned from data bearing the marks of the generative constraints of the body.
References
Borg, J. 2021, “The Somax 2 Theoretical Model”, Technical report, IRCAM.
Clark, A. 2008. Supersizing the Mind. Embodiment, Action, and Cognitive Extension. Oxford: University Press.
Gingras et al. 2016. “Linking melodic expectation to expressive performance timing and perceived musical tension.” Journal of Experimental Psychology—Human perception and performance, vol. 42, n. 4, pp. 594–609.
Lesaffre, M., Maes P.-J., Leman M. 2017. Routledge Handbook of Embodied Music Interaction. Abingdon: Routledge.
Meyer, L. 1956. Emotion and Meaning in Music. Chicago: University of Chicago Press.
Moran, N. et al. 2015. “Perception of ‘Back-Channeling’ Nonverbal Feedback in Musical Duo Improvisation”. In : PLOS ONE 10.6, pp. 1–13.
Shapiro, L. 2014. Routledge Handbook of Embodied Cognition. Abingdon, Routledge.
Souza, J. De. 2017. Music at Hand. New York: Oxford University Press.
Vesper C., et al. 2010. “A minimal architecture for joint action”, Neural Networks, vol. 23, Issues 8–9, pp. 998–1003
Alice Helliwell
Creativity, Agency, and AI
N/A
Session H
Nikhil Mahant
Is AI deception deception?
AI systems are known to induce mental states in their users — e.g., false beliefs, irresponsible desires, etc. — which are non-ideal or fall short of the desired norms for those states. Such behavior is often taken to be an instance of deception on part of AI systems and accordingly labelled ‘AI deception’. One approach to make sense of AI deception is to consider it to be an instance of ‘loose talk’: a useful pretense which, at its best, can serve some communicative or regulatory purposes and, at its worst, is a theoretically uninteresting case of anthropomorphizing. A second approach, one pursued by this article, is to take AI deception seriously and consider the possibility that AI systems can really deceive us.
By focusing attention on a subset of deceptive phenomena — i.e., cases of manipulation, lying, misleading, etc. — this article addresses an issue concerning the notion of ‘deception’ which has gathered some attention in recent discussions. One way to lay down the issue is by pointing out that taking AI deception seriously requires rejection of either (A) or (B) below:
A: Deception requires the deceiver to have mental/intentional states.
B: AI systems lack mental/intentional states like beliefs, wants, intentions, etc.
One set of philosophers have argued against (A) and motivated revisionary (or ameliorative) accounts of ‘deception’ that define it exclusively in terms of non-intentional, non-mental features of the deceiver.
This article has two aims: one, it provides arguments against rejecting (A) as an approach to account for AI deception; and two, it makes the case that rejecting (B) is partly a matter of decision grounded in our practical or theoretical aims (one that will become easier with technological progress).
The first argument against rejecting (A) is that abstracting away from mental/intentional states (which the revisionary accounts aim to achieve) would make it difficult to distinguish, on the one hand, between deceptive and non-deceptive acts and, on the other hand, amongst deceptive acts. Deceptive acts (e.g., lying) are distinct from non-deceptive acts (e.g., errors, mistakes, etc.) and the distinction between them is based on the presence (or absence) of the relevant intentions or mental states. Consider, for example, a news journalist who ends up uttering ‘Hilary won the 2016 election’ because of, say, a slip of tongue or a subconscious bias. In the absence of an intention to deceive on part of the journalist, we judge it to be an instance of error instead of a lie. Further, the distinctness of deceptive act-kinds is grounded in the distinctness of the mental/intentional states they involve. For example, the distinctness of lies from bald-faced lies is grounded in the distinctness of the underlying deceptive intentions: unlike liars, bald-faced liars do not intend to have their audience believe what they assert (they may even know that their audience knows that they are lying).
Another argument against rejecting (A) concerns the asymmetry in the notion of ‘deception’ that revisionary accounts of deception result in. It proceeds by noting the pervasiveness of mental states in the traditional characterization of deception: apart from an intention to deceive, such accounts invoke standing mental states in both the deceiver and the deceived. For instance, for X to lie that p to Z (inter alia) it is required that: (a) X has the belief that not-p, (b) X has an intention to lie (i.e., have Z believe that not-p) and, in case of successful lying, (c) Z forms the belief that not-p. Similarly, for X to manipulate Z to do Q (inter alia) it is required that: (a) X wants Z to do Q, (b) X has a conscious/subconscious intention to manipulate (i.e., to have Z do Q) and, (c) Z is an agent with their own interests, desires, and goals (and in successful cases of manipulation, X gets Z to do Q without regard to Z’s agency).
However, the paradigmatic case for AI deception has been an AI system deceiving a human, and revisionary accounts have focused exclusively on avoiding an appeal to the mental states of deceiver (while allowing that the influence on the deceived may be spelled out in terms of mental states). This suggests that the notion of ‘deception’ operative in revisionary accounts is one which allows for the possibility that entities capable of deceiving may not themselves be capable of being deceived. This stands in stark contrast with the traditional notion of deception, which approaches the phenomenon as one characterized by a ‘mind game’ between the deceiver and the deceived. A mere departure from the traditional notion of deception is, of course, not in itself an objection. However, adopting an asymmetric notion of ‘deception’ for AI systems (which is different from the notion that we use more generally for humans) would be an impediment in the integration of AI systems in the larger social world.
The second part of the paper further develops some existing strategies for rejecting (B). One strategy is to build upon some existing approaches that ground mental states not in lower-level features of a system (e.g., composition, complexity, etc.) but on a decision to consider the system as having mental states. The decision may itself be grounded in some further practical considerations, or a theorist’s purposes behind the use of terms like ‘mental states’. I argue that to the extent that integration of AI systems in human social world is a theoretically important objective, adopting a revisionary account of ‘mental’ terms like ‘intention’, ‘belief’, etc. is a superior strategy to account for AI deception. This part of the article will also argue that how fruitful or easy it is to make the decision is a matter that depends (at least in part) on technological advancements.
Nathaniel Gan
Can AI systems imagine? A conceptual engineering perspective
Some AI systems perform their target tasks using simulated representations of real-world scenarios, and these simulations are sometimes called ‘imagination’ in the engineering literature (e.g. Wu, Misra, and Chirikjian, 2020; Zhu, Zhao, and Zhu, 2015). If AI systems have imagination, this may have significant implications for other AI capacities and human-AI relations. This presentation will consider if the term ‘imagination’ is appropriate in this context, taking a conceptual engineering perspective.
Conceptual engineering is an approach to philosophical issues focused on the description, evaluation, and improvement of our representational devices (Chalmers, 2020). From this perspective, one possible approach to the question of whether AI systems can imagine is to see it as a descriptive task, in which the goal is to identify the conditions that delineate our present concept of imagination, such that it becomes clear whether AI systems fall under the extension of our present concept. It will be argued that this approach has limited prospects, because AI simulations share some, but not all, of the key properties associated with paradigmatic instances of imagination.
Alternatively, the issue may be framed as a normative conceptual engineering project, in which the central question is whether our goals are better served by a concept of imagination that admits or excludes AI systems. It will be argued that normative considerations give us reasons to countenance AI imagination. An AI-friendly concept of imagination would serve to highlight the fact that simulation-based AI systems resemble humans in ways not shared by other more popular forms of AI, and would also explain some similarities between simulation-based AI and human intelligence.
Some suggestions for adopting an AI-friendly concept of imagination will be offered. Besides legitimising the descriptions of some purported cases of AI imagination, we might become open to applying the term ‘imagination’ more broadly to other simulation-based AI systems. Countenancing AI imagination might also involve broadening the way we view human-AI relations, considering not only the behavioural output of AI systems but also their inner workings. Moreover, we might become open to the possibility of AI having capacities not typically associated with artificial systems, such as perspective-taking or creative reasoning.
REFERENCES
Chalmers, D. (2020). What is conceptual engineering and what should it be? Inquiry: An interdisciplinary journal of philosophy. doi:10.1080/0020174X.2020.1817141
Wu, H., Misra, D., & Chirikjian, G. S. (2020). Is that a chair? Imagining affordances using simulations of an articulated human body. IEEE International Conference on Robotics and Automation.
Zhu, Y., Zhao, Y., & Zhu, S. (2015). Understanding Tools: Task-Oriented Object Modeling, Learning and Recognition. IEEE Conference on Computer Vision and Pattern Recognition, 2855–2864. doi:10.1109/CVPR.2015.7298903
Guido Löhr
Conceptual engineering as a method in the philosophy of AI
N/A
Bradley Allen
Conceptual Engineering Using Large Language Models
We describe a method, based on Jennifer Nado's definition of classification procedures as targets of conceptual engineering, that implements such procedures using a large language model. We then apply this method using data from the Wikidata knowledge graph to evaluate concept definitions from two paradigmatic conceptual engineering projects: the International Astronomical Union's redefinition of PLANET and Haslanger's ameliorative analysis of WOMAN. We discuss implications of this work for the theory and practice of conceptual engineering.