The Malick Illusion: Perceptual segmentation in The Thin Red Line


By Luis Antunes Rocha.

“The image, in terms of sound, always has the basic nature of a question. Fundamental to the cinema experience, therefore, is a process – which we might call sound hermeneutic – whereby the sound asks where? and the image responds here!” (Altman 1980: 74)

“With the new place that noises occupy, speech is no longer central to films. Speech tends to be reinscribed in a global sensory continuum that envelops it, and that occupies both kinds of space, auditory and visual.” (Chion 1994: 155)

Terrence Malick’s The Thin Red Line (1998) heralded the director’s return 20 years after the release of Days of Heaven (1978). The Thin Red Line is Malick’s adaptation of James Jones’ 1962 novel about the Battle of Guadalcanal, in which American and Japanese troops fought for control of the Guadalcanal Islands in the Pacific Ocean. The film has received as much passionate praise as criticism (see Michaels 2009). Several critics have castigated the film’s use of long, complex interior monologues that concatenate the voices of multiple characters.

Roger Ebert, for example, conjectures that the interior monologues create “an almost hallucinatory sense of displacement, as the actors struggle for realism, and the movie’s point of view hovers above them like a high school kid all filled with big questions” (Ebert 1999). According to Ebert,

“The soundtrack allows us to hear the thoughts of the characters, but there is no conviction that these characters would have these thoughts. They all seem to be musing in the same voice, the voice of a man who is older, more educated, more poetic and less worldly than any of these characters seem likely to be: the voice of the director.” (Ebert 1999)

Despite Ebert’s criticism, the most experimental and innovative aspect of the film seems to reside precisely in its “contemplation of existential questions and their interrogation of nature’s beauty and indifference to man’s fate” (Michaels 2009: 61) through the multiplicity of interior monologues that blend to create a collective stream of consciousness. Some criticism of the film suggest that the multiple internal monologues and “their abstract reflections and philosophical concerns seem at odds with the gritty circumstances of the soldiers at war” (Ibid.). However, “blurring individual identity in a polyphonic chorus” (Ibid.: 62) has resulted in a perceptual and aesthetic effect that provides insight into the process of segmentation in film.

This perceptual effect occurs at the level of speech. It creates ambiguity regarding the identity of the voices that are heard and what is seen on the screen in synchrony with those voices. There is an almost uninterrupted flow of speech that combines dialogue (sound whose source is the speech of the characters) with internal monologues. Malick often overlays the sound of the interior monologues (extradiscursive and offscreen sound) with images in which the characters’ lips move, creating the illusion that the internal monologues are diegetic and onscreen.

Although the internal monologues are, by nature, commentaries (Percheron and Butzel 1980; Chion 1994), Malick synchronizes them with the images, making them appear to be the dialogues shown onscreen. The way that Malick interconnects commentary and dialogue removes our capacity to determine whether the sources of the sounds we hear are the lips of the characters or the offscreen voiceover. The lack of clear attribution of agency and identity to these voices explains why Michel Chion, in his study of The Thin Red Line (Chion 2007), dissociates the monologues from the actual characters engaged in dialogue.

This problematizes Chion’s definition of diegetic and non-diegetic sound by “pulling” non-diegetic sound to the inside of the diegetic soundscape. It is paradoxical because this perceptual illusion in which the lip movements of the characters match the voiceover causes the viewer to visualize non-diegetic sound, which by definition is not supposed to be visualized. This is different from Chion’s description of acousmatic sound. In Malick’s perceptual illusion, viewers attribute the wrong source to the sound heard. Our perception deceives us for milliseconds, sometimes for a couple of seconds, into believing that we know the source of a sound when actually we do not.

Ventriloquism is a perceptual illusion based on the same principle of attributing a sound source (speech) to a visual target (the lip movements of the characters) (see Altman 1980). The difference is that the Malick illusion is noticeable, whereas ventriloquism can be imperceptible. Malick’s illusion challenges the assumption that speech, in film, has a merely communicative role, but also Chion’s position that “speech is no longer central to films” (Chion 1994: 155). Moreover, the Malick illusion offers tremendous insight into the role of speech sounds in continuity and segmentation in film editing.

The Malick illusion creates ambiguity regarding the proper attribution of agency to the speech sounds heard by viewers. At many points in the film, a second or two elapses before it becomes clear whether a voice belongs to an internal monologue or to a character on screen. This disorientation is often enhanced by synching the voiceover with the character’s lip movements. The interior monologue may be in the voice of the character onscreen or the voice of another character. This frustrates the viewers’ ability to segment characters and scenes. As we become aware of the illusion, we stop feeling confident in our capacity to distinguish (segment) between characters (agents) and scenes (events).

Unsurprisingly, film has a clearly illusory perceptual nature. Chion writes,

“Is the notion of cinema as the art of the image just an illusion? Of course: how, ultimately, can it be anything else? [a] phenomenon of audiovisual illusion, an illusion located first and foremost in the heart of the most important relationships between sound and image.” (Chion 1994: 5).

However, what is relevant about the Malick illusion is the consequences it has when spectators become aware of it; rather than naturalizing its perceptual and aesthetic effect, it makes us feel that we lack perceptual control over the film.

Malick manipulates Chion’s concept of added value, which Chion defines as,

“the expressive and informative value with which a sound enriches a given image so as to create the definite impression, in the immediate or remembered experience one has of it, that this information or expression ‘naturally’ comes from what is seen, and is already contained in the image itself” (Chion 1994: 5).

However, this concept differs from the Malick illusion, in which speech sounds denaturalize the image.

Perceptual segmentation

I define perceptual segmentation in film as spectators’ capacity to group features that allow them to individuate objects, agents and events. Segmentation is vital to the experience of film: “By distinguishing sensory inputs that arise from distinct events, or from distinct components of complex actions, successful segmentation promotes percepts that are meaningful, temporally stable, and behaviorally-useful” (Zhou et al. 2007: 641). At a perceptual level, segmentation may be highly dependent on processes of multisensory integration; this is the case for film, in which sound and image are combined to form percepts.

Experimental brain research on the processes by which sight and hearing modulate segmentation shows that visual and auditory information are not mechanically combined to form percepts but are “weighted” according to principles of optimization and multisensory integration in a matter of milliseconds. The time interval during which stimuli are integrated into a single object or event is called the time window of sensory integration.

In film, ventriloquism illustrates this concept. Depending on how synchronous sound and image are, they are integrated within this time window and attributed to the same agent even when they belong to different agents. Ventriloquism is a perceptual illusion in which the movements of a character’s lips and voice are perceived as belonging to the same source; in reality, the sound comes from the speakers and the visual cue from the screen. In this case, spectators segment the two cues into a single, unified agent.

As with many other aspects of film perception, segmentation is performed unconsciously because it would be “costly” to devote conscious attention to the constant work of individuating agents and events. However, a viewer may become aware of it when segmenting agents and events becomes a difficult task, for example, when the sound and image of a film are not synchronous. Viewers may assess such instances as flawed editing or technical issues. However, The Thin Red Line presents a case in which the viewer becomes aware of the ventriloquism illusion without ascribing it to an editing problem or a technical issue. Instead, it is seen as a particular aesthetic effect used by the director to engage our senses in the perceptual experience of the film.

The Malick Illusion

The Thin Red Line explores segmentation at a low perceptual level by minimally manipulating audiovisual cues in time intervals of milliseconds. By blending the internal monologues of multiple voiceovers and superimposing them over characters talking on the screen, Malick plays with the spectator’s capacity to segment, or to identify the exact source of sounds and images. In other words, the film manipulates cues so that they appear to belong to the same character (agent), when in fact they do not. In doing so, the multiple internal monologues are mixed with dialogues

This aesthetic and perceptual device has undergone a constant evolution in Malick’s œuvre. It was first used with a single character, Holly, in his feature film debut Badlands (1973) and it gradually evolved to create stream of consciousness that simultaneously belongs to a single identity and to a choir of other voices in The Thin Red Line. Although all of Malick’s films use voiceover to frame the viewer’s experience through subjectivity, beginning with The Thin Red Line, this strategy is taken to the extreme of a speech illusion in which the viewer is no longer always capable of distinguishing between internal monologues and dialogues.

Fig. 1

Fig. 1

In the set-up for Fig. 1, the film gradually moves through the exposition phase as the viewer is introduced to one of the main characters, Private Witt, and the setting of Guadalcanal. Before the character is introduced, there are steady camera shots of a forest in which everything seems to float, accompanied by a voiceover. It is apparent that the voiceover is studio-recorded and has been added to the score and the sounds of the character’s environment. Then, the score slowly fades out, leading to an establishing shot of Private Witt (Fig. 1) and then to a reverse-shot showing of what seems to be the object of his gaze. In the reverse shot, a voice is heard that seems to have been recorded on location (not in a studio), speaking over the images from Private Witt’s point of view.

Fig. 2

Fig. 2

The differences in sound quality between the first voiceover and the second, in addition to the Kuleshov Effect created as the viewer watches Witt and sees from his perspective, makes the viewer attribute the voice and the image of Witt to a single agent and event. However, the viewer is then shown Private Witt (Fig. 2) and realizes that either the voice is not his or it belongs to him at a later point in the film. When Private Witt appears in the medium shot (Fig. 2), it takes a second or so to conclude that the voice is a voiceover and not this character’s voice. For a fraction of a second, we are tricked into integrating the voice and image into one event and agent. Beginning with this scene, the viewer is constantly shown voices and faces that do not directly and immediately match, and the director uses the milliseconds in the time window of sensory integration to create ambiguity regarding the sound sources and their alignment with the images.

Although we become aware of these perceptual ambiguities, we continue to expect that the voice is associated with the onscreen faces and moving lips. This is because the phenomenon happens at a low level of perception over which humans have limited control. In Fig. 2, the voiceover goes silent immediately before the image is shown. When the voice returns, the viewer may believe that this voice is now the onscreen character’s voice, but in a few seconds it becomes clear that the viewer is not hearing a direct recording of the onscreen character’s speech. This effect occurs only because it takes place within the minimal interval of the time window, during which the viewer’s capacity to segment is tested.

Fig. 3

Fig. 3

After these scenes, there is a long sequence of shots that combines direct sound from what is occurring onscreen and from the shooting locations of other shots in the sequence (Fig. 3). During this sequence of shots, direct sound from a character speaking onscreen is incorporated. At other times, the lips of a character are also moving, and a second or so is needed to realize that the sound playing is not direct sound from the shot – there is no corresponding sound from the character’s speech. This game of congruence and incongruence between the image and the sound and our capacity to segment agents and events from the integration of sight and hearing occurs within milliseconds and creates an effect that makes sound and image resemble a subjective stream of consciousness. Malick also creates an effect in which the direct sound of a character speaking becomes a voiceover when it overlaps with the following shot. Thus, even if it begins as direct speech from the character, it becomes a voiceover when it stops being directly associated with the image.

Fig. 4

Fig. 4

The same phenomenon that occurs with voiceovers happens with other types of sound. The sounds of the environment blend and overlap with with the next shot, making it impossible for a second or so to segment the two events. Adding to the impossibility of integrating individual agents and events, there are also elements of visual continuity that enhance this difficulty to segment. Private Witt is gazing toward the left side of the frame (Fig. 4), which creates an illusory idea of continuity suggesting that he is looking at the child whose voice is heard. When the image track cuts to Witt, the child’s voice is still heard, and it seems as if Witt is listening to it. This brief illusion fades away when it becomes clear that Witt and the voice are in physically separate locations, allowing the viewer to segment the two events as two separate events.

Fig. 5

Fig. 5

In many moments of the film, characters look offscreen, creating a sense of continuity with scenes in which they are not physically present. This is subtly orchestrated and enhanced by allowing the sound of one shot to overlap with the shot that follows, making use of the spectator’s expectation of hearing a reply to what has just been said because of the illusion that the character is listening to what was said, when in fact what is heard and seen are two different scenes and two different events. The blending of agents and events forms a unity by removing segmentation for several milliseconds.

Fig. 6

Fig. 6

Dialogue is particularly prone to such experimentation and illusion because it creates expectations of replies. Therefore, we are likely to believe that we are hearing a reply to what another character has just said because we did not segment the events into two separate events. In one scene (Figs. 5-6), Brigadier General Quintard says, “The Marines have done their job, now it’s our turn,” and looks toward the right side of the frame, creating directional continuity with Lt. Col. Gordon Tall, who answers in a voiceover. For a second or so, it is impossible to determine whether the voice is coming from Lt. Col. Gordon Tall’s mouth. He could be mumbling, but he is not. Because the exchange is a dialogue, it is natural to expect a reply from Quintard, and the shot/reverse-shot construction of the scene enhances such expectations.

Fig. 7

Fig. 7

The viewer expects the sound of the voice to come from the lips of the man on the screen. This is how information is usually integrated in the time window. In this case, however, the lack of lip movements indicates that the two stimuli do not match. Of course, at a cognitive level we infer that the voice is expressing the character’s thought, but until this is clear, the speech seems to be coming from the character’s mouth. In the shots that follow, the two characters walk together on the boat, and their dialogue is mixed with voiceover, requiring the viewer to continually assess whether their lips are moving and whether or not the sounds heard are direct sounds. Although viewers might stop assessing or having a full awareness, it seems unlikely that this illusion is merely accidental. Rather, it seems to be an effect that Malick is searching for: a blending of dialogue and thoughts, a merging of what is inside the character with what is outside. Those milliseconds of doubt in which our capacity to segment agents and events is diminished facilitate the Malick illusion.

Fig. 8

Fig. 8

This effect also occurs when a character moves his lips in synchrony with the voiceover. In this case, it seems natural to attribute the lip movements to the voice because they seem to blend. In the shot from Fig. 7, it might be quicker to identify the voice as not belonging to the man depicted onscreen – to segment the voice and the man in the image as two separate agents. In this case, it is easier perhaps because the viewer has heard the voice before and knows that it belongs to a different character. Over the course of the film, the viewer becomes acquainted with the voices and associates them with certain characters. However, in Fig. 8, the voice belongs to the character shown, and he is moving his lips even though it is a voiceover. In this case, high-order inferences do not help viewers segment but instead extend the perceptual effect of the integration of the stimuli within the time window. With each movement of the character’s lips, the viewer must assess whether the stimuli belong to the same agent and event.

Fig. 9

Fig. 9

In yet another scene (Figs. 9-10), Malick alternates a character’s speech and the character’s own voiceover with direct sound as the character comments on his killing of a man. During the voiceover, his lips move and he exhibits a series of facial expressions.

Fig. 10

Fig. 10

For milliseconds, it is impossible to distinguish the direct sound of his speech from his voiceover. This may seem an insignificant detail, but it determines how the viewer experiences subsequent events in the film. In a way, the film teaches viewers not to rely on their immediate perceptions by manipulating their expectations and making them doubt their perceptual capacities. In this same scene, one of the characters is shown from behind while a voice is heard, but it is necessary to see his face and watch his lips moving before we can be certain of whether we are listening to a voiceover or to the character’s direct speech.

Perceptual segmentation in film

One of my purposes in analyzing perceptual segmentation in film is to show that even minute details of a film can shape our cinematic experience. Segmentation may seem to occur with a single type of stimulus, such as visual or auditory. However, experimental cognitive research indicates that segmentation depends on multisensory processes. The idea that, when it comes to multi-sensory integration of spatio-temporal segmentation cues, “one plus one does not always equal two” (Zhou et al. 2007), makes sense considering that the real world consists of a diversity of stimuli and that human perception is better adapted to perceive that diversity than to perceive a single, isolated stimulus. Because one plus one does not always equal two, it seems important to understand segmentation as further evidence that while film is an audiovisual medium, the spectator’s perception of that medium is multisensory.

I have written about this idea elsewhere (Antunes 2012), but it is worth noting that the idea of film as a multisensory experience has already garnered attention in film studies, especially by Laura Marks (2000, 2002), Vivian Sobchack (2004), Jennifer Barker (2009), Thomas Elsaesser and Malte Hagener (2009) and Charles Forceville and Eduardo Urios-Aparisi (2009). Marks, Sobchack and Barker have discussed the phenomenal and haptic qualities of the film experience; Elsaesser and Hagener have edited a volume exploring the influence of the senses in theorizations of the moving image; and Forceville and Urios-Aparisi have advanced the multimodal metaphor. The main argument of these studies is that film can depict and is capable of eliciting multiple senses. This emergent interest indicates a new direction in the field’s understanding of how film perception can influence film aesthetics. However, with few exceptions, there has been little discussion of segmentation.

Segmentation has been studied in film and literature by Jeffrey Zacks and Joseph Magliano, who have shown the importance of individuating events and characters. Their concept of segmentation is highly relevant because it counterbalances another dimension of film that has been given much more attention: continuity. Although a viewer’s perception of continuity in edited material is fundamental to the specific way in which film is apprehend, because we need to understand how spectators perceive continuity across jumps in space and time, segmentation is also important.

Segmentation not only has perceptual importance but is also at the core of meaning generation. Without the ability to segment, meaning could not be inferred because the capacity to separate causes and effects would be lost. I find it useful to use the concept of the time window of sensory integration, or simply “time window” (see Calvert and These 2004), to better understand segmentation in film and build on the work of Zacks and Magliano.

The time window implies that forming percepts from multisensory sources involves more than the mechanical integration of stimuli. Instead, it is based on the so-called variable weight summation, which depends on optimization, selective attention and other principles underlying multisensory processes. Crossmodal cases, such as the ventriloquism effect (e.g., Bertelson 1999), the McGurk effect (McGurk and MacDonald, 1976) and “hearing flashes” (Shams et al. 2000) are examples of how multisensory cues influence perception within certain time windows.

The time window refers to a real interval of time during which stimuli from different modalities are integrated. It can vary from optimal values of 50 to 250 milliseconds, after which stimuli are no longer considered as belonging to the same agent or event. Talsma and colleagues, for example, state that “there is a relatively broad integration time window of as large as 250 ms, in which stimuli from different modalities typically tend to be integrated into a single multisensory percept” (Talsma et al. 2009: 314). Therefore,

“in a badly mastered audio track of a movie, there can be a noticeable desynchronization between visual and auditory information streams. Such a desynchronization can also be observed in real life, such as in the case of a distant thunderstorm, a music concert in a large arena, or a fast jet aircraft that appears to fly ahead of its sound.” (Talsma et al. 2009, 313)

The optimal point of integration, also called the point of subjective simultaneity, has been reported to be approximately 50 milliseconds:

“[B]ehavioural studies have shown that there appears to be an optimum relative stimulus timing wherein visual stimuli precede auditory stimuli by about 50-100 ms, such that these stimuli are subjectively most likely as being perceived as simultaneously.” (Talsma et al. 2009: 314)

Decisions made within this time window are called temporal order judgments. These are central both to defining what we understand as simultaneity and to our capacity to segment agents and events and to separate and individuate them. This capacity to distinguish events and agents (whether characters or objects) makes film viewing, which relies on the millesimal precision of the time window of multisensory integration, possible.

What can this time interval of integration reveal about our perception of a film? I believe it can explain two of the main mechanisms that guide our attention to a film, namely, continuity (Smith 2005, 2011; Cutting and Candan 2012; Bordwell 2002, 2006; Branigan 1992), and segmentation (Zacks 2009, 2010; Magliano, Miller and Zwaan 2001), though this article focuses on segmentation alone. Continuity can refer to two levels of film. First, it refers to the cause and effect relationships in the narrative, which Bordwell (2011) describes as “what enables us to understand films,” and is mostly based on high-level inferences about the story or the meaning we create from what we see and hear. In other words, the first level involves the meaning that spectators create from the continuity between two or more agents and events.

The other level of continuity operates at a perceptual level, as expressed by Tim Smith’s attentional theory of cinematic continuity (Smith 2005, 2011), which explains the illusion of continuity that allows viewers to perceive edited film (with spatial and temporal jumps). Although continuity at both levels has received considerable attention in studies of film perception, the role of segmentation has been relatively neglected.

Segmentation has been studied at the level of the narrative by Zacks and Magliano, but these authors do not analyse in depth the perceptual mechanisms that explain how and why spectators are able to individuate agents and events and segment them. Although as viewers we need to perceive continuity in film (even illusory continuity), we also need to individuate agents and events; otherwise, a film would simply be an unorganized collection of indiscernible stimuli. The time window shows that a time gap of expectancy exists during our perception of film. This concept therefore helps to explain continuity and segmentation.

The examples I have offered from The Thin Red Line mainly address the integration of speech through sound and image. In these examples, sound, image and language are the relevant modalities at play. The multisensory nature of film is not, however, confined to these three modalities; humans can perceive modalities beyond the classic five senses through an audiovisual medium. There is not necessarily a direct correspondence between the sources of the stimuli and a single perceptual modality. The eye, for example, contains photo-receptors but also noci-, thermo-, and mechano-receptors. In addition, certain neural populations (in fact, the majority) can process more than one quality of stimuli.

Fig. 11

Fig. 11

An example in which segmentation occurs within a non-verbal modality can be found in Quentin Tarantino’s Pulp Fiction (1994), in which segmentation occurs at the level of the character’s body, or through a haptic modality. A notable continuity error occurs in one scene that indicates a different level of segmentation. The following example from Pulp Fiction demonstrates that segmentation also depends on attention and causality. The role of attention in segmentation is in line with Zacks and Magliano (2011).

Fig. 12

Fig. 12

The scene is constructed as follows: in the first shot (Fig. 11), Jules (Samuel L. Jackson) and Vincent (John Travolta) are inside a flat in which they have just shot a man who owed money to their boss. Behind a door, a fourth character, his presence unnoticed, hides with a loaded gun. This fact is shown to the spectator so as to create suspense: Jules and Vincent are unaware of the fourth character and are in danger. Although they have just killed someone, it becomes obvious that now they are under threat. The fourth man emerges from behind the door and empties the gun at them from two feet away (Figs. 13-14).

Fig. 13

Fig. 13

This event is shown in a medium frontal shot of the man. The viewer briefly believes that he has shot Jules and Vincent dead and segments their deaths as an event. However, the shot then shows the facial expression of the man who emptied the gun change, indicating that either something did not go as planned or the man is terrified of what he has just done. A reverse-shot of Jules and Vincent is then shown (Fig. 12), revealing that the man missed and that the bullets went instead into the wall behind them. This new information changes the viewer’s initial segmentation.

Fig. 14

Fig. 14

The change in the segmentation of the event is motivated by new information about cause and effect. The causal relationships established in this scene allow the creation of suspense and the ‘turns’ in the events. The scene first shows Vincent and Jules in perfect control of the situation, then suggests that they are not as in control as it initially seemed. Then, in a matter of seconds, the viewer shifts from thinking that they are dead to realizing that they are in control again.

This is a dynamic scene that manipulates the segmentation of events at a cognitive level of cause and effect. To show that attention is a fundamental element in the segmentation of this scene’s events, it is necessary to discuss a continuity error in the first shot. The bullet holes are already in the wall before the man shoots his gun. However, I would not be surprised if those holes went unnoticed by most first-time viewers of this film. There is no causal relevance to the holes, and viewers are therefore perceptually blind to them, especially because their attention is so occupied by the suspense and intensity of the scene.

Even in the unlikely case that spectators do notice the holes, they may attribute them to the age of the building; there is not yet a causal connection between the holes in the wall and the bullets because the gun has not yet been fired. Therefore, the viewer attributes the holes in the wall to a different agent and includes them in another event segment. However, after viewing the entire scene, the events can be segmented differently, and the holes become evidence of the event just experienced because they can now be attributed to a known agent and event.

This example shows that segmentation can depend both on high-order factors such as attention and on narrative aspects of causality that manifest at a low perceptual level but are not pure perceptual illusions, such as those in The Thin Red Line. The two levels of continuity previously noted indicate that segmentation assists with narrative comprehension (cognitive level) and perception (sensory level). Although it seems that the Pulp Fiction scene operates in modalities that have a bodily nature, such as pain, touch and even startle responses caused by seeing a man shooting straight at the camera, these modalities are by no means exclusive. The scene demonstrates how segmentation can occur at a non-verbal level.


My goal in exploring segmentation in film is to open avenues of investigation for film scholars studying film perception and to analyse a paradoxical case in which non-diegetic sound is captured by the image in what I call the Malick illusion. Segmentation complements the concept of continuity and provides more tools for analyzing and understanding how film perception influences spectators’ aesthetic experience of film and how it determines their narrative understanding of a film. It can also be used by filmmakers as a tool to create suspense and plot “turns.”

I have examined segmentation using the time window of multisensory integration concept, which is widely used in multisensory studies. This concept contributes to the field’s understanding of continuity and segmentation, two activities that are at the core of film perception. The fact that stimuli are integrated within an interval of time allows researchers to examine the miniscule details of a film to determine how the director or editor (consciously or not) has created perceptual ambiguity regarding the segmentation of agents (characters and objects) and events. In The Thin Red Line, this ambiguity is not a mere perceptual effect, although it is that, too; rather, it is primarily a perceptual device used to create a blending of voices that resemble what can be abstractly considered stream of thought/consciousness.

This device also has significant consequences for character development and character engagement. It generates ambiguity regarding characters by removing the viewers’ belief in their own capacity to segment and individuate them, creating a hybrid located between multi- and single-character narratives.

Time window and segmentation are aspects of film that can be applied to piecemeal analysis of film and can reveal much about the multisensory nature of film. In the case of The Thin Red Line, these aspects identify speech as an additional modality to sight and hearing. However, in the case of Pulp Fiction, these same aspects cut across other modalities, such as pain or touch. By describing this theoretical framework and conducting this film analysis, I hope to have encouraged more thorough studies of segmentation in film and the multisensory nature of the film-viewing experience.

Luis Rocha Antunes  was a Harvard University Fellow and a University of Copenhagen Fellow. He is a Ph.D. candidate in film studies at the University of Kent and the Norwegian University of Science and Technology. He has published work on the topics of film perception, experiential film aesthetics, Norwegian cinema, the multisensory film experience and Arctic film, appearing in Essays in Philosophy and the Journal of Scandinavian Cinema as well as in anthologies published by Queens University Press and Routledge.

Further reading: “The Site of Nature: Exteriority and Overexposure in The Thin Red Line, by Trevor Mowchun.


Altman, Rick (1980), “Cinema as Ventriloquism,” Yale French Studies 60, “Cinema/Sound,” pp. 67-79.

Antunes, Luis Rocha (2012), “The Vestibular in Film: Orientation and Balance in Gus Van Sant’s Cinema of Walking,” Essays in Philosophy vol. 13, no. 2, pp. 522-549.

Barker, Jennifer M. (2009), The Tactile Eye: Touch and the Cinematic Perception, Berkeley: University of California Press.

Bertelson, Paul (1999), “Ventriloquism: A Case of Cross-modal Perceptual Grouping,” in G. Aschersleben, T. Bachmann and J. Müsseler (eds.), Cognitive Contributions to the Perception of Spatial and Temporal Events, Amsterdam: Elsevier.

Bordwell, David (2002), “Intensified Continuity: Visual Style in Contemporary American Film,” Film Quarterly vol. 55, no. 3, pp. 16-28.

___ (2006), The Way Hollywood Tells It: Story and Style in Modern Movies. Berkeley: University of California Press.

___ (2007), Poetics of Cinema, New York: Routledge.

Bordwell, David (2011), “Common Sense + Film Theory = Common-Sense Film Theory?”, Observations on Film Art, May. Accessed 29 October 2014.

Branigan, Edward (1992), Narrative Comprehension and Film, New York: Routledge.

Calvert, Gemma A. and Thomas Thesen (2004), “Multisensory integration: methodological approaches and emerging principles in the human brain,” Journal of Physiology 98, pp. 191–205.

Calvert, Gemma, Charles Spence and Barry E. Stein (eds.) (2004), The Handbook of Multisensory Processes, Cambridge, MA: MIT Press.

Chion, Michel (1994), Audio-vision: Sound on Screen, New York: Columbia University Press.

___ (2007), The Thin Red Line, London: BFI Classics.

Cutting, James (2005), “Perceiving scenes in film and in the world,” in Joseph Anderson and Barbara Anderson (eds.), Moving Image Theory: Ecological Considerations, Carbondale: University of Southern Illinois Press.

Cutting, James and Ayse Candam (2012), “Movies, Evolution, and Mind: From Fragmentation to Continuity”, Cornell University, College of Arts and Sciences. Accessed 29 October 2014.

Elsaesser, Thomas and Malte Hagener (2009), Film Theory: An Introduction Through the Senses, New York: Routledge.

Forceville C.J. and E. Urios-Aparisi (eds.) (2009), Multimodal Metaphor, Berlin: Mouton de Gruyter.

Magliano, Joseph and Jeffrey Zacks (2011), “The Impact of Continuity Editing in Narrative Film on Event Segmentation,” Cognitive Science, vol. 35, no. 8, pp. 1489-517.

Marks, Laura (2000), The Skin of the Film: Intercultural Cinema, Embodiment, and the Senses, Durham: Duke University Press.

___ (2002), Touch: Sensuous Theory And Multisensory Media. Minnesota: University Of Minnesota Press.

McGurk, Harry and John MacDonald (1976), “Hearing Lips and Seeing Voices,” Nature 264, pp. 746–8.

Michaels, Lloyd (2009), Terrence Malick (Contemporary Film Directors), Champaign: University of Illinois Press.

Murch, Walter (2001), In the Blink of an Eye, Los Angeles: Silman-James Press.

Percheron, Daniel and Marcia Butzel (1980), “Sound in Cinema and its Relationship to Image and Diegesis,” Yale French Studies 60, “Cinema/Sound,” pp. 16-23.

Shams, Ladan, Yukiyasu Kamitani and Shinsuke Shimojo (2000), “What you see is what you hear,” Nature 408, p. 788.

Spence, Charles, Jordi Navarra, Argiro Vatakisb, Massimiliano Zampinib, Salvador Soto-Faraco and William Humphreys (2005), “Exposure to asynchronous audiovisual speech extends the temporal window for audiovisual integration,” Cognitive Brain Research 25, pp. 499-507.

Smith, Tim (2005), An Attentional Theory of Continuity Editing, doctoral thesis, Edinburgh: University of Edinburgh.

Smith, Tim (2011), “The Attentional Theory of Cinematic Continuity”, Birkbeck University of London. Accessed October 29, 2014.

Sobchack, Vivian (2004), Carnal Thoughts: Embodiment and Moving Image Culture, Berkeley: University of California Press.

Stein, Barry and Alex Meredith (1993), The Merging of the Senses, Cambridge: MIT Press.

Talsma, Durk, Daniel Senkowski and Marty Woldorff (2009), “Intermodal attention affects the processing of the temporal alignment of audiovisual stimuli,” Experimental Brain Research 198, pp. 313-328.

Zacks, J. and C. Kurby (2008), “Segmentation in the perception and memory of events,” Trends in Cognitive Science. February, 12.2, pp. 72–79.

Zacks, J., N. Speer and J. Reynolds (2009), “Segmentation in Reading and Film Comprehension,” Journal of Experimental Psychology, 138.2, pp. 307–327.

Zacks, Jeffrey (2010), “How We Organize Our Experience into Events”, The American Psychological Association. Accessed 29 October 2014.

Zhou, Feng, Victoria Wong and Robert Sekuler (2007), “Multi-sensory integration of spatio-temporal segmentation cues: One plus one does not always equal two,” Experimental Brain Research, July, vol. 180, no. 4, pp. 641-654.

Zwaan, R., J. Magliano and J. Miller (2001), “Indexing Space and Time in Film Understanding,” Applied Cognitive Psychology 15, pp. 533-545.

Leave a Reply