The Outcomes of Fifth-Grade Emergent Bi/Multilinguals’ Introduction to a Visual Metalanguage When Constructing Scientific Explanations in Hong Kong

The visual mode provides emergent bi/multilinguals an essential resource to construct scientific explanations. Yet, while a metalanguage is used to describe the written mode of scientific language such as, claim, evidence, reason; there is little research that makes students aware of the metalanguage of a visual mode. We propose an introduction to the visual metalanguage will ensure emergent bi/multilinguals better access to the visual mode. This study employs an instrumental case study to examine the introduction of visual metalanguage to a fifth-grade science class. Two cameras record ten emergent bi/multilinguals as they construct scientific explanations in nine lessons. We use a framework informed by social semiotics to analyse the meanings made. The data revealed that an awareness of the visual metalanguage led to an enhanced commitment to illustrate the explanation of the phenomenon, illuminated key concepts and provided more context to the audience. In addition, teacher questioning became more focused. Downloaded from Brill.com02/26/2022 04:03:19PM via free access 310 Williams and Tang ASIA-PACIFIC SCIENCE EDUCATION 7 (2021) 309–342


Introduction
Creating and interpreting visual representations is considered an essential requirement when constructing explanations in science (National Research Council, 2012, 2013. Using multiple representations, including visual representations, when constructing explanations has been found to develop students' understanding in science (Tippett, 2016;Tytler & Prain, 2010). This includes students lacking the advanced knowledge of the language of instruction necessary to construct verbal explanations or to participate in verbal arguments (Williams, Tang & Won, 2019;Lee, 2005;Ryoo & Bedell, 2017). These students, described here as emergent bi/multilinguals, require support due to the complex language of science. A sophisticated level of knowledge of the language of instruction is necessary to comprehend the content-specific vocabulary and additional grammatical constructions such as nominalization (Fang, 2005;Halliday, 2004;Lemke, 1990). In contrast, meanings made through visual representation require a lower level of knowledge of the language of instruction (Kress & van Leeuwen, 2006). In this study, we argue that visual representations provide emergent bi/multilinguals with an additional resource to construct explanations in science. Contemporary language theories confirm the capacity for emergent bi/ multilinguals to communicate meanings visually when constructing explanations. As suggested by social semiotics, language comprises multiple meaningmaking systems known as modes, which also include visual systems (Halliday, 1978;Kress & van Leeuwen, 2006). A mode is defined in this article as a meaning-making system that evolves with the communicative needs of a discourse community. Recent studies have shown that the use of the visual mode in the construction of explanations plays an important role in supporting emergent bi/multilinguals' learning and meaning-making in science (Williams et al., 2019;Ryoo & Bedell, 2017). This finding is consistent with research in multimodality, which has examined how students make scientific meanings by integrating the visual mode with other modes, notably speech and gesture (e.g., Tang et al., 2014;Kress et al., 2014;Yeo & Gilbert, 2017).
With multiple modes in action during meaning-making, different areas of research have come into focus over time, for example, the use of multiple representations in science, which may include the visual mode (Ainsworth, 2008;Gilbert & Treagust, 2009;Kozma, 2003). Other areas of research have investigated how a learner's knowledge develops using more than one representation, for example, representational competence, which includes the knowledge of the form and function of representations in different modes (diSessa, 2004;Kozma & Russell, 2005). This knowledge is also considered in the construction of representations, such as drawing scientific explanations (Prain & Tytler, 2012). As Waldrip et al. (2010) specified, "learning about new concepts cannot be separated from learning both how to represent these concepts as well as what these representations signify in the world" (p. 68). Other research areas that include the visual mode are Gilbert's (2007Gilbert's ( , 2008 investigations of external representations and Lemke's (1998) explorations of visual communication in scientific language.
Despite our increasing understanding of the visual mode, few educators have explicitly taught emergent bi/multilinguals how to utilize the visual mode as an additional resource to construct scientific explanations. Yet studies on visual representations in science have shown promising results for students constructing scientific explanations. Sandoval et al. (2000) noted that students were able to explain complex ideas more intelligently if they had the support of visual representations than if they did not. Furthermore, the visual representations created by students, even inaccurate ones, helped illuminate students' understanding. In other research, the process students used to depict visual representations was found to help deepen their scientific understanding (Ainsworth & VanLabeke, 2004;Gilbert & Treagust, 2009;Tippett, 2016). From these encouraging results in the use of visual representations in science, it seems reasonable to suggest that the visual mode has potential for emergent bi/multilinguals. Thus, a visual metalanguage presents a currently unexplored potential resource for emergent bi/multilinguals.
To investigate the visual mode as a potential resource for emergent bi/multilinguals, this study aims to raise students' awareness of the form, function, and constraints of the visual mode, a need that has been reported in previous research (e.g., Ainsworth, 2006;Prain & Tytler, 2012). An awareness of representations in the visual mode requires an awareness of a visual metalanguage, which is defined in this study as a "language for talking about language, images, texts and meaning-making interactions" (New London Group, 2000, p. 24). A metalanguage makes explicit the terminology used to describe any symbolic system (Ellis, 2004). The potential of teaching and using a metalanguage can be seen in second-language acquisition and literacy education (e.g., Basturkmen et al., 2002;Hu, 2010) and in science education focusing on literacy (e.g., Tang & Rappa, 2010). While few studies to date have investigated the impact of metalanguage on emergent bi/multilinguals in science, beneficial outcomes have been found, such as making emergent bi/multilinguals' needs explicit, ensuring meaningful interaction, and enabling a sense of accomplishment (Borg, 1998). Consequently, exposing emergent bi/multilinguals to a metalanguage of visual elements appears warranted. Therefore, we ask the question: "How does an explicit understanding of the visual mode through a metalanguage help emergent bi/multilinguals in science?" This research endeavor may be of significance as emergent bi/multilinguals are one of the fastest-growing school populations today (Tereshchenko & Archer, 2014). In the United States it has been predicted that by 2030, 40% of school students will be learning English as an additional language (Thomas & Collier, 1997). In Hong Kong, most local schools use English instruction in science due to social and economic influences (Perez-Milans, 2017). These influences also attract local citizens to seek independent schools with English instruction. This study takes place in an independent bilingual school in Hong Kong and is founded on the notion that the visual mode presents communication opportunities to emergent bi/multilinguals in science. As a result, we surmise that supplementary knowledge of a visual mode will broaden the students' range and capacity for representing scientific meanings when constructing visual representations. To achieve our aim, this study adopts a visual metalanguage created by Tang, Won and Treagust (2019) that embraces the visual mode conceptualized by Kress and van Leeuwen (2006). The metalanguage was introduced to the teacher and then to her fifth-grade science class via a student checklist. Over a 9-month period, we investigated the teacher's and 10 emergent bi/multilingual students' use of the visual metalanguage as they constructed visual representations that sought to explain an unknown phenomenon.

1.1
Overview of Learning When Constructing Visual Representations According to Vygotsky (1978), constructing visual representations is an active experiential process involving language and mediation. We explore the essential connection between language and learning through social semiotics, as exemplified in sociocultural learning theory. From this perspective, language is seen as a semiotic system (Halliday, 1978;Kress & van Leeuwen, 2006). Apart from oral language, multiple semiotic systems include "algebraic symbol systems, works of art, writing, schemes, diagrams, maps, and mechanical drawings, all sorts of conventional signs and so on" (Vygotsky, 1981, p. 137). However, regardless of the type of semiotic system, meanings are made with the purpose of communicating that meaning: A producer (seen as a social agent) creates signs that a reproducer interprets (Kress, 2010;Kress & van Leeuwen, 2006). The participants involved in the meaning-making process internalize the signs; signs provide the medium for thoughts to occur, revealing the integral link between language and learning (Vygotsky, 1978). As signs are created for a social purpose, it seems reasonable that learning necessitates a social experience to provide a context and means for language (Vygotsky, 1978).
As diagrams are a form of language, constructing visual representations collaboratively in science is one of the essential ways that emergent bi/ multilinguals learn science effectively. However, to comprehend how emergent bi/multilinguals communicate in a visual mode, we must first understand how they use language to communicate. The theory of translanguaging offers an explanation that posits signs that evolve in social and cultural contexts and accumulate to form part of an emergent bi/multilingual's communicative repertoire. In addition to other signs from non-linguistic modes such as audio, gesture, and tactile, visual signs are said to form part of emergent bi/multilinguals' semiotic repertoire (García & Li, 2014). Emergent bi/multilinguals draw upon and construct signs as they communicate. In other words, to make meaning, emergent bi/multilinguals create and combine signs from verbal systems (from different national languages or dialects) and non-verbal systems. This corresponds to a social semiotic notion that all meanings are multimodal (Kress, 2010).
During the process of construction, the visual representation becomes a tool of semiotic mediation in two distinct ways. First, constructing a scientific explanation through a visual representation can expose unseen mechanisms, components, and concepts that allow students to unearth and organize complex ideas (Coll et al., 2005); uncover relationships between components (Schwarz et al., 2009); and allow ideas to become visible (Forbes et al., 2015). Furthermore, materiality means visual representations are autonomous and transferable, which makes it possible for them to mediate between reality and theory (Knuuttila, 2005;Morgan & Morrison, 1999). When a visual representation takes the form of an explanatory tool, it is likely to include a combination of practical, representative, and theoretical meanings (Ainsworth, 2006;Maschietto & Bartolini Bussi, 2009). This can be further understood by considering the meaning-making functions of all representations. In social semiotics, there are three major meaning-making functions or metafunctions (Halliday, 1978): presentational -how ideas are deployed in relation to the world, orientational -what motivates a viewpoint or interaction, and organizational -the connections between the design of elements in the larger representation (Lemke, 1998).
Second, constructing representations collaboratively mediates meaning as it enables more knowledgeable others to confront previously held assumptions of their peers. For example, Bracey (2017) found that during the constructing of visual representations, English learners drew support from their communicative repertoires as well as each other as they argued. This action facilitated their sense-making of galaxy collisions, which is a complex abstract concept. Students scaffolded ideas by building from one another until a more comprehensive understanding was produced (Bracey, 2017). Constructing representations therefore has the potential to facilitate language learners' higher order mental functioning as they challenge each other's ideas with their own language (Brooks, 2009). This enhances a student's explanation by giving rise to new knowledge while generating further questions (e.g., Williams et al., 2019). Thus, the purposeful act of explaining a scientific phenomenon by constructing representations means semiotic processes are at work that cause students to build and revise scientific knowledge.
In this study we adopt the position that constructing visual representations is a fundamental part of the semiotic processes in science and that, in addition to being transferable entities, they present communication avenues for emergent bi/multilinguals.

A Metalanguage for Describing Visual Elements in a Diagram
Constructing visual representations is not only an integral part of science, it is also a discipline-specific skill that students need to successfully participate in science (Tang & Moje, 2010). As such, drawing scientific representations is not an innate skill that children already possess; it is a literacy skill that must be acquired or developed through their formal education, much like reading and writing. For instance, there are many conventional and symbolic ways of meaning-making in scientific diagrams that are not obvious to most people. Some students could discern the "visual language" on their own, but most children do not unless they are guided by a teacher or an expert (Lemke, 2000).
To teach and learn the conventions of interpreting and creating visual representations, teachers and students need a shared metalanguage to explicitly talk about the form and function of the visual mode. A metalanguage is defined as a set of terminologies for describing language use and functions (Basturkmen et al., 2002). For example, to learn the English language, a metalanguage that includes terms such as verb, noun, and clause is often used by language teachers and specialists to teach the grammatical conventions of English. Metalanguage is not confined to describing verbal language. The New London Group (1996) expanded the concept of metalanguage to include language for talking about images and all meaning-making interactions.
In science education, Tang, Won and Treagust (2019) developed a metalanguage for describing scientific images based on Kress and van Leeuwen's (2006) initial work on visual grammar. This metalanguage was originally developed for researchers and analysts to describe and categorize a range of ideas represented in students' scientific drawings. The theoretical basis of this metalanguage is derived from social semiotics, which suggests that meanings are made by creating and putting together signs in each mode in meaningful relationships.
In social semiotics, researchers often examine how various "symbolic units" from a mode are integrated to form meanings. In a verbal-linguistic mode, the symbolic units are comprised of words that are combined in various ways to form a unit of meaning according to what Lemke (1990) called a semantic relationship. For instance, saying "friction is a type of force" and "an example of a force is friction" are similar ways of putting different words together to make the same semantic relationship of hyponym (classification). In a visual mode, the symbolic units are visual signs comprising dots, lines, curves, and basic geometrical shapes. In the same way that we can analyze verbal language through the choice and combination of words into sentences, we can analyze diagrams by examining how these visual signs are selected and joined into the larger diagram. For instance, a smaller circle inside a larger circle can denote a "type of" relationship similar to a hyponym. We call this arrangement of the signs the "visual elements" of the diagram.
The metalanguage used in Tang, Won and Treagust' (2019) study comprised a description of several visual elements that can be made with drawings that were grouped into seven categories: association, spatial, movement, perspective, modality, connective, and textual contextualization. For instance, association is the term to describe the visual elements of joining objects to one another through the drawing of lines or proximity. Objects that are visually joined often signify some kind of physical connection, whereas an object drawn within a larger object can signify some kind of inclusion relationship, either as a composite relationship (e.g., an atom inside an enclosing space) or a set-subset relationship, as with a Venn diagram. Examples of visual elements from the other categories can be seen in the results section.
In this study, we modified the metalanguage by Tang, Won and Treagust (2019) and transformed it into a pedagogical tool for teachers and emergent bi/multilinguals to make explicit the visual elements in an elementary science classroom. First, some of the terminologies were renamed or simplified to make it easier for elementary school students to use. For example, association became relationship, and modality became believability as this was a measure of how closely it resembled reality for the viewer. Second, the metalanguage was introduced to the students in the form of a checklist, as shown in Figure 1. Every item in the checklist focuses on one aspect of the visual elements (i.e., relationship, spatial, movement, perspective, believability, connection, and text, as presented in the first column) and poses guiding questions (the second column), with a corresponding selection of answers (third column) illuminating the representational options. This enables the students to tick the elements used and circle the details applied in their representation (as shown in Figure 1). The guiding questions, representational options, and checklist are the instructional methods of how we make explicit the visual elements to the emergent bi/multilinguals during their construction of visual explanations.

Research Design and Research Question
This paper uses an instrumental case study (Stake, 2000) to examine how a teacher and 10 emergent bi/multilinguals responded to knowledge of the visual elements by using the introduced metalanguage in three science topics. The purpose of this case study was not to make decontextualized claims about the use of visual representations for constructing explanations; instead, it aims to reveal insights into the affordance of a visual system and its metalanguage in supporting science learning for a group of emergent bi/multilinguals. With this purpose in mind, the research question that guided the investigation was: figure 1 The checklist of visual elements used by the teacher when planning the lesson on matter 1.
What are the outcomes of introducing the visual metalanguage for (a) students to construct explanations and (b) teachers to influence students' construction of visual repre sentations?

Research Context
The study was conducted at an independent school in an affluent area of Hong Kong. The bilingual model at the site was unlike government schools and was considered partial immersion (Lin & Man, 2009). This meant that the percentage of English delivery compared to Putonghua (standard Mandarin) increased each year through different subject areas, from 30% in the early years to 50% in the fourth and fifth grades. As a result, science was not taught in English until third grade. The eclectic school science program encompassed curricula from Taiwan and Hong Kong and had recently integrated parts of the Next Generation Science Standards (National Research Council, 2013) from the U.S.
Convenience sampling was used to select appropriate participants for the research, which meant the participants came from the first author's fifth-grade science class, which was delivered in English. As a result, the teacher-researcher, as a participant observer, could clarify meanings as they occurred in context, consistent with an instrumental case study (Stake, 2000). This enhanced the credibility and trustworthiness of the study, as the teacher-researcher managed elementary science in the English department at the school. She had revised the science curriculum, conducted previous research in her department, and developed science learning experiences. She also knew and had taught the participants prior to the research. She was 36, monolingual, and an expatriate of Anglo-Saxon descent. In contrast, the 10 student participants came from families with Chinese heritage and spoke varying degrees of English. Nine of the student participants had listed Chinese as their first language but did not specify between Putonghua or Cantonese, the local Hong Kong dialect. Some participants knew both. Due to the unspecified language backgrounds of the student participants, they were separated randomly into two groups for the entirety of the study.
Altogether, nine science lessons (listed in Table 1) were adapted from the Thinking Frames Approach (Newberry & Gilbert, 2016), which included explanatory drawing and provided an opportunity for visual communication. This approach was considered multimodal as each of the three phases of the lesson demanded the use of a different dominant mode ( Figure 2).
The nine lessons were divided into three units: human body, forces in motion, and matter. Each unit was covered in one academic term. In the lessons, the student groups created representations to explain several puzzling ASIA-PACIFIC SCIENCE EDUCATION 7 (2021) 309-342 Table 1 The science focus and the question requiring an explanation for each lesson

Unit Lesson Science lesson focus Question requiring an explanation
(1) Human body phenomena demonstrated in the classroom. For each lesson, the student groups were required to construct a verbal, visual, and written explanation to answer a how or why question (see Table 1). Thus, prior to the visual explanation, the students collaboratively deciphered the science concepts involved in the phenomenon and verbally constructed an explanation to answer the question. In the visual explanation, the students represented their explanation in the visual mode.
In this paper we limit our focus to the construction of the artifact produced for the visual explanation in each lesson. The students were provided with an A4 template for individual explanations and an A3 template for collaborative explanations. During the inquiry lessons, the objective was to draw an explanation of the phenomenon to address the inquiry question (see Table 1). If the students expressed their explanation in other modes, they were redirected by the teacher to represent their meanings visually. As the students drew their response, the teacher questioned the students to help them consider their ideas.
In the first two units, the teacher had access to the visual metalanguage and used it to plan her lessons. In the final unit, the students had access to the metalanguage via the checklist of visual elements constructed for this study. The checklist was introduced to the students in an additional science lesson (prior to the final unit) in three parts. In Part 1: Learning about drawing affordances, the teacher drew two perspectives of a familiar object on the board, one was regularly used by the students, whereas the other was rarely used at all. She asked the students why they used a top-down view to explain how the balloon car works. Next, she added a projected view to show how to magnify areas of the car. In Part 2: Identifying drawing affordances, the students became familiar with the visual elements by using the checklist to review their own representations from previous lessons. Their self-assessment led to discourse about the best way to explain the concepts and the phenomenon in each representation. In Part 3: Drawing with affordances in mind, the teacher led a new experiment that provided the students with an opportunity to use the checklist to construct a visual explanation. Following this lesson, the students had access to the checklist in all their science lessons.

Data Collection Methods
The data in this case study are from a 9-month study aimed at supporting emergent bi/multilinguals in science. In this paper we draw on several sources, including artifacts such as annotated checklists, the students' visual and written representations of their explanations, lesson plans, interviews, and video recordings. Multiple data sources are necessary for three reasons. First, as we assume that all meanings are of value and add to the visual representations (Kress, 2010), to capture the variety of signs made during the process of constructing the visual representations, data have to be collected from multiple viewpoints. For instance, video recordings capture tactile and gestural signs, including movement. As a result, two cameras were used, one focused on each student group. Second, multiple data sources provide a thorough convergence of evidence with which to explore the aims of the study and arrive at an in-depth conception of the phenomenon. Finally, multiple data sources provide an opportunity for multiple outcomes to become clear.
In response to the research question, we first constructed a multimodal transcript of the recordings of each lesson and obtained the times when each student group constructed the visual representations. If the teacher revealed the visual elements or information relating to the visual elements to the students in the semiotic units, we reviewed this to see if the teacher's meaning had been translated into the student's verbal or visual explanation. In the first two science units, the analysis focused on how the teacher influenced each group's construction of the visual representations. In the final science unit, as the students were now exposed to the visual metalanguage, our analysis broadened to include the multiple meanings (verbal and visual) leading to each group's explanation of the phenomenon in each lesson.
To determine if the elements in the visual framework influenced the students' representations, we followed the process shown in Figure 3. First, we elicited an accurate written explanation of each phenomenon and used Lemke's (1990) thematic analysis to examine the semantic relationships of the explanation. The verbal meanings established from the semantic relationships were translated into visual meanings to establish which visual elements would be considered essential in providing an explanation of each phenomenon. For example, an explanation of "Why we can eat upside down" in Lesson 3 must include the process of peristalsis, that is, the contraction and relaxation of muscles in the esophagus to send food to the stomach. The semantic relationships would consist of several transitivity processes (e.g., muscles contract, food is swallowed) joined in a temporal sequence through conjunctions (e.g., first, then). To translate these relationships into visual meanings in a drawing, it is necessary to choose equivalent visual elements that establish the same pattern. Therefore, this visual explanation must show the movement figure 3 The framework for analysis of muscles and food through the visual element of movement represented by arrows, lines, or labels as well as the transition of time through the visual element of connection via visual sequences. Together, the verbal explanation and translated visual elements became the aim of the students' explanations in each lesson.
Next, we analyzed the video recordings and students' representations to inspect the meanings made by the students during the construction of visual representations. Here, we created multimodal transcriptions (Bezemer & Mavers, 2011) and used a fine-grained analysis to examine the multimodal combination of signs, known as semiotic units (Tang et al., 2014;Williams et al., 2019), created during the construction process. Examples of the multimodal transcription can be seen in the Results section. Similar to the process described earlier, we analyzed the verbal discourse using Lemke's (1990) thematic analysis and the visual elements using Tang, Won and Treagust's (2019) visual framework.
Following the analysis of each group's visual representation, the visual elements were identified and counted. Two analysts were involved in the counting process, and the Cohen's kappa coefficient of the interrater reliability was 0.758. This data allowed comparisons to be made: first, by considering the visual elements chosen by the different groups in their explanations of the same phenomenon; second, by comparing each group's application of visual elements with the visual elements perceived necessary for each explanation, allowing a measure of accuracy for each group's visual explanation; and, finally, by reflecting on the number of times a particular visual element was perceived necessary and the number of times it was chosen by the different groups in their explanations. Making comparisons enabled variations to be found, which directed the investigation and inspection of the subsequent results. Discrepancies and equivalences warranted a more detailed examination and ensured more descriptive information was unveiled.
Throughout the entire process, the analysis remained trustworthy by using multiple methods (Guba & Lincoln, 1989). First, the analysis was conducted in a timely manner. Second, the researchers critically reviewed the interpretations of others. For example, if an interpretation of the transcript was in question, the raw video material was viewed and for any disagreements about interpretation (which were rare), the data in question were not included. Third, the implementation of three lessons in three science units ensured triangulation of the data, certifying the results were credible. Finally, the teacher-researcher had prior knowledge of the participants, which offered an insider's perspective and also meant they were able to clarify the intended meanings in context during the lesson. The following presents a discussion of the findings that address the research question.

The Students' Construction of Visual Representations
This section presents the quantitative data, which show the number of visual elements highlighted by each group in their visual representations for each lesson, followed by descriptions of the semiotic units and visual representations produced in two lessons. In the discussion, the results are summarized to address each part of the research question directly. The data chosen are representative of the visual representations and semiotic units found in each group for each science unit. Episode 1 is taken from the forces in motion unit, which preceded the teaching of the visual elements, whereas Episode 2 presents the findings from the matter unit, which followed the implementation of the checklist. Episodes 3 and 4 depict the teacher's influence and, as such, revisit each of the lessons shown in Episodes 1 and 2, respectively. In the episodes, the letter "T" represents the teacher and an ellipsis (…) represents truncated transcription.
To ascertain the outcomes of the visual metalanguage for the students, it was necessary to inspect the students' use of the visual mode before their exposure. Table 2 shows the number of visual elements found in the students' visual representations and compares this with the number of visual elements believed necessary for an accurate and complete (visual) explanation of the phenomenon in each lesson. If the student groups were unable to include all the necessary visual elements, their visual explanation of the phenomenon was considered incomplete or inaccurate. An inaccurate explanation was associated with either inexperience in drawing or a misunderstanding of the science. Thus, at this point in the lesson, the ability of the students to show the visual elements became a measure of the effectiveness of their visual explanation. The drawing required the application of the students' ideas through visual elements. However, as the visual elements were still new to the students, they were free to explore the use of them. Throughout the process, the students remained unaware and uninformed of the set of elements necessary for the explanations of each phenomenon. At this point in the lesson the teacher was mainly focused on students' visually demonstrating their understanding.
The results showed that the emergent bi/multilingual students communicated science meanings through visual representations before and after the visual metalanguage was introduced. Table 2 illustrates that prior to introducing the students to the checklist, they used a range of visual elements to explain the unknown phenomenon. For example, the data showed that three visual representations produced an accurate explanation of the phenomenon in human body Lesson 2, where Group 1 had all the necessary elements; forces in motion Lesson 4, where both groups were accurate; and forces in motion Lesson 6, where Group 2 included all the necessary visual meanings. It is noteworthy that these explanations were created prior to the introduction of the metalanguage to raise students' awareness of the visual elements. Thus, the data show that unbeknownst to the students, they were already applying the visual elements in their diagrams to some extent. Nonetheless, the data also show inaccuracies did occur in the majority of the lessons. On inspection we see the largest number of inaccuracies (i.e., the highest number of missing elements) occurred in explanations for the human body unit. In the other two units, most of the explanations were missing only one or two necessary visual meanings. We reveal the types of common inaccuracies and specific missing visual meanings in Table 3.
The results also show that each group included the entire repertoire of visual elements in their representations throughout the entire study. This indicates that the students were able to make meaning using all the elements from Table 2 The number of essential visual elements required in each representation to accurately explain the phenomenon and the number found to be present in each group's representations the visual mode. This is shown in Table 3, which compares the total number of visual elements required in the explanations of the phenomenon for all lessons with the total number of visual elements used in each group's representations for the entire time. (A more detailed analysis of each group's accuracy for a specific visual representation can be seen in Table 4.) Presentational meaning, specifically association, was the most frequently required visual element in the explanations. This is likely because associations portray the relationship and connections between lines and images and, as such, depict subject matter. For example, the lungs must be inside the chest cavity. However, this visual element appeared to be missing in several of the visual representations from each group. Further details of how the visual elements were used by students in their visual representations will be explored in the subsequent sections.

3.1.1
Episode 1 The first episode is taken from the forces in motion unit and comes before the metalanguage was introduced. It illustrates how visual mode gave the students a way to represent the details of their explanations that they were unable to represent in the written mode. We infer the reasons for why certain visual meanings could not be or were inaccurately translated into a written mode. For this lesson, the students were asked to explain how a balloon car works. The visual elements necessary for a good explanation of how the balloon car works and whether each group's drawing showed those visual elements are listed in Table 4. In this episode, we surmise that the spatial element (unique to the visual mode) afforded the students a way to explain science meanings. As the data show, the students presented details of an explanation visually that many did not describe in their writing. For instance, both groups applied multiple presentational and orientational meanings that made clear their assumptions of the position and movement of air particles inside the balloon and the straw. One of the meanings employed by both groups in their visual representations was spatial elements. On closer inspection, we noticed the use of proximity and distribution (sub-categories of the spatial element) to visually describe the movement of air particles. For instance, the small green circles (particles) inside Group 1's balloon and the different sized circles in Group 2's balloon are spaced apart in a random distribution, except at the balloon opening. The particles at the opening form a pattern with two solid rows of touching circles. The contrast in particle distribution within the balloon demonstrates a misconception held by the students, as all particles should be positioned closely together. However, the notable variation in distribution and proximity of the air particles implies that the students knew a change in density is necessary for the explanation of how a balloon car moves. Their assumption is accurate, as the difference in the density of the particles inside and outside the balloon was necessary to show a change in pressure. The students' ability to share their meanings visually exposed both their misconception and a potential relationship in their visual representations (see Figure 4) that most were unable to depict in their written explanations (as shown in Figure 5).
In another example, we noticed the visual depiction of movement, which is an important presentational meaning necessary for this explanation. Both groups depicted the directional movements using arrows to accurately identify the contrast in the directions of the air from the balloon and of the car. However, this visual depiction is juxtaposed with the students' written explanations. For instance, seven students did not mention the contrasting direction of the car movement. Of the three students who verbalized the movement, two wrote that the car moved in the "opposite way," and only one described it as the "opposite direction." However, the most common description provided was "move forwards," which was not entirely correct. Nor is the reference of movement comprehensive enough to determine whether the students understood that the action, the push of the denser air molecules out of the balloon in one direction, is responsible for the reaction, the push of the car in the opposite direction. In addition, several students did not mention the car's movement at all, as demonstrated in Figure 5. Instead, this explanation described merely the force created by the release of air in the balloon. The visual mode provided more information than the written mode regarding the students' explanation. For this reason, we infer that the visual mode appeared a more appropriate way for students to explain this phenomenon. The data also show that not all necessary aspects of each explanation were depicted in the students' representations (see Table 4). For example, an orientational meaning necessary for an accurate visual explanation of how a balloon car moves is a side-view perspective. Yet both groups chose to represent the phenomenon with a top-view perspective. This choice of perspective limited the students representations of the movement of the wheels and the surface friction. Thus, while Group 2 attempted to add vertical lines on each side of the car to depict a wheel turning (with arrows), this mix of top-and side-view depictions is quite confusing and inaccurate. Furthermore, a topview perspective limited the accurate depiction of the force of gravity, as the arrow appears to point in an inaccurate direction due to the position of the car (Figure 4).
Another example is the visual elements in the connective category, which is a type of organizational meaning. A good explanation of how a balloon car works should describe or show a temporal sequence of events: from the buildup of pressure due to the air particles inside the balloon to the release of the particles resulting in the car moving forward. To show such a temporal sequence visually, a balloon car at some time in the future can be drawn side by side (typically on the right) with the car in the present time to depict the passage of time (Tang, Won and Treagust, 2019). This application of a connective visual element was not observed in Episode 1.

3.1.2
Episode 2 Episode 2 is from the matter unit, which followed the students' exposure to the visual metalanguage and includes data from Group 1. The key idea of this lesson was why ice melts faster on a ceramic tile than on foam. In this lesson, the students first created individual representations before combining their ideas to produce a group explanation.
It appeared that an awareness of the visual metalanguage provided the students with a way to demonstrate a key concept in their visual explanations.
An analysis of the students' individual representations showed a commonality in how microscopic aspects were represented. For example, all students in Group 1 (Figures 6-8) depicted microscopic images of each material using a projected view. As the students had not regularly used this element in their previous explanations, we inferred that this technique was borrowed from the teacher's illustration when she introduced the visual metalanguage. As such, this demonstrated that the students' awareness of the element was accurately applied. Moreover, the students' application of a projected view provided the necessary contrast between the two materials. As a result, a crucial aspect of the explanation was highlighted: the difference between the spacing (or density) of particles in each material. Jane even made use of this comparison to show the passing of time and the melting of the ice by drawing before and after diagrams (see Figure 6, representation on the left), another visual element on temporal connectivity not seen in previous lessons. Thus, the visual metalanguage had an impact on how the students represented the meanings visually. To further understand how the visual metalanguage shaped the students' visual representations, we examined their group discourse as they discussed how to draw. At the start of their group drawing, Group 1 were sitting in a circle and were almost at the completion of their collaborative representation. They were discussing the vibration of particles. The excerpt presented in Table 5 details a conversation between the emergent bi/multilingual students in Group 1, with the "action" presented in the right image in Figure 9. This excerpt demonstrates that the emergent bi/multilinguals' cognizance of the visual metalanguage enhanced their determination to achieve their objective: to visually explain the science phenomenon. Although the students did not discuss the visual metalanguage directly in their discourse, they were aware and determined to show the necessary concepts in their visual explanations. For instance, several students identified what the image should show, although they said "tell" (Lines 7 and 14) and "say" (Line 1). Josie commented, "We have to tell people that the vibration is creating thermal energy" (Line 7). They were also discussing how to depict them. Josie (as with the others) grappled with how to depict the science concepts through the visual elements, using statements such as "Yeah, but how do we show that?" (Line 21). They attempted to represent the concepts in new ways by introducing symbols not stated on the checklist, such as squiggly lines (Line 6; Figure 9) to show the vibrating movement of the particles (Lines 1 and 2). These questions, statements, and actions demonstrate a heightened awareness among the students regarding the importance of communicating the visual mode to a particular audience instead of assuming their drawing was self-explanatory. This type of awareness and discourse was not observed in the previous two units.
To deliberate on how to best represent the science concepts, the students used the visual mode. As they represented ideas, they applied the visual elements. For instance, in Line 23, Jane used the connective element to show a comparison as she verbally described: "Start over there, it's like draw them together … then draw a line going through and the other draw it far away" Table 5 Conversation excerpts between emergent bi/multilingual students in Group 1 during the Matter unit Line Name Speech Action  Table 5 Conversation excerpts between emergent bi/multilingual students (cont.) figure 9 Group 1's representation explaining why ice melts quicker on a tile than on foam (Line 23). She drew a line to show the connection between the particles and the direction the vibration (energy) was moving (Line 31; Figure 9). Here Jane used association elements to position the particles left to right and applied spatial elements to show the contrasting proximity of particles to each other, in both materials. Similarly, Josie used visual meanings to present her idea ( Figure 9). First, she focused on the macro-level and identified the subject matter, "OK, this is the ceramic thing OK, and this is the foam" (Line 24). Next, she added details of the microscopic level, "So, then this has more stuff right?" (Line 25). Thus, Josie also applied a contrast between the two materials. As such, despite their differing depictions, both Jane and Josie accurately identified the necessary visual meanings for the explanation of this phenomenon. As the teacher was able to make meaning from the image, we inferred Nick's lack of understanding in this situation (e.g., Lines 26 and 28) arose from his lack of conceptual knowledge. It is clear that the emergent bi/multilinguals' ability to demonstrate understanding and ideas through both the visual and verbal modes ensured that they were able to participate in the discourse and gain access from the meanings presented. Moreover, the emergent bi/multilinguals' heightened awareness of the visual mode, via an introduction to the visual metalanguage, appeared to alter their perception of visual explanations in science. This was seen from the comments of the Group 1 students. For example, in her self-recorded video reflection following the matter lessons, Yasmine said, "My favorite part of the lesson is using drawing to represent my information. This is the first time I know I can use drawing to represent my information without using any words, which is really surprising and interesting." Similarly, the students present in the group interview agreed that the checklist should be given to the next Grade 5 students. In response to the teacher's query about whether the checklist was helpful, one student replied, "This helped me drawing [sic] a diagram a lot. It taught me the procedures of drawing a diagram" (Jeffery). Another student added, "and also sometimes I forgot to use labels, and those things [visual elements] and close-up pictures [projected view] [help me to] remember" (Kirsty). Thus, the visual metalanguage was perceived by several students as enhancing their ability to present meanings in science.

The Teacher's Influence on the Students' Construction of Visual
Representations This section illustrates how the teacher's attempts to aid the students' creation of the visual explanations subsequently enhanced their communication of science meanings. It includes two episodes. Episode 3 occurred after Episode 1 in the forces in motion lesson and presents an example of a semiotic unit made during the explanation of how a balloon car works. Episode 4 occurred after Episode 2 in the matter unit and follows the students' introduction to the visual metalanguage and asks the students to explain why ice melts faster on a tile than on foam.

3.2.1
Episode 3 This episode demonstrates how the teacher's knowledge of the visual elements allowed her to direct her prompts and guidance during group discussions.
We deduce that the teacher's prior knowledge of the visual system enabled her to consider the meanings necessary in the visual explanation, which equipped her to respond to the student groups as they constructed their representations. The excerpt also demonstrates how her prompting and questioning subsequently led the students to improve their visual communication, as evidenced by their completed representation. For example, prior to the following excerpt (Table 6), Group 1's visual representation (Figure 4) depicted only the subject matter that could be seen, such as the car and balloon. Table 6 Conversation excerpts between emergent bi/multilingual students in Group 1 during the forces in motion unit Line Name Speech Action As illustrated in the excerpt, the students' visual representation did not yet provide any microscopic details. However, for an accurate visual explanation, a depiction of the movement of air particles is called for to show the force that moved the car, otherwise it is incomplete. Despite an earlier verbal discussion of the movement of particles (molecules), the students neglected to add microscopic meanings in their visual representation. Following the teacher's prompting in Lines 4 and 9, the students added particles using a green pen (see Figure 4). Of significance in this episode is that the students appeared to already understand how to represent these meanings and accurately portray their movement out of the balloon. However, without the teacher's expectation of visual meanings and prompting during the discussion, these meanings might have been overlooked.

3.2.2
Episode 4 This episode follows the implementation of the visual metalanguage in the checklist and occurs in the same matter lesson as Episode 2. In this lesson, the students observed a demonstration of ice melting quicker on a tile than on foam. Following this, each student constructed an individual visual representation. The following excerpt (Table 7) began as the students were about to draw their individual representation.
This excerpt illustrates how the introduction of the visual checklist mediated meaning by allowing the teacher (and the students) to verbalize their thinking when explaining how to represent their ideas and concepts visually. A good illustration can be seen in Lines 1 to 5, where the teacher showed the drawing checklist as she provided the students with the metalanguage necessary to verbalize their visual meanings. The teacher probed their ideas by asking them how they would use the visual elements in their drawing. In particular, the questioning was mediated through a shared language using the term "density" (a spatial category) to focus on the "different ways" of "comparing one thing to another." This form of questioning stood in contrast to her requests prior to the introduction of the checklist in Episode 3 for students to simply show the phenomenon in vague terms, for example "Did you show me what those look like?" (Line 9) and "What happens to the molecules when they're in the balloon?" (Line 11). Furthermore, in episodes prior to the introduction of the metalanguage of visual elements, the changing spatial qualities (proximity and distribution) of the particles was not discussed. However, in this episode, we see a difference in the discussion. For instance, in Line 3, the teacher mentioned making comparisons (referring to connection in the checklist) and in Line 9 she alluded to the need for spatial awareness of the particles within the models, with suggestions of spacing in Line 14: "Are they together or far apart?" Table 7 Conversation excerpts between emergent bi/multilingual students in Group 1 during the matter unit Line Name Speech Action Thus, when discussing the visual explanation, the teacher attempted to make associations between the visual elements and the science concepts before the students began to draw their representations. The teacher's highlighting of the relationships between the science concepts and the visual elements prompted the students to consider their explanations.
For instance, Josie mentioned comparing the two images (Line 7). Later, the students in Group 1 agreed that adding more molecules into one of the objects would show it was denser (Lines 12-13). The teacher's support in navigating the visual metalanguage thus enabled the students to apply the visual elements.

3.3
Limitations One of the limitations of this study is the sample size, which remained deliberately small to ascertain an in-depth understanding of the phenomenon and conduct a detailed analysis of semiotic units. Subsequently, the generalizability of the results is limited. Another limitation was that by using fixed cameras, the signs outside the recorded area were missed. However, placing tape on the table to provide a boundary for the students to stay within reduced the chance of missed signs.

Discussion
To investigate whether supplementary knowledge of the visual mode supported students in communicating science explanations, we made the visual elements explicit to a fifth-grade science class through a shared metalanguage. We recorded 10 emergent bi/multilinguals' construction of visual explanations of an unknown phenomenon, both before and after the introduction of the metalanguage. We found that exposure to the visual metalanguage heightened their awareness of visual meanings and motivated their desire to illustrate science concepts in explanations. Furthermore, making the visual elements explicit in the form of a checklist helped them to demonstrate key concepts in their explanations and indirectly increased some students' confidence levels when learning science. These results of this case study coincide with other research demonstrating that the visual mode provides an outlet for emergent bi/multilinguals' meaning-making (e.g., Ryoo & Bedell, 2017). Yet this study extends our understanding. An argument has been made for the importance of representational competence in science (diSessa, 2004;Kozma & Russell, 2005); in response to this, we discussed how the visual elements affected the construction of visual representations. To gain representation competence, the students must know the form and function meanings can take in a mode of representation. The visual elements offered students an insight into the form of 2D visual meanings. However, without an understanding of how they function, a student's meaning can be impeded. For example, one element that provided crucial information in these explanations was not found in the written mode. The students appeared to have prior knowledge of the spatial element as they were able to organize particles appropriately in most representations. However, an element that we found constrained meanings in the visual system was the use of an orientational meaning perspective, and students appeared to have a lower level of knowledge of this visual element. In one explanation in this study, we found the choice of perspective encumbered the intended direction of forces depicted and the wheels creating friction with the ground. We argue that knowing how visual elements function in science classrooms could improve students' visual explanations. This has further implications for emergent bi/multilinguals, who may prefer this form of communication.

The Implications of the Visual Mode for Emergent Bi/multilinguals in Science
The results have shown that visual elements provide a medium for meaningmaking, even when they are implicitly used by young emergent bi/multilinguals. However, when visual elements are explicitly discussed using a metalanguage, they can raise an awareness of the form and function of visual elements and thereby provide the students with an alternative way of illuminating key concepts in their visual explanations. In doing so, the visual system offered a window into the emergent bi/multilinguals' understanding of science, which was not always uncovered in their writing. This suggests that the visual mode can provide assessment possibilities once the students know more about it through the visual metalanguage. Through the explicit metalanguage, the teacher gained an expectation of which visual elements should be seen in the students' visual explanations. Consequently, the students' misunderstandings became noticeable. In this study, any misunderstanding may be attributed to an insufficient knowledge of the visual elements due to the students' limited exposure to the visual metalanguage. Nevertheless, the use of the visual framework to analyze each representation may help teachers identify emergent bi/multilinguals' understanding and any potential misunderstanding.
In the same way, the students used their understanding to their advantage as they accessed semiotic resources when translanguaging to produce visual meanings in unique ways. This meant different visual representations could still depict the required visual elements. As a result, the visual metalanguage afforded the students a way to value their cultural and language backgrounds. Furthermore, the checklist of visual elements provided learners with a selection of possibilities for making visual meaning. By asking learners to choose from among these possibilities, the checklist ensured they had representational agency, and enticed them to participate. In essence, the checklist could potentially mediate meaning as it questioned what forms were best to represent meanings and elicited responses from the students. Comparing the students' actual representation with their completed checklist of the elements perceived to be included may provide an area for further study. In principle, the checklist became an artifact associated with the representation; yet in this study its potential remained unexplored. Broadening the research to include this avenue in future studies is a recommended goal.