This study investigates how microlearning videos become entangled with classroom space, discourse, and collaborative meaning-making in English language learning, offering a sociomaterial perspective that goes beyond cognitive or instructional effectiveness. While microlearning is often praised for its brevity and clarity, little attention has been given to how it reconfigures social practices, spatial alignments, and multimodal interactions in face-to-face settings. This study explores microlearning videos not as neutral tools, but as active elements within an assemblage of bodies, materials, and discursive practices. It focuses on how students interact with video content, reshape learning spaces, and co-construct grammatical understanding through peer dialogue and embodied interaction. To guide the investigation, three research questions frame the study: (a) How did microlearning videos entangle with the learning space in English language classrooms? (b) How did microlearning videos influence classroom discourse? and (c) How did microlearning video elements function as assemblages in the collaborative meaning-making process? This qualitative study adopts a sociomaterial approach to explore the integration of microlearning videos in English language instruction, focusing on how digital tools entangle with classroom space, discourse, and student engagement. Drawing on sociomaterialism, the research utilized multiple data collection methods including participant observation, stimulated recall interviews, and video recordings. An observation checklist was developed to guide the analysis of sociomaterial aspects, sociocognitive interactions, and classroom discourse, while stimulated recall interviews elaborated on observed phenomena by prompting participants to reflect on their experiences during specific moments in the lesson. Data analysis drew upon a sociomaterial framework adapted from Moura and Bispo (2018), enriched by the Cognitive Theory of Multimedia Learning (Mayer, 2005; 2017) to interpret how microlearning videos mediate both meaning-making and spatial practices. This hybrid analytical strategy enabled a nuanced understanding of how digital resources, human actors, and physical environments co- construct learning experiences in a language classroom setting. This study reveals how microlearning videos become entangled with the physical and social dimensions of an English language classroom, demonstrating that learning environments are not passive spaces but dynamic sociomaterial assemblages, co- constructed by learners, lecturers, and material arrangements. Students’ spatial choices highlighted their active negotiation of space to support visibility and interaction, while informal postures like sitting on the floor reflected informal social practice and identity to enhanced peer collaboration. The integration of microlearning videos reshaped classroom discourse, introducing a new pattern—PIER (Presentation–Interjection–Engagement– Reintegration) —that reconfigure the traditional IRF model. Lecturers used a Play–Pause– Rephrase technique to scaffold understanding, though discussions remained largely guided rather than open-ended. Video content also functioned as a cognitive extension, mediating knowledge construction through gestures, gaze, and shared attention. Students demonstrated high levels of social engagement, negotiating meaning and co-constructing understanding collaboratively, showing that language learning is inherently dialogic and socially mediated. Behavioural engagement was evident in students’ bodily orientation toward the screen, indicating focused attention and task involvement. Material engagement emerged as students paused, replayed, and re-analysed segments, treating the video as a manipulable resource rather than a static medium. Cognitive engagement was supported by lecturer-led questioning that encouraged critical thinking about grammatical structures like wish and if only. Finally, analysis of video design showed that principles from the Cognitive Theory of Multimedia Learning (Mayer, 2017) —including narration, text, color coding,
and visual cues—were effectively used to reduce cognitive load and enhance comprehension. The findings suggest some pedagogical implications. Teachers must develop awareness of spatial, material, and technological dynamics to create flexible yet structured learning environments. Beyond managing behaviour, they need to mediate between digital tools and physical arrangements, using their presence to guide attention and support cognitive engagement. Lessons should be chunked to allow processing time, paired with intentional scaffolding that bridges passive viewing and active participation. Students also share responsibility by interacting with materials and peers. Instruction should leverage multimodal elements—visual, auditory, textual—to enhance understanding of complex grammar. Teachers are encouraged to intentionally design video-based tasks that integrate microlearning’s affordances, ensuring that multimedia components work cohesively to support diverse learners. This approach fosters deeper comprehension and socially mediated meaning-making, aligning with how students naturally engage with digital content in collaborative settings. |