Since the turn of the millennium researchers have access to an ever-increasing pool of novel types of video recordings. People use camcorders, mobile phone cameras, and even drones to film and photograph social life, and many public spaces are under video surveillance. More and more sociologists, psychologists, education researchers, and criminologists rely on such visuals to observe and analyze social life as it happens. Based on qualitative or quantitative techniques, scholars trace situations or events step-by-step to explain a social process or outcome. Recently, a methodological framework has been formulated under the label Video Data Analysis (VDA) to provide a reference point for scholars across disciplines. Our paper aims to further contribute to this effort by detailing important issues and potential challenges along the VDA research process. The paper briefly introduces VDA and the value of 21st century visuals for understanding social phenomena. It then reflects on important issues and potential challenges in five steps of conducting VDA, and formulate guidelines on how to conduct a VDA: From setting up the research, to choosing data sources, assessing their validity, to analyzing the data and presenting the findings. These reflections aim to further methodological foundations for studying situational dynamics with 21st century video data.