Investigating the virtual directing strategies of a virtual cinematographer in an automatic lecture video post-processing system

dc.contributor.advisorMarais, Patrick
dc.contributor.advisorMarquard, Stephen
dc.contributor.advisorMarquard, Stephen
dc.contributor.authorKhatieb, Mohamed Tanweer
dc.date.accessioned2024-11-08T08:54:58Z
dc.date.available2024-11-08T08:54:58Z
dc.date.issued2023
dc.date.updated2024-09-03T09:33:36Z
dc.description.abstractAs recording technology improves and becomes more affordable, many learning institutions are using lecture recording to make lessons more persistent and accessible. Statically mounted 4K cameras are now cheaper than PTZ cameras which makes them a desirable alternative for lecture recordings. Unfortunately, 4K resolution videos are very large, posing a problem for storage and streaming - the file size for a 45 - 60 minute lecture video in 4K can exceed 2GB. Many students cannot afford the bandwidth required to stream such large files. Furthermore, since static 4K cameras do not move, they require a wide-angle view of the venue in order to capture as much of the front of the venue as possible. This view is much too zoomed out for viewers to see the details, such as writing on the boards and the presenter's facial expressions, captured by the 4K resolution. This dissertation investigates an approach to post-processing these 4K lecture videos to reduce the file size and emphasise lecture details such as lecture motion and board/screen usage. This is done using scene tracking data (generated via a third-party front-end) which a Virtual Cinematographer (VC) uses to make decisions on about which areas to crop from each 4K frame in the original video. The VC then positions and sizes the cropping windows in such a way that the resultant, cropped video resembles one recorded by a human camera operator. This is accomplished using cinematographic heuristics to inform its decision-making. The VC uses scene analysis algorithms to determine how the environment changes as time progresses in the video. By dividing the video into “chunks” (equivalent to “scenes” in traditional cinematography) based on context, the VC is able to maintain stable shots with consistent framing to avoid jittery and disorienting footage. These contextual chunks are determined by comparing the trajectory of the presenter with the manner in which the features on the board regions change over time. After the chunks are established, the VC creates transitions between them while avoiding any changes to the framing inside each chunk. The final output is a JSON file containing the cropping coordinates for each frame in the video for a third-party video cropping application to use when producing the final video. We performed a user evaluation of the VC to measure user satisfaction with the resulting output videos and how successful it was at following its heuristics. The VC succeeded in following the major heuristics such that viewers were satisfied with the output based on the framing of the presenter and the content on the boards, transition stability and smoothness of motion, and transition frequency with the VC only changing shots when necessary.
dc.identifier.apacitationKhatieb, M. T. (2023). <i>Investigating the virtual directing strategies of a virtual cinematographer in an automatic lecture video post-processing system</i>. (). ,Faculty of Science ,Department of Computer Science. Retrieved from http://hdl.handle.net/11427/40692en_ZA
dc.identifier.chicagocitationKhatieb, Mohamed Tanweer. <i>"Investigating the virtual directing strategies of a virtual cinematographer in an automatic lecture video post-processing system."</i> ., ,Faculty of Science ,Department of Computer Science, 2023. http://hdl.handle.net/11427/40692en_ZA
dc.identifier.citationKhatieb, M.T. 2023. Investigating the virtual directing strategies of a virtual cinematographer in an automatic lecture video post-processing system. . ,Faculty of Science ,Department of Computer Science. http://hdl.handle.net/11427/40692en_ZA
dc.identifier.ris TY - Thesis / Dissertation AU - Khatieb, Mohamed Tanweer AB - As recording technology improves and becomes more affordable, many learning institutions are using lecture recording to make lessons more persistent and accessible. Statically mounted 4K cameras are now cheaper than PTZ cameras which makes them a desirable alternative for lecture recordings. Unfortunately, 4K resolution videos are very large, posing a problem for storage and streaming - the file size for a 45 - 60 minute lecture video in 4K can exceed 2GB. Many students cannot afford the bandwidth required to stream such large files. Furthermore, since static 4K cameras do not move, they require a wide-angle view of the venue in order to capture as much of the front of the venue as possible. This view is much too zoomed out for viewers to see the details, such as writing on the boards and the presenter's facial expressions, captured by the 4K resolution. This dissertation investigates an approach to post-processing these 4K lecture videos to reduce the file size and emphasise lecture details such as lecture motion and board/screen usage. This is done using scene tracking data (generated via a third-party front-end) which a Virtual Cinematographer (VC) uses to make decisions on about which areas to crop from each 4K frame in the original video. The VC then positions and sizes the cropping windows in such a way that the resultant, cropped video resembles one recorded by a human camera operator. This is accomplished using cinematographic heuristics to inform its decision-making. The VC uses scene analysis algorithms to determine how the environment changes as time progresses in the video. By dividing the video into “chunks” (equivalent to “scenes” in traditional cinematography) based on context, the VC is able to maintain stable shots with consistent framing to avoid jittery and disorienting footage. These contextual chunks are determined by comparing the trajectory of the presenter with the manner in which the features on the board regions change over time. After the chunks are established, the VC creates transitions between them while avoiding any changes to the framing inside each chunk. The final output is a JSON file containing the cropping coordinates for each frame in the video for a third-party video cropping application to use when producing the final video. We performed a user evaluation of the VC to measure user satisfaction with the resulting output videos and how successful it was at following its heuristics. The VC succeeded in following the major heuristics such that viewers were satisfied with the output based on the framing of the presenter and the content on the boards, transition stability and smoothness of motion, and transition frequency with the VC only changing shots when necessary DA - 2023 DB - OpenUCT DP - University of Cape Town KW - Computer Science LK - https://open.uct.ac.za PY - 2023 T1 - Investigating the virtual directing strategies of a virtual cinematographer in an automatic lecture video post-processing system TI - Investigating the virtual directing strategies of a virtual cinematographer in an automatic lecture video post-processing system UR - http://hdl.handle.net/11427/40692 ER - en_ZA
dc.identifier.urihttp://hdl.handle.net/11427/40692
dc.identifier.vancouvercitationKhatieb MT. Investigating the virtual directing strategies of a virtual cinematographer in an automatic lecture video post-processing system. []. ,Faculty of Science ,Department of Computer Science, 2023 [cited yyyy month dd]. Available from: http://hdl.handle.net/11427/40692en_ZA
dc.language.rfc3066eng
dc.publisher.departmentDepartment of Computer Science
dc.publisher.facultyFaculty of Science
dc.subjectComputer Science
dc.titleInvestigating the virtual directing strategies of a virtual cinematographer in an automatic lecture video post-processing system
dc.typeThesis / Dissertation
dc.type.qualificationlevelMasters
dc.type.qualificationlevelMasters
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
thesis_sci_2023_khatieb mohamed tanweer.pdf
Size:
15.42 MB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.72 KB
Format:
Item-specific license agreed upon to submission
Description:
Collections