Investigating the virtual directing strategies of a virtual cinematographer in an automatic lecture video post-processing system

Khatieb, Mohamed Tanweer

Investigating the virtual directing strategies of a virtual cinematographer in an automatic lecture video post-processing system

dc.contributor.advisor	Marais, Patrick
dc.contributor.advisor	Marquard, Stephen
dc.contributor.advisor	Marquard, Stephen
dc.contributor.author	Khatieb, Mohamed Tanweer
dc.date.accessioned	2024-11-08T08:54:58Z
dc.date.available	2024-11-08T08:54:58Z
dc.date.issued	2023
dc.date.updated	2024-09-03T09:33:36Z
dc.description.abstract	As recording technology improves and becomes more affordable, many learning institutions are using lecture recording to make lessons more persistent and accessible. Statically mounted 4K cameras are now cheaper than PTZ cameras which makes them a desirable alternative for lecture recordings. Unfortunately, 4K resolution videos are very large, posing a problem for storage and streaming - the file size for a 45 - 60 minute lecture video in 4K can exceed 2GB. Many students cannot afford the bandwidth required to stream such large files. Furthermore, since static 4K cameras do not move, they require a wide-angle view of the venue in order to capture as much of the front of the venue as possible. This view is much too zoomed out for viewers to see the details, such as writing on the boards and the presenter's facial expressions, captured by the 4K resolution. This dissertation investigates an approach to post-processing these 4K lecture videos to reduce the file size and emphasise lecture details such as lecture motion and board/screen usage. This is done using scene tracking data (generated via a third-party front-end) which a Virtual Cinematographer (VC) uses to make decisions on about which areas to crop from each 4K frame in the original video. The VC then positions and sizes the cropping windows in such a way that the resultant, cropped video resembles one recorded by a human camera operator. This is accomplished using cinematographic heuristics to inform its decision-making. The VC uses scene analysis algorithms to determine how the environment changes as time progresses in the video. By dividing the video into “chunks” (equivalent to “scenes” in traditional cinematography) based on context, the VC is able to maintain stable shots with consistent framing to avoid jittery and disorienting footage. These contextual chunks are determined by comparing the trajectory of the presenter with the manner in which the features on the board regions change over time. After the chunks are established, the VC creates transitions between them while avoiding any changes to the framing inside each chunk. The final output is a JSON file containing the cropping coordinates for each frame in the video for a third-party video cropping application to use when producing the final video. We performed a user evaluation of the VC to measure user satisfaction with the resulting output videos and how successful it was at following its heuristics. The VC succeeded in following the major heuristics such that viewers were satisfied with the output based on the framing of the presenter and the content on the boards, transition stability and smoothness of motion, and transition frequency with the VC only changing shots when necessary.
dc.identifier.apacitation	Khatieb, M. T. (2023). <i>Investigating the virtual directing strategies of a virtual cinematographer in an automatic lecture video post-processing system</i>. (). ,Faculty of Science ,Department of Computer Science. Retrieved from http://hdl.handle.net/11427/40692	en_ZA
dc.identifier.chicagocitation	Khatieb, Mohamed Tanweer. <i>"Investigating the virtual directing strategies of a virtual cinematographer in an automatic lecture video post-processing system."</i> ., ,Faculty of Science ,Department of Computer Science, 2023. http://hdl.handle.net/11427/40692	en_ZA
dc.identifier.citation	Khatieb, M.T. 2023. Investigating the virtual directing strategies of a virtual cinematographer in an automatic lecture video post-processing system. . ,Faculty of Science ,Department of Computer Science. http://hdl.handle.net/11427/40692	en_ZA
dc.identifier.ris	TY - Thesis / Dissertation AU - Khatieb, Mohamed Tanweer AB - As recording technology improves and becomes more affordable, many learning institutions are using lecture recording to make lessons more persistent and accessible. Statically mounted 4K cameras are now cheaper than PTZ cameras which makes them a desirable alternative for lecture recordings. Unfortunately, 4K resolution videos are very large, posing a problem for storage and streaming - the file size for a 45 - 60 minute lecture video in 4K can exceed 2GB. Many students cannot afford the bandwidth required to stream such large files. Furthermore, since static 4K cameras do not move, they require a wide-angle view of the venue in order to capture as much of the front of the venue as possible. This view is much too zoomed out for viewers to see the details, such as writing on the boards and the presenter's facial expressions, captured by the 4K resolution. This dissertation investigates an approach to post-processing these 4K lecture videos to reduce the file size and emphasise lecture details such as lecture motion and board/screen usage. This is done using scene tracking data (generated via a third-party front-end) which a Virtual Cinematographer (VC) uses to make decisions on about which areas to crop from each 4K frame in the original video. The VC then positions and sizes the cropping windows in such a way that the resultant, cropped video resembles one recorded by a human camera operator. This is accomplished using cinematographic heuristics to inform its decision-making. The VC uses scene analysis algorithms to determine how the environment changes as time progresses in the video. By dividing the video into “chunks” (equivalent to “scenes” in traditional cinematography) based on context, the VC is able to maintain stable shots with consistent framing to avoid jittery and disorienting footage. These contextual chunks are determined by comparing the trajectory of the presenter with the manner in which the features on the board regions change over time. After the chunks are established, the VC creates transitions between them while avoiding any changes to the framing inside each chunk. The final output is a JSON file containing the cropping coordinates for each frame in the video for a third-party video cropping application to use when producing the final video. We performed a user evaluation of the VC to measure user satisfaction with the resulting output videos and how successful it was at following its heuristics. The VC succeeded in following the major heuristics such that viewers were satisfied with the output based on the framing of the presenter and the content on the boards, transition stability and smoothness of motion, and transition frequency with the VC only changing shots when necessary DA - 2023 DB - OpenUCT DP - University of Cape Town KW - Computer Science LK - https://open.uct.ac.za PY - 2023 T1 - Investigating the virtual directing strategies of a virtual cinematographer in an automatic lecture video post-processing system TI - Investigating the virtual directing strategies of a virtual cinematographer in an automatic lecture video post-processing system UR - http://hdl.handle.net/11427/40692 ER -	en_ZA
dc.identifier.uri	http://hdl.handle.net/11427/40692
dc.identifier.vancouvercitation	Khatieb MT. Investigating the virtual directing strategies of a virtual cinematographer in an automatic lecture video post-processing system. []. ,Faculty of Science ,Department of Computer Science, 2023 [cited yyyy month dd]. Available from: http://hdl.handle.net/11427/40692	en_ZA
dc.language.rfc3066	eng
dc.publisher.department	Department of Computer Science
dc.publisher.faculty	Faculty of Science
dc.subject	Computer Science
dc.title	Investigating the virtual directing strategies of a virtual cinematographer in an automatic lecture video post-processing system
dc.type	Thesis / Dissertation
dc.type.qualificationlevel	Masters
dc.type.qualificationlevel	Masters

Files

Original bundle

Now showing 1 - 1 of 1

Name:: thesis_sci_2023_khatieb mohamed tanweer.pdf
Size:: 15.42 MB
Format:: Adobe Portable Document Format
Description:

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 1.72 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

Masters