Detection of HTTPS malware traffic without decryption

Nyathi, Miranda

Detection of HTTPS malware traffic without decryption

dc.contributor.advisor	Hutchison, Andrew
dc.contributor.author	Nyathi, Miranda
dc.date.accessioned	2022-06-23T15:32:51Z
dc.date.available	2022-06-23T15:32:51Z
dc.date.issued	2022
dc.date.updated	2022-06-23T14:34:06Z
dc.description.abstract	Each year the world's dependency on the internet grows, especially its functionality relating to critical infrastructure and social connections. More than 80% of internet traffic is encrypted using Transport Layer Security (TLS) protocol, and it is predicted that this number will increase [8]. However, threat actors are also increasingly using the TLS protocol to hide malicious activities such as Command and Control, loading malware into a network, and exfiltration of sensitive data. The use of TLS by threat actors poses a challenge to security professionals as traditional techniques used in the detection of HTTP malware cannot be applied in detecting Hypertext Transfer Protocol Secure (HTTPS) encrypted malware. To manage this, companies are using a traditional method called Transport Layer Security Inspection (TLSI), which involves decrypting packets to do full packet inspection. TLSI is expensive in computational performance and complexity, and over and above all, it violates the users' privacy. Researchers from Cisco have proposed that it is possible to identify malicious encrypted traffic by techniques other than TLSI and that the unencrypted TLS handshake messages, certificates, and flow metadata of malicious traffic are distinct from benign. These differences can be effectively used in machine learning to classify malicious and benign encrypted traffic [35]. This dissertation aims to assess the feasibility and effectiveness of the proposed alternative to TLSI. We sourced thousands of malware and benign flows and then used the Cisco tool called Joy to extract the features from the unencrypted TLS handshake messages, certificates, and flow metadata. To understand the characteristic behaviour between malicious and benign flows, we did a data exploration, summarized the unique values of the features from our datasets, and compared them with the feature values from the Cisco datasets used in the research paper [35]. We then selected features that had the most differentiating power in our dataset. The selected features were inputs into the two supervised classifiers: logistic regression and random forest. The classifiers were trained and tested on the offline datasets of benign and malware features, and we observed that the random forest performed better with an average accuracy of 98.92%. We concluded that it is viable and effective to use alternative techniques to detect HTTPS malware without TLSI.
dc.identifier.apacitation	Nyathi, M. (2022). <i>Detection of HTTPS malware traffic without decryption</i>. (). ,Faculty of Science ,Department of Computer Science. Retrieved from http://hdl.handle.net/11427/36527	en_ZA
dc.identifier.chicagocitation	Nyathi, Miranda. <i>"Detection of HTTPS malware traffic without decryption."</i> ., ,Faculty of Science ,Department of Computer Science, 2022. http://hdl.handle.net/11427/36527	en_ZA
dc.identifier.citation	Nyathi, M. 2022. Detection of HTTPS malware traffic without decryption. . ,Faculty of Science ,Department of Computer Science. http://hdl.handle.net/11427/36527	en_ZA
dc.identifier.ris	TY - Master Thesis AU - Nyathi, Miranda AB - Each year the world's dependency on the internet grows, especially its functionality relating to critical infrastructure and social connections. More than 80% of internet traffic is encrypted using Transport Layer Security (TLS) protocol, and it is predicted that this number will increase [8]. However, threat actors are also increasingly using the TLS protocol to hide malicious activities such as Command and Control, loading malware into a network, and exfiltration of sensitive data. The use of TLS by threat actors poses a challenge to security professionals as traditional techniques used in the detection of HTTP malware cannot be applied in detecting Hypertext Transfer Protocol Secure (HTTPS) encrypted malware. To manage this, companies are using a traditional method called Transport Layer Security Inspection (TLSI), which involves decrypting packets to do full packet inspection. TLSI is expensive in computational performance and complexity, and over and above all, it violates the users' privacy. Researchers from Cisco have proposed that it is possible to identify malicious encrypted traffic by techniques other than TLSI and that the unencrypted TLS handshake messages, certificates, and flow metadata of malicious traffic are distinct from benign. These differences can be effectively used in machine learning to classify malicious and benign encrypted traffic [35]. This dissertation aims to assess the feasibility and effectiveness of the proposed alternative to TLSI. We sourced thousands of malware and benign flows and then used the Cisco tool called Joy to extract the features from the unencrypted TLS handshake messages, certificates, and flow metadata. To understand the characteristic behaviour between malicious and benign flows, we did a data exploration, summarized the unique values of the features from our datasets, and compared them with the feature values from the Cisco datasets used in the research paper [35]. We then selected features that had the most differentiating power in our dataset. The selected features were inputs into the two supervised classifiers: logistic regression and random forest. The classifiers were trained and tested on the offline datasets of benign and malware features, and we observed that the random forest performed better with an average accuracy of 98.92%. We concluded that it is viable and effective to use alternative techniques to detect HTTPS malware without TLSI. DA - 2022 DB - OpenUCT DP - University of Cape Town KW - computer science LK - https://open.uct.ac.za PY - 2022 T1 - Detection of HTTPS malware traffic without decryption TI - Detection of HTTPS malware traffic without decryption UR - http://hdl.handle.net/11427/36527 ER -	en_ZA
dc.identifier.uri	http://hdl.handle.net/11427/36527
dc.identifier.vancouvercitation	Nyathi M. Detection of HTTPS malware traffic without decryption. []. ,Faculty of Science ,Department of Computer Science, 2022 [cited yyyy month dd]. Available from: http://hdl.handle.net/11427/36527	en_ZA
dc.language.rfc3066	eng
dc.publisher.department	Department of Computer Science
dc.publisher.faculty	Faculty of Science
dc.subject	computer science
dc.title	Detection of HTTPS malware traffic without decryption
dc.type	Master Thesis
dc.type.qualificationlevel	Masters
dc.type.qualificationlevel	MSc

Files

Original bundle

Now showing 1 - 1 of 1

Name:: thesis_sci_2022_nyathi miranda.pdf
Size:: 5.67 MB
Format:: Adobe Portable Document Format
Description:

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 0 B
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

Masters