Abstract
Certificate misissuance is a growing issue in the context of
phishing attacks, as it leads inexperienced users to further trust fraudulent
websites, if they are equipped with a technically valid certificate. Certificate
Transparency (CT) aims at increasing the visibility of such malicious
actions by requiring certificate authorities (CAs) to log every certificate
they issue in public, tamper-proof, append-only logs. This work introduces
Phish-Hook, a novel approach towards detecting phishing websites based
on machine learning. Phish-Hook analyses certificates submitted to the
CT system based on a conceptually simple, well-understood classification
mechanism to effectively attest the phishing likelihood of newly issued
certificates. Phish-Hook relies solely on CT log data and foregoes intricate
analyses of websites’ source code and traffic. As a consequence, we are able
to provide classification results in near real-time and in a resource-efficient
way. Our approach advances the state of the art by classifying websites
according to five different incremental certificate risk labels, instead of
assigning a binary label. Evaluation results demonstrate the effectiveness
of our approach, achieving a success rate of over 90%, while requiring
fewer, less complex input data, and delivering results in near real-time.
phishing attacks, as it leads inexperienced users to further trust fraudulent
websites, if they are equipped with a technically valid certificate. Certificate
Transparency (CT) aims at increasing the visibility of such malicious
actions by requiring certificate authorities (CAs) to log every certificate
they issue in public, tamper-proof, append-only logs. This work introduces
Phish-Hook, a novel approach towards detecting phishing websites based
on machine learning. Phish-Hook analyses certificates submitted to the
CT system based on a conceptually simple, well-understood classification
mechanism to effectively attest the phishing likelihood of newly issued
certificates. Phish-Hook relies solely on CT log data and foregoes intricate
analyses of websites’ source code and traffic. As a consequence, we are able
to provide classification results in near real-time and in a resource-efficient
way. Our approach advances the state of the art by classifying websites
according to five different incremental certificate risk labels, instead of
assigning a binary label. Evaluation results demonstrate the effectiveness
of our approach, achieving a success rate of over 90%, while requiring
fewer, less complex input data, and delivering results in near real-time.
Original language | English |
---|---|
Title of host publication | 15th EAI International Conference on Security and Privacy in Communication Networks |
Publisher | Springer |
Publication status | Published - 23 Oct 2019 |
Event | 15th EAI International Conference on Security and Privacy in Communication Networks - Orlando, United States Duration: 23 Oct 2019 → 25 Oct 2019 |
Conference
Conference | 15th EAI International Conference on Security and Privacy in Communication Networks |
---|---|
Abbreviated title | SecureComm 2019 |
Country/Territory | United States |
City | Orlando |
Period | 23/10/19 → 25/10/19 |