Thursday, January 4, 2018

Using Deep Learning to Examine Browser-Based Malware - 1/4/18



We're almost done with our project! As we begin to wrap everything up, we are now deep into our final-stage: using a deep-learning based software to determine the presence of browser-based malware. After examining numerous examples of browser-based phishing, we found that there was often a correlation between word frequency with the type of phishing. For example, if a site contained the word "account" numerous times, the phisher's malicious intent would likely be trying to compromise your banking information. Consider this phishing site (Source: Google Images):
Image result for bank phishing

Here obviously, the phisher attempts to recreate the Bank of America identity verification page, which would likely contain the word "account" numerous times. However, this is also present in emails (Source: Google Images):
Image result for bank phishing email
Here, the word "account" appears the most once again. In this example, the phisher attempts to send an email to the victim impersonating Bank of America. In fact, the phisher even changes the email headers so that the victim would presume its legitimacy.

For our machine learning approach, we use a logistical classifier to determine whether or not a site has been modified for phishing. A logistical classifier's outcome is measured with a dichotomous variable (in which there are only two possible outcomes) through using a best fitting model to describe the relationship between the dichotomous characteristic of interest and the set of independent variables. In this case, the dichotomous characteristic is whether or not a site has been modified for phishing while the independent variable is the frequency of each word.

While many may beg the question that a phisher would make the impersonated site identical to the actual site, this is still a valid approach. Most phishers provide reassurance somewhere in their phishing site that the victim is doing everything "correctly". For example, in the email above, the phisher wrote "After a few clicks, just verify the information you entered is correct" is actually quite common. For people that are not too familiar with phishing, this statement may often mislead them to thinking that retyping their credentials multiple times is because of heightened security measures after their account was supposedly compromised.

We're excited to see how our classifier will turn out! We'll update you shortly!

-James

No comments:

Post a Comment