Thursday, January 18, 2018

Blog Questionnaire - 1/17/18

We recently finished our responses for the INSPIRE Shakespeare Blog Award. They helped us think about the impact maintaining this blog, with all of our dedicated readers, has had on our research. We thought it would be interesting to share our answers with you all as well, especially since these are joint answers with input from both of us. Thank you all of accompanying us on this amazing journey, and we hope you enjoyed reading about our research!

What have you learned by blogging about your research?
Blogging our research helped us thoroughly explore our topic while maintaining a strict schedule. We wanted to put out a post about once every week, so we mapped out our expected update schedule, allowing us to also schedule which sections of our paper we hoped to complete. These deadlines allowed us to comfortably finish our work, without fear of falling behind. Additionally, we learned to collaborate as we divided blog posts and ideas between each other. By dividing up the blog posts, we were able to ensure both of us had a mutual understanding of the ideas discussed, and if there were any misunderstandings, or ideas we may have missed. Maintaining this blog also helped us think about how to present our research as well, especially in an informal, casual manner. This was especially useful when we approached potential mentors with our ideas. Finally, blogging helped us explore ideas as we could evaluate what was important to put in our paper, considering feedback we received from readers. What we were able to explore thoroughly on the blog were topics we knew we could also delve deeper into in our paper, allowing us to publish a more in-depth analysis paper.

Why do you think you deserve to win the INSPIRE Blog Award?
Blogging to both of us was a new experience, especially since neither of us have ever maintained a blog before or are in journalism. We used the blog uniquely as a tool to grow as writers and learn how to present our work. We also used the blog as a networking tool. Not only did we reach out to other potential competitors (who hopefully were INSPIREd by our topic), we also presented our blog to fellow peers, our mentor, and others who gave insightful feedback. When creating the schedule of blog posts, we made sure that our blog holisitcally reflected our project, from milestone updates (https://hackingthemalware.blogspot.com/2018/01/placehold-11018.html), to simple techniques to avoid social engineering that we found while researching (https://hackingthemalware.blogspot.com/2017/12/happy-holidays-online-shopping-safety.html). Furthermore, our blog somewhat represented an official record to the public of our project, an important artifact to maintain in the midst of rapid technological advancement, especially with AI. We would love to receive this award as a bittersweet finale to our project, but most importantly, for documenting our findings to impact science, technology, and society.

How has your research experience shaped your career or academic aspirations?
A large part of the reason we decided to try humanities research was to compare the experience to STEM research, which we are familiar with ( https://hackingthemalware.blogspot.com/2017/11/introduction-113017.html). One of our surprises was realizing that our philosophy research was as time-consuming and intensive, if not more, as STEM research. Through this research, we also found many bridges between humanities and STEM, and focused on how these two fields are interconnected. For example, in our research, we learned that the network security industry primarily focuses on creating firewalls. However, 95% of malware attacks are based off of social engineering, which attacks the user themselves, not their system. Hence, we found it much more important to protect consumers from a social angle rather than a technical viewpoint, even designing a browser-based phishing identifier using deep learning to accomplish this.

Humanities research with such strong connections to cryptography and machine learning has certainly opened a new door for us in thinking about computer science. We both intend to pursue CS, and our interest is stronger than ever, especially with the implications we have learned with this research. However, we aspire to continue exploring the humanities side of CS and technology as well, as our experience has shown us that looking at problems like network security from a different angle (predominance of viruses vs. social engineering) can show what would have a greater impact on society and what is more important to create.



Once more, thank you all for supporting us! We hope you enjoyed this insight, as well as our blog.

-James and Sohini

Wednesday, January 17, 2018

"Examining​ ​Ethical​ ​Issues​ with​ ​Malware​" Wrapup and Reflection - 1/7/17

As we're currently polishing our final paper, Sohini and I are both close to achieving a new personal milestone in our academic journey-writing our first humanities research report. Throughout the two months of our investigation in ethical issues with malware, not only have we developed a deeper understanding and appreciation for the values humanities research brings, but also a new worldview in examining problems. Whereas both of us are used to crunching numbers and making graphs for research, our examination in ethics required extensive literature review, discussing ideas among each other, and asking those in our local communities for their opinion on our research topic to gauge a holistic evaluation.

Both of us found this research enjoyable, and a relaxing break from trying to model systems with challenging mathematical equations. This process also allowed us to gain a broader perspective of computer science and artificial intelligence; living in the Silicon Valley, we are often encapsulated within the mentality that all types of technological innovation is beneficial, yet people in other places, even just outside the Bay Area, have starkingly different views. Where people in Silicon Valley live off of innovation, those in rural areas may see this as a threat to their jobs and personal stability. From this research, we both learned an important lesson for our own future pursuits in computer science: the importance ethics has behind governing ones and zeroes.

The most challenging part of our research was when we tried to come into an agreement of our different views through debating. By examining this topic through Greek philosopher's Pyrrho's lens, the boundaries between good and evil suddenly became indistinguishable. Pyrrho states that good and evil can only be assessed on a relative scale from an observer's perspective. In this case, the side making and distributing the malware can be seen as good or evil. If the observer was on this side, they could easily make the argument that distributing malware is analogous to distributing capital and making profit as income, whereas an electronics consumer sees these hackers as a threat to cyberspace.

All in all, Sohini and I learned a vast amount about the implications malware has in our digital age, and how it might proliferate expediently. With malware distribution techniques similar to those of a capital market,  the future of malware in cyberspace can only be concluded as spontaneous and uncertain. However, through improving our current anti-virus and malware prevention software and systems, a safe interconnected world glimmers in the near future.

-James

Monday, January 8, 2018

The Research Process "Examining​ ​Ethical​ ​Issues​ with​ ​Malware​" - 1/8/18

Examining​ ​Ethical​ ​Issues​ with​ ​Malware​ ​and​ ​Designing​ ​a​ Browser-Based Phishing​ ​Identifier​ ​using​ Deep​ ​Learning

This is the finalized title of our project. It's a comprehensive amalgamation of my personal interests - tech - with new areas I'm unfamiliar with - humanities. This is part one of two posts, where I will address our research process.

The first step we took after finalizing our topic was to create a very rough outline of the paper. Here's our first idea:

[REDACTED]
Of course, this has changed a considerable amount. Right after creating it, our first steps were to start completing some research pertaining to each section. I remember taking a red-eye flight to Boston (actually to visit MIT!) and instead of sleeping, just compiling links upon links of possible sites to pull information from. In fact, the end result was four entire pages of just links, single spaced. On the way back, I actually crawled through them, pulling information from the links. James and I compiled the information. We further solidified our sections by creating a set of tags for our research. Here are those tags:


The research we compiled and later edited and shortened now takes up nearly eight pages. It was interesting to learn about the beginnings of the Internet, and reading about the Morris worm, it seems incredible that one virus (laughed out of MIT ;)) could infect 10% of all computers, a number that seems huge now. We found many more laws pertaining to malware, computer fraud, and phishing than we knew or expected there to be. I, personally, have been startled by the huge number of types of malware, enough to start taking a course on Internet safety (SecurityIQ at InfoSec Institute).

We condensed our research into three different viewpoints, and we explored case studies involving social engineering (see the next post). Finally, James and I discussed how we felt about malware and the ethics behind it, having studied and research it thoroughly. We created a thesis, and then moved on to creating an identifier.

-Sohini

The Research Process "Designing​ ​a​ Browser-Based Phishing​ ​Identifier​ ​using​ Deep​ ​Learning" - 1/8/18

Examining​ ​Ethical​ ​Issues​ with​ ​Malware​ ​and​ ​Designing​ ​a​ Browser-Based Phishing​ ​Identifier​ ​using​ Deep​ ​Learning

This is the finalized title of our project. It's a comprehensive amalgamation of my personal interests - tech - with new areas I'm unfamiliar with - humanities. This is part two of two posts, where I will address our research process.

I have worked previously with network security and cryptography, taking a summer course and a later, more math-based course, on this subject, so I'm familiar with malware and the tech aspects of it. From Diffie-Hellman to El Gamal, the number theory behind malware has long intrigued me, but I was stunned when I learned that the vast majority of successful malware attacks come from social engineering. (This can be seen in both a positive and negative light - yes, firewalls are working, and our computer keeps out intruders. But that means the attacks on us are the successful ones.) Social engineering, as the name indicates, is a type of attack that relies heavily on human interaction, trying to trick people into allowing malware in. Examples of this are phishing emails (think spam filters), and people can learn to avoid infecting their computers through courses and by learning how to identify potential attacks. I became interested in learning about this other side of network security - this human side.

As we've mentioned before in this blog, James and I met at a summer research program, where we worked together in a machine learning lab (specifically, computer vision). Here, I first was intrigued by the beauty of artificial intelligence. The term used to conjure complex, even intimidating, images of thousands of lines of code and huge, clunky GPUs. While the latter is certainly true - I used my GPU over the summer as a footrest - the charm of AI and machine learning comes, in my opinion, from its simplicity in its similarities to humans and the way we learn, which is most often trial and error. Just as we learn through our mistakes, machine learning teaches computers to become accurate by adjusting their parameters as they measure their amounts of error.

So if people could be taught how to avoid social-engineering-based malware, could computers be taught this as well? After all, both are rooted in trial-and-error. To research this connection further, we decided to look at trends in social engineering, and specifically two: word frequencies, and image to word count ratios. After looking through several papers, we found several addressing the most common words found in phishing emails, and several other discussing how social engineers coerced people into giving up their most valuable information. We found PhishSim, given through SecurityIQ at InfoSec Institute. This held a gauntlet of phishing email templates, which we then stripped to just uniform text and ran through a word frequency program.

While creating out templates, we began to look up what deep learning model to use. We whittled possibilities to just two. The first was a Naive Bayes classifier, which assumes that the presence of a particular feature in a class is unrelated to the presence of any other feature. However, we realized that some words may appear together, like free and money, but not free and shipment. Our templates were separated by type, which was an important variable not considered by the Naive Bayes. On the other hand, in a Logistic Regression, the outcome is measured with a dichotomous variable ( only two possible outcomes). The goal is to find the best fit to describe the relationship between the variables. Our two outcomes would be phishing, or not phishing, and we could input words (word frequencies) and numbers (image:word count) as characteristics of interest, which combined, would teach a computer when to output what. We decided that the Logistic Regression would be the best fit for our identifier.

Future steps are outlined in our paper - creating a data set based on the templates and then programming the actual logistic regression. The finished product would be our Browser-Based Phishing​ ​Identifier​ ​using​ Deep​ ​Learning.

Creating this identifier has been incredible, to me, because of the intersection of my interests. Bridging computer science, cryptography, and artificial intelligence, there is also an element of humanities. Learning about new types of deep learning models (we used convolutional neural networks for computer vision) was a nostalgic callback to what I did over the summer, but also a strong step in continuing to learn about machine learning. I also learned about considering the human aspects, especially when creating this type of identifier. I had to learn to think like the user - what would be alarms in a phishing email? For example, the image to word count ratio was an idea of mine I didn't see in other papers, but for me, seeing a marketing email advertising a product but no product images would be a huge red flag. Especially looking through promotional emails, most don't even include more than 10 words of text. Considering this human part was something I enjoyed as well, and I look forward to more projects in the humanities.

See the previous post.

-Sohini

Thursday, January 4, 2018

Using Deep Learning to Examine Browser-Based Malware - 1/4/18



We're almost done with our project! As we begin to wrap everything up, we are now deep into our final-stage: using a deep-learning based software to determine the presence of browser-based malware. After examining numerous examples of browser-based phishing, we found that there was often a correlation between word frequency with the type of phishing. For example, if a site contained the word "account" numerous times, the phisher's malicious intent would likely be trying to compromise your banking information. Consider this phishing site (Source: Google Images):
Image result for bank phishing

Here obviously, the phisher attempts to recreate the Bank of America identity verification page, which would likely contain the word "account" numerous times. However, this is also present in emails (Source: Google Images):
Image result for bank phishing email
Here, the word "account" appears the most once again. In this example, the phisher attempts to send an email to the victim impersonating Bank of America. In fact, the phisher even changes the email headers so that the victim would presume its legitimacy.

For our machine learning approach, we use a logistical classifier to determine whether or not a site has been modified for phishing. A logistical classifier's outcome is measured with a dichotomous variable (in which there are only two possible outcomes) through using a best fitting model to describe the relationship between the dichotomous characteristic of interest and the set of independent variables. In this case, the dichotomous characteristic is whether or not a site has been modified for phishing while the independent variable is the frequency of each word.

While many may beg the question that a phisher would make the impersonated site identical to the actual site, this is still a valid approach. Most phishers provide reassurance somewhere in their phishing site that the victim is doing everything "correctly". For example, in the email above, the phisher wrote "After a few clicks, just verify the information you entered is correct" is actually quite common. For people that are not too familiar with phishing, this statement may often mislead them to thinking that retyping their credentials multiple times is because of heightened security measures after their account was supposedly compromised.

We're excited to see how our classifier will turn out! We'll update you shortly!

-James

Tuesday, January 2, 2018

Connotation of "Hacker" - 1/2/18

What image does the word "hacker" or "hacking" create in your mind? Perhaps you see someone desperately typing away at their computer, or think of the words "Access Granted."

Google Image Results for "hacker"


Hack first came to be associated with computers and machines at MIT itself. In a transcript from a meeting of the Tech Model Railroad Club in April of 1955, there is a quote that states: “Mr. Eccles requests that anyone working or hacking on the electrical system turn the power off to avoid fuse blowing.”

The word's connotation relative to machines started off positive. The term meant just working on a problem in a creative way, relative to MIT.

In the 1960s, the definition expanded out of MIT to computer scientists and engineers in general. In fact, it held positive connotations, as evidenced by the definitions for "hacker" in the Jargon File (launched in 1975). Here are the eight definitions:



The majority of the definitions given here are approving, like 4. "A person who is good at programming quickly." But the negative connotation in 8 seems to have won out on the long run. Especially in the media and outside of the tech world, "hacker" is used maliciously. The first time the word "hacker' appears in Times reads: "Computer hackers often sell the stolen codes to other students for a few dollars."

The Computer Fraud and Abuse Act popularized this negative connotation as well, in a political sense (again outside of the tech world). It has been used in the prosecution of people like Julian Assange and Aaron Swartz.

However, its positive connotation lingers in tech culture, especially to identify others. The juxtaposition of the word's meaning proves to draw a sharp line between techies and those outside the tech world, and it will prove interesting to see how the word continues to evolve.

-Sohini

Thursday, December 28, 2017

Intersection of Philosophy, Ethics, Science, Technology, and Society - 12/28/17

We have begun wrapping up our initial background research component for this research and have begun examining the different viewpoints we have synthesized from other sources combined with our own opinions. Through extensive discussion and debate between me and Sohini, we decided to settle upon these three viewpoints/arguments:

Viewpoint I

Code, just like writing, is a form of free speech, an inalienable right everyone is entitled to. While some may argue that social engineers (or "hackers") who create malware is a form of coercion, or a duress crime to force their victims to give up their information, this is most of the time, not the case. Especially with phishing, users themselves have to give up their own information by typing it into some sort of web-based interface. This is no different than a stranger walking up to someone, request their bank account information, and have the victim comply, but of course, in a more concealed manner. Despite the fact that most social engineers try to make their platforms be as similar as possible to the interface they are trying to recreate, the responsibility still lies with the user, as falling victim to phishing (or any sort of malware requiring the victim from intervening) is the result of his or her negligence. This same principle applies to malware that does not require human intervention. Just like in the real world, users in the virtual world must always be on the alert for threats, especially since the user is knowledgeable of these threats through pre-installed anti-virus systems, and constant reminders for security on the Internet, such as changing your password every so often.

Viewpoint II

Users of the technology argue that the malicious intent to harm is what is bad as the initial purpose of the internet was solely to transmit research data, no one expected it to be infected with malware on a closed internet. Today, the Internet plays a prominent role is social globalization and thus rely on in for trade, education, socialization, entertainment, among many other important aspects of human life. Unless one is browsing the Internet for the purpose of becoming infected with malware, people tend to assume they are immune to attacks. After all, most computers today come pre-installed with anti-virus systems. Should a user be affected by malware, it is the extensive work of a social engineer to break through existing security systems. As a result, users affected by malware would place all fault on the creator, as they were the ones who knowingly committed a crime. Many parallels can be drawn from this perspective. For example, if someone was hit by a stray bullet on the road, the person who shot it is at fault because they are aware that carelessly shooting may result in dire consequences whereas the person shot would have assumed the road is safe and constantly monitored by law enforcement.


Viewpoint III

The manufacturers themselves are at fault. For example,  iOS is only manufactured by the company Apple, which has a tight focus on security. If there’s “a malware threat to iPhones and iPads, Apple can blast out an update and, in theory, that’s the end of it” (Beres). Similarly, “if something goes wrong on Android, Google has to identify the problem and deliver a fix to manufacturers, and then those manufacturers have to beam that update to their customers” (Beres). The manufacturers may also be held accountable for educating users in malware prevention. Often, they do put in place firewalls and employ other cybersecurity and cryptography techniques to prevent specific attacks directly to the system.

From these three viewpoints, we see that Viewpoint I is most valid (despite it being ethically controversial :/). In the digital age that we live in today, it is with no doubt that people are knowledgeable of the destruction that malware can bring to a computer, or even a whole networking system, as we are constantly kept current through the media and new security updates/patches on mobile devices and web applications. Of course, our perspective may also be the result of a biased lens from living in the Silicon Valley. Our next step is to examine this issue and each of our perspectives from a philosophical lens, such as through Nietzsche or Pyrrho, specifically on the ideas of good versus evil. From a closer analysis from the perspective of these classical philosophers, we hope to bring new light on this topic that has constantly debated amongst the consumer electronics markets.

-James

Sunday, December 17, 2017

Happy Holidays - Online Shopping Safety - 12/17/17

Holiday season is right around the corner! With an unprecedented number of expected online shoppers this year, it's important to remember that the Internet isn’t a safe place. While we can trust sites such as Amazon, Best Buy, Walmart, etc., scammers and hackers are always creating sites similar to these to steal our information. In fact, some even take it a step further. I was recently awaiting a $50 Amazon Gift Card in my email after filling out an online redemption form from a third party. Unfortunately, the third party’s system was compromised, and I received this in my inbox two days later:




Thanks to the filtering system implemented by Google, I was able to cautiously handle the email. Opening the link that supposedly led me to my “redemption code”, the following site showed up:




This person was deceivingly smart. The phisher (those who create fake websites to steal information) was able to accurately duplicate the Amazon sign-in page. Even the tab title and logo were correct. However, what instantly threw a red flag was the url, and that’s how I confirmed this site was ingenuine.


This is a prime example of what is known as phishing, which is defined as the practice of sending a fradulent offer through electronic transmission supposedly holding reputable content in order to induce individuals to reveal personal information. Phishing nowadays usually is done first by sending an email to the victim, claiming the victim has a reward to redeem (like the one shown), or their account (i.e. social, bank, etc.) may be at risk. Then, they are led to a site that is identical to what they would normally see, except the URL is wrong. If the victim enters their credentials, that information is sent to the phisher, and the victim is usually redirected to the official site login (and the victim will likely assume this was a glitch). The phisher then uses that information to compromise the victim’s identity, and possibly sends the phishing site to the victim’s connections through his/her account.

While we typically take the utmost precautions when logging into our financial accounts, these precautions have to be taken everywhere. Many people have one password for everything, and with one account compromised, it doesn’t take too much effort for one for the phisher to steal the victim’s identity. Here are some tips to stay safe during this holiday season:


  • Never leave your password in plain view, especially in a public location
  • Always log out when you are done, and make sure you are logged out
  • Whenever possible, put in the least amount of personal information as possible. That way, if someone were to compromise your account, it is less likely they would be able to compromise other information about you.
  • Use different passwords for every account
  • Don’t transmit personal information and passwords over the internet



We wish you all a happy and safe holiday season! We’ll be updating our research progress sometime around the New Year!

-James


Tuesday, December 12, 2017

Finding a Mentor - 12/12/17

Finding a mentor for this project was an interesting journey. We first created a spreadsheet online, posting links leading to staff pages of the humanities department in several universities. Then we painstakingly combed through each to find potential mentors, reading their papers and bios to determine fit. Finally, we had a comprehensive list of possible mentors, who we then reached out to via email. We gave them a brief description of our idea and what we expected to do, then explain why we would like them specifically for our project. Each professor we reached out to was selected for a unique, individual reason why they would be a huge asset for our project.

 Most professors replied back, around 80%, but most were negative as they were busy. However, we did receive several positive responses, with those professors either eager to be our mentor or interested in learning more.

We finally decided on Dr. Joshua Cohen. As a political theorist, we found him an amazing fit for our project, which partially deals with political philosophy. As a member of the faculty at Apple University and a Senior Director at Apply itself, he would be well versed in the computer science aspects of our project. We are very excited to work with Dr. Cohen.

For more information on Dr. Cohen, please see: https://en.wikipedia.org/wiki/Joshua_Cohen_(philosopher).

Official title:
Joshua Cohen
Senior Director, Apple, Inc.
Distinguished Senior Fellow, University of California, Berkeley
Editor, Boston Review
Emeritus Professor, MIT
Honorary Emeritus Professor, Stanford University



-Sohini

Saturday, December 2, 2017

Phishing Opinion Survey - 12/2/17

I decided to conduct a survey among students in my school to get an idea of their opinions about phishing and who is at fault if a user falls into a social engineering trap, and the results were quite interesting. Feel free to tell us your thoughts as well!

In a survey of 10 high school students with half experienced and inexperienced with technology, they were asked three questions:
- What if your experience and comfort level with computers?
- If a computer virus infected your computer, who is at fault?
- If someone slipped at a restaurant because floor is slippery, who is at fault?

Those who stated they had experience with computers also stated that the user is at fault for negligence. Those who did not have extensive experience with computers stated that the person who created the malware is at fault for unethical practices. This is an interesting result, as it shows that people with considerable experience with computers may be aware to measures a user can take to prevent getting hacked and being the victim of malware. Additionally, it may show that helping employees become more comfortable with computers may lower their risk of falling for malware, as they can detect deviations and the social engineering more easily.

However, both groups stated that in a restaurant, the restaurant would be at fault for a risk not assumed by the customer. This could translate to consumers blaming companies who create technology if they fall for malware. They could claim that it is the fault of the creator (the restaurant), for leaving a potential risk (slippery floor) that doesn't not have to be assumed by the user (the customer).

These are definitely some viewpoints we are excited to consider and develop in our paper.

-James

Thursday, November 30, 2017

Introduction - 11/30/17

We've been working on this project for a month now, so perhaps this post is rather belated. However, I thought an interesting first post would be an introduction to us and our project.

James and I met at the Research Mentorship Program at the University of California, Santa Barbara, so we have some research experience. We were in the same project, conducting research on computer vision at the MIRAGE Lab. I specifically worked on image classification, while he explored loss functions.

Artificial intelligence was an interesting field for us to work in, as a growing field in computer science. Both of us have extensive backgrounds in CS, so we were drawn to this lab when we were choosing projects at RMP. Outside of this program, we've both explored several of computer science's multitude of subfields. For example, I once took a summer course through Stanford on cryptography and cybersecurity. Here, I was exposed to the mathematical side of encryption and malware.

We both thought it would be interesting to explore the other side of this field, and see computer science from a humanities and ethical standpoint. For one, we hope this will allow us to gain a more thorough and comprehensive understanding of CS. Additionally, both of us have participated in other science fairs focused on STEM. I've done the Synopsys Championship and the California State Science Fair, while James is affiliated with the Alameda County Science and Engineering Fair. We both competed in and were successful in the Siemens Competition in Math, Science & Technology. For us, a humanities, arts, and social sciences research competition is an interesting and attractive avenue to pursue to broaden our horizons. We both hope to gain a new appreciation for research in these fields, by trying it ourselves.

Our project explores the ethical dilemma behind malware, specifically those based on social engineering. These cannot be prevented mathematically using cryptography or firewalls, and these types of attacks are behind 95% of successful malware attacks, according to our research. As computer science evolves, so will these types of attacks, and we find it paramount to investigate this issue from an ethical and philosophical standpoint, to better advise both governments and users of susceptible technologies how to deal with social engineering. We also hope to be trace patterns in this type of malware, and use trends that emerge for artificial intelligence and machine learning to detect and advise against attacks.

We're excited to see where this research takes us, and we look forward to exploring research in ethics and philosophy.

-Sohini