Last week I gave a presentation on "Web Application Vulnerabilities" as part of our weekly Dev meetup at ThoughtWorks. The presentation was aimed at covering some vulnerabilities and risks that plague Web based applications, and to make folks aware of risks and possible mitigation options. In specific, topics covered were Phishing, Social Engineering, Cookies and Cross Site Request Forgery (CSRF or XSRF). As a continuation to the presentation, I felt that writing a blog on it would crystallize the information I had collected and make it available to a larger audience.
Note: As per conventions, Malice = Malicious user. Alice = Unsuspecting user.
Phishing: Phishing is a form of Social Engineering attack, where-in Malice tries to trick people into handing over their sensitive data and passwords. The premise is that its much easier to trick a person into giving their information, than attempting to hack a website. So, for instance -- if on an IM chat, someone shares a link to a Photo sharing web site, which directs a user to a login screen that looks like Google Accounts login page -- with a message "Please enter your Google credentials" to login --- and Alice enters her Google password, she would have exposed her sensitive data to someone else! This is called a Phishing attack. The internet is strewn with sites explaining how to create such fake websites. As an example check out this Link. The author of this page, has listed steps on how to create fake pages for a website, and then at the end has attached a zip containing pre-created fake pages of many popular websites for download!
So, as Web Application developers, it is imperative that we are aware of such risks, and can help our customers prevent Phishing vulnerabilities on their sites -- because information on hacking, phishing and other attacks is now easily available to any malicious attacker.
To address the problems of Phishing, one can develop solutions along two lines:
- Prevention: Solutions to help users of a website identify a Phishing attack, or to make it difficult for an attacker to create a Phishing website.
- Protection: Solutions and strategies around Protecting user data, money, or other sensitive information, even if user information like login credentials have been compromised via a Phishing attack.
- SSL/Digital Certificates: Whenever a user visits an SSL secured website (i.e. over an HTTPs connection) on their browser, some browsers will display the name of the organization that owns the website by inspecting the digital certificate presented by the site. Also, if the certificate has issues, is expired, etc, then the browser pops up a warning. If a user checks the PAD lock icon to ensure the authenticity of the website, then the user can be sure he is not submitting his data to some malicious user. All websites of any standing will use SSL/Digital Certificates nowadays to help users identify their websites. This has an additional advantage that all communication between the browser and the user is encrypted -- to avoid a man-in-the-middle attack. So, this strategy is a "Prevent" strategy -- though, unfortunately not sufficient enough. Most successful Phishing attacks have targeted sites that are SSL enabled, and most naive (and sometimes even smart) users do not check to see if the page is SSL enabled or not - leave alone verifying certificate owner, etc. Furthermore, getting a Signed Certificate for a website isn't complicated or expensive, and hence even though the page is served over an encrypted connection, its "destination" could very easily be malicious. Hence, SSL based security is good, but no longer sufficient to prevent Phishing attacks.
- Personal Image or Message: The idea here is to display a user-selected image or message on sensitive transaction pages. For instance, a Bank may display a user-chosen image (or from a set of pre-selected images), on the screen where user enters his transaction password or his profile data. This way, for a Phishing site, it would be difficult to have a large attack vector -- since it needs access to each user's personal image or message. Quite a few banks, including "Verified by Visa" utilize this technique -- and is mandatory for all online credit card transactions in US.
- RSA or Two-Step Authentication: Most banks and websites which handle sensitive data usually resort to a Two Step Authentication mechanism, where one of the passwords is randomly generated and unique for every login. It is also known as "OTP" or "One Time Password". The idea here is to ensure that even if a user unknowingly enters his credentials on a Phishing website -- since the secondary password has been generated using RSA or some random password generator, it is not valid for the next login request. Hence, the "password" is only valid for one-time use, and is useless in the hands of an attacker. This is a very popular and effective strategy to defend loss of critical user data. Its downside is user inconvenience, since the users need to carry or have access to a Random password generating device or software. With the advent of smart phones, this has become less of an issue. To avoid user inconvenience, some websites -- like Facebook and ICICI, will request you to enter a OTP only if they detect a change in user's co-ordinates -- like change of MAC address (due to the user logging in from a different machine), geo location (different place), etc.
- Log Referral Websites: Every HTTP request comes with an attribute called the "Referer" which contains the name of the website that originated the request. For instance, if on attacker.com, there is a link to example.com, then the request received by example.com whenever Alice clicks on the link on attacker.com will contain its referer attribute set to "attacker.com". Similarly, if a website example.com contains images, then request for images to be displayed on the browser will contain referer as "example.com". It is quite likely, that a fake bank login page (on a Phishing Website) will redirect its user's to the original website after a user has entered his credentials on the fake page, so that the user does not realize that he had visited a fake website. If the original bank website, logs the "referer" attribute on all its incoming requests, then it can easily determine a phishing attack, when a large number of redirects are coming from a single website -- which isn't one of its own domains. The bank can also utilize this information to display a Warning message to its customers that they may have just been victims of a Phishing attack, and they should call help desk immediately to have their accounts secured. Unfortunately, again - this isn't a fool proof mechanism since many applications, proxies and firewalls can empty out the "Referer" attribute -- for user privacy, etc. Still, its a good idea to have referer logging in place in a website.
- Google Safe Browsing API: Google provides a Safe Browsing API that developers can leverage to check whether a given URL is considered unsafe, or a Phishing website. Google Chrome and Mozilla Firefox use the same data for warning their users about malicious websites while browsing. For example, this API may be used by a web site which allows people to post URLs on its web pages (like maybe a forum, etc). Since user's can post malicious URLs on the website, one can use Google's safe browsing API to check if a URL is marked as malicious or not in Google's database. There are two types of APIs available: one allows you to submit the URLs to a Google Service, and get a response which indicates each URL in the request is safe or not, or alternatively you can download the Malicious URL database, and perform a local comparison. For more details check out: https://developers.google.com/safe-browsing/
- Phishing Detection Browser Plugin: Some websites will allow a user to install a browser plugin that can detect whether the user is visiting a real banking website or not. This is not such a popular option, but still something that can be considered. There is a lot of research going on in the area of Fake website detection, so that one is able to detect whether a website is "similar" to another website. Originally, to detect "similarity", algorithms would compare HTML snippets of websites to see if they match, but newer malicious sites have been able to circumvent such detection -- by using Flash and Images to generate pages which look similar to the real website but contain totally different HTML snippets than the original website. So newer algorithms are being developed which perform image comparisons and use a weights based score to detect polymorphic Phishing. For reference check out: http://www.cs.rice.edu/~wx6/publications/phishing.pdf
Cookies: Developers should have good understanding of Cookies and their underlying behavior, since a lot of XSS and CSRF attacks on websites are built around exploiting the way Cookies are used by websites, and are managed by browsers.
Cookies were originally designed in 1995 by Netscape as a way of storing "Shopping Cart" information of a user on the browser side. Since HTTP is a stateless protocol, developers wanted a way of being able to recognizing a user without having to ask "Who are you?" on every page request, and also have the ability to store some user specific data on the browser. A cookie can be used by the server to ask the browser to store some information -- given a "Name" and a "Value", like "UserSessionID = 1234", and the browser will create a Cookie for the current website, put this information inside it and store it locally on the user's machine. When the next page request is sent by the user for the same domain, the browser will automatically submit all cookies to the server along with the request, so that the server can "identify" the user or use the data in the cookie to give a personalized experience to the user. In response, the server can ask the browser to update some cookie values, or create more cookies.
Some points to note about Cookies:
- A "Session Cookie" is a cookie whose lifetime is the "session" of the browser, and will be auto-deleted once the browser session is destroyed. This is the default behavior of a cookie, if the server has not specified Expiry or Max Age for the cookie. The point to note is that a browser session is destroyed only when the browser window is closed, and not when the browser Tab is closed. Hence, the cookie is still present, even if a user closes the browser's Tab. Websites which track user's session via cookies, will make use of Session Cookies to store user identifiers.
- A "Persistent Cookie" is a cookie whose lifetime is defined by the "Expires" or "Max Age" attribute of the cookie. Such cookies will be kept by the browsers until they expire. Even when the browser session is destroyed, these cookies will be stored on the user's machine, and the next time a user visits the same website, the browser will automatically submit the cookies to that website. A browsers can choose to delete a persistent cookie in case it needs space, or a user decides to remove all cookies from the browser, etc. Persistent cookies are used by websites to remember personalization data -- and also by the "Remember Me" functionality on many websites.
- A "Third Party Cookie" is a cookie whose "domain" is different from the "domain" being currently visited by the user. For instance, if a user visits www.example.com, and this website shows advertisement on the right side of the page, for which it creates an iFrame and makes a call to www.some-advertisement-server.com, then the Ad server may choose to send cookies for its domain, which will also be stored by the browser. Therefore on visiting one page, a user's browser now stores cookies for two different domains. This cookie, which has been stored by the Ad server is called a Third Party Cookie. The reason the Ad server chose to store the cookie is that, if the next time a user visits another website, which also has a tie-up with www.some-advertisement-server.com, then the cookies already stored for www.some-advertisement-server.com will be automatically submitted to the Ad server, and then Ad server will be able to give personalized ads to the user based on the knowledge that the user visited site A, and now site B. A Third Party cookie is also a type of Persistent Cookie, except that it was created by a different site than the one which the user visited. Third Party Cookies have raised a lot of Privacy concerns, and subsequently some countries have now come up with legislation which requires a website to get a user confirmation before storing any Third Party Cookie. Check out "Cookie Law" http://www.cookielaw.org/
- Multiple Tabs in a browser, will share the same cookies -- and therefore if a user visits the same website on two different Tabs, the same cookies will be shared by both Tabs. For instance, try logging into GMail from one Tab, and then access GMail from the other tab, and you will see that you are automatically logged into the same user session that you opened in a different Tab. This is because as far as GMail is concerned, its the "same" user, since the browser sends the "same" cookies for both Tabs. Interestingly, for most browsers, even if you open a new window, and enter GMail, you will again get logged into the same account, since for most browser's a New Window does not really mean a New "Session". But, if you open Google Chrome in Incognito Mode, then you get a different session from the non-Incognito Mode, which is why you can now login as a different user on GMail -- since Session cookies are not being shared across these two Sessions of Chrome.
- A cookie is only associated with a domain, not with a protocol or port. Therefore, a cookie which was created as part of an HTTP connection with the domain, will also be submitted for HTTPs, FTP, and other such requests to the same domain by the browser. If a cookie was created as part of a port 80 connection, or port 1000 or some other port, it will still be submitted to all requests to the domain -- irrespective of the port. Therefore, all websites on a domain must be trusted, since cookies from one will be accessible to the other as long as the domain name matches. If a router is compromised (DNS Spoofed), then browser will unknowingly submit cookies for a domain to a malicious server, since it would think it is submitting data to the right domain.
- Cookies can also be scoped to a sub-domain. For instance, bar.example.com can set a cookie for bar.example.com, or for example.com. If a cookie is set for a sub-domain, then it will only be sent for requests directed to that sub-domain. But, bar.example.com cannot set cookies for foo.example.com.
Detailed specification on cookie behavior can be read in RFC-6265
Interesting Side Note:
Besides cookies, there are other ways to identify a unique user (or shall we say browser). Check out: http://panopticlick.eff.org/ This website will tell you how unique is your browser's fingerprint, based on browser version, OS, patch level, fonts installed, screen resolution, plugin, and settings. It is quite surprising how easy it is to identify unique browsers based on these parameters.
Cross Site Scripting (XSS) and Cross Site Request Forgery (CSRF)
CSRF is a class of attacks where a Hacker utilizes the trust that the website has on a user's credentials (cookie), to make the browser request actions on the behalf of the trusted user -- which the trusted user would otherwise not authorize. For instance, let's say Alice is logged into her company website (alice-company-website.com) in the browser, and on a new browser tab, Alice is checking out a social networking site. While browsing the site, Alice visits Malice's user profile page, on which Malice has posted an image link such as
Then the browser will submit a request on Alice's behalf while rendering Malice's profile page -- which will send a request to the server to email the company confidential report to Malice's email address, without the knowledge of Alice. This happens because when the browser requests the page on alice-company-website.com, the browser will submit all cookies of Alice's company website, and the legitimate server will have no idea that the request was not really approved by Alice (but was instead forged by Malice on behalf of Alice). This form of an attack, where on visiting one website X, a request is forged and sent on another website A, while the user is logged into website A (or while the cookies for site A of the user are stored in the browser session), so as to execute an action not really authorized by the user is known as CSRF. A CSRF attack can also be done via an IM link, or an email link, as long as the link is clicked while the user is logged on to the legitimate site in the browser.
<img src="http://alice-company-website.com/mailReport.html?report=companyReport&mailIDfirstname.lastname@example.org" />
This form of vulnerability does have a small attack vector, because Alice needs to be logged into for the attack to be played out. It also requires that the website has non-confirmation based actions, like GET requests with side-effects which can be exploited by the attacker, since Alice would not be "authorizing" the request. Websites which have the "Remember Me" feature aggravate this form of attack, since the user's session containing cookie is stored for longer periods of time.
Protection options for Websites to avoid XSS and CSRF attacks:
- If a website has a 2-step authentication, or a transaction password for authorizing all sensitive actions, then it becomes quite difficult for an attacker to perform CSRF without involving the unsuspecting user.
- Ensure that there are no GET actions on the website, which have side-effects or modify the state of the user's session.
- Use CAPTCHA techniques for verifying sensitive actions.
- Encourage users to open a separate Browser "Session" (like Incognitio Mode of Google Chrome), for performing sensitive web interactions, and only work with one website at a time in one browser session. Make use of CSRF token technique to generate a unique URL (for GET Requests) or hidden parameter (for POSTs). In this technique, the server generates a unique Hash (http://en.wikipedia.org/wiki/HMAC) based on User's Session ID, Server Key, and URL Action and attaches that as part of the POST's hidden input field. This ensures that no one can sent a forged request, since in such a case, one will need to have access to the user specific hash generated by the server. For more details read: https://www.owasp.org/index.php/Cross-Site_Request_Forgery_(CSRF)_Prevention_Cheat_Sheet
Since a lot of the attacks exploit the fundamental design of how the Web works, it is important for all Web Developers to understand various risks, vulnerabilities and attacks which a website can face -- to be able to design systems which are safe and cannot be compromised easily.