Detecting Web Application Security Vulnerabilitiesby Shreeraj Shah
Web Application Vulnerability Detection with Code Review
Web application source code, independent of languages and platforms, is a major source for vulnerabilities. One of the CSI surveys on vulnerability distribution suggests that 64% of the time, a vulnerability crops up due to programming errors and 36% of the time, due to configuration issues. According to IBM labs, there is a possibility of at least one security issue contained in every 1,500 lines of code. One of the challenges a security professional faces when assessing and auditing web applications is to identify vulnerabilities while simultaneously performing a source code review.
Several languages are popular for web applications, including Active Server Pages (ASP), PHP, and Java Server Pages (JSP). Every programmer has his own way of implementing and writing objects. Each of these languages has exposed several APIs and directives to make a programmer's life easy. Unfortunately, a programming language cannot offer any guarantee on security. It is the programmer's responsibility to ensure that his own code is secure against various attack vectors, some of which may be malicious in nature.
On the other side, it is imperative to get the developed code assessed from a security standpoint, externally or in-house, prior to deploying the code on production systems. It's impossible to use only one tool to determine vulnerabilities residing in the source code, given the customized nature of applications and the many ways in which programmers can code. Source code review requires a combination of tools and intellectual analysis to determine exposure. The source code may be voluminous, running into thousands or millions of lines in some cases. It is not possible to go through each line of code manually in a short time span. This is where tools come into play. A tool can only help in determining information; it is the intellect--with a security mindset--that must link this information together. This dual approach is the one normally advocated for a source code review.
To demonstrate automated review, I present a sample web application written in ASP.NET. I've produced a sample Python script as a tool for source code analysis. This approach can work to analyze any web application written in any language. It is also possible to write your own tool using any programming language.
Method and Approach
I've divided my method for approaching a code review exercise into several logical steps with specific objectives:
- Dependency determination
- Entry point identification
- Threat mapping and vulnerability detection
- Mitigation and countermeasures
Prior to commencing a code review exercise, you must understand the entire architecture and dependencies of the code. This understanding provides better overview and focus. One of the key objectives of this phase is to determine clear dependencies and to link them to the next phase. Figure 1 shows the overall architecture of a web shop in the case study under review.
Figure 1. Architecture for web application [webshop.example.com]
The application has several dependencies:
- A database. The web application has MS-SQL Server running as the backend database. This interface must be examined when performing a code review.
- The platform and web server. The application runs on the IIS web server with the .NET platform. This is helpful from two perspectives: 1) in securing deployment, and 2) in determining the source code type and language.
- Web resources and languages. In this example, ASPX and ASMX are web resources. They are typical web applications and web services pages, written in the C# language. These resources help to determine patterns during a code review.
- Authentication. The application authenticates users through an LDAP server. The authentication code is a critical component and needs analysis.
- Firewall. The application layer firewall is in place and content filtering must be enabled.
- Third-party components. Any third-party components being consumed by the application along with the integration code need analysis.
- Information access from the internet. Other aspects that require considerations are RSS feeds and emails, information that an application may consume from the internet.
With this information in place, you are in a better position to understand the code. To reiterate, the entire application is coded in C# and is hosted on a web server running IIS. This is the target. The next step is to identify entry points to the application.
Entry point identification
The objective of this phase is to identify entry points to the web application. A web application can be accessed from various sources (Figure 2). It is important to evaluate every source; each has an associated risk.
Figure 2. Web application entry points
These entry points provide information to an application. These values hit the database, LDAP servers, processing engines, and other components in the application. If these values are not guarded, they can open up potential vulnerabilities in the application. The relevant entry points are:
- HTTP variables. The browser or end-client sends information to the application. This set of requests contains several entry points such as form and query string data, cookies, and server variables (
HTTP_REFERER, etc). The ASPX application consumes this data through the
Requestobject. During a code review exercise, look for this object's usage.
- SOAP messages. The application is accessible by web services over SOAP messages. SOAP messages are potential entry points to the web application.
- RSS and Atom feeds. Many new applications consume third-party XML-based feeds and present the output in different formats to an end-user. RSS and Atom feeds have the potential to open up new vulnerabilities such as XSS or client-side script execution.
- XML files from servers. The application may consume XML files from partners over the internet.
- Mail system. The application may consume mails from mailing systems.
These are the important entry points to the application in the case study. It is possible to grab certain key patterns in the submitted data using regular expressions from multiple files to trace and analyze patterns.