Bleeding Edge: June 2010

Tuesday, June 22, 2010

Top Threats to Cloud Computing

The CSA has recently issued a report called Top Threats to Cloud Computing in which they identify and discuss seven general threat areas:

Abuse and Nefarious Use of Cloud Computing
Insecure Application Programming Interfaces
Malicious Insiders
Shared Technology Vulnerabilities
Data Loss/Leakage
Account, Service and Traffic Hijacking
Unknown Risk Profile

No priority is implied in the ordering of the top threats; the advisory committee felt that further research and greater industry participation would be required to rank the threats. My view is that ranking is less important than applying a risk management discipline to the specific requirements of an organization considering cloud services.

As we consider the seven threats individually, we should keep in mind that the CSA considers this document as a first deliverable that will be updated regularly to reflect expert consensus on probable threats to cloud services:

Abuse and Nefarious Use of Cloud Computing
Because the Cloud Service Providers (CSPs) business model is based on rapid scalability, they have emphasized ease of adoption. Therefore, in most cases anyone with a valid credit card can register for and begin using cloud services in a matter of minutes. In other words, an attacker can materialize inside your CSP's infrastructure at any time, including on the same physical hardware your cloud-based application is running on, and you need to be prepared. The best policy is one of calculated paranoia: Assume your virtual environment includes all of your competitors as well as hackers, botnets, malicious users, clueless resource hogs, and other "nefarious users." Although as a user of cloud services you need to employ a layered defense strategy to protect critical resources, you also need to rely on your CSP's onboarding and technical surveillance processes: How effective is your CSP's registration and validation process for screening new users, and how well does your CSP's monitoring of internal traffic work?

Insecure Application Programming Interfaces
The same investor and market pressures that motivate CSPs to streamline the onboarding process also apply to how they support the configuration and use of their services by large numbers of users. The more these services can be enabled in a frictionless manner, the more profitable the CSP will be. Therefore, it's worth focusing on the APIs provided by CSPs for manging, deploying and maintaining cloud services. As the report points out, the "security and availability of general cloud services is dependent on the security of these basic APIs." Furthermore,

"From authentication and access control to encryption and activity monitoring, these interfaces must be designed to protect against both accidental and malicious attempts to circumvent policy."

One key question to ask: Does the CSP require use of X.509 certificates to access APIs? Besides being used to support the TLS protocol and WS-Security extensions to SOAP, X.509 certificates are used for code signing -- critical for secure use of APIs.

It's essential that users understand the security model of the CSP's APIs, especially to ensure that strong authentication and access controls are implemented.

Malicious Insiders
The threat from malicious CSP insiders is a threat that organizations have always had, except the threat was (and still is!) from someone they know rather than someone they don't know. An organization should compare its own policy with regard to insiders with that of the CSP, ensuring that controls such as the following exist:

State of the art intrusion detection systems
Background check on new hires (where permitted by law)
Authorized staff must pass two-factor authentication
Immediate deprovisioning of admin when no longer has business need
Extensive background check of staff with potential access to customer data
All admin access logged and audited, with suspicious actions raising a real-time alarm

Organizations should require transparency of CSP security and HR practices as well as all compliance reporting, and should refer to controls such as listed above as part of any legal agreement with the CSP.

Shared Technology Vulnerabilities
The foundation of the cloud service provider's business model is sharing of computing resources: CPU; memory; persistent storage; caches; and so forth. This sharing results in a multi-tenant environment, where great trust is placed in all virtualization technologies -- especially hypervisors that enable sharing of server hardware. Hypervisors must effectively isolate multiple guest operating systems while ensuring security and fairness. The CSA paper lists five remediation tactics for shared technology vulnerabilities, but the fact they they're generic recommendations (implement security best practices..., etc) serves to reinforce the point that at the end of the day, we need to be able to rely on the assumption that the CSP employs a secure hypervisor.

One potentially useful resource is a recently-released vSphere Security Hardening Guide from VMware. Overall, the guide contains more than 100 guidelines in a standardized format, with formally defined sections, templates, and reference codes that are in alignment with formats used by NIST, CIS, and others. The guide itself is split into the following major sections:

Introduction
Virtual Machines
Host
vNetwork
vCenter
Console OS (for ESX)

While the document is mostly applicable to CSPs using VMWare, many of the guidelines are generic and might apply to other hypervisors. In evaluating CSPs and shared technology vulnerabilities, it would be worthwhile having the CSP respond with how they've incorporated applicable recommendations from the hardening guide into their environment.

Data Loss/Leakage
The concept of defense in depth, or a layered security strategy, comes into play when we consider the threat of data loss or leakage. All of the above threat vectors can result in data loss or leakage. Data encryption, then, becomes the last line of defense against the data loss threat.

While encryption is easy enough conceptually, in practice it's a challenge -- especially in a multi-tenant environment. The authors of Cloud Security and Privacy dedicated an entire chapter to Data Security and Storage (as I previously discussed here). In particular, the authors warn of CSPs that use a single key to encrypt all customer data, rather than a separate key for each account (see pg. 69). Best practices for key management are provided in NIST's 800-57 "Recommendation for Key Management"; your CSP should comply or have an equivalent guideline that they use.

Of course you should know whether your CSP uses standard encryption algorithms, what they key length is, and whether the protocols employed ensure data integrity as well as data confidentiality. And since encrypted data at rest can't be operated on without being unencrypted you'll want to know whether memory, caches and temporary storage that have held unencrypted data are wiped afterward. The same set of questions (and answers) apply to the issue of data migration and to processes by which failed or obsolete storage devices are decommissioned.

Many regulatory frameworks focus on protecting against data loss and leakage. If you need to comply with PCI DSS or any other set of financial controls you will need to ensure adequate threat protection that includes encryption of data at rest.

Account, Service and Traffic Hijacking
In the online payment space there's a segment called "card not present" (CNP). That's analogous to cloud computing, where service is provided to a "user not present". All of the threats in an enterprise environment -- including phishing, fraud, shared or stolen credentials and weak authentication methods -- become magnified in the cloud. Remediation suggestions are fairly obvious: prohibit sharing of credentials; leverage strong two-factor authorization where possible; employ proactive monitoring to detect unauthorized activity; and understand CSP security policies and SLAs. I would add to CSA's recommendations that organizations should routinely check for excessive access rights to ensure there are no unused (and unmonitored) accounts that would be vulnerable to highjacking.

Unknown Risk Profile
CSPs, hypervisor vendors, other cloud technology providers, application developers, security experts, and customers are all pushing the envelope when it comes to cloud services. The compelling economics of cloud services are driving adoption rates higher than is typical for new technologies. All together, this adds an element of technical uncertainty to the question of what are the top threats to cloud computing.

In general, a strategy of pragmatic paranoia is recommended. Be on the alert for the unexpected. Review logs, set up monitoring and alerting systems where practical, and re-evaluate the security implications of your cloud service periodically. Most importantly, select a CSP you can trust and back it up with a strong agreement specifying all areas of concern and including SLAs -- with penalties for non-compliance.

Monday, June 14, 2010

The Growth of Web Services

eBay published their first web API in 2000. It took another 8 years to get to 1,000 APIs on the web; it only took 18 months to get to the next 1,000 APIs.

ProgrammableWeb was founded in 2005, when they tallied 105 APIs. The current count is 2,016 and the rate of new APIs is doubling year over year.

What segments account for these APIs? Social networking sites are high on the list, followed by mapping, financial, reference and shopping. The single most popular API is Google Maps, used in 1,978 mashups.

Even more dramatic are the stats for how often APIs are called. Here's the Internet's new billionaire club:

74% of the APIs are REST and 15% are SOAP; the remainder includes JavaScript, XML-RPC and AtomPub. Over the past two years the use of REST APIs has increased as an overall percentage of net APIs, mostly at the expense of SOAP. Another trend is the increasing use of JSON; 45% of all new APIs support JSON. And on the authentication front, OAuth continues to pick up steam as over 80 APIs now have OAuth support.

The web is evolving from providing access to information, to providing access to services, to providing access to complex services -- also known as mashups. The popular and somewhat trivial example is the number of sites that call Google Maps API to show a map to their location. Links to Flickr, YouTube and Twitter are also popular. But what is the real business potential of these complex services?

APIs enable further leverage in systems development; that's why we can think of the web as a platform. Object-oriented software development is giving way to service oriented architecture (SOA) , which allows interfaces to be specified and their web services to be made available to any system with web access. This allows development organizations to focus on their core competencies, and leverage web services for the rest.

An example of how this is playing out is in monetizing the web. A new generation of web-based services has emerged, and many of these services are based on subscription revenue models rather than single transactions (aka shopping carts). Subscription billing is hard: while it's tempting for the many new developments in digital publishing, gaming, telecommunications, health care, consumer electronics and renewable energy to include a do-it-yourself billing system, there's no need to. Companies such as Zuora, Vindicia and Aria Systems provide sophisticated billing systems through APIs, providing advanced functionality such as currency conversation, tax calculation, invoicing, fraud control, collections, reporting and analytics for a fraction of the time and expense that it would take to self-develop such capabilities. As we evolve towards a subscription economy with a variety of payment models, APIs providing web billing services will be leveraged to ensure secure, reliable billing.

Tuesday, June 8, 2010

User Activity Monitoring

Gartner recommends that organizations implement user activity monitoring as part of a strategy to manage external and internal threats, and for regulatory compliance. Gartner suggests integrating Identity and Access Management (IAM) capabilities with a SIEM system to achieve user activity monitoring, but other approaches work as well if not better as I explain below.

Why is user activity monitoring needed? Since all major regulatory frameworks -- including SOX, PCI DSS, GLBA, and HIPAA -- require least privilege access controls, thousands of companies are obligated to prevent excessive access rights and yet, according to Deloitte, have failed to adequately do so. The reason this is a hard problem has to do with the dynamic nature of the enterprise-especially in an economic downturn -- with layoffs, restructurings, aggressive use of contractors and other service providers, along with the need for federated identity and access management as enterprises collaborate.

Conventional wisdom holds that the best practice for resolving this issue is to adopt an IAM system with role-based access control (RBAC) capabilities. Unfortunately, such systems provide no user activity monitoring or other assessment mechanisms and as a result are notoriously ineffective. While these systems ensure that only authorized users may log in to critical resources, they fail to consistently determine which users should be authorized to access those resources. As a result, as reported by a Dartmouth field study and by IDC, over-entitlement is the norm. In many organizations over 50% of access rights are dormant, representing a huge security vulnerability as well as a significant compliance exposure.

This is where user activity monitoring comes in. Organizations can assess user privileges, or entitlements, through user activity monitoring in order to identify excess entitlements. That few organizations do so is indicated by the high rate of audit findings for such access controls. Two additional methods of implementing user activity monitoring, besides the SIEM+IAM integration suggested by Gartner, are network-based activity monitoring and log-based activity monitoring.

Many organizations collect NetFlow data for IP traffic analysis reasons, and analyze this data for user activity monitoring. While NetFlow shows source and destination IP address and port number, it doesn't show authenticated user names nor application names (applications can in many cases be deduced with destination IP address and port number, but it's practically impossible to link source IP address to user names). NetFlow is therefore inadequate in most cases for tracking user access to audited applications.

Some organizations have adopted a network-based user activity monitoring system which goes beyond NetFlow to record, not just source and destination IP addresses, but authenticated user names and which application was accessed. While far superior to a NetFlow-only approach, network based activity monitoring has several challenges:

Span port scarcity - span ports are used for a variety of applications, and without a network monitoring system such as one from Gigamon span port availability could be a constraint;
Span port data loss - most switches are vulnerable to packet loss on their span ports during peak traffic bursts. Even a data loss rate of under 1% can render such a solution inadequate for forensic purposes;
Application-side scalability - network activity monitoring requires a probe on every ingress span into the application infrastructure;
User-side scalability - a probe must be placed in every subnet with its own AD or other authorization system, which can make for a very expensive deployment in a distributed environment or one with many remote offices;
Encryption - as the percentage of encrypted sessions inside the data center increases, it leaves a larger blind spot for network-based approaches;
Technical challenges with today's DPI silicon in monitoring 10G links - the latest generation network processor with DPI (deep packet inspection ) capabilities can monitor 4-5 Gbps, far short of the 20 Gbps required for full-duplex traffic monitoring of a 10G link; and
No visibility to access from behind the monitored span port - network activity monitoring is blind to local access, e.g. from the application server's console port. It also can't see application-to-application access.

Despite these challenges, enterprises are deploying network-based access activity monitoring system because they otherwise do not have effective solutions for preventing excessive access rights.

An alternate approach to network-based access activity monitoring is log-based user activity monitoring, also known as Identity and Access Assessment (IdAA), which does not suffer from the limitations and constraints listed above. Cloud Compliance, my prior company, read log files for audited applications in order to prevent excessive access rights and other access audit violations. The log-based approach precludes the need for hardware to be deployed, is scalable, detects 100% of access activity (regardless of encryption, 10G links, and source of access) and, when deployed as a SaaS solution, eliminates the need for installation, software maintenance, and a large upfront capital outlay.

Tuesday, June 1, 2010

Visualizing Security Metrics

This is the third and final post discussing Security Metrics: Replacing Fear, Uncertainty and Doubt by Andrew Jaquith. As I noted, Jaquith makes some intriguing and vital points about the need for "good" metrics and "serious analytic scrutiny" to inform executive decision-making on issues of security, compliance, and risk governance. This is an especially important topic today, with organizations everywhere trying to figure out how to stay secure and improve compliance while cutting their expense budget.

Most organizations, when considering appropriate investment levels to deal with risk, are not lacking for data. But lots of data does not equate to relevant information required for sound decision-making. Jaquith's point is that information in the form of metrics -- good metrics, which he defines -- is lacking in many enterprises.

But once good metrics have been defined, how are they communicated to stakeholders? Jaquith dedicates an entire chapter to visualization. He starts by listing his six design principles for visualization of metrics:

It is about the data, not the design (resist urges to "dress up" the data)

Just say no to three-dimensional graphics and cutesy chart junk (it obscures your data)

Don't go off to meet the wizard (or talking paperclips)

Erase, erase, erase (removing tick marks and grid lines results in a crisp chart with few distracting lines)

Reconsider Technicolor (default colors are far too saturated, and should be muted. Consider a monochromatic palette)

Label honestly and without contortions (pick a meaningful title, label units of measure, don't abbreviate to the point where the meaning is not clear)

Like me, Jaquith is an admirer of Edward Tufte, author of several books about information visualization including the classic The Visual Display of Quantitative Information (1983, Cheshire, CT: Graphics Press). According to Tufte, a key to effective visual displays is understanding the goal of your presentation. In Tufte's own words:

At the heart of quantitative reasoning is a single question: Compared to what? Small multiple designs, multivariate and data bountiful, answer directly by visually enforcing comparisons of changes, of the differences among objects, of the scope of alternatives. For a wide range of problems in data presentation, small multiples are the best design solution.

Hence, we have small multiples as a visualization strategy. Here's an example:

From this display, one can look at different categories (in this case, departments) to view comparative performance over time. Once can readily imagine security/compliance applications for this approach, such as dormant accounts by resource, or excessive access rights by department.

In his book Beautiful Evidence (2006, Cheshire, CT: Graphics Press) Tufte introduces a refinement to this concept called the sparkline, which he defines as "small, intense, simple datawords". The example Tufte uses to explain the sparkline concept is a patient's medical data, taken from Beautiful Evidence:

Besides Tufte's small multiples and sparklines, Jaquith's visualization suggestions include indexed and quartile time series charts, bivariate charts, period-share charts, treemaps, and Pareto charts. The key point is that there's not a single graphic approach that works in all cases; one needs to determine the essence of what is being conveyed. The audience almost always consists of busy people, often executives, who need to have information presented clearly and in context. It doesn't do anyone any good to be able to point out after a security event that the "smoking gun" data had been seen, but it was either lost in the noise of too much data, or its significance was not clear.

P.S. It's not necessarily relevant to this post, but my favorite graphical display of quantitative information is an advertisement for one of Tufte's books that regularly appears in Scientific American and The Economist:

Bleeding Edge