This is OLD (out of date) DRAFT text being worked on by the SNCTFI team in AARC NA3
For VERSION 0.2 see
https://docs.google.com/document/d/1WGcH3RqPpD_3usqL4sEq1S73iiF2-FUTiE6X0WlVYTo/edit?usp=sharing
Start with words from SCI document version 1
A Trust Framework for Security Collaboration among Infrastructures
http://pos.sissa.it/archive/conferences/179/011/ISGC%202013_011.pdf
( Copyright owned by the author(s) under the terms of the Creative Commons Attribution-NonCommercial-ShareAlike Licence)
Workplan
Start with the part related to Incident Response.
Then move to Data Protection.
Afterwards - add behaviour of the Proxy itself, the Community Attribute Authority and credential stores.
The Security for Collaborating Infrastructures (SCI) group is a collaborative activity of information security officers from several large-scale distributed IT infrastructures (DIIs), including EGI, OSG, PRACE, EUDAT, CHAIN, WLCG, and XSEDE. SCI is developing a trust framework to enable interoperation of collaborating DIIs with the aim of managing cross-infrastructure operational security risks. We also aim to build trust between the DIIs by developing policy standards for collaboration especially in cases where we cannot just share identical security policy documents.
SCI was created to help address several issues. Firstly, the WLCG infrastructure which uses resources from several DIIs, including EGI, OSG and NDGF, found itself in the position where agreeing security policy documents was becoming more difficult. Different DIIs had similar policies addressing similar issues but agreement on the exact wording was in many cases not possible. What was needed was a higher-level agreement on the types of security policies required and the issues they should address rather than the detailed policy words. Secondly, the various operational security teams were finding that they needed to work more and more often together on shared security incidents. This collaboration was often found to be successful, but all agreed that having a common security policy framework would help enable a more formal trust environment. Finally, it was recognised that the adoption of a common security policy framework would also help facilitate interoperation between the DIIs in the sense of shared communities of users.
There are several series of national or international standards defining best practice in the management of Information Security, including ISO 27000 [1] and NIST 800-53 [2]. These standards provide extremely useful guidance for handling security within a single management domain but are not really of much use when dealing with the management of security across multiple DIIs where each DII already consists of resources from many different sites each having their own management. We are not aware of any other activity to define standards and a trust framework as presented here.
The detailed requirements for security differ between our DIIs, but nevertheless, in this SCI activity we are concentrating on those issues which are common to all infrastructures. The characteristics may differ and some issues may be more or less important, but all of them should be considered by every DII.
Many of the requirements expressed in this current document are left deliberately vague, e.g. by not setting minimum requirements for the content of policy documents, nor defining detailed procedures, e.g. the time limit for security patching. Experience has shown that these are often the areas where DIIs differ and that it works better to allow infrastructures the freedom to define the detailed procedures according to their own environment and needs. Future versions of the SCI document, which may always be found on our web site [3], may well define some of these issues more tightly.
The SCI deliberations have been conducted by face to face meetings, by telephone or video conference and by email. Most of our face to face meetings have been co-located with meetings of the International Grid Trust Federation (IGTF) [4] as it involves many of the same individuals. IGTF has also agreed to host our web-site. This is very appropriate in that this is a neutral, DII-independent, location and IGTF is after all in the business of building global trust.
The document presented here is written by our group but is not yet approved by the management of our infrastructures. SCI does not have the authority to force new policy standards on our parent infrastructures, nor can we commit effort to performing self-assessments. The next stage of the work will involve wider consultation with our DIIs to gather feedback on and wherever possible will also include seeking endorsement of the SCI document. We aim to perform self-assessments of the extent to which our DIIs meet the documented requirements according to the maturity scale presented here. Based on the feedback received and the results of our self-assessments it is likely that we will need to further refine the SCI document.
SCI is an open group. Other DIIs interested in contributing to the group's activities or using our documents and assessments are very welcome to join our deliberations.
The SCI document itself follows after the Glossary starting with the section: "Introduction".
The following terms are defined for use in the SCI document:
Infrastructure | All of the IT hardware, software, networks, data, facilities, processes etc. that are required to develop, test, deliver, monitor, control or support services. |
Distributed IT Infrastructure (DII) | An Infrastructure together with its management, Resource Providers and Service Operators. It provides, manages and operates (directly or indirectly) all the services required by the Resource Providers and their collections of users. |
Resource | The equipment (CPU, disk, tape, network), software, middleware and data required to run a service. |
Service | Any computing, storage, preservation, or software system which provides access to, information about or controls resources. |
Resource Provider | The smallest resource administration domain in a DII. It can be either localised or geographically distributed. |
Service Operator | An entity responsible for the management, deployment and operation of a service. |
Participant | Any entity providing, using, managing, operating, supporting or coordinating one or more service(s). |
User | An individual or an organisation who has been given authority to access and use resources. |
In recent years we have seen the implementation of a variety of infrastructures supporting distributed computing environments and sharing of resources. Each such infrastructure consists of distributed computing and data resources, users (who may be organised into separate user communities), and a set of policies and procedures. Examples of such infrastructures include computing grids and/or clouds, as well as cooperating computing facilities managed by different organisations.
Even when such an infrastructure considers itself to be decoupled from other infrastructures, it is in fact subject to many of the same threats and vulnerabilities as other infrastructures because of the use of common software and technologies. Moreover, there may be users who take part in more than one infrastructure and are thus potential vectors that can spread infection from one infrastructure to another. Finally, one infrastructure may want to extend rights to use its resources to users who are enrolled in a different infrastructure. In each of these situations, the infrastructures can benefit from working together and sharing information on security issues.
Security in a distributed collaborative environment is governed by the same principles that apply to a local centrally managed system, but complicated by the diversity of sites (both in terms of hardware and software systems and in terms of local policies and practices that apply), and by the lack of a centralized management hierarchy that can "order" certain operations to be performed in specific ways.
Governing principles include:
In this document we lay out a series of numbered requirements in six areas (operational security, incident response, traceability, participant responsibilities, legalities, and data protection) that each infrastructure should address as part of promoting trust between infrastructures.
To evaluate the extent to which the requirements described in this document are met, we recommend that each infrastructure assess the maturity of its implementation according to the following levels:
Level 0: Function or feature not implemented
Level 1: Function or feature exists, is operationally implemented but not documented
Level 2: Function or feature is comprehensively documented and operationally implemented
Level 3: Function or feature implemented, documented, and reviewed by an independent external body
We encourage openness and transparency in the documentation and for Levels 2 and 3 we recommend that wherever possible such documents should be made available to collaborating infrastructures as a way of promoting trust.
Retaining operational availability and integrity is the most urgent and visible aspect of security. Each of the collaborating infrastructures must therefore have the following:
The management of risk is fundamental to the operation of any Infrastructure. Identifying the cause of incidents is essential to prevent them from re-occurring. In addition, it is a goal to contain the impact of an incident while keeping services operational. For response to incidents to be acceptable this needs to be commensurate with the scale of the problem.
It is imperative that every infrastructure has an organized approach to addressing and managing events that threaten the security of resources, data and overall project integrity.
Each infrastructure must have the following:
The minimum level of traceability for the Infrastructure is to be able to identify the source of all actions (executables, file transfer, etc.) together with the individual[1] initiating the actions. In addition, sufficiently fine-grained controls, such as blocking the originating user, system or service and monitoring to detect abnormal behaviour, are necessary for keeping services operational. It is essential to be able to understand the cause and to fix any problems before re-enabling access for the user.
The aim is to be able to answer the basic questions "who, what, where, when and how" concerning any incident. This requires retaining all relevant information, including accurate timestamps and the digital identity of the initiator, sufficient to identify, for each service instance, and for every security event including at least the following: connect, authenticate, authorise (including identity changes) and disconnect.
Each infrastructure must provide the following:
All participants in a group of collaborating infrastructures need to rely on appropriate behavior by various actors in both their own and other infrastructures. We separate these responsibilities into behavior expected of:
Each infrastructure must ensure that the various participants are aware that they have these responsibilities.
Each infrastructure must provide:
A Collection of users is a group of individuals organised around a common purpose jointly granted access to the Infrastructure. It may serve as an entity which acts as the interface between the individual users and each Infrastructure. In general the members of the Collection will not need to separately negotiate with Resource Providers or Infrastructures.
Examples of Collections of users include: User groups, Virtual Organisations, Research Communities, Virtual Research Communities, Projects, Science gateways, and geographically organised communities.
Each infrastructure must have:
Collections of users must:
The Infrastructure must have policies and procedures in place to ensure that Resource Providers and Service Operators understand and agree to abide by expected standards of behaviour, including:
Infrastructures, resource providers, service providers and collections of users must have policies and procedures, appropriately communicated to all participants, that address legal issues including but not limited to the following:
Infrastructures, resource providers, service providers and collections of users must have policies and procedures addressing the protection of individuals with regard to the processing of their personal data (PII) collected as a result of their participation in the infrastructure, including but not limited to:
The authors acknowledge the support and collaboration of many colleagues in their respective infrastructures and the funding received by these infrastructures from many different sources.
These include but are not limited to the following:
EGI acknowledges the funding and support received from the European Commission and the many National Grid Initiatives and other members. The EGI-InSPIRE project is co-funded by the European Commission (contract number: RI-261323).
The Worldwide LHC Computing Grid (WLCG) project is a global collaboration of more than 170 computing centres in 36 countries, linking up national and international grid infrastructures. Funding is acknowledged from many national funding bodies and we acknowledge the support of several operational infrastructures including EGI, OSG and NDGF.
PRACE: The research leading to these results has received funding from the European Union Seventh Framework Programme (FP7/2007-2013 ) under grant agreements n° 261557 and n° 312763.
We acknowledge the contribution of the CHAIN project (Grant Agreement n. 260011) co-funded by the European Commission under the 7th Framework Programme.
The Extreme Science and Engineering Discovery Environment (XSEDE) is supported by the National Science Foundation.
Fermilab: Work supported by the U.S. Department of Energy under contract No. DE-AC02-07CH11359.
[1] ISO 27000 series of information security standards. http://www.iso.org/iso/home/store/catalogue_tc/catalogue_detail.htm?csnumber=56891
[2] NIST 800-53 series of standards. http://nvlpubs.nist.gov/nistpubs/SpecialPublications/NIST.SP.800-53r4.pdf
[3] http://www.eugridpma.org/sci/
[1] For software agents initiating actions there must be a human individual responsible for all actions of the agent
[2] Examples include but are not limited to: Registration or renewal in a membership system, dynamic authorisation such as acquisition of VOMS attributes, authentication to a Science Gateway or portal, job submission or file transfer initiated by the Collection on behalf of an individual user