Table of content

Description

First draft of TF-NOC survey

General survey input

The survey should be simple to complete. To increase the number of participants giving feedback.
It should be decided beforehand approximately how long time it should take to complete the survey. Should probably be maximum one hour if not less.
The survey should only include "heavy questions" (questions that need alot of explanation to answer) if they are considered important.
The survey questions should preferably be multiple options questions (check boxes or drop down lists).
The most important questions should be in the beginning of the survey (if people stop survey they still bring value).

Functions that would be very useful:
It is important that people can pause the survey.
At the end of the survey participants should see info from other surveys (reward for finalizing the survey). (smile)
Include a progress meter.
The survey should not include to many questions on one page.
Maybe include an optional section with more questions

NOC organisations, internal processes, workflows

Information organisations are interested in getting to know how other organisations are handling:

  • Network
    • Acquiring, management and operation of dark fibre.
    • Core equipment vendor/make, approx numbers;
  • NOC functions
    • Which services and incidents are covered by the NOC? How do you deal with requests/incidents that are outside your NOC's scope?
    • Services managed by the NOC.
    • What other services than the network does your NOC operate and how do they differentiate from the network service?
    • Connected networks: Types of organization and approx numbers for subscribers, peers and upstream IP providers- Roles and responsibilities (structure, who does what)
    • Responsibility of your NOC: how far does it go?
    • operation of CERT team and security policies
    • What scope of services does the NOC support? Networking? Servers? Value added services like DNS, web hosting, etc.? HPC?
  • NOC structure
    • Staff organization, i.e. services support strategy (per person/team/NOC) with their respective pros/cons
    • NOC organization and procedures: Tier1-N functions, # of persons in shifts and in total, work quality control, peak hours
    • Organisation and roles of NOC members. Number of levels. Outsourced vs. staff-in NOC. Pros and cons. Functions. Experience with externalized NOC.
    • NOC activities/coverage and staff (network, services, community details)
    • NOC staff job description
    • How are staff hired for the NOC? Skillsets? Interview processes?
    • How operation is done outside working hours including weekends
    • How is the 24x7x365 NOC service guaranteed?
    • Distributed NOC's (as opposed to a NOC on a central location): pro's & con's
    • How common experience is handled: specialized vs. universal team members
    • With R&E networking moving towards having more regional connectedness between smaller networks, RON's, universities, etc, are you planning for a new phase in networking where specific operational differences between NOC's may negatively affect the reliability of the services provided?(for example: different response levels from hardware and fiber vendors, vastly different maintenance windows, NOC's not being 24x7, etc.)
    • Procedures. Optical NOC differentiated from IP NOC?
  • Outsourced services
    • Partially outsourced NOCs (24x7, weekends,...): what kind of information do you give them, how do they interact with you and your users?
    • branding of an outsourced NOC
    • Good & bad experiences with outsourcing
  • Setting up and maintaining a NOC
    • Knowledge management and NOC staff training
    • Number (approx) of ticket mail lists the NOC is on, and approx number of tickets received on these lists
  • NOC assessment
    • NOC assessment and KPI - measuring NOC performance
    • NOC key performance indexes - how to measure/to improve ?
    • How do you assign value to what the NOC does? Do you cross charge for projects that will be handed over to the NOC? How to you prove to upper management that the NOC is good value for money?
  • NOC workflows
    • How do you deal with performance issues/incidents and e2e issues regarding traffic that flows through your network.
    • Do you have/use a fixed maintenance window?
    • What kind of routines are used in regards to change management (e.g. in planned maintenance)?
    • Procedures: presentation of the NOC's current procedures & who is responsible for maintaining them
    • Incident handling
    • (D-)DoS - how to detect network attacks
    • Target repair times;
    • How to work proactively in NOC (active/passive monitoring)
    • How are new services and users provisioned?
    • How do NOCs source support and SLAs? What about penalty clauses? What's reasonable, what works and how are they enforced?

Information organisations are willing to share about their organisation:

  • Network
    • Infrastructure building and operations (dark fiber, net devices, server-rooms, POPs, servers)
    • Acquiring and managing dark fiber
    • Deployed network (Optical/Ethernet/IP)
    • Virtualization of the network.
    • Type of network. Equipment and type of lines.
    • Management of fibers - with segments from different providers
    • Network type (IP only, hybrid, switched/transmission only, LAN only);
    • CARNet network, users & Core/Access nodes - how to treat different users and different parts of network
  • Out-of-band access to our network.
    • Out-of-band systems (MRV using GSM/GPRS)
    • Reaching your network from anywhere (out-of-band, in-band and VPN)
  • NOC functions
    • Survey for our services, what to ask and how?
    • Services operated (from lambdas to large scale e-mail accounts)
    • Providing VPS service,
    • Network services: IPv4/IPv6/multicast IPv4/multicast IPv6/VPNs/QoS/etc
    • Responsibilities (e.g. monitoring, configuring/provisioning, direct or indirect repairs, DNS updates, RIPE updates, POP access and deliveries, reporting);
  • NOC structure
    (staffing, internal escalations, handover and schedule)
    • 24x7x365 service support
    • Hours of operation (office and on-call);
    • Pros and contras for dedicated NOC personnel
    • (tiers, and which tiers are in-house or outsourced);
    • A network knowledgeable tier-one Service Desk that can assess network issues intelligently and can provide engineering with helpful information as they begin troubleshooting.
    • Structure and roles in a multi-layer NOC operating multi-type networks (campus network, international research network, internet exchange point)
    • Networks PIONIER (NREN) and POZMAN - technical structure (DWDM, switches, routers) and its influence to NOC structure, organizational structure: Layer 1,2 (NOC) and Layer 3 (IP NOC), employees and their functions
    • Global Research NOC Service Desk provides 24x7x365 technical call center support, trouble ticket management, and workflow support. The service desk is housed in a state-of-the-art call center in Indiana University's Informatics and Communications Technology Complex (ICTC) on Indiana University's Indianapolis campus. This call center features a fully customizable 30-foot screen used to monitor and troubleshoot the current health of the various networks supported. We provide NOC support for the most advanced research and education networks in the country.
    • 24 x 7 NOC cover. Do you provide it? What SLA? How do you hire staff and plan rotas, etc? Is there a real client demand for it?
  • Outsourced services
    • Managing an outsourced NOC
    • Transforming 2 layer network support into 3 layer support with outsourcing
    • Lessons learned from outsourced incident & change management
  • Setting up and maintaining a NOC
    • Knowledge sharing
  • NOC workflows
    • Procedures used in our NOC - the good and bad things
    • Workflows

NOC tools

What tools does your organization currently use?

  1. Monitoring tools:
    1. Network Monitoring platforms (HP Openview, Pandora,...)
    2. General monitoring (MRTG, Cacti, Nagios, ... with or without plugins)
      1. Weather maps
      2. Thresholds and alarms
    3. Infrastructure monitoring (Nagios, Ganglia, Zabbix,...)
    4. Diagnostic tools (ping, traceroute, mtr,..)
    5. Flow monitoring (Netflow, cflow, sflow,... )
    6. Syslog (logfile scanner, JFFNMS,...)
    7. Routing: route servers (zebra,...), BGPmon, looking-glass, ...
    8. Multicast monitoring tools: (dBeacon,...)
    9. Out-of-band access tools
    10. Network security
    11. Sniffing & analyzing (tcpdump, wireshark,...)
    12. Topology documentation (Visio,...)
    13. Changes in network configurations (rancid,...)
    14. Control of configuration files (subversion, ...)
  2. Multidomain tools
    1. Do you use any special tool for the monitoring of multidomain networks? Which one?
    2. Do you use any tool for the automatic configuration of end-to-end circuits accross different networks? Which one?
  3. Reporting and statistics tools (Nagios, Zabbix,..)
    1. Who uses them?
  4. Ticketing tools (RT, Trac, Buzilla,...)
    1. Who uses them?
    2. Do the users have access to the tickets?
    3. Are the tickets publicly available?
  5. Performance management
    1. Performance testing (IPerf, NDT,...)
    2. Traffic generators (Bulk, Mgen,...)
  6. Chat/communication/coordination tools: IM, mailing lists, skype,...
  7. Databases: (Mysql, exchange,...)
    1. What information do you store on them?
    2. How do you connect them to your management tools?
  8. Knowledge management/documentation: (Plone, Wiki, ...)
    1. What kind of information do you store?
    2. How do you structure it?

The aim is to have information about advantages or disadvantages of the tools that have been already tested by the community.

NOC front-end

Information organisations are interested in getting to know how other organisations are handling:

  • Escalation of incidents. How many levels?
  • User (customer) information exchange: channels and content
  • How to inform/communicate with customers
  • NOC to user communication (communication with more or less experienced users, how to keep user contact info up-to-date, personalized web interface for NOC to user communication)
  • How to minimize users calls (public available FAQ? tools? documentation?)
  • Serving users calls - how to automate it, distributing calls among different NOC administrators
  • Authentication: how do you verify the caller / e-mail sender

Information organisations are willing to share about their organisation:

  • How many customers do you have? (depends on if you are counting persons, networks, hosts,..)
  • Users management. How to approach our users, depending on the action (change, incident, request,..).
  • Coverage of user support (helpdesk)
  • Integration of network and telecom support
  • Service Desk & Automated reporting (per member)
  • SLAs with vendors and providers and type of contact.
  • Willingness to assist network customers with other services and requests outside of traditional front-line network operations, such as network redesign planning and utilizing our tool set.
  • Managing user interfaces (using people outside the NOC to aggregate user requests)
  • What kind of agreements do you have with service providers and customers (SLA,..)? Without financial info, but just roughly descriptions of responsibilities.
  • Experiences in joint management application design & deployment with commercial company
  • Introduction of our customers to the outsourced NOC's: how to change their attitude

Efficient communication/collaboration tools

Information organisations are interested in getting to know how other organisations are handling:

  • NOC awareness and identification: internally and among user community
  • Relation and communication between the network operations team and other internal entities (design, deployment)
  • How do you integrate the different groups within your NOC (Service Desk-Tier One, Engineering-Tier Two & Three, Systems Engineering, etc) so each has clear expectations of what each group does, with all working as a unified team?
  • What type of training methodologies do you use to bring new employees up-to-speed quickly so they can start making a valuable contribution to the operation?
  • How is knowledge dissemination handled between groups and new NOC employees?

Efficient communication

  • Approach & tools: how the different departments & the NOC's work together

Best practice documents & documentation

Information organisations are interested in getting to know how other organisations are handling:

  • Documentation: what can and can not be documented.
  • How procedures are created and stored
  • How do you organize and update your internal documentation environment for easy access and search, as well as providing clear and accurate information?nal documentation environment for easy access and search, as well as providing clear and accurate information?

Information organisations are willing to share about their organisation:

  • Past horror stories: What you can try to avoid & Murphy's Law. (smile)
  • Efficient administration of many Linux based servers,
  • Procedures and documentations - what should be strictly put "on paper" (level of knowledge and procedures) from our experience.
  • Scripting & naming convention: making the network more consistent for the NOC
  • Keeping the quality of the NOC's in check over time: means & experiences

Liaising

  • International networking efforts
  • No labels