Skip to content

Careers

Transcend the day-to-day work experience. Work inspired.

Cloud Incident Manager

Location:

Lowell - Massachusetts - USA

Function:

Engineering

Ref #:

201701119
Apply >

Corporate overview

You’re empowered when you’re a Kronite. 

Want to be part of an elite group of highly skilled professionals? We think our employees are a special group of talented, energetic, and innovative people. And for that reason, we refer to ourselves as Kronites. Kronites care about more than just work. We recognize the need to maintain a healthy work-life balance – to live inspired. In fact, it’s expected! You’ll soon learn that we take work and fun seriously. No matter what position you hold at Kronos, you’re a Kronite. And we want you to feel like you have the power to make a difference in your life and the lives of others, at work and beyond. 

Kronos is a global provider of workforce management and human capital management cloud solutions. Kronos’ industry-specific workforce applications are purpose built for businesses, healthcare providers, educational institutions and government agencies of all sizes. Tens of thousands of organizations – including half of the Fortune 1000® - and more than 40 million people in over 100 countries use Kronos every day. 

Description

The role of a Cloud Incident Manager in the Cloud Service Operations group, you will be responsible for operational coverage of Kronos Cloud Support Operations and performing the critical function of managing Incidents across all services being offered across multiple public and private clouds by Kronos. Working as part of a multi-location follow the sun team model, the successful candidate will proactively work to ensure the integrity of Kronos’s SaaS services and will ensure that service impacting events are managed utilizing best practice Incident Management techniques to restore services within SLA.
The Incident Manager acts as the (Cloud Support) Senior Leader who is assigned to various shifts and/or on call to support Kronos Cloud. This leader will be empowered to make decisions/assign resources across PT&C / Support organizations to reduce customer impact and restore services ASAP.
Principal Duties & Responsibilities:
• Quickly assessing the severity of an outage in regard to business impact and technical complexity
• Assess and ensure all appropriate groups are working on restoring service in a timely manner
• Facilitate the resolution by effective communication across multiple teams (use conference bridges, group chats)
• Notify, escalate and communicate to senior management the existence and status of outages, as necessary
• Accurately record all diagnostics, communications, and resolutions within Service Management toolsets
• Retain ownership of all major incidents through to completion irrespective of the assignment group
• Ensure Knowledge Base (KB) articles are kept current & contribute new KB articles based on known problems and their resolutions
• Liaise with the Business and Application owner during major incidents impacting particular business units
• Be the focal point while leading a Major incident up to & including interacting with Sr. IT Leadership as needed to help expedite incident resolution
• Provide technical direction and coordination to the resolver groups involved
• Provide support and participate in the change control process
• Provide appropriate inputs to the Problem management process, RCA preparation
• Develop an understanding of the organizational structure and infrastructure environment
• Handle conflict situations and make quick decision while driving incidents
• Manage and drive third parties to the quick resolution of incidents

Qualifications

• 6-9 years of work experience, with 4+ years in a similar role within a large organization
• Have worked previously in a Command Center / NOC environment
• Must have experience with SaaS services
• Proven skills in collaborating with other teams, both technical and non-technical
• Ability to independently make decisions on complex issues by identifying & rationalizing risks
• Ability to quickly learn new technologies or procedures
• Ability to handle and problem solve multiple issues
• Good understanding of SAAS Support covering Infrastructure, Application technologies.
• Prior experience of sending out executive alerts/communication mailers
• Act with delegated authority of the IS Senior Leadership team in matters occurring outside of normal office hours.
• Ensure primary focus is given to the resolution of Cloud incidents and that appropriate priority is assigned according to the priority/severity
• Act as the primary contact and authorized decision maker around invocation of Cloud Support Continuity arrangements where required.
• Ensure that security implications of any incident, or decision taken within incident management are properly considered and aligned with the Security Operations Center.
• Provide On-Call/Shifts leadership support for all Cloud Support team and partner staff when required. Advising and directing where appropriate through delegated authority from the Senior Leadership Team.
Education:
• Bachelor’s Degree or equivalent experience
• 5+ years of experience managing incident response in a mission critical environment utilizing multiple third-party suppliers and ideally from an environment with a high degree of political and public visibility.
• Fluent writing and communication skills in English.
• Analytical ability to identify underlying issues from numerous sources.
• ITIL Foundations Certification
#LI-POST

EEO Statement

Kronos is proud to be an equal opportunity employer and is committed to maintaining a diverse and inclusive work environment. All qualified applicants will receive consideration for employment without regard to race, color, religion, gender, gender identity or expression, sexual orientation, national origin, physical or mental disability, age, or veteran status or any other basis protected by federal, state, or local law.