Application Reliability Engineer

Donnelley Financial Solutions Chicago, IL
Title: Application Reliability Engineer

Job ID: 129

Location: Chicago, IL, US

Category: Information Technology

Description:

Job Description

Donnelley Financial Solutions is seeking a strong general technologist to support our document engineering and communication platform. The ideal candidate will be a self-starter, comfortable with a variety of scripting and automation tools and continually find measurable opportunities for improvement of the platform. Work is typically performed under no supervision, with only guidance about overall goals and objectives. Usually receive minimal to no guidance about how to complete work objectives. Must be able to define work based on evaluation of short term and long term goals of the department. Able to independently evaluate processes, identify areas of improvement, and incorporate in to overall work objectives.

Responsible for understanding business and technical requirements and to monitor and support application and infrastructure services. Evaluate, recommend, plan, and design solutions to improve application reliability. The scope of these tasks will include infrastructure, hardware, software systems and middleware with the objective of exceeding uptime and availability SLAs, application performance SLAs. Interface with internal development teams and vendors to resolve hardware and software problems. Interface with internal development and tech-ops teams to ensure system runbooks are current and useful. Assist technical development in defining and developing solutions which can be monitored appropriately, are highly available and easy to troubleshoot by providing guidance on the functionality and/or limitations of the various environments, platforms, and infrastructure systems used by the company.

* Responsible for working with the different application development teams to develop the engineering design to support the business and operational requirements in one or more of the following areas: Network Infrastructure, operating system environments, or enterprise storage and backup.

* Responsible for supporting the engineering design to meet the business and operational objectives for business continuity solutions.

* Responsible for implementing and overseeing monitoring solutions that ensure we are exceeding our SLA obligations.

* Responsible for incident management and facilitating escalation and troubleshooting.

* Analyzes system and/or network performance, suggesting modifications to improve throughput and effectively utilize resources. Monitor resource usage, making required adjustments.

* With administrative direction and assistance from management, leads the evaluation and selection of tools. Sets up product demos, and objectively evaluates tools. Coordinates the collection of feedback from other evaluators. Summarizes the input, makes cost conscience tool recommendations. May be responsible for the effective deployment of tools in the group. Sets up training classes and may function as a tool administrator, working with the vendor, internally coordinating product upgrades and fixes. Oversees or performs the installation and testing of upgrades working with the systems and applications teams.

* Participates in on-call Tier 3 production support activities. Is expected to have the technical knowledge and capability to handle most problems that may arise. Proactively puts procedures in place to prevent and reduce the severity of outages.

* Responsible for developing the organization by mentoring junior engineers and developing "how to" guides under the direction of the manager.

* Coordinates activities with vendors as required.

* Performs other related duties and participates in special projects as assigned.

* This position will require on-call rotation and after hours support.

Required Skills

* The duties and responsibilities described above are the essential functions of the job. The qualifications below are representative of the knowledge, skills, and/or abilities required. Reasonable accommodations may be made to enable individuals with disabilities to perform the essential functions.

* Bachelor degree in computer science or minimum three years' experience as an application programming developer in design and construction utilizing current web-based technologies.

* Knowledge of a range of platforms, including Linux

* Experience supporting production environments

* Apply advanced troubleshooting techniques to provide unique solutions to our customer's needs

* Experience maintaining/supporting (24X7) applications

* Able to apply broad work experience and knowledge when analyzing complex problems. Must be able consistently identify critical elements, variables and alternatives to develop solutions. Must be able to organize/prioritize existing resources and incorporate new information, as needed, to implement the most effective solutions. Able to communicate clearly and courteously with those who need to know of decisions/actions/problems. Able to apply collaborative skills when resolving problems.

* Requires excellent communication skills with ability to state messages in a clear manner by using language that is easy for others to understand. Able to explain programs policies and procedures in language that is understood by others. .

* Must be able to modify communication style both formal and informal to match the appropriate level of the audience targeted. Requires strong understanding of the impact of a message on the organization or customer. Able to write with the clarity and precision necessary for the work being performed.

* Experience with Windows Server management and configuration

* 3+ years experience with scripting languages and tools; e.g. Python, Ruby, PowerShell, VBS, or C#

* 3+ years experience with SQL queries and scripting

* Print & mail or manufacturing environment experience is a plus.