Senior Systems Engineer

Washington, DC
Full Time
Experienced

About The Britton Group

The Britton Group is a premier provider of intelligence and national security solutions, specializing in mission-critical IT services, enterprise digital transformation, artificial intelligence, full stack development, multimedia design, and advanced intelligence support.

With over 25 years of experience delivering innovative, secure, and agile solutions to the federal government, we are a trusted partner to the Intelligence Community.


The Opportunity

This position supports a mission-critical analytics and research platform built on a Linux-based high-performance computing (HPC) environment. The platform enables advanced statistical modeling and economic research across multiple business lines and federal stakeholders.

We are seeking a Senior Systems Engineer with deep expertise in Linux system administration, automation, and HPC environments. This role is ideal for engineers who excel in performance optimization, platform reliability, and supporting highly technical user communities such as data scientists and researchers.

You will be responsible for maintaining and enhancing a high-availability analytics platform, providing Tier 3 support, and collaborating with cross-functional teams to deliver scalable and secure solutions that meet evolving analytical demands.


Core Experience

Candidates should bring hands-on experience in:

Administering and maintaining Linux-based server environments in high-performance or enterprise settings

Managing high-performance computing (HPC) platforms, including workload scheduling and resource optimization

Utilizing automation tools such as Ansible and Ansible Automation Platform for configuration management and system orchestration

Monitoring system performance, tuning resources, and ensuring high availability and reliability

Providing Tier 3 support for complex system and platform issues in production environments

Supporting analytical and statistical workloads using tools such as Python, R, MATLAB, Stata, or SAS

Collaborating with data scientists, analysts, and business stakeholders to translate requirements into technical solutions

Documenting system configurations, operational procedures, and troubleshooting methodologies


Additional Experience That Adds Value

Experience with HPC workload managers such as SLURM

Familiarity with platforms such as Open OnDemand for HPC user access

Experience supporting research, analytics, or data science environments

Strong background in system security, vulnerability management, and compliance

Experience with performance benchmarking and system optimization

Exposure to cloud-based HPC or hybrid computing environments


Demonstrated Expertise

Candidates should be able to clearly demonstrate:

Ability to manage and optimize Linux-based high-performance computing environments

Strong troubleshooting skills for complex, distributed systems

Ability to support and enable data scientists and researchers with compute-intensive workloads

Strong communication skills and ability to work across technical and non-technical teams

Proactive approach to system reliability, performance, and continuous improvement


Technical Stack

Strong experience with the following technologies is expected:

Linux (Red Hat, CentOS, or similar distributions)

Ansible / Ansible Automation Platform

HPC technologies (SLURM, Open OnDemand)

Shell scripting (Bash or similar)

Statistical and analytical tools (Python, R, MATLAB, Stata, SAS)

System monitoring and performance tuning tools


Responsibilities

Design, maintain, and optimize Linux-based HPC infrastructure supporting analytical workloads

Perform system updates, patching, and security hardening to ensure compliance and stability

Provide Tier 3 support for platform-related issues, ensuring minimal downtime and rapid resolution

Collaborate with stakeholders to align system capabilities with analytical and research needs

Implement and maintain security controls to protect sensitive data and meet regulatory requirements

Conduct system audits, vulnerability assessments, and performance evaluations

Participate in platform enhancement initiatives, including upgrades and new feature implementation

Contribute to system architecture design and long-term platform strategy

Participate in on-call rotation to support critical system availability


Education

Bachelor’s degree in Computer Science, Information Technology, Engineering, or a related technical discipline is preferred. Equivalent professional experience will be considered.


Qualifications

Strong experience with Linux system administration and shell scripting

Hands-on experience with Ansible for automation and configuration management

Experience supporting high-performance computing environments

Strong problem-solving skills and customer-focused mindset

Ability to work in an on-call rotation supporting mission-critical systems

Share

Apply for this position

Required*
We've received your resume. Click here to update it.
Attach resume as .pdf, .doc, .docx, .odt, .txt, or .rtf (limit 5MB) or Paste resume

Paste your resume here or Attach resume file

Human Check*