Senior Systems Engineer
About The Britton Group
The Britton Group is a premier provider of intelligence and national security solutions, specializing in mission-critical IT services, enterprise digital transformation, artificial intelligence, full stack development, multimedia design, and advanced intelligence support.
With over 25 years of experience delivering innovative, secure, and agile solutions to the federal government, we are a trusted partner to the Intelligence Community.
The Opportunity
This position supports a mission-critical analytics and research platform built on a Linux-based high-performance computing (HPC) environment. The platform enables advanced statistical modeling and economic research across multiple business lines and federal stakeholders.
We are seeking a Senior Systems Engineer with deep expertise in Linux system administration, automation, and HPC environments. This role is ideal for engineers who excel in performance optimization, platform reliability, and supporting highly technical user communities such as data scientists and researchers.
You will be responsible for maintaining and enhancing a high-availability analytics platform, providing Tier 3 support, and collaborating with cross-functional teams to deliver scalable and secure solutions that meet evolving analytical demands.
Core Experience
Candidates should bring hands-on experience in:
Administering and maintaining Linux-based server environments in high-performance or enterprise settings
Managing high-performance computing (HPC) platforms, including workload scheduling and resource optimization
Utilizing automation tools such as Ansible and Ansible Automation Platform for configuration management and system orchestration
Monitoring system performance, tuning resources, and ensuring high availability and reliability
Providing Tier 3 support for complex system and platform issues in production environments
Supporting analytical and statistical workloads using tools such as Python, R, MATLAB, Stata, or SAS
Collaborating with data scientists, analysts, and business stakeholders to translate requirements into technical solutions
Documenting system configurations, operational procedures, and troubleshooting methodologies
Additional Experience That Adds Value
Experience with HPC workload managers such as SLURM
Familiarity with platforms such as Open OnDemand for HPC user access
Experience supporting research, analytics, or data science environments
Strong background in system security, vulnerability management, and compliance
Experience with performance benchmarking and system optimization
Exposure to cloud-based HPC or hybrid computing environments
Demonstrated Expertise
Candidates should be able to clearly demonstrate:
Ability to manage and optimize Linux-based high-performance computing environments
Strong troubleshooting skills for complex, distributed systems
Ability to support and enable data scientists and researchers with compute-intensive workloads
Strong communication skills and ability to work across technical and non-technical teams
Proactive approach to system reliability, performance, and continuous improvement
Technical Stack
Strong experience with the following technologies is expected:
Linux (Red Hat, CentOS, or similar distributions)
Ansible / Ansible Automation Platform
HPC technologies (SLURM, Open OnDemand)
Shell scripting (Bash or similar)
Statistical and analytical tools (Python, R, MATLAB, Stata, SAS)
System monitoring and performance tuning tools
Responsibilities
Design, maintain, and optimize Linux-based HPC infrastructure supporting analytical workloads
Perform system updates, patching, and security hardening to ensure compliance and stability
Provide Tier 3 support for platform-related issues, ensuring minimal downtime and rapid resolution
Collaborate with stakeholders to align system capabilities with analytical and research needs
Implement and maintain security controls to protect sensitive data and meet regulatory requirements
Conduct system audits, vulnerability assessments, and performance evaluations
Participate in platform enhancement initiatives, including upgrades and new feature implementation
Contribute to system architecture design and long-term platform strategy
Participate in on-call rotation to support critical system availability
Education
Bachelor’s degree in Computer Science, Information Technology, Engineering, or a related technical discipline is preferred. Equivalent professional experience will be considered.
Qualifications
Strong experience with Linux system administration and shell scripting
Hands-on experience with Ansible for automation and configuration management
Experience supporting high-performance computing environments
Strong problem-solving skills and customer-focused mindset
Ability to work in an on-call rotation supporting mission-critical systems