- Meta (Menlo Park, CA)
- …fabric and host networking, comms lib and scheduling infrastructure. **Required Skills:** AI / HPC Systems Performance Engineer Responsibilities: 1. ... **Summary:** Meta's AI Training and Inference Infrastructure is growing exponentially...workloads that expects a loss-less fabric interconnect. To improve performance of these systems we constantly look… more
- Meta (Menlo Park, CA)
- …hardware and software components, co-design 15. Experience in developing or debugging AI / HPC systems , performance optimizations, including familiarity ... or supporting production hardware at scale 9. Experience in deploying and productionizing AI / HPC systems and/or related components at scale 10. Experience in… more
- Ford Motor Company (Dearborn, MI)
- …and maintaining our HPC infrastructure and user-facing tooling, ensuring optimal performance and reliability for our critical AI /ML applications. This role ... Troubleshoot and resolve complex technical issues related to Linux systems , networking, storage, and AI /ML HPC...related to Linux systems , networking, storage, and AI /ML HPC applications. Develop and maintain documentation… more
- NVIDIA (Santa Clara, CA)
- …designing and operating large scale storage infrastructure. + Experience analyzing and tuning performance for a variety of AI / HPC workloads. + Experience ... join us today! As a member of the GPU AI / HPC Infrastructure team, you will provide leadership...solutions to enable runs of demanding deep learning, high performance computing, and computationally intensive workloads. We seek an… more
- NVIDIA (Santa Clara, CA)
- …group at NVIDIA has openings for software architects in the field of AI and high- performance networking and system software. We research, develop, and ... be doing + Creating proofs-of-concept to evaluate and motivate extensions in AI Frameworks (PyTorch/NEMO), HPC programming models (MPI, OpenSHMEM, PGAS), new… more
- The MITRE Corporation (Bedford, MA)
- …+ Provide Linux systems administration support for MITRE's HPC systems to ensure the availability, performance , and security of systems . ... Computing to MITRE research organizations. Job Description: We are seeking an experienced Linux HPC Systems engineer to join our team! This is an exciting… more
- NVIDIA (Santa Clara, CA)
- …doing: + Work with NVIDIA Product Teams to understand new product requirements including HPC and AI /ML Products. + Finding Optimum Solutions to deploy these ... hosts a heterogeneous mix of machines and devices with various operating systems (Windows/Linux/Android), a multitude of hardware platforms both NVIDIA GPUs and… more
- The MITRE Corporation (Colorado Springs, CO)
- …Manager. + Familiarity with NVIDIA DGX systems , including DGX H100, and integration into HPC and AI workflows. + Desired experience as a team lead or similar ... Computing to MITRE research organizations. Job Description: We are seeking an experienced Linux HPC Systems Administrator to join our team as a Group Lead for… more
- General Dynamics Information Technology (Fairfax, VA)
- …Regular **Clearance Level Must Be Able to Obtain:** None **Job Family:** Systems Engineering **Skills:** High- Performance Computing ( HPC ) Systems ... are our differentiator. Our work depends on a Senior HPC Systems Engineer joining our team to...some travel required. WCOSS provides NOAA the operational High Performance Computing ( HPC ) resources essential to process… more
- Meta (Menlo Park, CA)
- …Meta and externally. **Required Skills:** Research Scientist, Systems ML and HPC - SW/HW Co-Design Responsibilities: 1. Apply High- Performance Computing ( ... Performance team is dedicated to maximizing training performance of Generative AI and recommendation models...HPC ) algorithms and techniques to optimize large-scale AI workloads 2. Analyze, benchmark, and optimize large-scale workloads… more