Algorithms & Optimization Engineer MulticoreWare
MulticoreWare
Office Location
Full Time
Experience: 4 - 4 years required
Pay:
Salary Information not included
Type: Full Time
Location: All India
Skills: Machine Learning, Computer Vision, Algorithm Development, optimization, Testing, CUDA, OpenCL, parallel computing, Numeric Libraries, Porting, CC, SIMD instructions, Memory hierarchies, Cache Optimization, deep learning frameworks
About MulticoreWare
Job Description
We are looking for a talented engineer to implement and optimize machine learning, computer vision, and numeric libraries for various hardware architectures, including CPUs, GPUs, DSPs, and other accelerators. Your expertise will play a crucial role in ensuring efficient and high-performance execution of algorithms on these hardware platforms. Key Responsibilities: - Implement and optimize machine learning, computer vision, and numeric libraries for target hardware architectures, including CPUs, GPUs, DSPs, and other accelerators. - Collaborate closely with software and hardware engineers to achieve optimal performance on target platforms. - Execute low-level optimizations, such as algorithmic modifications, parallelization, vectorization, and memory access optimizations, to fully exploit the capabilities of the target hardware architectures. - Engage with customers to comprehend their requirements and develop libraries to fulfill their needs. - Create performance benchmarks and conduct performance analysis to verify that the optimized libraries meet the necessary performance targets. - Keep abreast of the latest advancements in machine learning, computer vision, and high-performance computing. Qualifications: - BTech/BE/MTech/ME/MS/PhD degree in CSE/IT/ECE. - Over 4 years of experience in Algorithm Development, Porting, Optimization, and Testing. - Proficiency in programming languages like C/C++, CUDA, OpenCL, or other relevant languages for hardware optimization. - Hands-on experience with hardware architectures such as CPUs, GPUs, DSPs, and accelerators, along with familiarity with their programming models and optimization techniques. - Knowledge of parallel computing, SIMD instructions, memory hierarchies, and cache optimization techniques. - Experience with performance analysis tools and methodologies for profiling and optimization. - Familiarity with deep learning frameworks and techniques is advantageous. - Strong problem-solving skills and the ability to work independently or as part of a team.,