We are looking for an experienced and highly motivated Edge AI expert to join our dynamic engineering team. The ideal candidate will have a strong background in porting and optimizing AI/ML models for heterogeneous, multi-core ARM or RISC-V based System-on-Chips (SoCs) in embedded environments.

In this role, you will work at the intersection of machine learning, systems engineering, and hardware acceleration, contributing to the development of high-performance, reliable, and efficient Edge AI products. You’ll be involved in deploying state-of-the-art AI models on a range of compute engines — including GPUs, DSPs, and NPUs — and driving innovations in AI inference performance on edge devices.

Experience: 5 to 10 Years

Job Location: Hyderabad

Role & Responsibilities / What you’ll do:

Design, Develop and Optimize AI/ML models (e.g., CNNs, transformers) for edge devices using hardware accelerators like GPU, DSP, and NPU.

Analyze performance & identify bottlenecks and tune inference pipelines for memory, power, and compute efficiency.

Adapt and integrate optimized models to inference runtimes such as TensorRT, ONNX, TFLite, SNPE, OpenVINO, or TVM.

Implement quantization, pruning, and other model compression techniques.

Support for integration of AI workloads into embedded software stacks running on Linux, RTOS, or bare-metal systems.

Develop tools and workflows for automating model deployment across different SoCs and target architectures.

Lead and mentor a team of 3 to 6 engineers; Plan, delegate and monitor day to day technical tasks

Support and work with project manager for project estimation and planning, take part in technical discussions with customers

Participate in the team’s software processes to ensure code quality & maintenance, including — requirements and design documentation, test-plan generation and execution, peer design and code reviews

Stay current with advancements in edge computing, AI inference frameworks, and compiler toolchains.

Required skills / Whom we are looking for:

Bachelor’s or Master’s degree in related engineering field with 5+ to 10 years of hands-on experience in experience in embedded AI/ML development, with a focus on model optimization and deployment.

Proficiency in in C/C++ and Python programming, Intrinsic or Assembly based optimization methods using instruction pipeline and latency optimal designs, Modular and Object-Oriented programming skills

Experience working with heterogeneous computing platforms (e.g., CPU + GPU/DSP/NPU). Must have exposure and development experience on one or more DSPs/NPUs for example ARM-NEON, TI C6x/C7x DSP, Tensilica Vision DSPs, CEVA DSPs, Qualcomm Hexagon HVX DSP

In-depth knowledge Processor/SoC architecture – VLIW and SIMD, DMA, cache, memory architecture etc.,

Working experience in machine learning technologies such as CNN, transformers, quantization algorithms and approaches on embedded systems

Hands-on experience with any of the AI frameworks such as TensorFlow, PyTorch, or ONNX and familiarity with inference toolkits such as TensorRT, SNPE, TFLite, TVM, or OpenVINO.

Familiarity with build systems (e.g. make, cmake, GCC, Eclipse, Visual Studio, ARM Development Tools)

Familiarity with debugging tools such as GDB, JTAG, and performance profiling tools.

Well verse with software development life cycle and efficient use of associated tools like Git, SVN, JIRA etc.,

Experience of leading small teams to achieve technical goals of assigned project

Excellent problem-solving skills with a focus on optimizing software for embedded hardware.

Strong communication skills and the ability to work effectively in a collaborative, cross-functional team environment.

Detail-oriented with a focus on delivering high-quality, reliable software.

Self-motivated with a strong passion for embedded AI systems and technology.

Nice-to-haves

Exposure to OpenCL based GPU development / CUDA based programming is a plus

Basic knowledge of RTOS like QNX, FreeRTOS, VxWorks, or similar and Linux with exposure to debugging of embedded systems – familiarity with heterogeneous core architecture is added advantage

Familiarity with continuous integration and automated testing practices.

Why join us:

Opportunity to work on innovative projects with the latest Embedded & AI technologies

Opportunities for accelerated career growth and professional development. Engineer your future, we empower our employees to truly own their career and development.

A collaborative and inclusive team culture

Competitive compensation and benefits package

Physical AI

Enterprise AI

Lead Engineer/Technical Lead (Edge AI Acceleration)

Apply for this position

Company

Physical AI

Enterprise AI

Resources

Get in Touch!

© 2026 Vedya Labs. All Rights Reserved.