Engineering

Mobile ML Engineer

Full-time

|

Madurai (Hybrid)

| Exp.

3-5 Years

Posted on.

Skills Required

Python, PyTorch, TensorFlow, TensorFlow Lite, CoreML, ONNX Runtime, NumPy, Pandas, Matplotlib, Git, GitHub, LLM, Camera Integration, Sensor Integration, API Integration, Swift, Objective-C, Kotlin, Java, Flutter

Role Summary

We are seeking a Junior-Mid Level Mobile ML Engineer to design, optimize, and deploy machine learning models on iOS, Android, and wearable devices. This role focuses on the critical challenge of bringing sophisticated ML models from research/training into real-time, power-efficient, and user-friendly mobile experiences.

This role is ideal for someone with 3–5 years of native mobile development experience and 1–2 years of ML model deployment who is ready to specialize in the intersection of mobile and ML.


Key Responsibilities (Summarized)

1. Mobile ML Model Integration & Format Conversion

  • Integrate pre-trained ML models into iOS and Android applications using:

  • TensorFlow Lite (primary): Model conversion, interpretation, and inference APIs.

  • CoreML (iOS): Model conversion and on-device inference optimization.

  • ONNX Runtime or similar frameworks where applicable.

  • Build robust model loading and lifecycle management:

  • Efficient model file storage and caching.

  • Model versioning and update strategies.

  • Fallback mechanisms for model loading failures.

  • Convert models from training formats (PyTorch, TensorFlow) to deployment formats with validation.

  • Implement multi-model pipelines (chaining multiple inference stages with minimal overhead).

2. Real-Time Inference Optimization

  • Profile model inference on target devices (phones, tablets, wearables):

  • Measure latency (ms per inference), memory usage (MB), and CPU/GPU utilization.

  • Identify bottlenecks (data loading, preprocessing, inference, postprocessing).

  • Optimize end-to-end latency to meet strict targets (<100ms for real-time applications):

  • Batching strategies for efficient tensor operations.

  • GPU/Neural Engine utilization where available (GPU delegates, NNAPI, Metal).

  • Asynchronous inference and frame buffering.

  • Quantization validation and accuracy preservation.

  • Implement performance monitoring and logging to track inference metrics in production.

3. Quantization & Model Compression

  • Collaborate with the AI/ML Engineer on quantization strategies:

  • Validate INT8 post-training quantization models.

  • Test FP16 or dynamic range quantization where applicable.

  • Measure accuracy impact on mobile hardware and fix any regressions.

  • Compare full-precision vs. quantized models for:

  • Model size reduction (storage and download).

  • Inference speed improvements.

  • Accuracy preservation on key metrics.

  • Implement per-layer quantization tuning and ablation studies if needed.

4. Camera Integration & Real-Time Preprocessing

  • Implement efficient camera capture pipelines:

  • Frame capture from rear/front cameras at 30 FPS.

  • Frame preprocessing (resizing, rotation, color space conversion) optimized for inference.

  • Buffering and synchronization for multi-camera scenarios.

  • Handle image orientation and device rotation (portrait, landscape, device orientation locks).

  • Build robust preprocessing that adapts to varying lighting and camera characteristics.

  • Implement efficient memory management for video frame buffers (avoid unnecessary copies).

5. Sensor Integration & Data Fusion

  • Integrate IMU (accelerometer, gyroscope), BLE (Bluetooth Low Energy), and other sensors where applicable:

  • BLE communication with wearables (smart glasses, fitness trackers).

  • Sensor data fusion with ML model outputs.

  • Sensor data preprocessing and temporal alignment.

  • Build robust connectivity handling for wireless devices (BLE connection management, reconnection logic).

  • Implement low-power sensor data collection strategies.

6. Battery, Memory & Thermal Management

  • Optimize for power efficiency:

  • Minimize inference frequency and GPU usage.

  • Implement background/foreground task management.

  • Measure battery drain per inference and optimize to target (<2% battery per 60 minutes of typical usage).

  • Monitor memory usage and prevent leaks:

  • Efficient tensor allocation and deallocation.

  • Reduce peak memory footprint during inference.

  • Handle thermal constraints:

  • Throttle inference if device temperature exceeds thresholds.

  • Implement adaptive quality/accuracy trade-offs under thermal stress.

7. API Design & Backend Integration

  • Design clean APIs for ML model inference:

  • Synchronous and asynchronous inference interfaces.

  • Error handling and fallback strategies.

  • Clear input/output contracts for model consumers.

  • Integrate with backend services for:

  • Sending inference results or telemetry.

  • Model updates and A/B testing.

  • User feedback loops.

  • Implement robust error handling and graceful degradation when inference fails or models are unavailable.

8. Testing, Validation & Performance Benchmarking

  • Build comprehensive test suites for ML deployment:

  • Unit tests for model loading, preprocessing, and postprocessing.

  • Integration tests for end-to-end inference pipelines.

  • Performance regression tests to detect latency/accuracy degradation.

  • Validate model accuracy on representative device hardware:

  • Different phone models (low-end, mid-range, flagship).

  • Different OS versions (iOS 13+, Android 8+).

  • Create performance benchmarking dashboards:

  • Latency, memory, CPU/GPU usage, battery impact metrics.

  • Regression detection and alerting.

  • Document known limitations (minimum Android/iOS versions, device requirements).

9. Debugging, Logging & Observability

  • Implement structured logging for inference:

  • Model loading status, inference latency, error conditions.

  • Device hardware info for correlation with performance issues.

  • User behavior tracking (inference frequency, error rates).

  • Build debugging tools:

  • Model output visualization (intermediate tensor inspection).

  • Comparison between expected and actual inference results.

  • Profiling capabilities (CPU/GPU flame graphs, memory allocation tracking).

  • Collaborate with the AI/ML Engineer to debug model accuracy issues on-device.

10. Cross-Platform Development & Collaboration

  • Develop for both iOS (Swift, Objective-C) and Android (Kotlin, Java) with code reuse where applicable.

  • Collaborate with Full-Stack Engineer on API contracts and data formats.

  • Work with AI/ML Engineer on model optimization feedback (which layers are bottlenecks, where quantization hurts accuracy).

  • Participate in technical design reviews and share platform-specific insights.


Required Skills & Experience (Junior-Mid Level)


Educational Background

  • Bachelor's degree in Computer Science, Engineering, or related field.

  • Strong foundation in algorithms, data structures, and system design.


Core Programming & Mobile Development

  • 3–5 years native mobile development experience (iOS and/or Android in production):

  • iOS: Swift or Objective-C, experience with UIKit or SwiftUI.

  • Android: Kotlin or Java, experience with Android framework and architecture.

  • Expert-level in at least one platform (iOS or Android):

  • App lifecycle management.

  • Efficient memory and resource management.

  • Threading and concurrency patterns.

  • Camera APIs and sensor integration.

  • Familiarity with the other platform (basic understanding, ability to learn quickly).

  • Version control: Git workflows, collaborative development.


ML Model Deployment (Mobile-Specific)

  • 1–2 years hands-on experience deploying ML models on mobile:

  • TensorFlow Lite (primary): Model conversion, inference APIs, delegates.

  • CoreML (iOS): Model format, inference, Metal performance shaders basics.

  • ONNX Runtime or similar frameworks.

  • Experience optimizing model inference for:

  • Comfort with model quantization (INT8, FP16) and understanding accuracy-performance trade-offs.


Performance Optimization & Profiling

  • Experience profiling mobile applications:

  • CPU profiling (Xcode Instruments, Android Studio Profiler).

  • Memory profiling and leak detection.

  • GPU utilization monitoring.

  • Optimization techniques:

  • Reducing main thread work and optimizing frame rate.

  • Efficient data structures and algorithms.

  • Multi-threading and asynchronous patterns.

  • Battery and thermal profiling basics.


Sensor & Hardware Integration

  • Experience with camera APIs:

  • Real-time video capture and preprocessing.

  • Camera frame orientation and rotation handling.

  • Multi-camera systems (if available).

  • Familiarity with sensor APIs (accelerometer, gyroscope, magnetometer).

  • Basic understanding of Bluetooth/BLE for wearable integration (nice-to-have but learnable).


System Design & APIs

  • Ability to design clean, maintainable APIs for ML inference.

  • Understanding of asynchronous programming patterns (callbacks, Futures/Promises, coroutines).

  • Error handling and graceful degradation strategies.

  • Basic knowledge of networking and REST APIs.


Mathematical Knowledge (Intermediate)

  • Understanding of basic linear algebra (vectors, matrices, tensor operations).

  • Familiarity with neural networks and common architectures (CNNs, RNNs).

  • Concept of quantization and its impact on model accuracy.


Communication & Code Quality

  • Ability to write clean, well-documented, maintainable code.

  • Clear technical communication with ML engineers and other team members.

  • Receptiveness to feedback and iterative improvement.

  • Strong problem-solving and debugging mindset.


Edge Agent Frameworks & MCP Client

  • Edge Agent Frameworks – Deploying lightweight agents on-device, on-device vector search (e.g., SQLite-based VSS), using quantized embedding models.

  • MCP Client (Mobile) – Implementing mobile-side clients for model/tool coordination and context sharing with backend agents.


Preferred Skills & Experience

  • Experience with both iOS and Android at production level.

  • Prior work on performance-critical applications (games, real-time video, AR).

  • Familiarity with AR frameworks (ARKit, ARCore, AR Glass SDKs).

  • Experience with edge AI frameworks beyond TFLite (NCNN, MNN, Qualcomm Snapdragon NPU).

  • Contributions to mobile ML open-source projects or strong GitHub portfolio.

  • Basic understanding of 3D graphics (OpenGL, Metal, Vulkan concepts).

  • Experience with wearable development (Apple Watch, Android Wear, smart glasses).

  • Knowledge of secure ML (model encryption, on-device privacy).

  • Familiarity with continuous deployment and A/B testing frameworks for mobile.


What You'll Gain

  • Deep expertise at the intersection of mobile development and machine learning.

  • Hands-on experience optimizing complex algorithms for real-time, resource-constrained environments.

  • Production impact: Your work powers AI experiences on millions of user devices.

  • Technical leadership trajectory: Clear path to senior Mobile ML Engineer or ML Infrastructure roles.

  • Cross-platform mastery: Deep knowledge of iOS and Android ecosystems and their unique ML deployment challenges.

  • Collaboration with ML researchers: Direct feedback loop with model trainers to drive practical optimization.

  • Hybrid working opportunities: Post MVP phase (Month 6 onwards), flexibility for remote collaboration as per team needs.

  • Patent involvement: Potential contribution to Nutpaa's ML deployment patents.


Organizational & Cultural Expectations

  • Maintain technical rigor in optimization, testing, and validation.

  • Share learnings through code reviews, documentation, and team discussions.

  • Provide and receive feedback with clarity and respect.

  • Uphold Nutpaa's values: Engineering Excellence, Long-Termism, Open Evolution, and Peer-Driven Collaboration.

  • Take initiative in solving problems and learning new mobile/ML technologies.

  • Challenge assumptions constructively and help improve deployment practices.

  • Maintain confidentiality of proprietary models, data, and optimization strategies.

  • Contribute to technical discussions and best practices for mobile ML.


Application Process

Please email to careers@nutpaa.ai with:

1. Resume (highlighting mobile development and ML deployment experience; include any ML framework certifications or courses)

2. Portfolio:

  • GitHub repos showing iOS and/or Android projects with native code.

  • Links to shipped production apps with your technical contributions.

  • Examples of ML model deployment work (if available).

  • Performance optimization case studies or technical write-ups.

3. Brief statement (~200 words):

  • Why you're interested in mobile ML engineering and Nutpaa's mission.

  • One challenging mobile performance or optimization problem you solved (include technical details and metrics).

  • Your experience bridging ML and mobile development.

4. Optional but valuable:

  • Technical writing sample (blog post, documentation, architecture write-up).

  • Links to open-source contributions in mobile or ML frameworks.

  • Examples of cross-platform work (iOS and Android on same project).

Email Subject: Junior-Mid Level Mobile ML Engineer – [Your Name]


Equal Opportunity Statement

Nutpaa is an equal opportunity provider. We are committed to building a diverse and inclusive team, and we do not discriminate based on race, religion, color, national origin, gender, gender identity or expression, sexual orientation, age, marital status, veteran status, or disability status.

Apply now to join us

Apply now to join us

First Name *
Middle Name
Last Name *
Email Address *
Phone no.
Current Location
LinkedIn
GitHub
Portfolio
Brief about you *
Resume *
Click to choose a file or drag here
Size limit: 1 MB
Loading captcha…