Md Ashikur Rahman | Lead AI Engineer

Md Ashikur Rahman is a Lead AI Engineer at The KOW Company, specializing in multimodal AI, vision-language systems, generative AI, and production computer vision.

Research interests: trustworthy multimodal machine learning; vision-language models; AI safety and robustness.

He leads 15+ engineers and researchers across applied research, product development, and enterprise delivery, and has built and scaled production AI systems supporting 200+ brands, including computer-vision platforms that have processed 4.5M+ images globally.

Recent Highlights

ICDAR 2026 oral paper on diagram understanding with LogicBench-1K (last and corresponding author).
Three first-author manuscripts under review on conformal risk control for LLM tool calls, detector-based grounding metrics in video QA, and math-encoded jailbreaks.
The Fitting Room: cross-brand virtual try-on platform covering 170+ brands.
CogniX: early-stage R&D for proprietary multimodal image generation and editing.

Education

B.Sc. in Computer Science and Engineering, American International University-Bangladesh, 2011-2015. CGPA 3.87/4.00 (WES equivalent 3.94/4.00); magna cum laude; top 3%; Merit Scholarship and Tuition Fee Waiver.

Publications

Stroke-Level Connectivity Verification: Grounding Vision-Language Models Against Topology Hallucination in Diagram Understanding · ICDAR 2026 · Oral · Last and corresponding author · LogicBench-1K

Beyond Aggregate Risk: Role-Stratified Conformal Risk Control for LLM Tool Calls · Manuscript under review, 2026 · First author

When Detector-Based Grounding Metrics Measure Vocabulary: A Cautionary Audit of Entity Claims in Video-QA Reasoning Traces · Manuscript under review, 2026 · First author

Math-Encoded Jailbreaks Across Provider-Matched Models and Inference-Time Reasoning Configurations · Manuscript under review, 2026 · First author

Automated Detection of Diabetic Retinopathy Using Deep Residual Learning · International Journal of Computer Applications, 2020

Full publication list on Google Scholar →

Selected Projects

Retouched.ai Object Detection and Segmentation

Lead AI Engineer Production

Developed production salient-object segmentation for background removal; improved segmentation quality by 17% on internal benchmarks, reduced processing time by 30%, supported uploads up to 257 MB, and achieved a 2.27-second average processing time across standard production workloads.
Scaled Retouched.ai to process 4.5M+ images globally for hundreds of customers using PyTorch, U²-Net-inspired salient-object segmentation, FastAPI, and Google Cloud Platform.

Omnimage.ai AI Image and Video Generation

Lead AI Engineer Production

Co-designed and launched production image- and video-generation APIs used across 200+ brands for creative and product-image workflows.
Built workflows for reference-image conditioning, asynchronous processing, prompt classification, intent routing, and automated model selection.

The Fitting Room Cross-Brand Virtual Try-On Platform

Lead AI Engineer In Development

Conceived and co-led a unified cross-brand virtual try-on platform covering 170+ brands.
Architected a Dockerized FastAPI/Nginx backend using SQL Server, Google Cloud Storage, Redis, recommendation services, and 2D virtual try-on pipelines.

Enterprise AI Catalog Audit and Image Quality Assurance Platform Catalog, Image, and Content Quality Audit

Lead AI Engineer In Development

Lead development of an AI catalog-audit and image quality assurance platform for a major US retail client.
Automate image QA, catalog validation, metadata checks, and content-health monitoring across DAM and CMS workflows using computer vision, NLP, PyTorch, and FastAPI.

CogniX Proprietary Multimodal Image Generation and Editing

Lead AI Engineer Active R&D

Leading early-stage R&D for a proprietary multimodal image-generation and editing system for product-visualization workflows.
Researching and prototyping multi-reference fusion and prompt-guided image-editing workflows for garment replacement, scene composition, product visualization, and style transfer using PyTorch, Diffusers, ComfyUI, and parameter-efficient fine-tuning (LoRA, QLoRA).

Experience

The KOW Company Dhaka, Bangladesh

Lead AI Engineer Current Jan 2023 - Present

Lead a multidisciplinary team of 15+ ML engineers, software engineers, and researchers across architecture, planning, code review, and delivery; maintain 90-95% on-time delivery across AI engineering projects.
Oversee development of production AI systems across virtual try-on, generative media, 3D reconstruction, catalog audit, and audio QA.
Direct applied research on AI hallucination, prompt safety, visual grounding, and video understanding; last and corresponding author on an ICDAR 2026 oral paper.
Mentor and supervise four researchers on literature review, research planning, experimental design, reproducible implementation, result analysis, and academic writing.
Develop and scale production segmentation and image-processing systems, including Retouched.ai; publish reproducible research code, datasets, and model artifacts on GitHub and Hugging Face.

Senior Machine Learning Engineer Jul 2021 - Dec 2022

Improved object detection and segmentation performance by 20-35% across internal evaluation benchmarks; the resulting models were later deployed through Retouched.ai.
Led 6+ client ML engagements from business requirements to technical delivery; built offline evaluation pipelines and production A/B testing workflows to validate model quality, inference performance, and production outcomes.

Machine Learning Engineer Jul 2020 - Jun 2021

Built deep learning models for production object recognition, image segmentation, and background-removal workflows; developed scalable preprocessing, training, and A/B testing pipelines.

Smart Technologies (BD) Ltd Dhaka, Bangladesh

Senior Software Engineer Sep 2016 - Dec 2019

Led .NET and SQL Server supply-chain systems on a 4 TB database, reducing report generation from 20 minutes to 40-54 seconds.
Achieved 70-75% process automation and 99.9% synchronization success for offline-capable enterprise workflows.

Proggasoft Dhaka, Bangladesh

Software Engineer Mar 2015 - Aug 2016

Developed ASP.NET MVC features, backend services, and database integrations for DevSkill.com.

Technical Skills

Multimodal and Vision-Language AI: Python · PyTorch · Vision-Language Models · Visual Grounding · Hallucination Evaluation · Multimodal Learning
Generative AI and LLMs: Diffusion Models · Large Language Models (Llama, Qwen) · Prompt Safety · Parameter-Efficient Fine-Tuning (LoRA, QLoRA) · Prompt-Guided Image Editing
Computer Vision: Object Detection · Segmentation · Pose Estimation · Image Quality Assurance
3D Vision: Structure from Motion · Multi-View Stereo · COLMAP · Open3D · Neural Radiance Fields (NeRF) · 3D Gaussian Splatting
Production ML and MLOps: Model Serving · Production Deployment · Offline Evaluation · Production A/B Testing · Model Evaluation · Dataset Design
Backend and Cloud: FastAPI · Docker · Nginx · Redis · Google Cloud Platform · Google Cloud Storage · SQL Server

Honors and Awards

Champion, BASIS National ICT Awards, 2020 (Retouched.ai)
Finalist, Asia Pacific ICT Alliance Awards, 2021
Artificial Intelligence in Advertising, invited workshop speaker, Daffodil International University
Magna Cum Laude; Top 3%; Merit Scholarship and Tuition Fee Waiver, AIUB

Contact

Open to research collaboration, speaking invitations, and professional inquiries in multimodal AI, vision-language models, and applied machine learning.