Hoang-Minh Tran

AI Engineer

profile_pic.png

Ho Chi Minh City, Vietnam

I’m Tran Hoang Minh — a recent graduate and freelance AI Engineer specializing in building end-to-end, production-ready AI systems. Currently self-taught transitioning from Computer Vision to Vision-Language Models (VLMs) and Vision-Language-Action (VLA) systems, leveraging the flexibility of freelance work to deepen my research expertise while continuing to ship production solutions.

What I do best: Transforming research-stage models into reliable, scalable AI services. I design model architectures, build MLOps pipelines, automate data flows, and implement monitoring for production stability. My tech stack covers Python, PyTorch, TensorFlow, YOLO, HuggingFace, transformers, LangChain, CI/CD with GitHub Actions, DVC/MLFlow, and infrastructure with Terraform. I utilize AI coding agents to accelerate solution design and system orchestration, ensuring rapid turnaround from prototype to deployment.


Background

B.Sc. in Artificial Intelligence — FPT University (2025)

Recent graduate with hands-on experience in computer vision and large language models, combining academic foundations with practical production deployment skills.


Work Experience

Freelance AI Engineer — Present
Building production AI systems and personal research projects focusing on VLM/VLA and embodied AI.

Junior AI Engineer — Rainscales (Feb 2025 – Sep 2025)
Developed and deployed AI solutions for production environments.

AI Engineer Intern — FPT Software, Quy Nhon (Sep 2023 – Dec 2023)
4-month internship working on AI/ML projects and gaining industry experience.


Current Focus

My research interests span production-scale multimodal AI and embodied intelligence systems:

  • Vision-Language Models (VLMs) & Vision-Language-Action (VLA) – Developing multimodal systems that combine visual understanding with language reasoning and actionable outputs for robotic and embodied AI applications

  • Physical AI & Robotics – Building AI systems with spatial reasoning, perception, and control capabilities for real-world embodied applications, with focus on humanoid robotics infrastructure

  • Production AI Systems – Architecting end-to-end AI pipelines including multimodal serving, RAG systems, vector database optimization, and GPU-accelerated inference on production infrastructure

  • Infrastructure & Deployment – Designing robust MLOps systems for production workloads, including containerization, orchestration, monitoring, and cost-optimized compute management


Technical Expertise

Programming Languages

Python (Proficient) • C (Basic)

AI/ML Frameworks

PyTorch • TensorFlow • YOLO • HuggingFace Transformers • Scikit-learn • LangChain • LangGraph • Pandas • NumPy • OpenCV

MLOps & DevOps

Docker • Kubernetes (GKE) • Terraform • GitHub Actions • Prometheus • Grafana • Loki • MLFlow • DVC • Helm • KServe • vLLM

Databases & Vector Stores

PostgreSQL • Microsoft SQL Server • Milvus • Qdrant

Cloud Platforms

Google Cloud Platform (GKE, GCS, Compute Engine) • NGINX • FastAPI


IntelliRAG System

Architected cloud-native RAG platform on GKE with FastAPI, Kubernetes, and GPU-accelerated inference (vLLM/KServe), achieving high-throughput document processing and low-latency query response using Qdrant vector database, LangChain, and Sentence Transformers with comprehensive MLOps monitoring via Prometheus/Grafana/Evidently.

Action Retrieval from CCTV Footage

Evaluated text-video retrieval models (CLIP4Clip, Frozen-in-time, InternVideo) using Recall@k and Precision@k metrics to optimize action retrieval system, building data pipelines to ingest video-caption pairs into Milvus for low-latency retrieval using PyTorch and OpenCV.

Vietnamese Text Recognition

Built end-to-end Vietnamese OCR system with fine-tuned PaddleOCR models and Tkinter GUI, improving text recognition accuracy by 10% through synthetic data generation with diverse fonts optimized for advertising plates and product packaging.


Research and Publications

  1. ICISN
    An Automated Pipeline for Constructing a Vietnamese VQA-NLE Dataset
    Truong-Binh Duong, Hoang-Minh Tran, Binh-Nam Le-Nguyen, and 1 more author
    In Proceedings of the Fifth International Conference on Intelligent Systems and Networks, 2026