The interview focused on my experience with large language models (LLMs), multimodal systems involving text and vision integration, and GPU-based training using PyTorch. I discussed a personal project where I built a wearable assistant combining YOLOv11 and depth estimation with LLM-based scene understanding. The interviewer asked in-depth questions about training pipelines and real-time inference.