FakeLLaVA

LLaVA-style VLM — CLIP ViT-B/32 connected to Qwen2.5-0.5B through a two-layer projection MLP.