4D reasoning from demonstration data for VLA
Visual-Language-Action (VLA) models are typically trained through imitation learning, which teaches policies to reproduce demonstrated actions but provides limited supervision about the conditions that define task success. We propose a framework that automatically extracts executable 3D task verifiers from demonstrations and uses them to improve policy learning beyond imitation. Given Read more



