Label-Efficient Grasp Joint Prediction with Point-JEPA

2025-09-25 19:00 GMT · 7 months ago aimagpro.com

arXiv:2509.13349v2 Announce Type: replace-cross
Abstract: We study whether 3D self-supervised pretraining with Point–JEPA enables label-efficient grasp joint-angle prediction. Meshes are sampled to point clouds and tokenized; a ShapeNet-pretrained Point–JEPA encoder feeds a $K{=}5$ multi-hypothesis head trained with winner-takes-all and evaluated by top–logit selection. On a multi-finger hand dataset with strict object-level splits, Point–JEPA improves top–logit RMSE and Coverage@15$^{circ}$ in low-label regimes (e.g., 26% lower RMSE at 25% data) and reaches parity at full supervision, suggesting JEPA-style pretraining is a practical lever for data-efficient grasp learning.