Treffer: Evaluating 3d human pose estimation in occluded multi-sensor scenarios : dataset and annotation approach

Title:
Evaluating 3d human pose estimation in occluded multi-sensor scenarios : dataset and annotation approach
Contributors:
Image Perception Interaction (LS2N - équipe IPI), Laboratoire des Sciences du Numérique de Nantes (LS2N), Institut National de Recherche en Informatique et en Automatique (Inria)-Centre National de la Recherche Scientifique (CNRS)-IMT Atlantique (IMT Atlantique), Institut Mines-Télécom [Paris] (IMT)-Institut Mines-Télécom [Paris] (IMT)-NANTES UNIVERSITÉ - École Centrale de Nantes (Nantes Univ - ECN), Nantes Université (Nantes Univ)-Nantes Université (Nantes Univ)-Nantes université - UFR des Sciences et des Techniques (Nantes univ - UFR ST), Nantes Université - pôle Sciences et technologie, Nantes Université (Nantes Univ)-Nantes Université (Nantes Univ)-Nantes Université - pôle Sciences et technologie, Nantes Université (Nantes Univ)-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre National de la Recherche Scientifique (CNRS)-IMT Atlantique (IMT Atlantique), Nantes Université (Nantes Univ), China University of Mining and Technology (CUMT), Robots and Machines for Manufacturing, Society and Services (LS2N - équipe RoMas), Institut universitaire de France (IUF), Ministère de l'Education nationale, de l’Enseignement supérieur et de la Recherche (M.E.N.E.S.R.), ANR-20-THIA-0011,AIby4,AI by / for Human, Health and Industry(2020)
Source:
2024 IEEE International Conference on Image Processing (ICIP). :2683-2689
Publisher Information:
CCSD; IEEE, 2024.
Publication Year:
2024
Collection:
collection:CNRS
collection:EC-NANTES
collection:UNAM
collection:LS2N
collection:LS2N-IPI
collection:LS2N-ROMAS
collection:INSTITUTS-TELECOM
collection:ANR
collection:NANTES-UNIVERSITE
collection:NANTES-UNIV
collection:AIBY4
collection:ANR-IA-20
collection:ANR-IA
Subject Geographic:
Original Identifier:
HAL: hal-04812238
Document Type:
Konferenz conferenceObject<br />Conference papers
Language:
English
Relation:
info:eu-repo/semantics/altIdentifier/doi/10.1109/ICIP51287.2024.10647858
DOI:
10.1109/ICIP51287.2024.10647858
Rights:
info:eu-repo/semantics/OpenAccess
Accession Number:
edshal.hal.04812238v1
Database:
HAL

Weitere Informationen

Obtaining ground truth annotations for 3D pose estimation (3D HPE) typically depends on motion capture equipment (Mocap), which is not only expensive but impractical for widespread deployment. In contrast, triangulation can reconstruct 3D poses solely from multi-view 2D poses with known camera parameters, eliminating the need for Mocap. However, inherent noise in 2D pose predictions introduces uncertainties, compromising the reliability of the results. To obtain more reliable annotations with noisy input, we introduce an annotation approach for the 3D HPE task, driven by prior knowledge of the skeletal configuration. We split our approach into two steps: first a parametric model is designed to enhance confidence predictions. Then, a differentiable weighted triangulation is employed to estimate the 3D pose in world space, leveraging the predicted confidence scores as weights. The pipeline is trained using a bone length loss. Moreover, we collect a multi-view dataset for 3D HPE and annotate it using our proposed annotation tool. This dataset is characterized by more construction scenarios, including heavier occlusion cases, diverse viewing directions, and the integration of various optical sensors, setting it apart from existing datasets. Experiments on both our dataset and Hu-man3.6M demonstrate the effectiveness of our method.