On March 15th we released a new version of our platform that includes an exciting new feature, another type of computer vision task that focuses on object localization – a pre-trained model for pose estimation.
This article will introduce you to its purpose, history, key benefits and will help you navigate through its usage on the SentiSight.ai platform.
What is human pose estimation?
Human pose estimation, is defined as the localization of major human joints such as elbows, knees, wrists, etc.It continues to be one of the most popular research areas regarding computer vision tasks.
It is a feature set that determines the estimated pose of a person from an image or video by approximating the spatial location of body and limb joints. It is important to note that pose estimation is only used for estimating body joints and not for the recognition of specific individuals.
The major breakthrough in human pose estimation
This field is relatively new, the first major paper regarding human pose estimation based on deep learning methods – DeepPose: Human Pose Estimation via Deep Neural Networks – was proposed during the IEEE Conference on Computer Vision and Pattern Recognition in 2014. Motivated by the deep neural network’s exceptional results on classification and localization problems, the authors presented a cascade of deep neural networks-based (DNN-based) regressors towards body joints that resulted in high precision pose estimates.
Since humans are relatively flexible, the key problem with human joint localization usually lies within the occlusions, small or barely visible limbs, and the need to capture the context of an image. This new proposal included the location regression of each body joint and a cascade of DNN-based pose predictors, which have significantly increased the precision of joint localization. It outperformed all previous approaches by showing strong pose estimation results in the four most challenging limbs – lower and upper arms and legs as well as the mean confidence score value across these limbs.
After the initial proposal of DNN-based regression towards body joints, other approaches arose that have implemented a sliding window detector to produce a rough heatmap output and estimation of the pose and its iterative correction based on feedback, instead of predicting the outputs in one go.
Another state-of-the-art method includes deep convolutional neural networks passing the input through a high-resolution subnetwork and forming more stages by adding high-to-low subnetworks connecting them parallelly.
How does human pose estimation work?
When it comes to the pose estimation process, there are two main approaches:
- Bottom-up: This approach works by detecting every key point in an image and then assembling them into an object.
- Top-down. Involves drawing a bounding box around the object and only then estimates the key points within each region.
The process of human pose estimation depends on the complexity of input images, their quality, occlusion, clothing and lighting. Usually, input images are processed and an output of indexed keypoints is produced, along with a confidence score ranging from 0.0 to 1.0 for each keypoint. The confidence score refers to the likelihood of a body joint existing in that spot.
Human pose estimation use cases
With pose estimation, we can track a person and his/her movements every step of the way. There is a myriad of exciting applications for human pose estimation technology in a wide variety of industries:
- Well-known, everyday examples include motion capture and motion tracking, as is used in Apple’s Animoji and Microsoft’s Kinect technologies, with a ton of consumer Augmented Reality (AR) applications on the horizon.
- Enhanced surveillance in cases of recognizing a person’s gait or emergency systems that detect if someone has fallen or is sick.
- Pose estimation in devices that can recognize sign language for use in service institutions, such as airports, schools and banks. This would save the need for any specialized sign language interpreters.
- Health and Fitness:Applications developed to analyze and optimize the performance of sports games, dance techniques, posture learning and personal fitness overall. Nowadays we already see some of its capabilities in the dance filter Sway and virtual fitness trainer Onyx that demonstrates the use of pose estimation. Another great example is an interactive basketball app HomeCourt that uses pose estimation to analyse basketball players’ movements and help them improve.
- Home and nursing use, specifically in training robots to walk and move more like a human would, and recognize if a person is in any kind of distress or need of immediate assistance.
- Self-driving cars can learn to better gauge a situation and make better decisions in regards to avoiding unexpected collisions with pedestrians by reacting to where they are on the road and where they are trying to go.
Human pose estimation using SentiSight.ai
SentiSight.ai offers a pre-trained pose estimation model that localizes human joints in the image and shows the kinematic pose in 2D. We present such a model that predicts both a single pose estimation and a multi one, meaning that it can detect more than one person in an image.
Starting the pose estimation process is very simple – you will need to navigate to Pre-trained models and select Pose estimation from the drop-down list. Upload your images and that’s it! You can see the results in front of you.
By ticking a check-box above your images you can manage the visibility of bounding boxes around the predictions. Finally, the results can then be downloaded as images with poses or in JSON format.
As with all of our pre-trained models, the pose estimation model is accessible via our SentiSight.ai web-interface or via REST API which is explained in more detail in our user guides.
Estimating human pose is advantageous in a wide variety of use cases as mentioned above – whether it’s the health and fitness industry to improve technique of movement and help minimize injury, recognizing sign language in service institutions, training robots to recognize distress/pain in the medicinal field, plus many more. The tech will be at the core of humanoid robot technology once it matures to a point where it can be widely adopted in practice.
You can start contributing to the future of technology by implementing AI-fueled innovations into your life introduced by SentiSight.ai.