Torus AI blogs

ViTPose : A simple yet powerful transformer baseline for Human Pose Estimation

ViTPose: Simple Vision Transformer Baselines for Human Pose EstimationAlthough no specific domain knowledge is considered in the design, plain vision transformers have shown excellent performance in visual recognition tasks. However, little effort has been made to reveal the potential of such simple structures for pose estimation tasks. In this paper,

MultiFacetEval

Multifaceted Evaluation to probe LLMs in mastering medical knowledge Yuxuan Zhou et al., June 2024 Large language models (LLMs) have achieved impressive results in medical evaluation benchmarks like MedQA. Despite this, there remains a significant gap between their reported performance and their practical effectiveness in real-world medical scenarios. In this

Depth Anything

Unleashing the Power of Large-Scale Unlabeled Data Lihe Yang et al., January 2024 Monocular Depth Estimation (MDE) The goal is to estimate depth information from a single image (i.e. monocular). The main application are robotics, autonomous driving, and VR. But it's also been applied to healthcare and

Skeleton project with DWPose

Goal The goal of this project is to predict a user's keypoints in real time and guide them so that another AI can have the best possible images as input. Among the models we have studied for this problem, DWPose is the SOTA for 2D whole-body pose estimation