Recently in AI | Jan 2024

In the rapidly advancing realm of artificial intelligence, recent groundbreaking research is pushing the boundaries of what these powerful systems can achieve.

Google and Georgia Tech's collaboration has revolutionized image segmentation, introducing a method that achieves a remarkable 26% increase in pixel accuracy and a 17% improvement in mean IoU on COCO-Stuff-27, all without the need for annotations.

https://sites.google.com/view/diffseg/home

DeepMind's FunSearch takes a significant leap in reshaping mathematical problem-solving, pairing a pre-trained Large Language Model (LLM) with an automated evaluator, resulting in LLMs yielding verifiably correct discoveries for challenging open problems.

https://www.nature.com/articles/s41586-023-06924-6

In another groundbreaking collaboration, the Allen Institute for AI, alongside the University of Illinois Urbana-Champaign and the University of Washington, introduces Unified-IO 2—an innovative autoregressive multimodal model excelling across over 30 benchmarks, from image and text understanding to video, audio, and robotic manipulation.

https://arxiv.org/abs/2312.17172

AlphaGeometry, a tool developed for proving geometric facts, achieves the rigor of International Mathematical Olympiad (IMO) competitors by solving 25 out of 30 geometry problems, demonstrating its strength in mathematical problem-solving.

https://www.nature.com/articles/s41586-023-06747-5

Google's Articulate Medical Intelligence Explorer (AMIE) takes groundbreaking strides in AI for medical dialogues, optimizing diagnostic capabilities and performing on par with Primary Care Physicians in simulated consultations.

http://blog.research.google/2024/01/amie-research-ai-system-for-diagnostic_12.html

NVIDIA's ChatQA introduces a groundbreaking family of conversational QA models, surpassing GPT-4 accuracy with a novel two-stage instruction tuning method, showcasing its potential to shape the future of conversational AI.

https://arxiv.org/abs/2401.10225

Finally, researchers from UCLA, the University of Washington, and Microsoft unveil MathVista, a benchmark evaluating mathematical reasoning capabilities, with GPT-4V emerging as the top performer despite falling short of human performance, shedding light on the challenges in understanding complex visual figures and conducting rigorous reasoning.

https://mathvista.github.io/