Biodiversity researchers tested vision systems on how well they could retrieve relevant nature images. More advanced models performed well on simple queries but struggled with more research-specific prompts.
Category: Computer vision
Auto Added by WPeMatico
Combining next-token prediction and video diffusion in computer vision and robotics
A new method can train a neural network to sort corrupted data while anticipating next steps. It can make flexible plans for robots, generate high-quality video, and help AI agents navigate digital environments.
AI pareidolia: Can machines spot faces in inanimate objects?
New dataset of “illusory” faces reveals differences between human and algorithmic face detection, links to animal face recognition, and a formula predicting where people most often perceive faces.
Precision home robots learn with real-to-sim-to-real
CSAIL researchers introduce a novel approach allowing robots to be trained in simulations of scanned home environments, paving the way for customized household automation accessible to anyone.
Looking for a specific action in a video? This AI-based method can find it for you
A new approach could streamline virtual training processes or aid clinicians in reviewing diagnostic videos.
AI generates high-quality images 30 times faster in a single step
Novel method makes tools like Stable Diffusion and DALL-E-3 faster by simplifying the image-generating process to a single step while maintaining or enhancing image quality.
Reasoning and reliability in AI
PhD students interning with the MIT-IBM Watson AI Lab look to improve natural language usage.
Multiple AI models help robots execute complex plans more transparently
A multimodal system uses models trained on language, vision, and action data to help robots develop and execute plans for household, construction, and manufacturing tasks.
Image recognition accuracy: An unseen challenge confounding today’s AI
“Minimum viewing time” benchmark gauges image recognition complexity for AI systems by measuring the time needed for accurate human identification.
A flexible solution to help artists improve animation
This new method draws on 200-year-old geometric foundations to give artists control over the appearance of animated characters.