How we were able to reach 97% on top level categories, and 92% on bottom level categories with a multilingual dataset for a customer of Pumice.ai.
December 20, 2024
We dive deep into the challenges we face in EMR data extraction and explain the pipelines, techniques, and models we use to solve them.
Width.ai created a new retail product image classification model that outperforms the SOTA results from CLIP and Fashion CLIP on the most popular dataset in the domain. These models are commonly used in product matching use cases where photos are taken with lower resolution and zero control for noise and image angles.
Explore the world of high-resolution image inpainting with the AOT-GAN model. This blog post delves into the challenges, solutions, and impressive results of the AOT-GAN model, providing a comprehensive guide on how to install, use, and evaluate this powerful tool for image inpainting
Explore how the implementation of Fashion CLIP, a domain-specific AI model, improves product similarity search in e-commerce by understanding complex product attributes, enhancing image-text alignment, and providing more accurate search results.
Explore the intricacies of the Seamless Tile Inpainting extension, designed to make image tiles seamless through inpainting. This detailed guide covers installation, usage, comparison with the Asymmetric Tiling extension, and a deep dive into the underlying Python script. Ideal for anyone working in digital imaging or 3D modeling looking to create high-quality seamless tiles.
Explore our comprehensive guide on using the Corridor Crawler Outpainting, a powerful extension for generating and animating intricate hallways. Learn about its installation, usage, and the Stable Diffusion model that powers it. Discover its wide range of applications in digital art, game development, architecture, VR/AR, and more.
Find out how you can improve content marketing and social media campaigns by creating unique, relevant images using image generation models like DALL-E Mini.
Discover the power of text-guided open-vocabulary segmentation using large language models like GPT-4 & ChatGPT for automating image and video processing tasks.
Learn how CLIPSeg segmentation, in combination with GPT-4 and ChatGPT, can enable diverse applications from medical image diagnosis to remote sensing.
A deep dive into how we reached SOTA accuracy in product similarity matching through a custom fine-tuning pipeline that refines the CLIP model for image similarity.
A deep look at how recurrent feature reasoning outperforms other image inpainting methods for difficult use cases and popular datasets.
Discover how transformer networks are revolutionizing image and video segmentation, and get insights on modern semantic segmentation vs. instance segmentation.
Discover how the state-of-the-art mask-aware transformer produces visually stunning and semantically meaningful images and how it stacks up against Stable Diffusion & DALL-E for large-hole inpainting
Discover the capabilities of zero-shot object detection, which enables anyone to use a model out-of-the-box without any training and generate production-grade results.
What is facial expression recognition and what SOTA models are being used today in production
Get a simple TensorFlow facial recognition model up & running quickly with this tutorial aimed at using it in your personal spaces on smartphones & IoT devices.
Explore accurate classification algorithms using the latest innovations in deep learning, computer vision, and natural language processing.
Learn what human activity recognition means, how it works, and how it’s implemented in various industries using the latest advances in artificial intelligence.
What is image classification and how we build production level TensorFlow image classification systems for recognizing various products on a retail shelf.
How to build an image classification model in PyTorch with a real world use case. How you can perform product recognition with image classification
Smart farming using computer vision and deep learning provides the most promising path forward in the slow-moving industry of agriculture.
Apply AI to your favorite sport with this guide. Learn how automated ball tracking can change the game for coaches and players.
Warehouse automation plays a crucial role across your supply chain. Learn about how machine learning and ai software can be integrated into your warehouse automation stack.
A complete walkthrough guide on how to use visual search in ecommerce stores to create more sales and real examples of companies already using it.
Automating the extraction of data from invoices can reduce the stress of your accountants by finding inaccuracies, digitizing paper invoices, and more.
Understand how to extract text from images via Python without Tesseract and how we execute robust text extraction and document understanding for your business.
Find out how machine learning in medical imaging is transforming the healthcare world and making it more efficient with three use cases.
Discover ways that machine learning in health care informatics has become indispensable. Review the results of two case studies and consider two key challenges.
5 ways you can use product matching software in ecommerce to create real value that raises your sales metrics and improves your workflow operations.
Product recognition software has tremendous potential to improve your profits and slash your costs in your retail business. Find out just how useful it is.
We built a custom ML pipeline to automate information extraction and fine tuned it for the legal document domain.
Dlib is a versatile and well-diffused facial recognition library, with perhaps an ideal balance of resource usage, accuracy and latency, suited for real-time face recognition in mobile app development. It's becoming a common and possibly even essential library in the facial recognition landscape, and, even in the face of more recent contenders, is a strong candidate for your computer vision and facial recognition or detection framework.
Best Place For was looking for an image recognition based software solution that could be used to detect and identify different food dishes, drinks, and menu items in images sourced from blogs and Instagram. The images would be pulled from restaurant locations on Instagram and different menu items would be identified in the images. This software solution has to be able to handle high and low quality images and still perform at the highest production level, while accounting for runtime as well as accuracy.
Inventory automation with computer vision - how to use computer vision in online retail to automate backend inventory processes