We released a new computer vision model today. It has 71,286 taxa, up from 69,966. This new model (v2.1) was trained on data exported last month on January 15th and added 1,770 new taxa.
Why v2.1 and not v1.7? As we mentioned in August, we have been training our new computer vision models using a transfer learning strategy.
All of the models that we have released since August (v1.1 - v1.6), one every month, were trained based on the same source model. We call that source model v1.0. The v1.0 model started training in 2021 on 55,000 taxa, 27 million photos, and trained for about 4 months (80 epochs).
While we've been transfer learning new production models, we've also been working on a new source model. We call this new source model v2.0. The v2.0 model started training in 2022 on 60,000 taxa, 30 million photos, and trained for about 9 months (200 epochs). All of the additional data and training time have produced a better source model, which in turn is making better final production models. The model we released today was the first model based on the v2.0 source model (v2.1). Note from the figure below that v2.0 won't ever be released since was trained on data/a taxonomy that is now over 9 months out of sync, which is why we are releasing v2.1 trained (via transfer learning) on data/a taxonomy from January 15th. Also note that we trained v1.7 as a backup in case v2.1 didn't evaluate well. But since v2.1 performed significantly better, we won't be releasing v1.7 and will continue releasing models derived from the v2.0 base until we move to v3.0 in the next 9 to 12 months.
Thanks to NVIDIA for the generous hardware grant that made all of this training possible!
Taxa differences to previous model
The charts below summarize these 1,465 new taxa using the same groupings we described in past release posts.
By category, most of these 1,465 new taxa were insects and plants
Here are species level examples of new species added for each category:
Click on the links to see these taxa in the Explore page to see these samples rendered as species lists. Remember, to see if a particular species is included in the currently live computer vision model, you can look at the “About” section of its taxon page.
We couldn't do it without you
Thank you to everyone in the iNaturalist community who makes this work possible! Sometimes the computer vision suggestions feel like magic, but it’s truly not possible without people. None of this would work without the millions of people who have shared their observations and the knowledgeable experts who have added identifications.
In addition to adding observations and identifications, here are other ways you can help:
-
Share your Machine Learning knowledge: iNaturalist’s computer vision features wouldn’t be possible without learning from many colleagues in the machine learning community. If you have machine learning expertise, these are two great ways to help:
-
Participate in the annual iNaturalist challenges: Our collaborators Grant Van Horn and Oisin Mac Aodha continue to run machine learning challenges with iNaturalist data as part of the annual Computer Vision and Pattern Recognition conference. By participating you can help us all learn new techniques for improving these models.
-
Start building your own model with the iNaturalist data now: If you can’t wait for the next CVPR conference, thanks to the Amazon Open Data Program you can start downloading iNaturalist data to train your own models now. Please share with us what you’ve learned by contributing to iNaturalist on Github.
-
Donate to iNaturalist: For the rest of us, you can help by donating! Your donations help offset the substantial staff and infrastructure costs associated with training, evaluating, and deploying model updates. Thank you for your support!