31 de enero de 2023

Not an Unknown Proteaeae

@dianastuder commented "(not an Unknown protea)"
on: https://www.inaturalist.org/observations/11256049

Yes, but the "unknown/Proteaceae" project is an AI product, and this is a case of a misidentification.

It stays in the project for statistical reasons - specifically to calculate how often the AI is wrong.

So no need to worry or to mark up the error. That can be simply done with
https://www.inaturalist.org/observations?project_id=152984&place_id=any&verifiable=any&captive=any&without_taxon_id=64517

Bear in mind that may of these are not errors as
** previous IDs may be blocking a Proteaceae ID
** there are probably Multiple Species in one Observation marked up as "Life"
** there are duplicates that should be deleted marked up as "Life"

Can anyone think of any other reasons why "Errors" might not be errors.? And if we want to mark these up in some way?

For instance, if we add the project https://www.inaturalist.org/projects/multiple-species-per-observation to all observations with multiple species, then we can refine our url to:
https://www.inaturalist.org/observations?project_id=152984&place_id=any&verifiable=any&captive=any&without_taxon_id=64517&not_in_project=multiple-species-per-observation

Publicado el enero 31, 2023 09:37 MAÑANA por tonyrebelo tonyrebelo | 2 comentarios | Deja un comentario

25 de diciembre de 2022

Getting help from the AI to find Proteas in the Unidentified Backlog

You might have noticed a few older observations of proteas popping out the the woodwork. Many of these are due to the AI identifying these thanks to the ministrations of @jeanphilippeb and his projects, of which https://www.inaturalist.org/projects/unknown-proteaceae is pertinent to us here.

You can find more of @jeanphilippeb's ideas here:
https://www.inaturalist.org/journal/jeanphilippeb/73398-draft-for-creating-projects-for-unknown-observations

So is it working?
Before we evaluate, we have to admit that we have no idea how many Proteaceae are being missed (false negatives). So we can only evaluate the positive identifications, without any notion as to which species, features or situations are being missed.

OK: so to date 822 observations have been retrieved from the forgotten or trashed pile.
I have been through them (you can too: here - https://www.inaturalist.org/observations/identify?quality_grade=casual%2Cresearch%2Cneeds_id&verifiable=any&project_id=152984&place_id=any ) and we have:

  • 137 observations are still not Proteaceae. But
  • 54 are Proteaceae but are either Multiple Observations (with different pictures of many species) or are Proteaceae with correct identifications but conflicting community identifications due to alternative incorrect identifications, or are clearly Proteaceae but some other organism is the focus of the ID (e.g. a sunbird on a pincushion). You can help resolve these by clicking here
    So approximately 14% are False Positives - identifications as Proteaceae when they were not

So 86% correct is quite something. (how many exams did you get 85% for at school?)

What were they?

For southern Africa 300 observations comprized 117 species, with these dominating:
16 Leucadendron laureolum × salignum Safari Sunset and Similar Cultivars
12 Leucadendron salignum Common Sunshine Conebush
11 Brabejum stellatifolium Wild Almond
9 Protea caffra caffra Common Sugarbush
9 Leucadendron rubrum Spinning-top Conebush
8 Protea nitida Wagon Tree
7 Faurea saligna African Beechwood
7 Leucadendron laureolum Golden Conebush
7 Aulax umbellata Broadleaf Featherbush
7 Leucospermum × hybridum Pincushion Hybrids
6 Leucadendron argenteum Silvertree
6 Leucospermum cordifolium The Pincushion
6 Protea repens Common Sugarbush
5 Protea laurifolia Grey Sugarbush
5 Leucadendron xanthoconus Sickleleaf Conebush
5 Protea roupelliae Silver Sugarbush
5 Leucadendron galpinii Hairless Conebush

Note that these are among the most commonly recorded species. So is the AI only identifying the common species and the rarer species are slipping through the cracks? Probably not, given that we have 117 species. But it does suggest, perhaps, that the observations were overlooked due to workload and random issues, rather than that people are having difficulties with some species and thus "ignoring" them, (Of course, the AI has not been trained on the rare species so they may still be in the unidentified pile, but hopefully as the AI gets trained, and as more records of species are received, these will be detected at a later date).
Some 57 identifiers have been involved, but many of these were to "plant" so how many of these contributed anything valuable to the ultimate identification is difficult to evaluate. Remember that despite these higher IDs, these observations were not identified until they were rescued by the AI.
Some 286 (95%) are identified to species or lower, and most of the remainder require an agreement to move them to species level. So those that are proteas are easily idenitifiable - it is not that they are problematic observations for identification.

Outside of southern Africa, we have the complication that I dont really know the Australian species well. We also have the complication that the AI is trained on southern African Proteaceae, so wont detect the Australian species anyway. Still we do have some species as alien invaders or as garden plants, so the AI is aware of those and will identify them.

We have 286 observations of 25 species.

  • 83 observations were only made to generic level as follows:
    65 Grevillea Grevilleas
    11 Banksia Banksias
    4 Macadamia Macadamias
    2 Stenocarpus Firewheels
    1 Telopea Waratahs

  • 203 were identified to species (by 4 identifiers)
    61 Leucadendron laureolum × salignum Safari Sunset and Similar Cultivars
    42 Leucospermum × hybridum Pincushion Hybrids
    32 Grevillea robusta Silky Oak
    15 Protea × hybrida Sugarbush Hybrids
    11 Leucospermum cordifolium × patersonii High Gold and Derived Cultivars
    7 Protea cynaroides King Protea
    4 Leucadendron argenteum Silvertree
    4 Hakea drupacea Sweet Needlebush
    4 Leucadendron discolor × gandogeri Cloudbank Jenny
    4 Leucospermum lineare × reflexum Brandi Dela Cruz
    4 Leucadendron × hybridum Conebush Hybrids
    3 Leucadendron laureolum × strobilinum Goldstrike
    2 Banksia ericifolia Heath-leaved Banksia
    1 each of Stenocarpus sinuatus Firewheel Tree, Adenanthos sericeus Woolly Bush, Embothrium coccineum Chilean Fire Bush, Ls cordifolium The Pincushion, Pr laurifolia Grey Sugarbush, Se florida Blushing Bride, Au umbellata Broadleaf Featherbush, Ld galpinii Hairless Conebush, Ld eucalyptifolium Gumleaf Conebush, Ls mundii Langeberg Pincushion

As above this more or less mirrors the abundance of species recorded so far on iNaturalist, so there are no real surprizes here. What is important to remember is that the AI is not trained on hybrids, so it is detected the hybrids "in error" for other species in the family. Note how many hybrids feature near the top!
Note also that Greviilles and Banksia are also very popular.

253 of these were from the USA (235 82% from California), 9 from Europe, 8 from South America, 0 from Australia.

Note that the paucity of identifiers is interesting. Proteas feature prominently in gardens, and there are hybridization schemes producing new cultivars in Hawaii (and California?), so it is surprizing that there are so few identifiers. On the other hand, perhaps horticulturalists do not use iNaturalist and therefore wont be aware of the ID gap.

So all in all, a great fishing expedition! The AI tool is certainly most useful in pulling lost observations from oblivion, and I can see it becoming an essential and eventually a standard tool for assisting with identifications.

It is worth noting that 361 (62% of the 586 Proteaceae) are marked casual. It is thus not entirely unexpected that these were not identified as they are not in the Needs ID queue and thus easily overlooked. .
Some 162 are Needs ID (28%) and 38 Research Grade (6%) You can help with getting observations to Research Grade here

Publicado el diciembre 25, 2022 08:07 TARDE por tonyrebelo tonyrebelo | 2 comentarios | Deja un comentario

Please help us identify our hidden Proteaceae backlog

Please use this link to help clear the backlog of Proteaceae requiring identification.

https://www.inaturalist.org/observations/identify?verifiable=any&project_id=152984&place_id=any

Any help at all will be appreciated.

If you need a refresher on how to use the Identify Tool, please see this tutorial:
https://vimeo.com/246153496

Publicado el diciembre 25, 2022 07:57 TARDE por tonyrebelo tonyrebelo | 1 comentario | Deja un comentario

Archivos