Applebot/0.1 does not fully obey robots.txt as it interprets allow entries for Googlebot as implied permission for Applebot.
78ae053c4168117e295b49f3d21f45583932d75f34e82ed990e26f77540f353c
Applebot - Not Obeying robots.txt
https://www.info-sec.ca/advisories/Applebot.html
Overview
"Applebot is the web crawler for Apple. Products like Siri and Spotlight Suggestions use Applebot. It respects customary robots.txt rules and robots meta tags, and it originates in the 17.0.0.0 net block."
(https://support.apple.com/en-us/HT204683)
Issue
Applebot/0.1 does not fully obey robots.txt as it interprets allow entries for Googlebot as implied permission for Applebot, illustrated by the following example:
User-agent: Googlebot
Disallow:
User-agent: *
Disallow: /
Impact
This scenario presents potential privacy implications for site owners who may want their content made available to Google but not Apple.
Timeline
September 19, 2019 - Provided the details to Apple via product-security@apple.com
September 19 , 2019 - Apple sent an auto acknowledgment
October 1, 2019 - Apple responded stating that they are investigating
October 29, 2019 - Asked for a status update
November 4, 2019 - Apple responded stating that they do not see any security implications
July 7, 2020 - Published an advisory to document the issue
Solution
If you do not want Apple to index your site, explicitly restrict Applebot in robots.txt using the following example:
User-agent: Applebot
Disallow: /