Skip to content

Commit 6b01e9e

Browse files
committed
[Radar] Add crawlers section to glossary
1 parent 349a6ac commit 6b01e9e

File tree

1 file changed

+8
-0
lines changed

1 file changed

+8
-0
lines changed

src/content/docs/radar/glossary.mdx

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -244,6 +244,14 @@ Each entry on the Verified Bots list exists because a corresponding IP address w
244244

245245
The data displayed on domain-specific geographic traffic patterns is based solely on data from our recursive DNS services. All data displayed is in accordance with our privacy policies and commitments. This data may include attack traffic and cross-origin requests.
246246

247+
## Web Crawlers
248+
249+
[Web crawlers](https://www.cloudflare.com/learning/bots/what-is-a-web-crawler/) are a type of bot that browses the Internet to collect and index website content. They are used by search engines like Google or Bing to make pages discoverable in search results.
250+
251+
They are also used by AI platforms, either to gather content for training large language models, or to retrieve up-to-date information for AI assistants. In both search and AI cases, crawlers work by following links from one page to another, creating a map of online content.
252+
253+
Radar's crawl-to-refer ratio metric is calculated by first mapping crawl requests for HTML pages based on the `User-Agent` header, and referral requests for HTML pages based on the `Referer` header, by platform (e.g., the ratio for Google is based on crawl requests from Googlebot, and referral requests from Google platforms). As with other traffic metrics on Radar, the aggregation resolution for the ratio data is based on the length of the selected timeframe. Additionally, note that traffic referred by native apps may not include a `Referer` header. As such, because the referral counts only include traffic from Web-based tools, these calculations may overstate the respective ratios, but it is unclear by how much.
254+
247255
## WHOIS
248256

249257
WHOIS is a standard for publishing the contact and nameserver information for all registered domains. Each registrar maintains their own WHOIS service. Anyone can query the registrar's WHOIS service to reveal the data behind a given domain.

0 commit comments

Comments
 (0)