spiderCrawling

Crawling, often called spidering, is the automated process of systematically browsing the World Wide Web. Similar to how a spider navigates its web, a web crawler follows links from one page to another, collecting information. These crawlers are essentially bots that use pre-defined algorithms to discover and index web pages, making them accessible through search engines or for other purposes like data analysis and web reconnaissance.


.Well-Known URIs

The .well-known standard, defined in RFC 8615arrow-up-right, serves as a standardized directory within a website's root domain. This designated location, typically accessible via the /.well-known/ path on a web server, centralizes a website's critical metadata, including configuration files and information related to its services, protocols, and security mechanisms.

By establishing a consistent location for such data, .well-known simplifies the discovery and access process for various stakeholders, including web browsers, applications, and security tools. This streamlined approach enables clients to automatically locate and retrieve specific configuration files by constructing the appropriate URL. For instance, to access a website's security policy, a client would request https://example.com/.well-known/security.txt.

The Internet Assigned Numbers Authority (IANA) maintains a registryarrow-up-right of .well-known URIs, each serving a specific purpose defined by various specifications and standards. Below is a table highlighting a few notable examples:

URI Suffix
Description
Status
Reference

security.txt

Contains contact information for security researchers to report vulnerabilities.

Permanent

RFC 9116

/.well-known/change-password

Provides a standard URL for directing users to a password change page.

Provisional

https://w3c.github.io/webappsec-change-password-url/#the-change-password-well-known-uri

openid-configuration

Defines configuration details for OpenID Connect, an identity layer on top of the OAuth 2.0 protocol.

Permanent

http://openid.net/specs/openid-connect-discovery-1_0.html

assetlinks.json

Used for verifying ownership of digital assets (e.g., apps) associated with a domain.

Permanent

https://github.com/google/digitalassetlinks/blob/master/well-known/specification.md

mta-sts.txt

Specifies the policy for SMTP MTA Strict Transport Security (MTA-STS) to enhance email security.

Permanent

RFC 8461

Last updated