Network Working Group C. A. Wood Internet-Draft Cloudflare Intended status: Standards Track 16 August 2023 Expires: 17 February 2024 The Remote Rate Limiting Protocol draft-wood-remote-rate-limiting-latest Abstract This document specifies the remote rate limiting protocol. It is designed to enable collaborative rate limiting between privacy proxy providers and target services. It is one mechanism amongst others for dealing with abusive traffic that negatively affects target services. Discussion Venues This note is to be removed before publishing as an RFC. Source for this draft and an issue tracker can be found at https://github.com/chris-wood/draft-wood-remote-rate-limiting. Status of This Memo This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at https://datatracker.ietf.org/drafts/current/. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." This Internet-Draft will expire on 17 February 2024. Copyright Notice Copyright (c) 2023 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/ license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Revised BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Revised BSD License. Table of Contents 1. Introduction 2. Terminology 3. Threat Model 4. Overview 4.1. Offline Registration 4.2. Online Rate Limit Enforcement 4.2.1. Authentication 4.2.2. Validation and Enforcement 4.3. Limitations 5. Applications 5.1. OHTTP DoS 5.2. Port Scanning DoS 5.3. Volumetric DoS 6. Security Considerations 7. IANA Considerations 8. References 8.1. Normative References 8.2. Informative References Appendix A. Comparison to DOTS Appendix B. Acknowledgements Author's Address 1. Introduction Privacy proxy systems such as those built on MASQUE [CONNECT-UDP], Oblivious HTTP [OHTTP], and WireGuard [WIREGUARD]. provide one common feature: they mask a client's true IP address from the targets to which clients interact with through these proxies. While this offers meaningful privacy benefits to clients, it complicates common operational security practices, such as IP addresses to help identify and mitigate abusive traffic. Examples of abusive traffic include malicious or otherwise malformed application data sent to targets through the proxies, volumetric flooding attacks, and general (distributed) denial of service (DoS) attacks. Naturally, absent some mechanism to apply granular rate limits to individual client connections, targets are left with broad sweeping mitigations that target the proxy service, such as IP-based rate limits, and therefore affect all of its clients, including those which do not engage in abusive behavior. Proactively preventing abuse through (privacy-preserving) client authentication is one alternative solution that can help mitigate such abuse in practice. In particular, proxies can only admit service to authentiated clients, or targets can use privacy- preserving authentication protocols such as Privacy Pass [PRIVACY-PASS] to admit client traffic. Another type of solution might be in form of some "humanity check" such as a CAPTCHA, with the intent of making sure that some human is responsible for client traffic rather than an automated bot. However, there are several important ways in which these proactive techniques can be inadequate in practice: 1. Authorization decisions based on client authentication do not attest to client behavior -- they only attest to the client identity. This means that authenticated clients can still engage in abusive behavior. 2. Authorization decisions based on humanity checks also do not attest to client behavior. Humans interacting with an application can intentionally initiate abusive traffic. Reactive mitigation mechanisms complement proactive mechanisms. Reactive mechanisms allow targets and proxy systems to work together to take corrective action to minimize or remove abusive traffic. This document describes a protocol that can implement one limited form of reactive mitigation, called the remote rate limiting (RRL) protocol. RRL builds on ACME to enable seamless registration and configuration between proxies and targets. Targets use authentication information from ACME to request rate limiting actions by the target. 2. Terminology The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here. Client: An entity that interacts with remote services, called targets. Target: A service or resource that clients interact with. Proxy: An entity that sits between client and target. Application proxy: A proxy that relays application messages between client and target, such as an OHTTP Oblivious Relay Resource. Transport proxy: A proxy that relays end-to-end transport connections between client and target, such as a MASQUE proxy or WireGuard VPN proxy. 3. Threat Model The remote rate limiting (RRL) protocol is based on the following threat model. Clients are either honest or malicious. Honest clients do not engage in abusive behavior, whereas malicious clients are carry out whatever behavior they wish, including abusive behavior. Targets can also be honest or malicious. An honest target will faithfully use the protocol to protect itself against abuse, whereas a malicious target will try to use the protocol to carry out the following goals. 1. Learn information about clients. The attacker aims to use the rate limiting protocol to learn information about the clients behind a proxy. This information might include, for example, the total number of clients behind a proxy, or any other information that might be useful in partitioning client anonymity sets. 2. Disproportionately and negatively impact honest clients. The attacker aims to misuse the rate limiting protocol to single out honest clients and cause service disruption for them. Malicious clients can engage in abusive behavior with the intent of disrupting service for honest targets, or for negatively impacting the proxy service for other honest clients. The proxy is assumed to be honest, since a malicious proxy could easily violate client privacy by revealing the client's IP address to targets. 4. Overview Given the threat model in Section 3, the remote rate limiting (RRL) protocol is based on the following assumptions: 1. The definition of abuse varies widely and depends on the target service. In other words, targets are authoritative for what is considered abusive traffic that negatively affects the target. 2. Rate limiting rules can only be expressed in terms of behavior that can be validated by both proxy and taget. Importantly, this means that targets can only express rules in terms of information that both parties know. In other words, targets cannot express rules in terms of information they do not know. As an example, it is not possible to express rules in terms of the number of requests per client if the target does not know how many clients are behind a particular proxy, nor if the proxy does not know the number of requests that a particular client is sending because the client's connection to the target is encrypted. 3. Proxies cannot trust targets which cannot authenticate themselves, as this can spoofed by attackers (malicious targets). Moreover, authenticating a target does not necessarily mean the target is honest; an authenticated target can still engage in malicious behavior. As such, the rate limiting protocol cannot leak information to the the privacy proxy that it does not already know. In particular, the protocol cannot depend on application data that is encrypted and unknown to the proxy. This ensures that the protocol cannot be misused by targets in an attempt to deanonymize clients. 4. IP addresses are not suitable for authentication and authorization decisions. In particular, this means that proxies cannot use target IP addresses to determine whether or not a particular target message is authenticated. RRL assumes that proxies are public, i.e., that targets have some realiable means of discovering or learning about a proxy. RRL is therefore not applicable to deployment scenarios where the proxy is meant to be private or otherwise does not seek to make its presence known to targets. The protocol is divided into two phases: an offline registration phase (Section 4.1), wherein targets obtain authentication material used for the online phase of the protocol, and an online phase (Section 4.2), wherein targets send rate limiting rules to the proxy for enactment. The relationship between these two phases is shown in Figure 1. Details about each phase are in the following sections. +--------+ +-------+ +--------+ | Client | | Proxy | | Target | +---+----+ +---+---+ +----+---+ | | (offline) <=== (register) ===> --------------------------------------------------- | | | (online) +======================= (abuse) =====> +==================|==> | + ... <--- (rate limit) -+ | ... | | | ... +----- 200 OK -----> +==================|==> | +=============> X (apply rate limit) Figure 1: RRL interaction overview 4.1. Offline Registration Registration is built on ACME, which is a protocol for obtaining authentication credentials in the form of a certificate. Targets run the ACME protocol with a proxy to obtain RRL authentication certificates. The certificate issued MUST have the Client Authentication EKU configured, as it will be used for authenticating the client. They then use these certificates in the online phase of the protocol. [[NOTE: this is pretty straightforward -- what more would we actually need to say here?]] 4.2. Online Rate Limit Enforcement The online phase of RRL is based on HTTP. Targets, as HTTP clients, send messages to a proxy Rule Resource to enact rate limit rules. Each rule is meant to limit the number of acceptable connections or requests in a given time window. Rules are expressed using the semantics in [RATE-LIMIT]. In particular, rate limits represent some limit, a policy (in terms of quota-units), and a time-based condition after which the limit resets. Proxies are configured with a URL for their RRL Rule Resource, e.g., "https://proxy.example/.well-known/rrl-rules". Targets send POST messages to the proxy Rule Resource with a JSON object ([RFC8259], Section 4). Section 4.2.1 describes the mechanism by which these requests are authenticated. Note that the reason that RRL relies on targets pushing messages to proxies rather than proxies pulling from targets is to enable on-demand application of rate limit rules. [[NOTE: Pushing vs pulling rate limit rules is somewhat of an implementation detail -- the salient point is that these messages are authenticated]] The contents of the Rule Resource message JSON object are defined in Table 1. +==================+==========================================+ | Field Name | Value | +==================+==========================================+ | Target | Name of the target | | (optional) | | +------------------+------------------------------------------+ | RateLimit-Limit | As defined in Section 5.1 of | | | [RATE-LIMIT] except that parameters are | | | not permitted, encoded as a JSON string. | +------------------+------------------------------------------+ | RateLimit-Policy | As defined in Section 5.2 of | | | [RATE-LIMIT] except that parameters | | | other than "unit" and "scope" are not | | | permitted, encoded as a JSON string. | +------------------+------------------------------------------+ | RateLimit-Reset | As defined in Section 5.4 of | | | [RATE-LIMIT] except that parameters are | | | not permitted, encoded as a JSON string. | +------------------+------------------------------------------+ Table 1: RRL Rule Resource message The "unit" parameter for the RateLimit-Policy field has the following permissible values: * requests: This means the rate limit quota applies to HTTP requests. This is only enforceable by a proxy if it can see requests, e.g., if it is an OHTTP Relay Resource (see Section 2 of [OHTTP]). * connections: This means the rate limit quota applies to number of connections. * bandwidth: This means the rate limit applies to the bandwidth consumed by a given connection or request. The "scope" parameter for the RateLimit-Policy field has the following permissible values: * total: This means the rate limit quota applies to all client traffic from the proxy to the target. * single: This means the rate limit quota applies to individual client traffic from proxy to target. Proxies MUST validate the values received in the Rule Resource message fields as described in Section 4.2.2. Proxies MAY ignore malformed Rule Resource messages and respond to them with a 400 error. Proxies that validate and accept Rule Resource messages respond to them with 200 OK messages. Proxies enforce these rules sent to the Rule Resource as described in Section 4.2.2. Sample Rule Resource messages and the scenario to which they would apply are in Section 5. 4.2.1. Authentication Rule Resource messages are authenticated using credentials obtained during the offline registration phase. There are several options for request authentication, including those below: * Mutually authenticated TLS. In this option, targets establish a mutually authenticated TLS connection to the proxy, using their credentials, before sending any Rule Resource messages. * Message signing. In this option, targets sign the content of the Rule Resource message using their credentials and produce a signature according to Section 3.1 of [MESSAGE-SIGNATURES]. Proxies verify the signature using the credentials according to Section 3.2 of [MESSAGE-SIGNATURES]. [[OPEN ISSUE: The HTTP message signature keyid needs to contain enough information for the proxy to obtain the credentials used for verifying the signature, so it's tightly bound to the way registration works. This is not specified now and needs more thought.]] Proxies authenticate requests using one of these options (or something with similar properties). 4.2.2. Validation and Enforcement Rule Resource message validity depends on the proxy's behavior and, in particular, whether the proxy is an application or transport proxy. Application proxies can observe the client request boundaries, but cannot view their contents. In contrast, transport proxies can only observe connection boundaries and cannot view request boundaries. As such, validation rules are different depending on the type of proxy, though there are some general Rule Resource message validation steps that apply to both. These common rules are as follows: * Check that the RateLimit-Reset field is not too far in the future. * Check that the RateLimit-Limit is not too high. [[OPEN ISSUE: what does too high even mean?]] * Check that the RateLimit-Limit, RateLimit-Policy, and RateLimit- Reset fields do not contain any unexpected parameters. Beyond these general validation rules, the validation rules for application proxies are as follows: * Check that the RateLimit-Policy "unit" parameter is present and has the value "requests" if the "scope" parameter is "total", else the "unit" parameter has the value "bandwidth." This has the effect of limiting total number of requests to the target or the size of any one request. Likewise, beyond the general validation rules above, the validation rules for transport proxies are as follows: * Check that the RateLimit-Policy "unit" parameter is present and has the value "connections" if the "scope" parameter is "total", else the "unit" parameter has the value "bandwidth." This has the effect of limiting total number of connections to the target or the bandwidth consumed by any one connection. If all checks pass, then the message is considered valid. Proxies can enforce valid Rule Resource messages but are not required to do so. Enforcing a message means enacting rate limit rules uniformly across all clients to the target; Proxies MUST NOT apply any rate limit actions with "scope" equal to "total" on a per-client basis. 4.3. Limitations The RRL protocol is limited in several important ways: * RRL is only usable by targets which can authenticate themselves. This means that services which, for example, are not capable of running HTTPS because they have not yet implemented ACME support, will not be able to submit RRL messages. * RRL does not support mitigation of attacks that span targets. This is because there is no straightforward way for proxies to authenticate and validate the legitimacy of rate limit requests from two independent targets. 5. Applications This section contains example applications of RRL that may be used to mitigate attacks enabled or otherwise exacerbated by deployed proxy technologies. 5.1. OHTTP DoS A rule for mitigating OHTTP attacks, which seek to overwhelm the target with too many requests is below. In this example, the policy expresses that the target can handle at most 100 requests per minute. { "RateLimit-Limit": 100, "RateLimit-Policy": "60; scope='total'; unit='requests'", } Similarly, a rule for mitigating OHTTP attacks due to excessively large messages (larger than 1024B) is below. { "RateLimit-Limit": 1024, "RateLimit-Policy": "60; scope='single'; unit='bandwidth'", } Since OHTTP is an application proxy protocol, it is not possible to safely express rate limits that limit the number of requests from any one client, as this could be misused by malicious targets to de- anonymize clients. 5.2. Port Scanning DoS A rule for mitigating port scanning attacks, which open many connections to the target server in a short amount of time, is shown below. In this example, the threshold for port scanning is determined to be more than 10 connections per minute. { "RateLimit-Limit": 10, "RateLimit-Policy": "60; scope='total'; unit='connections'", } 5.3. Volumetric DoS A rule for mitigating volumetric attacks, which sends excessive data to the target server in a short amount of time, is shown below. In this example, the threshold for port scanning is determined to be more than 65536 bytes per connection in a given minute. { "RateLimit-Limit": 65536, "RateLimit-Policy": "1; scope='total'; unit='bandwidth'; w=60", } 6. Security Considerations The RRL protocol was motivated by the need to ensure that operational security does not regress in the name of client privacy. As such, the design of RRL intentionally restricts what sort of security mitigations can be enacted in practice. A consequence of this is that certain classes of attack may not be mitigated entirely by RRL. For example, in the case of OHTTP, it is not possible to limit the number of requests per any single client, since enforcing such a policy might be abused by malicious targets to de-anonymize clients. As such, RRL is complementary to other approaches for dealing with attacks from individual clients, such as Privacy Pass. The RRL protocol is designed to allow any target which can authenticate itself to send rate limit rules to the proxy. Each rate limit rule does require the proxy to store state for enacting the rule. As such, absent restrictions, malicious targets could abuse this mechanism to exhaust resources on the proxy. In settings where this is a problem, proxies SHOULD apply some form of allow list for targets to ensure that state does not grow unbounded. 7. IANA Considerations This document has no IANA actions. 8. References 8.1. Normative References [DOTS-SIGNALS] Boucadair, M., Ed., Shallow, J., and T. Reddy.K, "Distributed Denial-of-Service Open Threat Signaling (DOTS) Signal Channel Specification", RFC 9132, DOI 10.17487/RFC9132, September 2021, . [MESSAGE-SIGNATURES] Backman, A., Richer, J., and M. Sporny, "HTTP Message Signatures", Work in Progress, Internet-Draft, draft-ietf- httpbis-message-signatures-19, 26 July 2023, . [RATE-LIMIT] Polli, R. and A. M. Ruiz, "RateLimit header fields for HTTP", Work in Progress, Internet-Draft, draft-ietf- httpapi-ratelimit-headers-07, 24 June 2023, . [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, March 1997, . [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, May 2017, . [RFC8259] Bray, T., Ed., "The JavaScript Object Notation (JSON) Data Interchange Format", STD 90, RFC 8259, DOI 10.17487/RFC8259, December 2017, . 8.2. Informative References [CONNECT-UDP] Schinazi, D., "Proxying UDP in HTTP", RFC 9298, DOI 10.17487/RFC9298, August 2022, . [DOTS] Mortensen, A., Ed., Reddy.K, T., Ed., Andreasen, F., Teague, N., and R. Compton, "DDoS Open Threat Signaling (DOTS) Architecture", RFC 8811, DOI 10.17487/RFC8811, August 2020, . [OHTTP] Thomson, M. and C. A. Wood, "Oblivious HTTP", Work in Progress, Internet-Draft, draft-ietf-ohai-ohttp-09, 27 July 2023, . [OHTTP-RateLimit] Reddy.K, T., Wing, D., Boucadair, M., and R. Polli, "Oblivious Relay Feedback", Work in Progress, Internet- Draft, draft-rdb-ohai-feedback-to-proxy-09, 14 May 2023, . [PRIVACY-PASS] Davidson, A., Iyengar, J., and C. A. Wood, "The Privacy Pass Architecture", Work in Progress, Internet-Draft, draft-ietf-privacypass-architecture-14, 9 August 2023, . [WIREGUARD] "WireGuard: Next Generation Kernel Network Tunnel", n.d., . Appendix A. Comparison to DOTS DDoS Open Threat Signaling (DOTS) is an architecture for establishing and maintaining Distributed DoS mitigations within and between domains on the Internet [DOTS]. The RRL protocol shares similarities with DOTS. For example, in some respects, targets act as DOTS client, which detect and request mitigation of attack, and proxies act as DOTS servers, which are responsible for implementing mitigations. There are also some notable syntactical differences, e.g., the DOTS signaling protocol uses CoAP instead of HTTP. Importantly, however, the RRL protocol differs from DOTS, and in particular the DOTS signal protocol in [DOTS-SIGNALS] in several semantically meaningful ways: 1. DOTS only signals information about targets under attack, e.g., the target domain name, IP address range, port range, etc., without conveying any information about specifying any sort of proxy mitigation behavior. This means that it would be possible for DOTS servers (proxies) to implement mitigations that could be abused to violate the privacy goals of the proxy system. In contrast, RRL is more explicit in terms of how mitigations are applied and, importantly, what mitigations cannot be applied. 2. DOTS requires an active session between client (target) and server (proxy) that is kept alive via heartbeat messages. Presumably this is done to ensure that the server (proxy) mitigation state is associated with the client session. In contrast, RRL does not use sessions to keep state and instead uses HTTP semantics for state management, i.e., the server maintains a resource that authenticated clients can update as needed without maintaining an active session. Note that it may be the case that these differences are not actually meaningful in practice, or that these are differences are based on a misunderstanding of DOTS. Appendix B. Acknowledgements This document was inspired by [OHTTP-RateLimit], which was focused on a variant of the problem addressed by this document and tailored specifically to work within OHTTP, rather than alongside it. This document was improved by contributions and feedback from Lucas Pardue and Tommy Pauly. Author's Address Christopher A. Wood Cloudflare 101 Townsend St San Francisco, United States of America Email: caw@heapingbits.net