Comments on X-Road Protocol for REST

My comments on the REST Draft v0.1 as RFC-ed by NIIS.

Oct 19, 2018

Comments below refer to the X-Road Message Protocol for REST draft v0.1, as RFC-ed by NIIS on Oct 3, 2018. For more accurate feedback, I'd also like to see the transport protocol change proposal and get to know some of the reasoning or discussion that led to any particular proposal. For example, I'm not sure whether the receiving proxy^proxy is also supposed to see the full path as presented below.

Message Format

While regular HTTP and REST depend on DNS and hostnames for routing and service locations, X-Road has it's own routing hierarchy: instance -> organization -> subsystem. The proposed method of encoding that information along with the resource (the HTTP-equivalent to SOAP's service and method-call) is:

{http-method} /rest/{version}/{consumer-subsystem}/{provider-subsystem}/{service-id}

And an example of it with real values:

GET /rest/v1/ee/NGO/1234/mine/ee/GOV/5678/theirs/v1/widgets/42

I'm not convinced it's optimal to encode both the consumer's and the provider's routing information in the URI path. Based on how the proxy server today works, I'm assuming the consumer's information is necessary for it to choose which certificate to sign with and to present to the provider's proxy or service. However, for the provider it's entirely redundant --- the same identity information, with additional possibly valuable fields, is in the message signature. For the consumer you could make the case it's equally redundant majority of the time. For proxy setups for a single consumer-organization, the organization's identity is implicit. For setups that use alternative components, bypassing the client-side proxy, it's also already available and set up (configuring a library to sign with a particular certificate, for example). If the consumer's identity information was implicitly expected to be passed to the provider's business-logic server by the provider's proxy, then I think that's better handled by unified x-headers (see below).

From an aesthetic and REST-ful angle, it also doesn't make much sense to speak about remote services "belonging" to a consumer.

A minor argument, but a result of the hierarchy mismatch, is tying the consumer to the path slightly inconveniences code re-use. Most HTTP libraries permit setting (shared) headers for all requests. They don't usually easily permit modifying the path transparently to the code that constructed the request. In other words, a (shared) function that requests a resource can more easily be re-used if it doesn't have to concern itself with the consumer (sender).

X-Headers

Later sections suggest using X-headers for additional context data, such as the person-id field Estonian X-Road services occasionally expect to see. I think unifying consumer identification with other optional headers is a more elegant solution for two reasons: It permits omitting it for single-organization proxy setups, or for alternative transport components, and permits the receiver's proxy to fill it for the servers it's proxying for. Should they require additional information from the signature (such as time), that could easily be added as additional headers.

While ideally I'd like to see no routing information in the path and see X-Road get out of the ad-hoc name resolving business (use DNS over distributing the /etc/hosts equivalent in a megabyte of XML), I'd first like to hear the reasoning behind proposing the path route.

I realize the majority of X-Road users are using the proxy setup today, but I think it's beneficial to think about the protocol separately and then see how to reconcile that with optional proxying.

Errors

The RFC described limiting errors the proxy may respond with it. It is unclear to me, how are users of the proxy supposed to differentiate between proxy errors and remote server (not remote proxy) errors?

Examples

I know these were just examples, but I found it a little funny that they already differed from the school of REST that I subscribe to, highlighting either how easy it is to make mistakes or how there's no immediate consensus on how to implement REST. It happens the White House REST Best Practices link the examples supposedly mimic match what I'd expect, so I'm guessing the protocol examples were mistakes.

Specifically:

GET /pet/{petId} --- falls under the White House's example of a bad URL, in that "pet" is not plural.
PUT /pet --- Shouldn't update a single pet by the id in the body, but an entire collection (if plural) or a "singleton" pet identified by the URI /pet.
POST /pet/{petId}/uploadImage --- better to have it under PUT /pets/42/image. Think [idempotent] actions on resources, not function calls!

I'll use the term "proxy" to refer to "security server" to emphasize there may and are alternative components for performing message signing and transport security. I find that distinction may give rise to more optimal protocol designs.↩