Secure API Design
General Guidelines for RESTful API Design
We recommend the following information sources about RESTful API Design (and assume their content to be known to the reader of this page):
- https://hackernoon.com/restful-api-designing-guidelines-the-best-practices-60e1d954e7c9
- https://restful-api-design.readthedocs.io/en/latest/
Motivation to think about Secure API Design
Depending on the Design and Implementation of an API, information might leak unintentionally.
Example:
The reaction of the server (e.g. returned HTTP Response Code) might reveal sensitive information (e.g., whether a certain asset exists).
About the “Response Status Code”
How should the backend react if someone is requesting a resource that exists but the requester does not have the permission to access the resource? The answer might be: 403 in case of lacking permissions, and 404 in case the resource really does not exist.
But: Returning 403 might reveal that the requested resource does exist (because status code is not 404) and that the requester is just not allowed to access it (because status code is 403). The information leak is caused by the server reacting/ responding in different/ distinguishable ways depending on whether the resources does not exist or whether there is lack of permission. To keep this information secure, the server needs to react/ respond in a way that the requester cannot distinguish the two cases. A simple solution is to use the same response status code for both cases.
Returning 404 in case a resource is not visible/ accessible for a requester might hide whether the resource really does not exist or whether the requester is just not allowed to access it (“resource does not exist for the requester”).
An alternative to returning status code 404 in case of lacking permissions (to hide the existence of the resource) could be to return status code 403 whenever the requested resource does not exist (“requester does not have permission to access non-existing resources”).
https://developer.mozilla.org/en-US/docs/Web/HTTP/Status#client_error_responses notes about response status code 404 Not Found: “Servers may also send this response instead of 403 Forbidden to hide the existence of a resource from an unauthorized client”.
But: Usability of the API might suffer as there will be no signal to tell the requester that the resource does exist and that there is just a lack of permission that prevents the access.
Therefore: When Designing an API and thinking about what response codes to use for what cases, one needs to gauge the options and decide whether/ what information should be given to the requester and what information should be hidden. Is it more important to hide the existence of a resource someone does not have access to or to report that a lack of permission prevents the access?
In our Demo System, we use response status code 404 in case a requester does not have access to a resource.
About “Collection Resources” and “Query Parameters”
Frequently, RESTful-APIs are structured using the concept of Collection Resources. Example:
- “/{collection-name}”, e.g., “/fields”
- Resources of type Field, collection of all the resources, List of Fields, might be empty.
- Maybe the requester is only allowed to see some of them.
- => The returned list might only contain a subset of the existing elements as the requester might not be allowed to access/ see them all (status 200 but maybe only return a subset of the list, or 404/403 in case that the requester has no permission to access this type of resource in general).
- “/{collection-name}/{id-of-entry}”, e.g., “/fields/my-field-1”
- One specific Field, can exist (200) or not exist (404) or maybe the requester does not have access to it (403 or 404, see part about “Response Status Code”).
Frequently, the ids of the entries are generated by the backend system and cannot be controlled by the client.
Frequently, the client does not know what entries exist and therefore cannot request them directly/ individually by their id via GET /{collection-name}/{id-of-entry}. Instead they need to query the collection resource to find out what entries are available.
One could design the collection resource endpoint to support query parameters so the client can “filter” the result set / query only certain items meeting certain criteria instead of the whole collection with all its entries. Example:
- /fields?ownedBy=user-123
- Returns a List of Fields but only those that are owned by user-123.
- Question: How should the server react if the requester is not allowed to access / see the fields that are owned by user-123?
- Option 1: 404 response status code.
- Option 2: 403 response status code.
- Option 3: 200 response status code with empty list in response body.
- Question: How should the server react if user-123 does not exist?
- Option 1: 404 response status code.
- Option 2: 403 response status code.
- Option 3: 200 response status code with empty list in response body.
- => Behavior of the system might reveal information (see part about “Response Status Code”).
Arguments against using 404 response status code for the collection resource:
- The collection does exist (is documented in the API Description), therefore 404 might make no sense (it is commonly known that the collection resource does exist under this URL, therefore 404 might be inappropriate).
- 200 response status code with empty list in response body is the behavior we expect when there is just no data yet and there is no lack of permission => The collection exists, it is just empty.
- The returned content of the collection needs to be filtered according to the permissions of the requester as the requester might not be allowed to see some of the entries. Maybe there is no item in the collection the requester has access to. In that case, the returned collection could just be empty. (“Request processed successfully, therefore 200, there are just no items that I can show you” because there are no items at all or because there are no items that you have access to).
Suggestion: Do not use 404 or 403 response status code for the collection resource GET /{collection-name}, instead: favor 200 response status code and only return the elements that the requester has access to and that match the filter query. In case the requester has no access to any element, the returned list is just empty. That way, we do not reveal information because the requester cannot distinguish whether there are no more resources or whether it is just a lack of permissions.
Possible drawback: The client does not get any information whether the filter query is valid, the user exists, whether the user has no data or whether there is a permission issue that causes an empty list response.
Alternative Design Options if we want to have error codes returned: Instead of having global collection resources, we might model the API in a way to have collection resources as sub-resources of another resource (e.g., a specific user). Example:
- /users/user-123/fields
- /users/user-123/fields/my-field-1
Here we could use 404 or 403 to signal that the user does not exist or that the requester does not have sufficient permission to access the sub-resources of that user. But it comes with the drawback that the client can no longer access data of multiple users with one API call.
In our Demo System we use a global collection resource for the data items with response status code 200 and a filtered list as response. In case there is no data or the requester has no permission to see any of the data, the returned list is just empty.
About “URL Patterns”
Incrementing or easy to guess resource identifiers (for example: “/somepath/customers/123456”) might be an issue. One could easily try to access other resources by manipulating the request URL and try to access for example “/somepath/customers/123457”
In case there is no proper access control for the resources and one only relies on the assumption that requesters will not change/ try out/ guess other paths or values, information might leak easily.
How to protect?
- Option 1: Have proper access control instead of hoping that no one will try out and find other valid URLs.
- Option 2: Have hard to guess/ randomly generated resource identifiers (maybe in combination with some sort of expiration).
- E.g. Randomly generated download link that expires after some time.
Separated APIs for different types of clients
We provide each role/stakeholder with their own part of the API and make sure that only these roles/stakeholders can access that part (User Role Check and Token Scope Check in the KST Platform backend, see Authentication and Authorization).
Separated APIs for:
- Data Owner
- Account / Profile.
- Consent Management.
- Audit Logs.
- Data Access.
- Disease Warning Service Preferences.
- Data Providers
- Consent Management.
- Data Provisioning.
- Data Consumers
- Consent Management.
- Data Consumption.
- Data Prosumers
- Consent Management.
- Data Provisioning.
- Data Consumption.
- Data Catalog
- Supported Data Item Types.
Doing so helps ensure that different types of requesters can only access the functionalities/ parts of the API they should have access to and provide them with API endpoints tailored to their needs:
- Only the Data Owner can manage their account, consents, audit log, settings, … and only the KST Platform Owner-UI can request according Token Scope to reach those endpoints (Data Consumers / Providers / Prosumers cannot do that).
- Data Consumers can consume data and manage data consumption consents/ consent requests (Data Consumers cannot provide data and cannot request according consents).
- Data Providers can provide data and manage data provisioning consents/ consent requests (Data Providers cannot consume data and cannot request according consents).
- Data Prosumers can provide and consume data and manage according consents/ consent requests.
For more details, please have a look at the API Specification of the KST Platform.