ATHN over https specification

Version: 0.1.1 (alpha)

Project ATHN is still very early in development. Suggestions and discussion are very welcome. Expect everything to change at any time.

Introduction

The ATHN system is a client server system where the client first sends one read or write request and the server responds with one response.

While ATHN markup documents are a big part of the ATHN system, they cant do everything the system supports on their own. The ATHN over x protocol specifications are not only responsible for making transactions over the network but also these features:

This specification is for using ATHN over https, not http. Encryption is REQUIRED, ATHN transactions MUST NOT use http.

1 This is for testing purposes

This specification is temporary until we find a better solution. https is not suitable for ATHN in the long run for various reasons. The main reason for going with https for this temporary specification is its rich software ecosystem.

1.1 https is too complex

https has a lot of features, most of which arent necessary or useful for ATHN. This complexity makes software unnecessarily harder to write, especially if you dont use libraries.

This doesnt line up with ATHNs philosophy of being as simple as possible.

1.2 https forces you to use X.509 certificates

Much like https, X.509 certificates are needlessly complicated and have lots of useless features for ATHN. All we need certificates for is encryption and access control. There's no need for all the arbitrary metadata or certificate chaining and all of the extra bloat that X.509 has added on top of simple public key cryptography.

I would much rather use ssh keys than X.509, but https doesnt support that.

Because of all of this complexity, if you want to work with X.509 certificates you're more or less forced to use openssl, which is a bloated, legacy program full of security and code quality problems. There have been attempts to replace openssl without much success because it relies on an overcomplicated set of standards to begin with, something that ATHN is literally made to avoid.

1.3 https can be used for tracking

While it's possible to use https in a stateless way, without leaking any information that could be used to track you, which is what's recommended in this specification. That is extremely difficult due to https' (not to mention https software's) design. The protocol may not have been designed with tracking in mind, but it also hasnt been designed to avoid tracking. Over the years though, https has been extended and used in ways that make it a privacy nightmare.

One of ATHNs main goals is to make privacy preservation the default, and to make privacy violations a hard mistake to make. https doesnt allow this.

2 Read requests

A client can send a read request to a server to ask for a document by issuing an https GET request. The server will respond with an error or the document that was requested. A read request only contains 3 pieces of information. Clients should avoid sending more information than necessary, and servers should avoid processing more information than necessary to preserve privacy.

  1. The document that's being requested
  2. Optionally: A list of preferred languages
  3. Optionally: A client certificate

Here is a curl command that shows how to properly send an ATHN read request.

curl -k --cert cert.p12 --cert-type p12 -H "Accept-Language: en, da" https://itzgoldenleonard.github.io/ATHN/index.athn

2.1 Language selection

If a user has a language preference they may choose to provide an ordered list of prefered languages with the read request. This is encoded in the Accept-Language http header. It's up to the server to determine what language a document should be served in given any language preference string.

2.2 Authentication/authorization

Authentication and authorization is done with client certificates. https supports sending requests with an X.509 client certificate.

Software support for this feature is surprisingly good but documentation is not. This is an attempt to make it slightly less hard to agree on how to use this feature.

Client certificates MUST be self signed. It's best to only use a client certificate if it's actually necessary. It's also best to use a different certificate for each domain.

If a resource requires a certificate and one hasnt been provided the server should respond with a 401 status code

If a resource is requested with an unauthorized certificate the server should respond with a 403 status code

These OpenSSL commands can be used to generate a client certificate for use with ATHN:

openssl req -x509 -newkey rsa:4096 -subj "/" -keyout client_key.pem -out client_cert.pem -days 365
openssl pkcs12 -export -in client_cert.pem -inkey client_key.pem -out client_certificate.p12

Deprecation warning

If you plan on using client certificates for anything serious that you dont want to risk losing access to you should have a way to migrate your certificates to another standard. When this specification is deprecated it's quite unlikely that your existing X.509 certificates will be compatible with whatever we end up using for the real specifications.

2.3 Server certificate validation

There are 2 ways to validate server certificates: Trust On First Use and no validation.

From a security standpoint this is not the best. But this specification is for testing purposes and this allows the use of self signed server certificates, which is much easier to deal with.

3 Write requests

A client can send a write request to ask a server to do some action by issuing an https POST request. A write request contains 3 pieces of information.

  1. The url
  2. Optionally: Form data
  3. Optionally: A client certificate (handled in exactly the same way as with read requests)

Here is a curl command that shows how to properly send an ATHN write request.

curl -k --cert cert.p12 --cert-type p12 -X POST -d data.json https://<URL>

3.1 The form data json encoding

A write request can be sent with form data. The form data is encoded in the following json format.

The entire form data is an array of input objects

Each input is an object with the keys:

if value is null, that means that the form field optional and empty. If value is null, but the form field is required, the client has made a big mistake and the server must be able to handle the error (the server should know the schema).

Lists can contain multiple children. The value field of a list type input is a list of form data json arrays. Even if the list only has 1 child.

File type inputs are encoded with the base64 encoding

Here is an example of a json encoded form data array:

[
    {
        "id": "username",
        "type": "string",
        "value": "Example user"
    },
    {
        "id": "age",
        "type": "int",
        "value": 18
    },
    {
        "id": "gender",
        "type": "string",
        "value": null
    },
    {
        "id": "interests",
        "type": "list",
        "value": [
            [
                {
                    "id": "interest",
                    "type": "string",
                    "value": "Computer science"
                },
                {
                    "id": "weight",
                    "type": "float",
                    "value": 0.8
                }
            ],
            [
                {
                    "id": "interest",
                    "type": "string",
                    "value": "Politics"
                },
                {
                    "id": "weight",
                    "type": "float",
                    "value": null
                }
            ]
        ]
    }
]

3.2 Write responses

If the write request is successful the server responds with an acknowledgement (A success status code) and optionally an ATHN document. This is the document that will be redirected to if the redirect property was applied to the submit button.

Server side form validation

The basic validation provided by form fields arent always enough, sometimes a form field needs to fulfill certain criteria that a client cant check. This is the purpose of server side form validation.

If the server deems a form field invalid it can send back a response with a 418 status code and a list of form validation error json objects.

The form validation error json object contains the following keys:

The id key is the ID of the erroneous form field

The message key contains the error message, the error message should be shown to the human. It is written in plain human language and has no meaning for a machine. More advanced "APIs" can choose to start each possible error message with a number (error code) to make the error message machine readable, this is something that's done on an endpoint to endpoint basis and it's up to the administrator of the server to document them.

The idx (index) key is only used if the erroneous form field is a child of a list field, otherwise it's null. Since a list of inputs contains duplicate ids this index number is needed to specify which one of the inputs with that id is erroneous. idx is zero based. It's not necessary to specify the parent list field because no 2 different form fields in a single form are allowed to have the same ID.

Here is an example of the data of an unsuccessful write response:

[
    {
        "id": "username",
        "message": "The username is already taken",
        "idx": null
    },
    {
        "id": "interest",
        "message": "This site is strictly non political, discuss your political opinions elsewhere",
        "idx": 1
    }
]

Appendix: Design decisions that need to be discussed