Skip to content

Add/update OAuth 2.0 flow documentation in IAP #10

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

jesusjuansalgado
Copy link

Better support of oAuth and examples for:
OAuth Device Code Flow (RFC 8628) (for command line)
OAuth Authorization Code Flow (RFC 8252) (for rich desktop applications)

I have not added the token/token yet as we still need to discuss if we want to promote it further. I think we probably should, as some VO apps already use this.

@mbtaylor
Copy link
Member

There are some details here that I don't fully follow, for instance:

  • X-VO-Auth-Error header (not a scheme parameter) - what are clients supposed to do with this value? Also it's only applicable to the ivoa_bearer scheme but there may be multiple challenges with different schemes. And it's optional which reduces its usefulness.

  • The ivoa://ivoa.net/sso#OAuth standard_id value - is this expected to be used with one of the schemes that defines the standard_id parameter (ivoa_cookie, ivoa_x509)? If so, it's not clear to me how this would work. If the user can get a bearer token it seems likely that they'd use it directly rather than exchange it for a cookie or X.509 certificate. I'd like to see this in action before adding it as an option here.

As far as the details of the exchanges go I'm not qualified to review it, at least before attempting an implementation, so it should also be reviewed by somebody else who understands OAuth2 (@aragilar?).

But in any case I think it's premature to make an OAuth2 update to the document. Before that happens I would like to see at least one prototype interoperating service and client implementation of the proposal. OAuth2 is complicated and there are quite a lot of moving parts here, so I think it would be wise to prove that the suggested scheme is sufficiently well-described to be implemented by services and clients, and that the authentication works in at least one example context, before absorbing it into the draft text of the document.

@jesusjuansalgado
Copy link
Author

jesusjuansalgado commented Jun 13, 2025

X-VO-Auth-Error header was introduced to identify the problem related to the authorisation (it could describe that the token is not present but needed, the token is invalid, the token is expired... we can create a vocabulary). I think adding this header could be interesting in general, even for other standard_id (s). I was tempted to add it to the other methods but it is only required for OAuth for the time being.

About the ivoa://ivoa.net/sso#OAuth standard_id , I think OAuth is a different thing from cookies or certificates. It has its own path for development and client-server integration, and this is why it is making use of a different standard_id

About the prototypes, I think the path is usually the opposite. We propose a draft, we try the implementation, and we discover if it is implementable, if there are things fundamentally wrong, or if some feedback needs to be added. I think I can coordinate the implementation at the SRCNet (we will not use either cookies or certificates but bearer tokens), in particular in our astroquery module. We are already implementing RFC 8628, but we just need to add the understanding of the error at our server. The current draft, with cookies and certificates, cannot be used by us (and others)

Yes, we could add @aragilar to review it. I would propose finding engineering experts on OAuth to give feedback. I can identify some engineers at SKAO and other missions to take a look at it, but of course, we need the integration in a draft first (I can generate one from my branch), so they could provide feedback while trying to implement it

Please note that as chair of DSP, I would not be confident to allow the progress of the draft without OAuth and only providing support to cookies or certificates, technologies that are being deprecated. Most of the bigger new observatories need to have more secure systems that use federation, so a standard without this would not be very useful in the short term. We need to include OAuth in one way or another to progress with this standard, and this is why I have made to effort to start the writing of this part

@mbtaylor
Copy link
Member

About the ivoa://ivoa.net/sso#OAuth standard_id , I think oAuth is a different thing than cookies or certificates. It has its own path, and this is why it is making use of a different standard_id

Agreed that cookies, certificates and OAuth are quite different. But the use of the standard_id parameter defined in AuthVO is, so far, only for use with the ivoa_cookie and ivoa_x509 schemes. Your description of the ivoa_bearer scheme doesn't use it (which is fine, I don't think it's appropriate there). If ivo://ivoa.net/sso#OAuth is a possible value for standard_id, it means that services can issue a challenge like:

   www-authenticate: ivoa_cookie access_url="https://example.org/login",
                                 standard_id="ivo://ivoa.net/sso#OAuth"

In that case the client has to do a lot of work to authenticate using OAuth, as you say "not directly specified by this identifier" to pick up a cookie or certificate for later use. Is this what you envisage? Unless we actually expect services to do that, I don't want to put it in the document since it adds a large implementation burden for clients. The way I'd expect clients to use OAuth is instead using something like the ivoa_oauth challenge you propose, which doesn't use the standard_id parameter.

About the prototypes, I think the path is usually the opposite. We propose a draft, we try the implementation, and we discover if it is implementable, if there are things fundamentally wrong, or if some feedback needs to be added. ... Current draft, with cookies and certificates, cannot be used by us

Also agreed that the existing (cookie, x509) text is no good for OAuth services. My reluctance to accept the PR at this stage is that there is a lot of untested detail there and I can imagine significant changes before we get it right. I think that it will be easier to manage such changes if we experiment using more informal descriptions of the requirements than PRs to standards text. This is generally how I've managed specifications that I've been involved with before (e.g. SAMP, VOParquet). If you disagree strongly with this approach maybe it could be discussed in the TCG.

@jesusjuansalgado
Copy link
Author

jesusjuansalgado commented Jun 13, 2025

I think there is a misunderstanding (?)
Once the invocation of the service fails (e.g. trying to invoke a data access URL coming from a data link as we do in our SRCNet system,), the system returns in the error the information needed to start oAuth, that implies to understand where is the document with all the oAuth configuration to authenticate this service. This is not a login URL but something you need to parse in the client https://auth.example.org/.well-known/openid-configuration (because it is more complex)
Also, you need to understand the reason of the error to know if there is an invalid token or you need to refresh it, for example.
The rest of the example describes the standard oAuth, so client developers could modify their existing clients to add the logic and incorporate oAuth (although most will just import an oAuth library)

As said, as chair I would not be able to accept a standard without bearer tokens as that would imply to just describe protocols that are getting obsolete and leaving major missions outside of the loop

@mbtaylor
Copy link
Member

Yes, I'm sure we're talking at cross purposes, apologies if I'm not expressing myself clearly. I certainly agree that OAuth2 has to be supported.

You are proposing an ivoa_bearer challenge, which tells the client where to find the configuration document, which sounds fine. But the standard_id parameter, which describes how to submit a username-password pair, is only for use in ivoa_cookie and ivoa_x509 challenges, so it doesn't need to have an OAuth-related value, unless you are saying that people should make cookie- or certificate-based auhentication accessible by use of OAuth2. I expect that OAuth2-based authentication doesn't go near cookies or certificates, so that
standard_id="ivo://ivoa.net/sso#OAuth" has no use.

@jesusjuansalgado
Copy link
Author

jesusjuansalgado commented Jun 13, 2025

Ah, I understand your point now. However, I think the purpose of the standard is to allow the identification by clients of an error accessing a particular service due to authorisation problems. This client should be able to understand that this is an error produced by a service that expects a basic authentication (and the error provides the URL to do the login), a certificate authentication or oAuth (where the information on how to handle the authorisation is in one configuration file provided to the client). I have aligned the parameters for the oAuth part, but I think what is wrong now is to assume that all the client-server negotiations are just passing a username and a password (as said in the text). I think this part needs to be more ambiguous to cover more complex negotiations

I am trying to cover all with a similar approach, so this is why I added a new standard_id. The interpretation is different but parameters are the same and the behaviour of the server due to errors very aligned

@mbtaylor
Copy link
Member

Agreed not all client-server negotiations are passing a username and password; the standard_id parameter is only for those protocols for which username+password is appropriate (and where there are multiple options of how to do that). For ivoa_bearer you don't want a username+password, and the nature of the client-server negotiation is implicit in the challenge type, so I don't think either standard_id or a differently-named parameter with similar meaning is required in this case.

@jesusjuansalgado
Copy link
Author

I will continue with this branch to see if I can align it more with a username/password approach and I will let you now when this is ready

@mbtaylor
Copy link
Member

I think that the parameters for the ivoa_bearer scheme need to be adjusted - it doesn't work in the same way as ivoa_cookie and ivoa_x509, so it shouldn't try to use the same parameters, which don't fit it very well.

For ivoa_cookie and ivoa_x509 the login protocol has the same requirements for both of them, i.e. send a username+password, so it makes sense to use the same parameter (stanadard_id) in both cases.

But ivoa_bearer has a different way of authenticating (not sending a username+password), so it doesn't make sense to use the same parameter with the same set of options. If one of the options for standard_id is ivo://ivoa.net/sso#OAuth then the client would have to figure out what to do if confronted with

   www-authenticate: ivoa_cookie access_url="https://example.com/login",
                                 standard_id="ivo://ivoa.net/sso#OAuth"

which I don't think has a sensible interpretation.

Moreover, the only value for standard_id that makes sense in your ivoa_bearer scheme is standard_id="ivoa://ivoa.net/sso#OAuth", so specifying that parameter in that scheme is not doing any work.

Instead ivoa_bearer should come up with its own set of parameters that make sense for what it needs to do, not force usage of the parameters already in use for ivoa_cookie and ivoa_x509.

So you could have something like:

   www-authenticate: ivoa_bearer discovery_url="https://auth.example.org/.well-known/openid-configuration"

Using HTTP response headers like X-VO-Auth-Error and X-VO-Auth-User-Action for additional information about the authentication requirements is also problematic, since the server might issue multiple challenges and it's not clear which one(s) the other headers apply to. If that extra information is necessary (since it's optional I'm not yet convinced that it is) it might be better to supply those as additional challenge parameters, e.g.

   www-authenticate: ivoa_bearer
                     discovery_url="https://auth.example.org/.well-known/openid-configuration",
                     error="missing_token",
                     user_action="hello you have done something wrong"

But I'm not sure about that. What is usual practice in OAuth2 for transmitting information like the fact that the token has expired?

@jesusjuansalgado
Copy link
Author

jesusjuansalgado commented Jun 17, 2025

I'm not entirely sure. What the text is proposing is:

www-authenticate: ivoa_bearer access_url="https://auth.example.org/.well-known/openid-configuration",  
                                 standard_id="ivo://ivoa.net/sso#OAuth"

So, whenever the response includes an ivoa_bearer header, the access_url is interpreted as the discovery_url. Since they are conceptually the same (discovery_url provides all the access_url(s) and other metadata for the negotiation), I don’t see any contradictions. In a previous proposal, I used discovery_url, so changing it now would imply that access_url should become optional in the text. That’s why I reused it — to avoid turning everything into optional parameters.

Regarding X-VO-Auth-Error, I understand your point, but I think it could be useful across all methods. We just need to define a controlled vocabulary for possible values. Other methods might use different error statuses, and this could provide a unified way to handle them.

In summary, I think trying to reuse the same parameters across all authentication methods, where possible, could be beneficial. Of course, we could treat them differently if needed (adding as extra metadata in the www-authenticate line)— but in that case, no parameters would be universally required, only those mandated by each standard_id. Nothing would be compulsory in AuthVO

@mbtaylor
Copy link
Member

It is normal for different authentication schemes to have different parameters, since they have different requirements. The Basic auth scheme (RFC7617) has the parameters "realm" and "charset", Bearer (RFC6750) has "realm" and "scope", and Digest (RFC7616) has parameters "realm", "charset", "domain", "nonce", "opaque", and a bunch of others. Where the meanings are the same they re-use the same ones, but if the meanings are different they use different parameters.

@mbtaylor
Copy link
Member

I still see the issue that @aragilar pointed out in #6 (comment): a client_id is required for Device Authorization Grant, but this text doesn't say how to get one. James's proposal does say how to get a client_id as well as, I think, giving clear instructions for how to do the rest of the authentication, although I can't say for sure without attempting a client-side implementation. Is that proposal unsuitable for SRCNet operations?

@jesusjuansalgado
Copy link
Author

jesusjuansalgado commented Jun 20, 2025

the preferred way to do it is by preregistered clients. At the SRCNet we use Indigo IAM, and the steps are done by:

Login as IAM admin
Navigate to the Admin Console
Go to Client Management → "Add Client"
Set:
Client ID: e.g., ivoa-pyvo-cli
Client type: Public
Grant types: enable:
urn:ietf:params:oauth:grant-type:device_code
authorization_code (for desktop)
optionally refresh_token
Scopes: openid, email, profile, vo.read, etc.
Redirect URIs:
For Device Code: not required
For native apps: e.g., http://localhost:*
Enable Client credentials only if the client is trusted to act without user
Save and document the client_id for external clients (e.g., for pyvo)

for keycloak (we could migrate to it at certain point) the steps are similar:
In Keycloak:

Go to Clients → Create
Enter:
client_id = ivoa-pyvo-cli
Enable public client (no client secret)
Enable device_authorization_grant
Set scopes (e.g., openid, profile, vo.read)

So, this is done by the admins and it depends on the IAM system.
There is a dynamic way to get a client_id based on Dynamic Client Registration (RFC 7591) but I do not think this is secure enough and it could be not accepted by some providers.
What we can do? We can maintain a list of VO client_ids and we could update our services. Something like:

client_ids for VO Clients

Client / Tool client_id Notes
TOPCAT / STILTS ivoa-stilts Covers both desktop GUI (TOPCAT) and CLI (STILTS) workflows
pyVO (CLI scripts) ivoa-pyvo-cli Headless Python client for SIA, TAP, and other VO protocols
Astroquery ivoa-astroquery Programmatic VO and archive access via Python
Aladin Desktop ivoa-aladin Interactive VO image/data viewer
vo-cli ivoa-vo-cli General VO command-line interface
VO Notebook (Jupyter) ivoa-vo-notebook For VO-enabled workflows in Jupyter or web notebooks
GAVO DaCHS Portal ivoa-dachs-service Internal DaCHS service/client communication
Web-based VO Portals ivoa-web-portal Browser-based VO tools and data explorers
Test Clients ivoa-test-client Used for sandbox, development, or interoperability testing

But in this standard, we should say that the clients should be preregistered and that the list is maintained by the IVOA (maybe, propose some concrete examples (?))

@mbtaylor
Copy link
Member

I think you are proposing here a federated IVOA-wide list of client_ids accepted by all OAuth2-based providers in the VO that want to be accessible in this way. From a client point of view that would make things quite straightforward, though it could make it harder for new clients to use the system, unless there's a general-purpose ID like "ivoa-client" suitable for anybody to use.

Do you (or other readers) think that VO resource providers would accept client_id management done like this? My feeling is that RFC7591 dynamic client registration would be more palatable to resource providers, but I don't have server-side experience, so I might be wrong.

@jesusjuansalgado
Copy link
Author

jesusjuansalgado commented Jun 20, 2025

I think the resource providers will accept the preregistered approach, but they will be reluctant to a dynamic client_id creation based on RFC7591. RFC 7591 is not a good fit, in my view, for the IVOA.
In the IVOA, different services and identity providers are managed by different organizations, so allowing clients to register without approval creates trust and security problems. It becomes hard to tell which clients are legitimate, and there’s a higher risk of abuse or misconfiguration.
Most IVOA clients—like data tools, scripts, and web apps—are known ahead of time and don’t change often, so they can be registered manually and securely, and it would be easy to coordinate

@mbtaylor
Copy link
Member

OK, let's ask around and see what the opinions are of OAuth2 VO service providers.

@jesusjuansalgado
Copy link
Author

BTW, I have updated the text to come back to discovery_url, set the errors as error and error_message in the same WWW-Authenticate: definition (removing other extra keywords) and a short sentence on the client_id that could be extended whenever we have an agreement with providers. BTW, now it is inconsistent the part of "Common Challenge Parameters for VO Schemes" as access_url is not present for oAuth. That could be fixed later whenever we have a common view

@andamian
Copy link
Contributor

My 2 cents:

  1. I don't know why we talk about OAuth (Authorization) when we actually want OpenID Connect (authentication). OAuth2 maybe? https://www.okta.com/identity-101/whats-the-difference-between-oauth-openid-connect-and-saml/. For OIDC we would only need the issuer URL since .well-known/openid-configuration is part of the standard (or just .well-known as suggested in Support to bearer tokens #6 (comment)
  2. There are already mechanisms for dealing with authentication errors that the generic tools already understand so I don't see the benefit of X-VO-Auth-Error. It won't be useful for certs, which fail during SSL handshake. Well-behave clients could also check the token for expiry and allowed_origins before sending the token.
  3. I'm not sure what the purpose of the client_id list is. I'm not an OIDC specialist, but clients (dynamic or pre-approved) are created to assign different security policies and limit impersonations (through matching of the redirect URLs in requests). Secure clients are also configured with a client secret but that only works with service clients due to the security nature of the secret. Unless we can provide a reason for those clients existence, I don't think they should be mentioned at all.

@jesusjuansalgado
Copy link
Author

jesusjuansalgado commented Jun 21, 2025

Hi Adrian,
1- I understand we are describing authorisation (not only authentication). In all the examples, we are trying to, e.g. access a particular data product, not only authenticating. We could mention oAuth2 instead of oAuth. Anyway, we are making reference to specific RFCs in all the text so there are not ambiguities. About the link to the discovery url, giving the issuer URL could be enough too and let the client to derive the openid configuration URL but if the final reason to provide the issuer URL is to deduce where the openid configuration is located (assuming that the file is going to be in the expected place) is adding a complexity in the client workflow (already complex) that, in my view, is unnecessary. There are at least two files that could need to be discovered (/.well-known/openid-configuration and /.well-known/oauth-authorization-server ) if we do not mention anything but the issuer url. I think we could do it better describing the expected configuration file location and the metadata inside (what creates a contract for client and server providers).
I even described the expected metadata inside the configuration file because many VO client developers would need to adapt them to be able to work with the new clients so, although it is complex, the more simplified description we could do, the better.

2- X-VO-Auth-Error was already removed and errors were integrated in the header (see my comment just before yours and the new text). This is not a problem for the bearer token description but I still think that this could have been a good add for all the methods. oAuth2 errors have a vocabulary defined but this is not true in general for other methods so I think it is a missing opportunity to standardise this at VO level. Yes, I know that there could be more than one challenge but we could have more than one X-VO-Auth-Error in the response describing the reason of the error so the clients could take decisions on the next step

3- Well, client_ids is part of the standard so the original authors of oAuth2 considered it necessary for security reasons (maybe we do not understand it but they did). I think we could use a default client_id and remove it from the text (this is what we are doing in the first version implemented at the SRCNet). However, reading the documentation:

  • In public flows like Device Code or Authorization Code with PKCE, the client_id is still required (these workflows are used by some clients)
  • client_ids (e.g., ivoa-stilts, ivoa-pyvo-cli) allow IdPs to apply security profiles per tool, like:
  • Acceptable scopes
  • Rate limits
  • Federation rules
  • ...
    We could investigate more to understand the reasons but, as said, if this is part of the standard there should be a reason. We are not inventing this.

@aragilar
Copy link

RFC 7591 I think is required for the VO to effectively use OAuth2/OIDC. There is no point relying on the client_id remaining secret, and all it does is encourage everyone to reuse an existing client_id they found on the internet (this is a well-known thing that all OAuth2/OIDC providers deal with). Let's rather save everyone that pain and do things right (and it's far better that providers assume they have to deal with hostile clients and take preventative measures than assuming that a client_id really does mean a specific client is being used).

For the VO, I think clients shouldn't care whether it's OAuth 2.0, OAuth 2.1, or OIDC, because clients shouldn't need any of the profile details to function (you don't need the user's full name or institute or email). The authorization and resource server can communicate however they wish (and OIDC makes sense to standardise on from a VO service provider's side), so I haven't really worried about the distinction between the versions.

@jesusjuansalgado
Copy link
Author

jesusjuansalgado commented Jun 21, 2025

Hi @aragilar, I completely agree with your first point. Dynamic registration is a more secure (if correctly implemented) and scalable option, but it may introduce too much complexity for the current VO ecosystem. Pre-registering public client_ids can still be useful, for example, to scope or restrict their capabilities — e.g., ensuring that stilts can't request unrelated scopes like email (so, at policy level). Of course, authorisation servers must assume clients could be hostile (not just in the VO, but in OAuth in general). So yes, pre-registered public clients offer limited protection and shouldn’t be treated as trusted but they can still help to limit the damage of impersonation.

Dynamic registration (RFC 7591) would provide better isolation and tracking, but implementing it correctly, particularly with access policies and registration control, could be difficult for many VO service providers. They are loosing some control of the clients registered on-the-fly so, I think, a bad implementation at server side facilitates attacks (from my partial understanding of the problem). Anyway, maybe this is something we could revisit later after talking to server providers. I do not have a strong position on this at all.

On your second point, I fully agree. I would avoid locking the standard to any specific OAuth version. The goal is to describe a discovery-based mechanism that lets clients adapt to the server’s authentication capabilities. The more generic and future-proof the top-level description is, the better. We should treat the examples (e.g., device code, authorisation code, etc.) as extensible, not exhaustive, ideally, avoiding full rewrites of the spec when new flows emerge in the IVOA ecosystem.

@aragilar
Copy link

@jesusjuansalgado There is nothing stopping groups from using OAuth2 currently (the ESO archive uses it, and I've written a Python wrapper around it https://dev.aao.org.au/adacs/eso-downloader; Rubin uses it; Data Central uses it both internally and as an IdP for MWA, SkyMapper and CASDA, though we don't advertise for VO client use, primarily because we don't have RFC 7591 configured as needed for the VO yet; and there are probably other I don't know about), and unlike the cookies/client certs (where there is not a standard, hence AuthVO), no-one needs a VO rec to implement it with pre-agreed client IDs, this has already happened, and I see that if we keep doing that we're going to end up with silos and users unhappy they cannot use their preferred tool against a specific VO service.

@jesusjuansalgado
Copy link
Author

@aragilar As commented before, the SRCNet is in the same situation about client_id. We are implementing oAuth2 but we do not ask for client_ids. The original question from Mark is, should we do something in the context of VO about it?
First answer is use RFC 7591 (what, I understand, nobody is using and could add complexity). Second answer is preregistered clients (nobody is stopped about making use of it because some generic client_ids could be defined like, ivoa-command-line). The third answer is ignore them (basically what we have been doing until now)

My vote is always the closer to "do not complicate the things" as we are not defining oAuth2. We are just trying to define how to react to a failed authorisation and guide in the error response on how to use the system to correct it by providing some basic info. So, based on this general criteria, I prefer option 3 or, if we want to promote something a little bit more controlled, option 2. Some preregistered clients could help to improve the access policies but the protection is partial because they are public clients. Option 1, that nobody is implementing and looks to be more complex for client and servers, is something I do not like too much but, as said, this is not really the part we should be focused on this standard. If we use option 1 we need just to refer to the standard (and wait for the client and servers to implement it)

We all understand the difference between the 3 approaches so we just need to vote, I think. Unless there is extra reasoning that has not been discussed, the three options are fine

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants