CEVO Techforum 1 Hack-a-thon session: How do we manage transitions between major versions of protocols? Tuesday 4 February 2020, 16:00-17:30 Participants: Markus Demleitner, Marco Molinaro, Gregory Mantelet, Gilles Landais, Mark Taylor Minutes: Marco Molinaro, Markus Demleitner Moving from one major revision of a protocol to another is a nontrivial operation in a distributed data system like the VO. We've not have one of those, really (SIAPv2 doesn't really count as is doesn't seem this will be a transition as much as long-term coexistence). A good solution needs to avoid a flag day, i.e., a single date when all services and/or all clients must be updated, because that's something we can't do in the VO; clients have a half-life time of perhaps two years, services, at least at smaller providers, more than that. This means: 1 -- version 1 clients should keep working for several years and show the services they've shown before. Users really hate it when their data disappears from their clients. 2 -- if services aren't updated immediately, they shouldn't disappear from clients when these are updated (provided the clients still contain support of the previous versions, which we ought to encourage) 3 -- clients supporting both version 1 and 2 need some way to figure out how to query each data collection exactly one (i.e., discard version 1 interfaces when there's a version 2 interface). The concrete use case we were looking at is a possible major revision for Simple Cone Search, but the considerations are (mostly) general. To satisfy point 3 a solution has to be put in place to find the unique ConeSearch needed by a specific client. VOResource aspect: Right now, for version-aware discovery is planned to have capabilities with standardIDs, say, i://ConeSearch#query-2.0 (where i:// abbreviates ivo://ivoa.net/std/in this document), i://ConeSearch#query-2.1, i://ConeSearch#query-3.0. Each of these will contain (essentially) one interface for the version corresponding to the minor version-exact id. This was done to save clients the trouble of selecting interfaces in the simple cases we've had so far. However, interface selection is going to be necessary in a world of mirrors and auth anyway, and if you try this plan in discovery in the presence of different major versions, queries become really ugly, too. So, alternative: Go back to original VOResource plan of having multiple interfaces in a single capability:: [ignoring for the moment the question why anyone would want to offer two interfaces with different minor versions]. This would be bad for legacy clients that (at least currently) blindly grab a random interface from a capability, as they might get the 2.0 interface, which then would make them break. So (also for equivalence of interfaces within a capability, and in order to let us change extension type schemas on major version updates), let's have one capability per major version, like this:: With this, here's a sketch of how something like this might look like in RegTAP:: rr.interface table rr.capability table ivoid cap# std_version ivoid cap# std_id id1 1 NULL id1 1 i://ConeSearch id1 1 1.0 id1 2 i://ConeSearch2 id1 1 1.1 id1 2 2.0 id1 2 2.1 On this, a subquery like:: (SELECT TOP 1 ivoid, access_url FROM rr.interface NATURAL JOIN rr.capability WHERE standard_id in ('i://ConeSearch','i://ConeSearch2') ORDER BY std_version DESC) would just about give the access URL a modern client should use (ignoring auth and mirrors for now). This doesn't quite work yet, because of NULLs in std_version, the ordering of which isn't predictable (background: long story involving erroneous XSD defaulting in VOResource 1.0 and resulting constraints on VOResource 1.1). Probably needs to be solved and treated in RegTAP. Now, for ConeSearch, all that doesn't work anyway. This is because at this point 4.5k SCS Registry records have multiple ConeSearches per resource, mostly where a single paper has multiple tables. It can be argued that's wrong anyway because these different tables should have different metadata, but there's no way we can fix this. The way around this *might* be that SCS2 clients retrieve all SCS1 and SCS2 capabilities and then discard all SCS1 records if at least one SCS2 client is present. Ouch. An alternative for SCS evolution is to design SCS 1.2 (so, no major version) such that it reacts to SCS 1.1 parameters (RA/DEC/SR) as SCS 1.1 does (including old UCDs and all that), and then have extra, non-1.1 parameters mutually exclusive with RA/DEC/SR. This would also let us introduce an optional "catalog" query param to let people retrieve results from multiple tables in one query. Note that the concern about multiple capabilities with the same standardID is specific to cone search; there's one SIAP service that exihbits this we'd need to look at, but that's it outside of SCS. And in general we really shouldn't bunch up non-equivalent (in some sense of the word) interfaces in different capabilities of the same service (cf. caproles).