289 lines
13 KiB
Markdown
Executable file
289 lines
13 KiB
Markdown
Executable file
# Package Federation: Custom Registries
|
|
|
|
**Note: this is the feature as it was initially specified and does not necessarily reflect the current behavior.**
|
|
|
|
**Up-to-date documentation is available at [Registries](../users/registries.md).**
|
|
|
|
As it is now, vcpkg has over 1400 ports in the default registry (the `/ports` directory).
|
|
For the majority of users, this repository of packages is enough. However, many enterprises
|
|
need to more closely control their dependencies for one reason or another, and this document
|
|
lays out a method which we will build into vcpkg for exactly that reason.
|
|
|
|
## Background
|
|
|
|
A registry is simply a set of packages. In fact, there is already a registry in vcpkg: the default one.
|
|
Package federation, implemented via custom registries, allows one to add new packages,
|
|
edit existing packages, and have as much or as little control as one likes over the dependencies that one uses.
|
|
It gives the control over dependencies that an enterprise requires.
|
|
|
|
### How Does the Current Default Registry Work?
|
|
|
|
Of course, the existing vcpkg tool does have packages in the official,
|
|
default registry. The way we describe these packages is in the ports tree –
|
|
at the base of the vcpkg install directory, there is a directory named ports,
|
|
which contains on the order of 1300 directories, one for each package. Then,
|
|
in each package directory, there are at least two files: a CONTROL or
|
|
vcpkg.json file, which contains the name, version, description, and features
|
|
of the package; and a portfile.cmake file which contains the information on
|
|
how to download and build the package. There may be other files in this
|
|
registry, like patches or usage instructions, but only those two files are
|
|
needed.
|
|
|
|
### Existing vcpkg Registry-like Features
|
|
|
|
There are some existing features in vcpkg that act somewhat like a custom
|
|
registry. The most obvious feature that we have is overlay ports – this
|
|
feature allows you to specify any number of directories as "overlays", which
|
|
either contain a package definition directly, or which contain some number of
|
|
package directories; these overlays will be used instead of the ports tree
|
|
for packages that exist in both places, and are specified exclusively on the
|
|
command line. Additionally, unfortunately, if one installs a package from
|
|
overlay ports that does not exist in the ports tree, one must pass these
|
|
overlays to every vcpkg installation command.
|
|
|
|
There is also the less obvious "feature" which works by virtue of the ports
|
|
tree being user-editable: one can always edit the ports tree on their own
|
|
machine, and can even fork vcpkg and publish their own ports tree.
|
|
Unfortunately, this then means that any updates to the source tree require
|
|
merges, as opposed to being able to fast-forward to the newest sources.
|
|
|
|
### Why Registries?
|
|
|
|
There are many reasons to want custom registries; however, the most important reasons are:
|
|
|
|
* Legal requirements – a company like Microsoft or Google
|
|
needs the ability to strictly control the code that goes into their products,
|
|
making certain that they are following the licenses strictly.
|
|
* There have been examples in the past where a library which is licensed under certain terms contains code
|
|
which is not legally allowed to be licensed under those terms (see [this example][legal-example],
|
|
where a person tried to merge Microsoft-owned, Apache-licensed code into the GPL-licensed libstdc++).
|
|
* Technical requirements – a company may wish to run their own tests on the packages they ship,
|
|
such as [fuzzing].
|
|
* Other requirements – an organization may wish to strictly control its dependencies for a myriad of other reasons.
|
|
* Newer versions – vcpkg may not necessarily always be up to date for all libraries in our registry,
|
|
and an organization may require a newer version than we ship;
|
|
they can very easily update this package and have the version that they want.
|
|
* Port modifications – vcpkg has somewhat strict policies on port modifications,
|
|
and an organization may wish to make different modifications than we do.
|
|
It may allow that organization to make certain that the package works on triplets
|
|
that our team does not test as extensively.
|
|
* Testing – just like port modifications, if a team wants to do specific testing on triplets they care about,
|
|
they can do so via their custom registry.
|
|
|
|
Then, there is the question of why vcpkg needs a new solution for custom registries,
|
|
beyond the existing overlay ports feature. There are two big reasons –
|
|
the first is to allow a project to define the registries that they use for their dependencies,
|
|
and the second is the clear advantage in the user experience of the vcpkg tool.
|
|
If a project requires specific packages to come from specific registries,
|
|
they can do so without worrying that a user accidentally misses the overlay ports part of a command.
|
|
Additionally, beyond a feature which makes overlay ports easier to use,
|
|
custom registries allow for more complex and useful infrastructure around registries.
|
|
In the initial custom registry implementation, we will allow overlay ports style paths,
|
|
as well as git repositories, which means that people can run and use custom registries
|
|
without writing their own infrastructure around getting people that registry.
|
|
|
|
It is the intention of vcpkg to be the most user-friendly package manager for C++,
|
|
and this allows us to fulfill on that intention even further.
|
|
As opposed to having to write `--overlay-ports=path/to/overlay` for every command one runs,
|
|
or adding an environment variable `VCPKG_OVERLAY_PORTS`,
|
|
one can simply write vcpkg install and the registries will be taken care of for you.
|
|
As opposed to having to use git submodules, or custom registry code for every project,
|
|
one can write and run the infrastructure in one place,
|
|
and every project that uses that registry requires only a few lines of JSON.
|
|
|
|
[legal-example]: https://gcc.gnu.org/legacy-ml/libstdc++/2019-09/msg00054.html
|
|
[fuzzing]: https://en.wikipedia.org/wiki/Fuzzing
|
|
|
|
## Specification
|
|
|
|
We will be adding a new file that vcpkg understands - `vcpkg-configuration.json`.
|
|
The way that vcpkg will find this file is different depending on what mode vcpkg is in:
|
|
in classic mode, vcpkg finds this file alongside the vcpkg binary, in the root directory.
|
|
In manifest mode, vcpkg finds this file alongside the manifest. For the initial implementation,
|
|
this is all vcpkg will look for; however, in the future, vcpkg will walk the tree and include
|
|
configuration all along the way: this allows for overriding defaults.
|
|
The specific algorithm for applying this is not yet defined, since currently only one
|
|
`vcpkg-configuration.json` is allowed.
|
|
|
|
The only thing allowed in a `vcpkg-configuration.json` is a `<configuration>` object.
|
|
|
|
A `<configuration>` is an object:
|
|
* Optionally, `"default-registry"`: A `<registry-implementation>` or `null`
|
|
* Optionally, `"registries"`: An array of `<registry>`s
|
|
|
|
Since this is the first RFC that adds anything to this field,
|
|
as of now the only properties that can live in that object will be
|
|
these.
|
|
|
|
A `<registry-implementation>` is an object matching one of the following:
|
|
* `<registry-implementation.builtin>`:
|
|
* `"kind"`: The string `"builtin"`
|
|
* `<registry-implementation.directory>`:
|
|
* `"kind"`: The string `"directory"`
|
|
* `"path"`: A path
|
|
* `<registry-implementation.git>`:
|
|
* `"kind"`: The string `"git"`
|
|
* `"repository"`: A URI
|
|
* Optionally, `"path"`: An absolute path into the git repository
|
|
* Optionally, `"ref"`: A git reference
|
|
|
|
A `<registry>` is a `<registry-implementation>` object, plus the following properties:
|
|
* Optionally, `"scopes"`: An array of `<package-name>`s
|
|
* Optionally, `"packages"`: An array of `<package-name>`s
|
|
|
|
The `"packages"` and `"scopes"` fields of distinct registries must be disjoint,
|
|
and each `<registry>` must have at least one of the `"scopes"` and `"packages"` property,
|
|
since otherwise there's no point.
|
|
|
|
As an example, a package which uses a different default registry, and a different registry for boost,
|
|
might look like the following:
|
|
|
|
```json
|
|
{
|
|
"default-registry": {
|
|
"kind": "directory",
|
|
"path": "vcpkg-ports"
|
|
},
|
|
"registries": [
|
|
{
|
|
"kind": "git",
|
|
"repository": "https://github.com/boostorg/vcpkg-ports",
|
|
"ref": "v1.73.0",
|
|
"scopes": [ "boost" ]
|
|
},
|
|
{
|
|
"kind": "builtin",
|
|
"packages": [ "cppitertools" ]
|
|
}
|
|
]
|
|
}
|
|
```
|
|
|
|
This will install `fmt` from `<directory-of-vcpkg.json>/vcpkg-ports`,
|
|
`cppitertools` from the registry that ships with vcpkg,
|
|
and any `boost` dependencies from `https://github.com/boostorg/vcpkg-ports`.
|
|
Notably, this does not replace behavior up the tree -- only the `vcpkg-configuration.json`s
|
|
for the current invocation do anything.
|
|
|
|
### Behavior
|
|
|
|
When a vcpkg command requires the installation of dependencies,
|
|
it will generate the initial list of dependencies from the package,
|
|
and then run the following algorithm on each dependency:
|
|
|
|
1. Figure out which registry the package should come from by doing the following:
|
|
1. If there is a registry in the registry set which contains the dependency name in the `"packages"` array,
|
|
then use that registry.
|
|
2. For every scope, in order from most specific to least,
|
|
if there is a registry in the registry set which contains that scope in the `"scopes"` array,
|
|
then use that registry.
|
|
(For example, for `"cat.meow.cute"`, check first for `"cat.meow.cute"`, then `"cat.meow"`, then `"cat"`).
|
|
3. If the default registry is not `null`, use that registry.
|
|
4. Else, error.
|
|
2. Then, add that package's dependencies to the list of packages to find, and repeat for the next dependency.
|
|
|
|
vcpkg will also rerun this algorithm whenever an install is run with different configuration.
|
|
|
|
### How Registries are Laid Out
|
|
|
|
There are three kinds of registries, but they only differ in how the registry gets onto one's filesystem.
|
|
Once the registry is there, package lookup runs the same, with each kind having it's own way of defining its
|
|
own root.
|
|
|
|
In order to find a port `meow` in a registry with root `R`, vcpkg first sees if `R/meow` exists;
|
|
if it does, then the port root is `R/meow`. Otherwise, see if `R/m-` exists; if it does,
|
|
then the port root is `R/m-/meow`. (note: this algorithm may be extended further in the future).
|
|
|
|
For example, given the following port root:
|
|
|
|
```
|
|
R/
|
|
abseil/...
|
|
b-/
|
|
boost/...
|
|
boost-build/...
|
|
banana/...
|
|
banana/...
|
|
```
|
|
|
|
The port root for `abseil` is `R/abseil`; the port root for `boost` is `R/b-/boost`;
|
|
the port root for `banana` is `R/banana` (although this duplication is not recommended).
|
|
|
|
The reason we are making this change to allow more levels in the ports tree is that ~1300
|
|
ports are hard to look through in a tree view, and this allows us to see only the ports we're
|
|
interested in. Additionally, no port name may end in a `-`, so this means that these port subdirectories
|
|
will never intersect with actual ports. Additionally, since we use only ASCII for port names,
|
|
we don't have to worry about graphemes vs. code units vs. code points -- in ASCII, they are equivalent.
|
|
|
|
Let's now look at how different registry kinds work:
|
|
|
|
#### `<registry.builtin>`
|
|
|
|
For a `<registry.builtin>`, there is no configuration required.
|
|
The registry root is simply `<vcpkg-root>/ports`.
|
|
|
|
#### `<registry.directory>`
|
|
|
|
For a `<registry.directory>`, it is again fairly simple.
|
|
Given `$path` the value of the `"path"` property, the registry root is either:
|
|
|
|
* If `$path` is absolute, then the registry root is `$path`.
|
|
* If `$path` is drive-relative (only important on Windows), the registry root is
|
|
`(drive of vcpkg.json)/$path`
|
|
* If `$path` is relative, the registry root is `(directory of vcpkg.json)/$path`
|
|
|
|
Note that the path to vcpkg.json is _not_ canonicalized; it is used exactly as it is seen by vcpkg.
|
|
|
|
#### `<registry.git>`
|
|
|
|
This registry is the most complex. We would like to cache existing registries,
|
|
but we don't want to ignore new updates to the registry.
|
|
It is the opinion of the author that we want to find more updates than not,
|
|
so we will update the registry whenever the `vcpkg.json` or `vcpkg-configuration.json`
|
|
is modified. We will do so by keeping a sha512 of the `vcpkg.json` and `vcpkg-configuration.json`
|
|
inside the `vcpkg-installed` directory.
|
|
|
|
We will download the specific ref of the repository to a central location (and update as needed),
|
|
and the root will be either: `<path to repository>`, if the `"path"` property is not defined,
|
|
or else `<path to repository>/<path property>` if it is defined.
|
|
The `"path"` property must be absolute, without a drive, and will be treated as relative to
|
|
the path to the repository. For example:
|
|
|
|
```json
|
|
{
|
|
"kind": "git",
|
|
"repository": "https://github.com/microsoft/vcpkg",
|
|
"path": "/ports"
|
|
}
|
|
```
|
|
|
|
is the correct way to refer to the registry built in to vcpkg, at the latest version.
|
|
|
|
The following are all incorrect:
|
|
|
|
```json
|
|
{
|
|
"$reason": "path can't be drive-absolute",
|
|
"kind": "git",
|
|
"repository": "https://github.com/microsoft/vcpkg",
|
|
"path": "F:/ports"
|
|
}
|
|
```
|
|
|
|
```json
|
|
{
|
|
"$reason": "path can't be relative",
|
|
"kind": "git",
|
|
"repository": "https://github.com/microsoft/vcpkg",
|
|
"path": "ports"
|
|
}
|
|
```
|
|
|
|
```json
|
|
{
|
|
"$reason": "path _really_ can't be relative like that",
|
|
"kind": "git",
|
|
"repository": "https://github.com/microsoft/vcpkg",
|
|
"path": "../../meow/ports"
|
|
}
|
|
```
|