Cédric Champeau
on 30 Nov 19
Introduce module metadata verification
This commit introduces verification of metadata files.
For this, another refactoring of the `ModuleSo… Show more
Introduce module metadata verification

This commit introduces verification of metadata files.

For this, another refactoring of the `ModuleSource` concept

was required. Ironically, before this commit, `ModuleSource`

used to be available for serialization in the module metadata

serializer. However, they weren't used in practice, because

all the required data could be reconstructed from the caches.

In particular, there was this "contentHash" which, because

not properly serialized, was actually set as a field on the

component resolve metadata itself, instead of being part

of the module source.

Now, this commit reintroduces serialization of module sources

but takes a different approach by splitting the module sources

in two distinct categories:

- module sources which can be reconstructed from known data,

such as the repository name and repository url

- module sources which have to be serialized alongside component

metadata, because they can't be reconstructed from sources

The latter category includes this "contentHash", serialized with

the descriptor hash module source. It also includes the information

about _which_ actual descriptor file was used to generate the

binary module descriptor (e.g, the source POM, Ivy or module

metadata file). This information does _not_ belong to the module

component resolve metadata itself, so it belongs to its sources.

For this purpose, serialization of module sources has been updated

so that instead of using Java serialization, module sources need

to provide a custom serializer, called a Codec. Those codecs are

uniquely identified by an id which is just an integer. This is

done for performance optimization, in order to avoid to serialize

the name of the codec and have to load it dynamically. Instead,

Gradle knows about the full set of serializers (and there's no

way for a user to add more because in any case it would require

an update of the module metadata store format).

This makes it much more efficient to serialize module sources

(because we can now have an optimized encoder), but it also

permits reconstructing module sources from incomplete information.

In particular, the module source which describes the file from

which a component resolve metadata was sourced contains a link

to the actual file in the local artifact store. However, in order

to be relocatable, we _don't_ want this file path to be stored

in the metadata cache. This means that instead of storing the

path, we actually store the artifact identifier and the hash

of the descriptor so that we can, when loaded from cache, find

its location back.

Currently, metadata verification is enabled for all components.

It's not possible to disable verification of metadata.

Show less