Index Format
The following defines the format of the index. New features are occasionally added, which are only understood starting with the version of Cargo that introduced them. Older versions of Cargo may not be able to use packages that make use of new features. However, the format for older packages should not change, so older versions of Cargo should be able to use them.
Index Configuration
The root of the index contains a file named config.json
which contains JSON
information used by Cargo for accessing the registry. This is an example of
what the crates.io config file looks like:
{
"dl": "https://crates.io/api/v1/crates",
"api": "https://crates.io"
}
The keys are:
-
dl
: This is the URL for downloading crates listed in the index. The value may have the following markers which will be replaced with their corresponding value:{crate}
: The name of crate.{version}
: The crate version.{prefix}
: A directory prefix computed from the crate name. For example, a crate namedcargo
has a prefix ofca/rg
. See below for details.{lowerprefix}
: Lowercase variant of{prefix}
.{sha256-checksum}
: The crate’s sha256 checksum.
If none of the markers are present, then the value
/{crate}/{version}/download
is appended to the end. -
api
: This is the base URL for the web API. This key is optional, but if it is not specified, commands such ascargo publish
will not work. The web API is described below.
Download Endpoint
The download endpoint should send the .crate
file for the requested package.
Cargo supports https, http, and file URLs, HTTP redirects, HTTP1 and HTTP2.
The exact specifics of TLS support depend on the platform that Cargo is
running on, the version of Cargo, and how it was compiled.
Index files
The rest of the index repository contains one file for each package, where the filename is the name of the package in lowercase. Each version of the package has a separate line in the file. The files are organized in a tier of directories:
- Packages with 1 character names are placed in a directory named
1
. - Packages with 2 character names are placed in a directory named
2
. - Packages with 3 character names are placed in the directory
3/{first-character}
where{first-character}
is the first character of the package name. - All other packages are stored in directories named
{first-two}/{second-two}
where the top directory is the first two characters of the package name, and the next subdirectory is the third and fourth characters of the package name. For example,cargo
would be stored in a file namedca/rg/cargo
.
Note: Although the index filenames are in lowercase, the fields that contain package names in
Cargo.toml
and the index JSON data are case-sensitive and may contain upper and lower case characters.
The directory name above is calculated based on the package name converted to
lowercase; it is represented by the marker {lowerprefix}
. When the original
package name is used without case conversion, the resulting directory name is
represented by the marker {prefix}
. For example, the package MyCrate
would
have a {prefix}
of My/Cr
and a {lowerprefix}
of my/cr
. In general,
using {prefix}
is recommended over {lowerprefix}
, but there are pros and
cons to each choice. Using {prefix}
on case-insensitive filesystems results
in (harmless-but-inelegant) directory aliasing. For example, crate
and
CrateTwo
have {prefix}
values of cr/at
and Cr/at
; these are distinct on
Unix machines but alias to the same directory on Windows. Using directories
with normalized case avoids aliasing, but on case-sensitive filesystems it’s
harder to support older versions of Cargo that lack {prefix}
/{lowerprefix}
.
For example, nginx rewrite rules can easily construct {prefix}
but can’t
perform case-conversion to construct {lowerprefix}
.
Registries should consider enforcing limitations on package names added to
their index. Cargo itself allows names with any alphanumeric, -
, or _
characters. crates.io imposes its own limitations, including the following:
- Only allows ASCII characters.
- Only alphanumeric,
-
, and_
characters. - First character must be alphabetic.
- Case-insensitive collision detection.
- Prevent differences of
-
vs_
. - Under a specific length (max 64).
- Rejects reserved names, such as Windows special filenames like “nul”.
Registries should consider incorporating similar restrictions, and consider the security implications, such as IDN homograph attacks and other concerns in UTR36 and UTS39.
Each line in a package file contains a JSON object that describes a published version of the package. The following is a pretty-printed example with comments explaining the format of the entry.
{
// The name of the package.
// This must only contain alphanumeric, `-`, or `_` characters.
"name": "foo",
// The version of the package this row is describing.
// This must be a valid version number according to the Semantic
// Versioning 2.0.0 spec at https://semver.org/.
"vers": "0.1.0",
// Array of direct dependencies of the package.
"deps": [
{
// Name of the dependency.
// If the dependency is renamed from the original package name,
// this is the new name. The original package name is stored in
// the `package` field.
"name": "rand",
// The SemVer requirement for this dependency.
// This must be a valid version requirement defined at
// https://doc.rust-lang.org/cargo/reference/specifying-dependencies.html.
"req": "^0.6",
// Array of features (as strings) enabled for this dependency.
"features": ["i128_support"],
// Boolean of whether or not this is an optional dependency.
"optional": false,
// Boolean of whether or not default features are enabled.
"default_features": true,
// The target platform for the dependency.
// null if not a target dependency.
// Otherwise, a string such as "cfg(windows)".
"target": null,
// The dependency kind.
// "dev", "build", or "normal".
// Note: this is a required field, but a small number of entries
// exist in the crates.io index with either a missing or null
// `kind` field due to implementation bugs.
"kind": "normal",
// The URL of the index of the registry where this dependency is
// from as a string. If not specified or null, it is assumed the
// dependency is in the current registry.
"registry": null,
// If the dependency is renamed, this is a string of the actual
// package name. If not specified or null, this dependency is not
// renamed.
"package": null,
}
],
// A SHA256 checksum of the `.crate` file.
"cksum": "d867001db0e2b6e0496f9fac96930e2d42233ecd3ca0413e0753d4c7695d289c",
// Set of features defined for the package.
// Each feature maps to an array of features or dependencies it enables.
"features": {
"extras": ["rand/simd_support"]
},
// Boolean of whether or not this version has been yanked.
"yanked": false,
// The `links` string value from the package's manifest, or null if not
// specified. This field is optional and defaults to null.
"links": null,
// An unsigned 32-bit integer value indicating the schema version of this
// entry.
//
// If this not specified, it should be interpreted as the default of 1.
//
// Cargo (starting with version 1.51) will ignore versions it does not
// recognize. This provides a method to safely introduce changes to index
// entries and allow older versions of cargo to ignore newer entries it
// doesn't understand. Versions older than 1.51 ignore this field, and
// thus may misinterpret the meaning of the index entry.
//
// The current values are:
//
// * 1: The schema as documented here, not including newer additions.
// This is honored in Rust version 1.51 and newer.
// * 2: The addition of the `features2` field.
// This is honored in Rust version 1.60 and newer.
"v": 2,
// This optional field contains features with new, extended syntax.
// Specifically, namespaced features (`dep:`) and weak dependencies
// (`pkg?/feat`).
//
// This is separated from `features` because versions older than 1.19
// will fail to load due to not being able to parse the new syntax, even
// with a `Cargo.lock` file.
//
// Cargo will merge any values listed here with the "features" field.
//
// If this field is included, the "v" field should be set to at least 2.
//
// Registries are not required to use this field for extended feature
// syntax, they are allowed to include those in the "features" field.
// Using this is only necessary if the registry wants to support cargo
// versions older than 1.19, which in practice is only crates.io since
// those older versions do not support other registries.
"features2": {
"serde": ["dep:serde", "chrono?/serde"]
}
}
The JSON objects should not be modified after they are added except for the
yanked
field whose value may change at any time.
Note: The index JSON format has subtle differences from the JSON format of the Publish API and
cargo metadata
. If you are using one of those as a source to generate index entries, you are encouraged to carefully inspect the documentation differences between them.For the Publish API, the differences are:
deps
name
— When the dependency is renamed inCargo.toml
, the publish API puts the original package name in thename
field and the aliased name in theexplicit_name_in_toml
field. The index places the aliased name in thename
field, and the original package name in thepackage
field.req
— The Publish API field is calledversion_req
.cksum
— The publish API does not specify the checksum, it must be computed by the registry before adding to the index.features
— Some features may be placed in thefeatures2
field. Note: This is only a legacy requirement for crates.io; other registries should not need to bother with modifying the features map. Thev
field indicates the presence of thefeatures2
field.- The publish API includes several other fields, such as
description
andreadme
, which don’t appear in the index. These are intended to make it easier for a registry to obtain the metadata about the crate to display on a website without needing to extract and parse the.crate
file. This additional information is typically added to a database on the registry server.For
cargo metadata
, the differences are:
vers
— Thecargo metadata
field is calledversion
.deps
name
— When the dependency is renamed inCargo.toml
,cargo metadata
puts the original package name in thename
field and the aliased name in therename
field. The index places the aliased name in thename
field, and the original package name in thepackage
field.default_features
— Thecargo metadata
field is calleduses_default_features
.registry
—cargo metadata
uses a value ofnull
to indicate that the dependency comes from crates.io. The index uses a value ofnull
to indicate that the dependency comes from the same registry as the index. When creating an index entry, a registry other than crates.io should translate a value ofnull
to behttps://github.com/rust-lang/crates.io-index
and translate a URL that matches the current index to benull
.cargo metadata
includes some extra fields, such assource
andpath
.- The index includes additional fields such as
yanked
,cksum
, andv
.
Index Protocols
Cargo supports two remote registry protocols: git
and sparse
. The git
protocol
stores index files in a git repository and the sparse
protocol fetches individual
files over HTTP.
Git Protocol
The git protocol has no protocol prefix in the index url. For example the git index URL
for crates.io is https://github.com/rust-lang/crates.io-index
.
Cargo caches the git repository on disk so that it can efficiently incrementally fetch updates.
Sparse Protocol
The sparse protocol uses the sparse+
protocol prefix in the registry URL. For example,
the sparse index URL for crates.io is sparse+https://index.crates.io/
.
The sparse protocol downloads each index file using an individual HTTP request. Since this results in a large number of small HTTP requests, performance is significantly improved with a server that supports pipelining and HTTP/2.
Caching
Cargo caches the crate metadata files, and captures the ETag
or Last-Modified
HTTP header from the server for each entry. When refreshing crate metadata, Cargo
sends the If-None-Match
or If-Modified-Since
header to allow the server to respond
with HTTP 304 “Not Modified” if the local cache is valid, saving time and bandwidth.
If both ETag
and Last-Modified
headers are present, Cargo uses the ETag
only.
Cache Invalidation
If a registry is using some kind of CDN or proxy which caches access to the index files, then it is recommended that registries implement some form of cache invalidation when the files are updated. If these caches are not updated, then users may not be able to access new crates until the cache is cleared.
Nonexistent Crates
For crates that do not exist, the registry should respond with a 404 “Not Found”, 410 “Gone” or 451 “Unavailable For Legal Reasons” code.
Sparse Limitations
Since the URL of the registry is stored in the lockfile, it’s not recommended to offer a registry with both protocols. Discussion about a transition plan is ongoing in issue #10964. The crates.io registry is an exception, since Cargo internally substitutes the equivalent git URL when the sparse protocol is used.
If a registry does offer both protocols, it’s currently recommended to choose one protocol as the canonical protocol and use source replacement for the other protocol.