Package resolution algorithm
Overview
Paket uses the paket.dependencies
file to specify
project dependencies. Usually only direct dependencies are specified and often a
broad range of package versions is allowed. During
paket install
Paket needs to figure out concrete
versions of the specified packages and their transitive dependencies. These
versions are then persisted to the paket.lock
file.
In order to figure out the concrete versions Paket needs to solve the following constraint satisfaction problem:
-
Select the latest version for each of the packages in the
paket.dependencies
file, plus all their transitive dependencies, such that all version constraints are satisfied.
Note:
- In general, more than one solution to this problem can exist and the solver will take the first solution that it finds.
- If you change the resolution strategy then Paket needs to find the oldest matching version
Getting data
The constraint satisfaction problem is covered by many scientific papers, but a big challenge for Paket's resolver is that it doesn't have the full constraints available. The algorithm needs to evaluate the package dependency graph along the way by retrieving data from the NuGet source feeds.
The two important API questions are:
- What versions of a package are available?
- Given a concrete version of a package, what further dependencies are required?
Answering these questions is a very expensive operation since it involves a HTTP request and therefore the resolver has to minimize these requests and only access the API when really needed.
Basic algorithm
Starting from the paket.dependencies
file we have a
set of package requirements. Every requirement specifies a version range and a
resolver strategy for a package:
1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: |
|
The algorithm works as a Breadth-first search. In every step it selects a requirement from the set of open requirements and checks if the requirement can be satisfied. If no conflict arises then a package version gets selected and all it's dependencies are added to the open requirements. A set of closed requirements is maintained in order to prevent cycles in the search graph.
If the selected requirement results in a conflict then the algorithm backtracks in the search tree and selects the next version.
1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15: 16: 17: 18: 19: 20: 21: 22: 23: 24: 25: 26: 27: 28: 29: 30: 31: 32: 33: 34: 35: 36: 37: 38: 39: 40: 41: 42: |
|
Sorting package requirements
In order to make progress in the search tree the algorithm needs to determine the next package. Paket uses a heuristic, which tries to process packages with small version ranges and high conflict potential first. Therefore, it orders the requirements based on:
- Is the version pinned?
- Is it a direct requirement coming from the dependencies file?
-
Is the
resolution strategy
Min
orMax
? - How big is the current package specific boost factor?
- How big is the specified version range?
- The package name (alphabetically) as a tie breaker.
Package conflict boost
Whenever Paket encounters a package version conflict in the search tree it increases a boost factor for the involved packages. This heuristic influences the package evaluation order and forces the resolver to deal with conflicts much earlier in the search tree.
Branch and bound
Every known resolution conflict is stored in a HashSet
. At every step Paket
will always check if the current requirement set (union of open requirements
and closed requirements) is a superset of a known conflict. In this case Paket
can stop evaluating that part of the search tree.
Caching and lazy evaluation
Since HTTP requests to NuGet are very expensive Paket tries to reduce these calls as much as possible:
-
The function
getAllVersionsFromNuget
will call the NuGet API at most once per package and Paket run. -
The function
getPackageDetails
will only call the NuGet API if package details are not found in the RAM or on disk.
The second caching improvement means that subsequent runs of paket update
can
get faster since package details are already stored on disk.
Error reporting
If the resolver can't find a valid resolution, then it needs to report an error to the user. Since the search tree can be very large and might contain lots of different kinds of failures, reporting a good error message is difficult. Paket will only report the last conflict that it can't resolve and also some information about the origin of this conflict.
If you need more information you can try the verbose mode by using the
--verbose
parameter.
Further reading
- Modular lazy search for Constraint Satisfaction Problems by T. Nordin and A. Tolmach (PDF)
- On The Forward Checking Algorithm by F. Bacchus and A. Grove (PDF)
- Structuring Depth-First Search Algorithms in Haskell by D. King and J. Launchbury (PDF)
- Qualified Goals in the Cabal Solver (Video)
val string : value:'T -> string
--------------------
type string = System.String
| DependenciesFile of string
| Package of PackageName * SemVerInfo
| Max
| Min
{Name: PackageName;
VersionRequirement: VersionRequirement;
ResolverStrategy: ResolverStrategy;
Parent: PackageRequirementSource;
Sources: PackageSource list;}
PackageRequirement.VersionRequirement: VersionRequirement
--------------------
type VersionRequirement = string
PackageRequirement.ResolverStrategy: ResolverStrategy
--------------------
type ResolverStrategy =
| Max
| Min
Orders the requirement set with a heuristic and selects the next requirement
module Set
from Microsoft.FSharp.Collections
--------------------
type Set<'T (requires comparison)> =
interface IComparable
interface IEnumerable
interface IEnumerable<'T>
interface ICollection<'T>
new : elements:seq<'T> -> Set<'T>
member Add : value:'T -> Set<'T>
member Contains : value:'T -> bool
override Equals : obj -> bool
member IsProperSubsetOf : otherSet:Set<'T> -> bool
member IsProperSupersetOf : otherSet:Set<'T> -> bool
...
--------------------
new : elements:seq<'T> -> Set<'T>
from Microsoft.FSharp.Collections
Calls the NuGet API and retrieves all versions for a package
Checks if the given version is in the specified version range
from Microsoft.FSharp.Core.Operators
Looks into the cache if the algorithm already selected that package
Puts all dependencies of the package into the open set
| Ok of Set<ResolvedPackage>
| Conflict of Set<PackageRequirement>
module List
from Microsoft.FSharp.Collections
--------------------
type List<'T> =
| ( [] )
| ( :: ) of Head: 'T * Tail: 'T list
interface IEnumerable
interface IEnumerable<'T>
member GetSlice : startIndex:int option * endIndex:int option -> 'T list
member Head : 'T
member IsEmpty : bool
member Item : index:int -> 'T with get
member Length : int
member Tail : 'T list
static member Cons : head:'T * tail:'T list -> 'T list
static member Empty : 'T list