Title: | Combinatorial Efficient Global Optimization |
---|---|
Description: | Model building, surrogate model based optimization and Efficient Global Optimization in combinatorial or mixed search spaces. |
Authors: | Martin Zaefferer [aut, cre] |
Maintainer: | Martin Zaefferer <[email protected]> |
License: | GPL (>= 3) |
Version: | 2.4.4 |
Built: | 2025-03-10 04:49:26 UTC |
Source: | https://github.com/cran/CEGO |
Combinatorial Efficient Global Optimization
Model building, surrogate model based optimization and Efficient Global Optimization in combinatorial or mixed search spaces. This includes methods for distance calculation, modeling and handling of indefinite kernels/distances.
Package: | CEGO |
Type: | Package |
Version: | 2.4.3 |
Date: | 2024-01-27 |
License: | GPL (>= 3) |
LazyLoad: | yes |
This work has been partially supported by the Federal Ministry of Education and Research (BMBF) under the grants CIMO (FKZ 17002X11) and MCIOP (FKZ 17N0311).
Martin Zaefferer [email protected]
Zaefferer, Martin; Stork, Joerg; Friese, Martina; Fischbach, Andreas; Naujoks, Boris; Bartz-Beielstein, Thomas. (2014). Efficient global optimization for combinatorial problems. In Proceedings of the 2014 conference on Genetic and evolutionary computation (GECCO '14). ACM, New York, NY, USA, 871-878. DOI=10.1145/2576768.2598282
Zaefferer, Martin; Stork, Joerg; Bartz-Beielstein, Thomas. (2014). Distance Measures for Permutations in Combinatorial Efficient Global Optimization. In Parallel Problem Solving from Nature - PPSN XIII (p. 373-383). Springer International Publishing.
Zaefferer, Martin and Bartz-Beielstein, Thomas (2016). Efficient Global Optimization with Indefinite Kernels. Parallel Problem Solving from Nature-PPSN XIV. Accepted, in press. Springer.
Interface of main function: optimCEGO
Creates a benchmark function for the Flow shop Scheduling Problem.
benchmarkGeneratorFSP(a, n, m)
benchmarkGeneratorFSP(a, n, m)
a |
matrix of processing times for each step and each machine |
n |
number of jobs |
m |
number of machines |
the function of type cost=f(permutation)
benchmarkGeneratorQAP
, benchmarkGeneratorTSP
, benchmarkGeneratorWT
n=10 m=4 #ceate a matrix of processing times A <- matrix(sample(n*m,replace=TRUE),n,m) #create FSP objective function fun <- benchmarkGeneratorFSP(A,n,m) #evaluate fun(1:n) fun(n:1)
n=10 m=4 #ceate a matrix of processing times A <- matrix(sample(n*m,replace=TRUE),n,m) #create FSP objective function fun <- benchmarkGeneratorFSP(A,n,m) #evaluate fun(1:n) fun(n:1)
Generates MaxCut problems, with binary decision variables. The MaxCut Problems are transformed to minimization problems by negation.
benchmarkGeneratorMaxCut(N, A)
benchmarkGeneratorMaxCut(N, A)
N |
length of the bit strings |
A |
The adjacency matrix of the graph. Will be created at random if not provided. |
the function of type cost=f(bitstring). Returned fitness values will be negative, for purpose of minimization.
fun <- benchmarkGeneratorMaxCut(N=6) fun(c(1,0,1,1,0,0)) fun(c(1,0,1,1,0,1)) fun(c(0,1,0,0,1,1)) fun <- benchmarkGeneratorMaxCut(A=matrix(c(0,1,0,1,1,0,1,0,0,1,0,1,1,0,1,0),4,4)) fun(c(1,0,1,0)) fun(c(1,0,1,1)) fun(c(0,1,0,1))
fun <- benchmarkGeneratorMaxCut(N=6) fun(c(1,0,1,1,0,0)) fun(c(1,0,1,1,0,1)) fun(c(0,1,0,0,1,1)) fun <- benchmarkGeneratorMaxCut(A=matrix(c(0,1,0,1,1,0,1,0,0,1,0,1,1,0,1,0),4,4)) fun(c(1,0,1,0)) fun(c(1,0,1,1)) fun(c(0,1,0,1))
Function that generates a NK-Landscapes.
benchmarkGeneratorNKL(N = 10, K = 1, PI = 1:K, g)
benchmarkGeneratorNKL(N = 10, K = 1, PI = 1:K, g)
N |
length of the bit strings |
K |
number of neighbours contributing to fitness of one position |
PI |
vector, giving relative positions of each neighbour in the bit-string |
g |
set of fitness functions for each possible combination of string components. Will be randomly determined if not specified. Should have N rows, and 2^(K+1) columns. |
the function of type cost=f(bitstring). Returned fitness values will be negative, for purpose of minimization.
fun <- benchmarkGeneratorNKL(6,2) fun(c(1,0,1,1,0,0)) fun(c(1,0,1,1,0,1)) fun(c(0,1,0,0,1,1)) fun <- benchmarkGeneratorNKL(6,3) fun(c(1,0,1,1,0,0)) fun <- benchmarkGeneratorNKL(6,2,c(-1,1)) fun(c(1,0,1,1,0,0)) fun <- benchmarkGeneratorNKL(6,2,c(-1,1),g=matrix(runif(48),6)) fun(c(1,0,1,1,0,0)) fun(sample(c(0,1),6,TRUE))
fun <- benchmarkGeneratorNKL(6,2) fun(c(1,0,1,1,0,0)) fun(c(1,0,1,1,0,1)) fun(c(0,1,0,0,1,1)) fun <- benchmarkGeneratorNKL(6,3) fun(c(1,0,1,1,0,0)) fun <- benchmarkGeneratorNKL(6,2,c(-1,1)) fun(c(1,0,1,1,0,0)) fun <- benchmarkGeneratorNKL(6,2,c(-1,1),g=matrix(runif(48),6)) fun(c(1,0,1,1,0,0)) fun(sample(c(0,1),6,TRUE))
Creates a benchmark function for the Quadratic Assignment Problem.
benchmarkGeneratorQAP(a, b)
benchmarkGeneratorQAP(a, b)
a |
distance matrix |
b |
flow matrix |
the function of type cost=f(permutation)
benchmarkGeneratorFSP
, benchmarkGeneratorTSP
, benchmarkGeneratorWT
set.seed(1) n=5 #ceate a flow matrix A <- matrix(0,n,n) for(i in 1:n){ for(j in i:n){ if(i!=j){ A[i,j] <- sample(100,1) A[j,i] <- A[i,j] } } } #create a distance matrix locations <- matrix(runif(n*2)*10,,2) B <- as.matrix(dist(locations)) #create QAP objective function fun <- benchmarkGeneratorQAP(A,B) #evaluate fun(1:n) fun(n:1)
set.seed(1) n=5 #ceate a flow matrix A <- matrix(0,n,n) for(i in 1:n){ for(j in i:n){ if(i!=j){ A[i,j] <- sample(100,1) A[j,i] <- A[i,j] } } } #create a distance matrix locations <- matrix(runif(n*2)*10,,2) B <- as.matrix(dist(locations)) #create QAP objective function fun <- benchmarkGeneratorQAP(A,B) #evaluate fun(1:n) fun(n:1)
Creates a benchmark function for the (Asymmetric) Travelling Salesperson Problem. Path (Do not return to start of tour. Start and end of tour not fixed.) or Cycle (Return to start of tour). Symmetry depends on supplied distance matrix.
benchmarkGeneratorTSP(distanceMatrix, type = "Cycle")
benchmarkGeneratorTSP(distanceMatrix, type = "Cycle")
distanceMatrix |
Matrix that collects the distances between travelled locations. |
type |
Can be "Cycle" (return to start, default) or "Path" (no return to start). |
the function of type cost=f(permutation)
benchmarkGeneratorQAP
, benchmarkGeneratorFSP
, benchmarkGeneratorWT
set.seed(1) #create 5 random locations to be part of a tour n=5 cities <- matrix(runif(2*n),,2) #calculate distances between cities cdist <- as.matrix(dist(cities)) #create objective functions (for path or cycle) fun1 <- benchmarkGeneratorTSP(cdist, "Path") fun2 <- benchmarkGeneratorTSP(cdist, "Cycle") #evaluate fun1(1:n) fun1(n:1) fun2(n:1) fun2(1:n)
set.seed(1) #create 5 random locations to be part of a tour n=5 cities <- matrix(runif(2*n),,2) #calculate distances between cities cdist <- as.matrix(dist(cities)) #create objective functions (for path or cycle) fun1 <- benchmarkGeneratorTSP(cdist, "Path") fun2 <- benchmarkGeneratorTSP(cdist, "Cycle") #evaluate fun1(1:n) fun1(n:1) fun2(n:1) fun2(1:n)
Creates a benchmark function for the single-machine total Weighted Tardiness Problem.
benchmarkGeneratorWT(p, w, d)
benchmarkGeneratorWT(p, w, d)
p |
processing times |
w |
weights |
d |
due dates |
the function of type cost=f(permutation)
benchmarkGeneratorQAP
, benchmarkGeneratorTSP
, benchmarkGeneratorFSP
n=6 #processing times p <- sample(100,n,replace=TRUE) #weights w <- sample(10,n,replace=TRUE) #due dates RDD <- c(0.2, 0.4, 0.6,0.8,1.0) TF <- c(0.2, 0.4, 0.6,0.8,1.0) i <- 1 j <- 1 P <- sum(p) d <- runif(n,min=P*(1-TF[i]-RDD[j]/2),max=P*(1-TF[i]+RDD[j]/2)) #create WT objective function fun <- benchmarkGeneratorWT(p,w,d) fun(1:n) fun(n:1)
n=6 #processing times p <- sample(100,n,replace=TRUE) #weights w <- sample(10,n,replace=TRUE) #due dates RDD <- c(0.2, 0.4, 0.6,0.8,1.0) TF <- c(0.2, 0.4, 0.6,0.8,1.0) i <- 1 j <- 1 P <- sum(p) d <- runif(n,min=P*(1-TF[i]-RDD[j]/2),max=P*(1-TF[i]+RDD[j]/2)) #create WT objective function fun <- benchmarkGeneratorWT(p,w,d) fun(1:n) fun(n:1)
Correcting, e.g., a distance matrix with chosen methods so that it becomes a CNSD matrix.
correctionCNSD(mat, method = "flip", tol = 1e-08)
correctionCNSD(mat, method = "flip", tol = 1e-08)
mat |
symmetric matrix, which should be at least of size 3x3 |
method |
string that specifies method for correction: spectrum clip |
tol |
torelance value. Eigenvalues between |
the corrected CNSD matrix
Martin Zaefferer and Thomas Bartz-Beielstein. (2016). Efficient Global Optimization with Indefinite Kernels. Parallel Problem Solving from Nature-PPSN XIV. Accepted, in press. Springer.
x <- list(c(2,1,4,3),c(2,4,3,1),c(4,2,1,3),c(4,3,2,1),c(1,4,3,2)) D <- distanceMatrix(x,distancePermutationInsert) is.CNSD(D) #matrix should not be CNSD D <- correctionCNSD(D) is.CNSD(D) #matrix should now be CNSD D # note: to fix the negative distances, use repairConditionsDistanceMatrix. # Or else, use correctionDistanceMatrix.
x <- list(c(2,1,4,3),c(2,4,3,1),c(4,2,1,3),c(4,3,2,1),c(1,4,3,2)) D <- distanceMatrix(x,distancePermutationInsert) is.CNSD(D) #matrix should not be CNSD D <- correctionCNSD(D) is.CNSD(D) #matrix should now be CNSD D # note: to fix the negative distances, use repairConditionsDistanceMatrix. # Or else, use correctionDistanceMatrix.
Correcting a (possibly indefinite) symmetric matrix with chosen approach so that it will have desired definiteness type: positive or negative semi-definite (PSD, NSD).
correctionDefinite(mat, type = "PSD", method = "flip", tol = 1e-08)
correctionDefinite(mat, type = "PSD", method = "flip", tol = 1e-08)
mat |
symmetric matrix |
type |
string that specifies type of correction: |
method |
string that specifies method for correction: spectrum clip |
tol |
torelance value. Eigenvalues between |
list with
mat
corrected matrix
isIndefinite
boolean, whether original matrix was indefinite
lambda
the eigenvalues of the original matrix
lambdanew
the eigenvalues of the corrected matrix
U
the matrix of eigenvectors
a
the transformation vector
Martin Zaefferer and Thomas Bartz-Beielstein. (2016). Efficient Global Optimization with Indefinite Kernels. Parallel Problem Solving from Nature-PPSN XIV. Accepted, in press. Springer.
x <- list(c(2,1,4,3),c(2,4,3,1),c(4,2,1,3),c(4,3,2,1),c(1,4,3,2)) D <- distanceMatrix(x,distancePermutationInsert) is.NSD(D) #matrix should not be CNSD D <- correctionDefinite(D,type="NSD")$mat is.NSD(D) #matrix should now be CNSD # different example: PSD kernel D <- distanceMatrix(x,distancePermutationInsert) K <- exp(-0.01*D) is.PSD(K) K <- correctionDefinite(K,type="PSD")$mat is.PSD(K)
x <- list(c(2,1,4,3),c(2,4,3,1),c(4,2,1,3),c(4,3,2,1),c(1,4,3,2)) D <- distanceMatrix(x,distancePermutationInsert) is.NSD(D) #matrix should not be CNSD D <- correctionDefinite(D,type="NSD")$mat is.NSD(D) #matrix should now be CNSD # different example: PSD kernel D <- distanceMatrix(x,distancePermutationInsert) K <- exp(-0.01*D) is.PSD(K) K <- correctionDefinite(K,type="PSD")$mat is.PSD(K)
Convert (possibly non-euclidean or non-metric) distance matrix with chosen approach so that it becomes a CNSD matrix.
Optionally, the resulting matrix is enforced to have positive elements and zero diagonal, with the repair
parameter.
Essentially, this is a combination of functions correctionDefinite
or correctionCNSD
with repairConditionsDistanceMatrix
.
correctionDistanceMatrix( mat, type = "NSD", method = "flip", repair = TRUE, tol = 1e-08 )
correctionDistanceMatrix( mat, type = "NSD", method = "flip", repair = TRUE, tol = 1e-08 )
mat |
symmetric distance matrix |
type |
string that specifies type of correction: |
method |
string that specifies method for correction: spectrum clip |
repair |
boolean, whether or not to use condition repair, so that elements are positive, and diagonal is zero. |
tol |
torelance value. Eigenvalues between |
list with corrected distance matrix mat
, isCNSD
(boolean, whether original matrix was CNSD) and transformation matrix A
.
Martin Zaefferer and Thomas Bartz-Beielstein. (2016). Efficient Global Optimization with Indefinite Kernels. Parallel Problem Solving from Nature-PPSN XIV. Accepted, in press. Springer.
correctionDefinite
,correctionCNSD
,repairConditionsDistanceMatrix
x <- list(c(2,1,4,3),c(2,4,3,1),c(4,2,1,3),c(4,3,2,1),c(1,4,3,2)) D <- distanceMatrix(x,distancePermutationInsert) is.CNSD(D) #matrix should not be CNSD D <- correctionDistanceMatrix(D)$mat is.CNSD(D) #matrix should now be CNSD D
x <- list(c(2,1,4,3),c(2,4,3,1),c(4,2,1,3),c(4,3,2,1),c(1,4,3,2)) D <- distanceMatrix(x,distancePermutationInsert) is.CNSD(D) #matrix should not be CNSD D <- correctionDistanceMatrix(D)$mat is.CNSD(D) #matrix should now be CNSD D
Convert a non-PSD kernel matrix with chosen approach so that it becomes a PSD matrix.
Optionally, the resulting matrix is enforced to have values between -1 and 1 and a diagonal of 1s, with the repair
parameter.
That means, it is (optionally) converted to a valid correlation matrix.
Essentially, this is a combination of correctionDefinite
with repairConditionsCorrelationMatrix
.
correctionKernelMatrix(mat, method = "flip", repair = TRUE, tol = 1e-08)
correctionKernelMatrix(mat, method = "flip", repair = TRUE, tol = 1e-08)
mat |
symmetric kernel matrix |
method |
string that specifies method for correction: spectrum clip |
repair |
boolean, whether or not to use condition repair, so that elements between -1 and 1, and the diagonal values are 1. |
tol |
torelance value. Eigenvalues between |
list with corrected kernel matrix mat
, isPSD
(boolean, whether original matrix was PSD), transformation matrix A
,
the matrix of eigenvectors (U
) and the transformation vector (a
)
Martin Zaefferer and Thomas Bartz-Beielstein. (2016). Efficient Global Optimization with Indefinite Kernels. Parallel Problem Solving from Nature-PPSN XIV. Accepted, in press. Springer.
correctionDefinite
, repairConditionsCorrelationMatrix
x <- list(c(2,1,4,3),c(2,4,3,1),c(4,2,1,3),c(4,3,2,1),c(1,4,3,2)) D <- distanceMatrix(x,distancePermutationInsert) K <- exp(-0.01*D) is.PSD(K) #matrix should not be PSD K <- correctionKernelMatrix(K)$mat is.PSD(K) #matrix should now be CNSD K
x <- list(c(2,1,4,3),c(2,4,3,1),c(4,2,1,3),c(4,3,2,1),c(1,4,3,2)) D <- distanceMatrix(x,distancePermutationInsert) K <- exp(-0.01*D) is.PSD(K) #matrix should not be PSD K <- correctionKernelMatrix(K)$mat is.PSD(K) #matrix should now be CNSD K
Generate test functions for assessment of optimization algorithms with
non-conditional or conditional simulation, based on real-world data.
For a more streamlined interface, see testFunctionGeneratorSim
.
createSimulatedTestFunction( xsim, fit, nsim = 10, conditionalSimulation = TRUE, seed = NA )
createSimulatedTestFunction( xsim, fit, nsim = 10, conditionalSimulation = TRUE, seed = NA )
xsim |
list of samples in input space, for simulation |
fit |
an object generated by |
nsim |
the number of simulations, or test functions, to be created |
conditionalSimulation |
whether (TRUE) or not (FALSE) to use conditional simulation |
seed |
a random number generator seed. Defaults to NA; which means no seed is set. For sake of reproducibility, set this to some integer value. |
a list of functions, where each function is the interpolation of one simulation realization. The length of the list depends on the nsim parameter.
N. A. Cressie. Statistics for Spatial Data. JOHN WILEY & SONS INC, 1993.
C. Lantuejoul. Geostatistical Simulation - Models and Algorithms. Springer-Verlag Berlin Heidelberg, 2002.
Zaefferer, M.; Fischbach, A.; Naujoks, B. & Bartz-Beielstein, T. Simulation Based Test Functions for Optimization Algorithms Proceedings of the Genetic and Evolutionary Computation Conference 2017, ACM, 2017, 8.
modelKriging
, simulate.modelKriging
, testFunctionGeneratorSim
nsim <- 10 seed <- 12345 n <- 6 set.seed(seed) #target function: fun <- function(x){ exp(-20* x) + sin(6*x^2) + x } # "vectorize" target f <- function(x){sapply(x,fun)} # distance function dF <- function(x,y)(sum((x-y)^2)) #sum of squares #start pdf creation # plot params par(mfrow=c(4,1),mar=c(2.3,2.5,0.2,0.2),mgp=c(1.4,0.5,0)) #test samples for plots xtest <- as.list(seq(from=-0,by=0.005,to=1)) plot(xtest,f(xtest),type="l",xlab="x",ylab="Obj. function") #evaluation samples (training) xb <- as.list(runif(n)) yb <- f(xb) # support samples for simulation x <- as.list(sort(c(runif(100),unlist(xb)))) # fit the model fit <- modelKriging(xb,yb,dF,control=list( algThetaControl=list(method="NLOPT_GN_DIRECT_L",funEvals=100),useLambda=FALSE)) fit #predicted obj. function values ypred <- predict(fit,as.list(xtest))$y plot(unlist(xtest),ypred,type="l",xlab="x",ylab="Estimation") points(unlist(xb),yb,pch=19) ############################## # create test function non conditional ############################## fun <- createSimulatedTestFunction(x,fit,nsim,FALSE,seed=1) ynew <- NULL for(i in 1:nsim) ynew <- cbind(ynew,fun[[i]](xtest)) rangeY <- range(ynew) plot(unlist(xtest),ynew[,1],type="l",ylim=rangeY,xlab="x",ylab="Simulation") for(i in 2:nsim){ lines(unlist(xtest),ynew[,i],col=i,type="l") } ############################## # create test function conditional ############################## fun <- createSimulatedTestFunction(x,fit,nsim,TRUE,seed=1) ynew <- NULL for(i in 1:nsim) ynew <- cbind(ynew,fun[[i]](xtest)) rangeY <- range(ynew) plot(unlist(xtest),ynew[,1],type="l",ylim=rangeY,xlab="x",ylab="Conditional sim.") for(i in 2:nsim){ lines(unlist(xtest),ynew[,i],col=i,type="l") } points(unlist(xb),yb,pch=19) dev.off()
nsim <- 10 seed <- 12345 n <- 6 set.seed(seed) #target function: fun <- function(x){ exp(-20* x) + sin(6*x^2) + x } # "vectorize" target f <- function(x){sapply(x,fun)} # distance function dF <- function(x,y)(sum((x-y)^2)) #sum of squares #start pdf creation # plot params par(mfrow=c(4,1),mar=c(2.3,2.5,0.2,0.2),mgp=c(1.4,0.5,0)) #test samples for plots xtest <- as.list(seq(from=-0,by=0.005,to=1)) plot(xtest,f(xtest),type="l",xlab="x",ylab="Obj. function") #evaluation samples (training) xb <- as.list(runif(n)) yb <- f(xb) # support samples for simulation x <- as.list(sort(c(runif(100),unlist(xb)))) # fit the model fit <- modelKriging(xb,yb,dF,control=list( algThetaControl=list(method="NLOPT_GN_DIRECT_L",funEvals=100),useLambda=FALSE)) fit #predicted obj. function values ypred <- predict(fit,as.list(xtest))$y plot(unlist(xtest),ypred,type="l",xlab="x",ylab="Estimation") points(unlist(xb),yb,pch=19) ############################## # create test function non conditional ############################## fun <- createSimulatedTestFunction(x,fit,nsim,FALSE,seed=1) ynew <- NULL for(i in 1:nsim) ynew <- cbind(ynew,fun[[i]](xtest)) rangeY <- range(ynew) plot(unlist(xtest),ynew[,1],type="l",ylim=rangeY,xlab="x",ylab="Simulation") for(i in 2:nsim){ lines(unlist(xtest),ynew[,i],col=i,type="l") } ############################## # create test function conditional ############################## fun <- createSimulatedTestFunction(x,fit,nsim,TRUE,seed=1) ynew <- NULL for(i in 1:nsim) ynew <- cbind(ynew,fun[[i]](xtest)) rangeY <- range(ynew) plot(unlist(xtest),ynew[,1],type="l",ylim=rangeY,xlab="x",ylab="Conditional sim.") for(i in 2:nsim){ lines(unlist(xtest),ynew[,i],col=i,type="l") } points(unlist(xb),yb,pch=19) dev.off()
Calculate the distance between all samples in a list, and return as matrix.
distanceMatrix(X, distFun, ...)
distanceMatrix(X, distFun, ...)
X |
list of samples, where each list element is a suitable input for |
distFun |
Distance function of type f(x,y)=r, where r is a scalar and x and y are elements whose distance is evaluated. |
... |
further arguments passed to distFun |
The distance matrix
x <- list(5:1,c(2,4,5,1,3),c(5,4,3,1,2), sample(5)) distanceMatrix(x,distancePermutationHamming)
x <- list(5:1,c(2,4,5,1,3),c(5,4,3,1,2), sample(5)) distanceMatrix(x,distancePermutationHamming)
The number of unequal elements of two vectors (which may be of unequal length), divided by the number of elements (of the larger vector).
distanceNumericHamming(x, y)
distanceNumericHamming(x, y)
x |
first vector |
y |
second vector |
numeric distance value
, scaled to values between 0 and 1
#e.g., used for distance between bit strings x <- c(0,1,0,1,0) y <- c(1,1,0,0,1) distanceNumericHamming(x,y) p <- replicate(10,sample(c(0,1),5,replace=TRUE),simplify=FALSE) distanceMatrix(p,distanceNumericHamming)
#e.g., used for distance between bit strings x <- c(0,1,0,1,0) y <- c(1,1,0,0,1) distanceNumericHamming(x,y) p <- replicate(10,sample(c(0,1),5,replace=TRUE),simplify=FALSE) distanceMatrix(p,distanceNumericHamming)
Longest common substring distance for two numeric vectors, e.g., bit vectors.
distanceNumericLCStr(x, y)
distanceNumericLCStr(x, y)
x |
first vector (numeric) |
y |
second vector (numeric) |
numeric distance value
, scaled to values between 0 and 1
#e.g., used for distance between bit strings x <- c(0,1,0,1,0) y <- c(1,1,0,0,1) distanceNumericLCStr(x,y) p <- replicate(10,sample(c(0,1),5,replace=TRUE),simplify=FALSE) distanceMatrix(p,distanceNumericLCStr)
#e.g., used for distance between bit strings x <- c(0,1,0,1,0) y <- c(1,1,0,0,1) distanceNumericLCStr(x,y) p <- replicate(10,sample(c(0,1),5,replace=TRUE),simplify=FALSE) distanceMatrix(p,distanceNumericLCStr)
Levenshtein distance for two numeric vectors, e.g., bit vectors.
distanceNumericLevenshtein(x, y)
distanceNumericLevenshtein(x, y)
x |
first vector (numeric) |
y |
second vector (numeric) |
numeric distance value
, scaled to values between 0 and 1
#e.g., used for distance between bit strings x <- c(0,1,0,1,0) y <- c(1,1,0,0,1) distanceNumericLevenshtein(x,y) p <- replicate(10,sample(c(0,1),5,replace=TRUE),simplify=FALSE) distanceMatrix(p,distanceNumericLevenshtein)
#e.g., used for distance between bit strings x <- c(0,1,0,1,0) y <- c(1,1,0,0,1) distanceNumericLevenshtein(x,y) p <- replicate(10,sample(c(0,1),5,replace=TRUE),simplify=FALSE) distanceMatrix(p,distanceNumericLevenshtein)
Bi-directional adjacency distance for permutations, depending on how often two elements are neighbours in both permutations x and y.
distancePermutationAdjacency(x, y)
distancePermutationAdjacency(x, y)
x |
first permutation (integer vector) |
y |
second permutation (integer vector) |
numeric distance value
, scaled to values between 0 and 1 (based on the maximum possible distance between two permutations)
Sevaux, Marc, and Kenneth Soerensen. "Permutation distance measures for memetic algorithms with population management." Proceedings of 6th Metaheuristics International Conference (MIC'05). 2005.
Reeves, Colin R. "Landscapes, operators and heuristic search." Annals of Operations Research 86 (1999): 473-490.
x <- 1:5 y <- 5:1 distancePermutationAdjacency(x,y) p <- replicate(10,sample(1:5),simplify=FALSE) distanceMatrix(p,distancePermutationAdjacency)
x <- 1:5 y <- 5:1 distancePermutationAdjacency(x,y) p <- replicate(10,sample(1:5),simplify=FALSE) distanceMatrix(p,distancePermutationAdjacency)
Chebyshev distance for permutations. Specific to permutations is only the scaling to values of 0 to 1:
where n is the length of the permutations x and y.
distancePermutationChebyshev(x, y)
distancePermutationChebyshev(x, y)
x |
first permutation (integer vector) |
y |
second permutation (integer vector) |
numeric distance value
, scaled to values between 0 and 1 (based on the maximum possible distance between two permutations)
x <- 1:5 y <- c(5,1,2,3,4) distancePermutationChebyshev(x,y) p <- replicate(10,sample(1:5),simplify=FALSE) distanceMatrix(p,distancePermutationChebyshev)
x <- 1:5 y <- c(5,1,2,3,4) distancePermutationChebyshev(x,y) p <- replicate(10,sample(1:5),simplify=FALSE) distanceMatrix(p,distancePermutationChebyshev)
The Cosine distance for permutations is derived from the Cosine similarity measure which has been applied in fields like text mining. It is based on the scalar product of two vectors (here: permutations).
distancePermutationCos(x, y)
distancePermutationCos(x, y)
x |
first permutation (integer vector) |
y |
second permutation (integer vector) |
numeric distance value
, scaled to values between 0 and 1 (based on the maximum possible distance between two permutations)
Singhal, Amit (2001)."Modern Information Retrieval: A Brief Overview". Bulletin of the IEEE Computer Society Technical Committee on Data Engineering 24 (4): 35-43
x <- 1:5 y <- c(5,1,2,3,4) distancePermutationCos(x,y) p <- replicate(10,sample(1:5),simplify=FALSE) distanceMatrix(p,distancePermutationCos)
x <- 1:5 y <- c(5,1,2,3,4) distancePermutationCos(x,y) p <- replicate(10,sample(1:5),simplify=FALSE) distanceMatrix(p,distancePermutationCos)
Euclidean distance for permutations, scaled to values between 0 and 1:
where n is the length of the permutations x and y, and scaling factor with
(if n is odd)
or
with
(if n is even).
distancePermutationEuclidean(x, y)
distancePermutationEuclidean(x, y)
x |
first permutation (integer vector) |
y |
second permutation (integer vector) |
numeric distance value
, scaled to values between 0 and 1 (based on the maximum possible distance between two permutations)
x <- 1:5 y <- c(5,1,2,3,4) distancePermutationEuclidean(x,y) p <- replicate(10,sample(1:5),simplify=FALSE) distanceMatrix(p,distancePermutationEuclidean)
x <- 1:5 y <- c(5,1,2,3,4) distancePermutationEuclidean(x,y) p <- replicate(10,sample(1:5),simplify=FALSE) distanceMatrix(p,distancePermutationEuclidean)
Hamming distance for permutations, scaled to values between 0 and 1. That is, the number of unequal elements of two permutations, divided by the permutations length.
distancePermutationHamming(x, y)
distancePermutationHamming(x, y)
x |
first permutation (integer vector) |
y |
second permutation (integer vector) |
numeric distance value
, scaled to values between 0 and 1 (based on the maximum possible distance between two permutations)
x <- 1:5 y <- c(5,1,2,3,4) distancePermutationHamming(x,y) p <- replicate(10,sample(1:5),simplify=FALSE) distanceMatrix(p,distancePermutationHamming)
x <- 1:5 y <- c(5,1,2,3,4) distancePermutationHamming(x,y) p <- replicate(10,sample(1:5),simplify=FALSE) distanceMatrix(p,distancePermutationHamming)
The Insert Distance is an edit distance. It counts the minimum number of delete/insert operations required to transform one permutation into another. A delete/insert operation shifts one element to a new position. All other elements move accordingly to make place for the element. E.g., the following shows a single delete/insert move that sorts the corresponding permutation: 1 4 2 3 5 -> 1 2 3 4 5.
distancePermutationInsert(x, y)
distancePermutationInsert(x, y)
x |
first permutation (integer vector) |
y |
second permutation (integer vector) |
numeric distance value
, scaled to values between 0 and 1 (based on the maximum possible distance between two permutations)
Schiavinotto, Tommaso, and Thomas Stuetzle. "A review of metrics on permutations for search landscape analysis." Computers & operations research 34.10 (2007): 3143-3153.
Wikipedia contributors, "Longest increasing subsequence", Wikipedia, The Free Encyclopedia, 12 November 2014, 19:38 UTC, [accessed 13 November 2014]
x <- 1:5 y <- c(5,1,2,3,4) distancePermutationInsert(x,y) p <- replicate(10,sample(1:5),simplify=FALSE) distanceMatrix(p,distancePermutationInsert)
x <- 1:5 y <- c(5,1,2,3,4) distancePermutationInsert(x,y) p <- replicate(10,sample(1:5),simplify=FALSE) distanceMatrix(p,distancePermutationInsert)
The interchange distance is an edit-distance, counting how many edit operation (here: interchanges, i.e., transposition of two arbitrary elements) have to be performed to transform permutation x into permutation y.
distancePermutationInterchange(x, y)
distancePermutationInterchange(x, y)
x |
first permutation (integer vector) |
y |
second permutation (integer vector) |
numeric distance value
, scaled to values between 0 and 1 (based on the maximum possible distance between two permutations)
Schiavinotto, Tommaso, and Thomas Stuetzle. "A review of metrics on permutations for search landscape analysis." Computers & operations research 34.10 (2007): 3143-3153.
x <- 1:5 y <- c(1,4,3,2,5) distancePermutationInterchange(x,y) p <- replicate(10,sample(1:5),simplify=FALSE) distanceMatrix(p,distancePermutationInterchange)
x <- 1:5 y <- c(1,4,3,2,5) distancePermutationInterchange(x,y) p <- replicate(10,sample(1:5),simplify=FALSE) distanceMatrix(p,distancePermutationInterchange)
Distance of permutations. Based on the longest string of adjacent elements that two permutations have in common.
distancePermutationLCStr(x, y)
distancePermutationLCStr(x, y)
x |
first permutation (integer vector) |
y |
second permutation (integer vector) #' @return numeric distance value
, scaled to values between 0 and 1 (based on the maximum possible distance between two permutations) |
Hirschberg, Daniel S. "A linear space algorithm for computing maximal common subsequences." Communications of the ACM 18.6 (1975): 341-343.
x <- 1:5 y <- c(5,1,2,3,4) distancePermutationLCStr(x,y) p <- replicate(10,sample(1:5),simplify=FALSE) distanceMatrix(p,distancePermutationLCStr)
x <- 1:5 y <- c(5,1,2,3,4) distancePermutationLCStr(x,y) p <- replicate(10,sample(1:5),simplify=FALSE) distanceMatrix(p,distancePermutationLCStr)
Usually a string distance, with slightly different definition. Adapted to permutations as:
where n is the length of the permutations x and y.
distancePermutationLee(x, y)
distancePermutationLee(x, y)
x |
first permutation (integer vector) |
y |
second permutation (integer vector) |
numeric distance value
, scaled to values between 0 and 1 (based on the maximum possible distance between two permutations)
Lee, C., "Some properties of nonbinary error-correcting codes," Information Theory, IRE Transactions on, vol.4, no.2, pp.77,82, June 1958
x <- 1:5 y <- c(5,1,2,3,4) distancePermutationLee(x,y) p <- replicate(10,sample(1:5),simplify=FALSE) distanceMatrix(p,distancePermutationLee)
x <- 1:5 y <- c(5,1,2,3,4) distancePermutationLee(x,y) p <- replicate(10,sample(1:5),simplify=FALSE) distanceMatrix(p,distancePermutationLee)
Levenshtein Distance, often just called "Edit Distance". The number of insertions, substitutions or deletions to turn one permutation (or string of equal length) into another.
distancePermutationLevenshtein(x, y)
distancePermutationLevenshtein(x, y)
x |
first permutation (integer vector) |
y |
second permutation (integer vector) |
numeric distance value
, scaled to values between 0 and 1 (based on the maximum possible distance between two permutations)
Levenshtein, Vladimir I. "Binary codes capable of correcting deletions, insertions and reversals." Soviet physics doklady. Vol. 10. 1966.
x <- 1:5 y <- c(1,2,5,4,3) distancePermutationLevenshtein(x,y) p <- replicate(10,sample(1:5),simplify=FALSE) distanceMatrix(p,distancePermutationLevenshtein)
x <- 1:5 y <- c(1,2,5,4,3) distancePermutationLevenshtein(x,y) p <- replicate(10,sample(1:5),simplify=FALSE) distanceMatrix(p,distancePermutationLevenshtein)
This function calculates the lexicographic permutation distance. That is the difference of positions that both positions would receive in a lexicographic ordering. Note, that this distance measure can quickly become inaccurate if the length of the permutations grows too large, due to being based on the factorial of the length. In general, permutations longer than 100 elements should be avoided.
distancePermutationLex(x, y)
distancePermutationLex(x, y)
x |
first permutation (integer vector) |
y |
second permutation (integer vector) |
numeric distance value
, scaled to values between 0 and 1 (based on the maximum possible distance between two permutations)
lexicographicPermutationOrderNumber
x <- 1:5 y <- c(1,2,3,5,4) distancePermutationLex(x,y) p <- replicate(10,sample(1:5),simplify=FALSE) distanceMatrix(p,distancePermutationLex)
x <- 1:5 y <- c(1,2,3,5,4) distancePermutationLex(x,y) p <- replicate(10,sample(1:5),simplify=FALSE) distanceMatrix(p,distancePermutationLex)
Manhattan distance for permutations, scaled to values between 0 and 1:
where n is the length of the permutations x and y, and scaling factor (if n is odd)
or
(if n is even).
distancePermutationManhattan(x, y)
distancePermutationManhattan(x, y)
x |
first permutation (integer vector) |
y |
second permutation (integer vector) |
numeric distance value
, scaled to values between 0 and 1 (based on the maximum possible distance between two permutations)
x <- 1:5 y <- c(5,1,2,3,4) distancePermutationManhattan(x,y) p <- replicate(10,sample(1:5),simplify=FALSE) distanceMatrix(p,distancePermutationManhattan)
x <- 1:5 y <- c(5,1,2,3,4) distancePermutationManhattan(x,y) p <- replicate(10,sample(1:5),simplify=FALSE) distanceMatrix(p,distancePermutationManhattan)
Position distance (or Spearmans Correlation Coefficient), scaled to values between 0 and 1.
distancePermutationPosition(x, y)
distancePermutationPosition(x, y)
x |
first permutation (integer vector) |
y |
second permutation (integer vector) |
numeric distance value
, scaled to values between 0 and 1 (based on the maximum possible distance between two permutations)
Schiavinotto, Tommaso, and Thomas Stuetzle. "A review of metrics on permutations for search landscape analysis." Computers & operations research 34.10 (2007): 3143-3153.
Reeves, Colin R. "Landscapes, operators and heuristic search." Annals of Operations Research 86 (1999): 473-490.
x <- 1:5 y <- c(1,3,5,4,2) distancePermutationPosition(x,y) p <- replicate(10,sample(1:5),simplify=FALSE) distanceMatrix(p,distancePermutationPosition)
x <- 1:5 y <- c(1,3,5,4,2) distancePermutationPosition(x,y) p <- replicate(10,sample(1:5),simplify=FALSE) distanceMatrix(p,distancePermutationPosition)
Squared position distance (or Spearmans Footrule), scaled to values between 0 and 1.
distancePermutationPosition2(x, y)
distancePermutationPosition2(x, y)
x |
first permutation (integer vector) |
y |
second permutation (integer vector) |
numeric distance value
, scaled to values between 0 and 1 (based on the maximum possible distance between two permutations)
Schiavinotto, Tommaso, and Thomas Stuetzle. "A review of metrics on permutations for search landscape analysis." Computers & operations research 34.10 (2007): 3143-3153.
Reeves, Colin R. "Landscapes, operators and heuristic search." Annals of Operations Research 86 (1999): 473-490.
x <- 1:5 y <- c(1,3,5,4,2) distancePermutationPosition2(x,y) p <- replicate(10,sample(1:5),simplify=FALSE) distanceMatrix(p,distancePermutationPosition2)
x <- 1:5 y <- c(1,3,5,4,2) distancePermutationPosition2(x,y) p <- replicate(10,sample(1:5),simplify=FALSE) distanceMatrix(p,distancePermutationPosition2)
R distance or unidirectional adjacency distance. Based on count of number of times that a two element sequence in x also occurs in y, in the same order.
distancePermutationR(x, y)
distancePermutationR(x, y)
x |
first permutation (integer vector) |
y |
second permutation (integer vector) |
numeric distance value
, scaled to values between 0 and 1 (based on the maximum possible distance between two permutations)
Sevaux, Marc, and Kenneth Soerensen. "Permutation distance measures for memetic algorithms with population management." Proceedings of 6th Metaheuristics International Conference (MIC'05). 2005.
Reeves, Colin R. "Landscapes, operators and heuristic search." Annals of Operations Research 86 (1999): 473-490.
x <- 1:5 y <- c(1,2,3,5,4) distancePermutationR(x,y) p <- replicate(10,sample(1:5),simplify=FALSE) distanceMatrix(p,distancePermutationR)
x <- 1:5 y <- c(1,2,3,5,4) distancePermutationR(x,y) p <- replicate(10,sample(1:5),simplify=FALSE) distanceMatrix(p,distancePermutationR)
The swap distance is an edit-distance, counting how many edit operation (here: swaps, i.e., transposition of two adjacent elements) have to be
performed to transform permutation x into permutation y.
Note: In v2.4.0 of CEGO and earlier, this function actually computed the swap distance on the inverted permutations
(i.e., on the rankings, rather than orderin).
This is now (v2.4.2 and later) corrected by inverting the permutations x and y before computing the distance (ie. computing ordering first).
The original behavior can be reproduced by distancePermutationSwapInv
.
This issue was kindly reported by Manuel Lopez-Ibanez and the difference in terms of behavior is discussed by Ekhine Irurozki and him (2021).
distancePermutationSwap(x, y)
distancePermutationSwap(x, y)
x |
first permutation (integer vector) |
y |
second permutation (integer vector) |
numeric distance value
, scaled to values between 0 and 1 (based on the maximum possible distance between two permutations)
Schiavinotto, Tommaso, and Thomas Stuetzle. "A review of metrics on permutations for search landscape analysis." Computers & operations research 34.10 (2007): 3143-3153.
Irurozki, Ekhine and Ibanez-Lopez Unbalanced Mallows Models for Optimizing Expensive Black-Box Permutation Problems. In Proceedings of the Genetic and Evolutionary Computation Conference, GECCO 2021. ACM Press, New York, NY, 2021. doi: 10.1145/3449639.3459366
x <- 1:5 y <- c(1,2,3,5,4) distancePermutationSwap(x,y) p <- replicate(10,sample(1:5),simplify=FALSE) distanceMatrix(p,distancePermutationSwap)
x <- 1:5 y <- c(1,2,3,5,4) distancePermutationSwap(x,y) p <- replicate(10,sample(1:5),simplify=FALSE) distanceMatrix(p,distancePermutationSwap)
The swap distance on the inverse of permutations x and y.
See distancePermutationSwap
for non-inversed version.
distancePermutationSwapInv(x, y)
distancePermutationSwapInv(x, y)
x |
first permutation (integer vector) |
y |
second permutation (integer vector) |
numeric distance value
, scaled to values between 0 and 1 (based on the maximum possible distance between two permutations)
x <- 1:5 y <- c(1,2,3,5,4) distancePermutationSwapInv(x,y) p <- replicate(10,sample(1:5),simplify=FALSE) distanceMatrix(p,distancePermutationSwapInv)
x <- 1:5 y <- c(1,2,3,5,4) distancePermutationSwapInv(x,y) p <- replicate(10,sample(1:5),simplify=FALSE) distanceMatrix(p,distancePermutationSwapInv)
The Euclidean distance for real vectors.
distanceRealEuclidean(x, y)
distanceRealEuclidean(x, y)
x |
first real vector |
y |
second real vector |
numeric distance value
x <- runif(5) y <- runif(5) distanceRealEuclidean(x,y)
x <- runif(5) y <- runif(5) distanceRealEuclidean(x,y)
Levenshtein distance for two sequences of numbers
distanceSequenceLevenshtein(x, y)
distanceSequenceLevenshtein(x, y)
x |
first vector (numeric vector) |
y |
second vector (numeric vector) |
numeric distance value
#e.g., used for distance between integer sequence x <- c(0,1,10,2,4) y <- c(10,1,0,4,-4) distanceSequenceLevenshtein(x,y) p <- replicate(10,sample(1:5,3,replace=TRUE),simplify=FALSE) distanceMatrix(p,distanceSequenceLevenshtein)
#e.g., used for distance between integer sequence x <- c(0,1,10,2,4) y <- c(10,1,0,4,-4) distanceSequenceLevenshtein(x,y) p <- replicate(10,sample(1:5,3,replace=TRUE),simplify=FALSE) distanceMatrix(p,distanceSequenceLevenshtein)
Number of unequal letters in two strings.
distanceStringHamming(x, y)
distanceStringHamming(x, y)
x |
first string (class: character) |
y |
second string (class: character) |
numeric distance value
distanceStringHamming("ABCD","AACC")
distanceStringHamming("ABCD","AACC")
Distance between strings, based on the longest common substring.
distanceStringLCStr(x, y)
distanceStringLCStr(x, y)
x |
first string (class: character) |
y |
second string (class: character) |
numeric distance value
distanceStringLCStr("ABCD","AACC")
distanceStringLCStr("ABCD","AACC")
Number of insertions, deletions and substitutions to transform one string into another
distanceStringLevenshtein(x, y)
distanceStringLevenshtein(x, y)
x |
first string (class: character) |
y |
second string (class: character) |
numeric distance value
distanceStringLevenshtein("ABCD","AACC")
distanceStringLevenshtein("ABCD","AACC")
Calculate the distance between a single sample and all samples in a list.
distanceVector(a, X, distFun, ...)
distanceVector(a, X, distFun, ...)
a |
A single sample which is a suitable input for |
X |
list of samples, where each list element is a suitable input for |
distFun |
Distance function of type f(x,y)=r, where r is a scalar and x and y are elements whose distance is evaluated. |
... |
further arguments passed to distFun |
A numerical vector of distances
x <- 1:5 y <- list(5:1,c(2,4,5,1,3),c(5,4,3,1,2)) distanceVector(x,y,distancePermutationHamming)
x <- 1:5 y <- list(5:1,c(2,4,5,1,3),c(5,4,3,1,2)) distanceVector(x,y,distancePermutationHamming)
This function calculates the Expected Improvement" of candidate solutions, based on predicted means, standard deviations (uncertainty) and the best known objective function value so far.
infillExpectedImprovement(mean, sd, min)
infillExpectedImprovement(mean, sd, min)
mean |
predicted mean values |
sd |
predicted standard deviation |
min |
minimum of all observations so far |
Returns the negative logarithm of the Expected Improvement.
This function checks whether a symmetric matrix is Conditionally Negative Semi-Definite (CNSD). Note that this function does not check whether the matrix is actually symmetric.
is.CNSD(X, method = "alg1", tol = 1e-08)
is.CNSD(X, method = "alg1", tol = 1e-08)
X |
a symmetric matrix |
method |
a string, specifiying the method to be used. |
tol |
torelance value. Eigenvalues between Symmetric, CNSD matrices are, e.g., euclidean distance matrices, whiche are required to produce Positive Semi-Definite correlation or kernel matrices. Such matrices are used in models like Kriging or Support Vector Machines. |
boolean, which is TRUE if X is CNSD
Ikramov, K. and Savel'eva, N. Conditionally definite matrices, Journal of Mathematical Sciences, Kluwer Academic Publishers-Plenum Publishers, 2000, 98, 1-50
Glunt, W.; Hayden, T. L.; Hong, S. and Wells, J. An alternating projection algorithm for computing the nearest Euclidean distance matrix, SIAM Journal on Matrix Analysis and Applications, SIAM, 1990, 11, 589-600
# The following permutations will produce # a non-CNSD distance matrix with Insert distance # and a CNSD distance matrix with Hamming distance x <- list(c(2,1,4,3),c(2,4,3,1),c(4,2,1,3),c(4,3,2,1),c(1,4,3,2)) D <- distanceMatrix(x,distancePermutationInsert) is.CNSD(D,"alg1") is.CNSD(D,"alg2") is.CNSD(D,"eucl") D <- distanceMatrix(x,distancePermutationHamming) is.CNSD(D,"alg1") is.CNSD(D,"alg2") is.CNSD(D,"eucl")
# The following permutations will produce # a non-CNSD distance matrix with Insert distance # and a CNSD distance matrix with Hamming distance x <- list(c(2,1,4,3),c(2,4,3,1),c(4,2,1,3),c(4,3,2,1),c(1,4,3,2)) D <- distanceMatrix(x,distancePermutationInsert) is.CNSD(D,"alg1") is.CNSD(D,"alg2") is.CNSD(D,"eucl") D <- distanceMatrix(x,distancePermutationHamming) is.CNSD(D,"alg1") is.CNSD(D,"alg2") is.CNSD(D,"eucl")
This function checks whether a symmetric matrix is Negative Semi-Definite (NSD). That means, it is determined whether all eigenvalues of the matrix are non-positive. Note that this function does not check whether the matrix is actually symmetric.
is.NSD(X, tol = 1e-08)
is.NSD(X, tol = 1e-08)
X |
a symmetric matrix |
tol |
torelance value. Eigenvalues between Symmetric, NSD matrices are, e.g., correlation or kernel matrices. Such matrices are used in models like Kriging or Support Vector regression. |
boolean, which is TRUE if X is NSD
# The following permutations will produce # a non-PSD kernel matrix with Insert distance # and a PSD distance matrix with Hamming distance # (for the given theta value of 0.01)- # The respective negative should be (non-) NSD x <- list(c(2,1,4,3),c(2,4,3,1),c(4,2,1,3),c(4,3,2,1),c(1,4,3,2)) K <- exp(-0.01*distanceMatrix(x,distancePermutationInsert)) is.NSD(-K) K <- exp(-0.01*distanceMatrix(x,distancePermutationHamming)) is.NSD(-K)
# The following permutations will produce # a non-PSD kernel matrix with Insert distance # and a PSD distance matrix with Hamming distance # (for the given theta value of 0.01)- # The respective negative should be (non-) NSD x <- list(c(2,1,4,3),c(2,4,3,1),c(4,2,1,3),c(4,3,2,1),c(1,4,3,2)) K <- exp(-0.01*distanceMatrix(x,distancePermutationInsert)) is.NSD(-K) K <- exp(-0.01*distanceMatrix(x,distancePermutationHamming)) is.NSD(-K)
This function checks whether a symmetric matrix is Positive Semi-Definite (PSD). That means, it is determined whether all eigenvalues of the matrix are non-negative. Note that this function does not check whether the matrix is actually symmetric.
is.PSD(X, tol = 1e-08)
is.PSD(X, tol = 1e-08)
X |
a symmetric matrix |
tol |
torelance value. Eigenvalues between Symmetric, PSD matrices are, e.g., correlation or kernel matrices. Such matrices are used in models like Kriging or Support Vector regression. |
boolean, which is TRUE if X is PSD
# The following permutations will produce # a non-PSD kernel matrix with Insert distance # and a PSD distance matrix with Hamming distance # (for the given theta value of 0.01) x <- list(c(2,1,4,3),c(2,4,3,1),c(4,2,1,3),c(4,3,2,1),c(1,4,3,2)) K <- exp(-0.01*distanceMatrix(x,distancePermutationInsert)) is.PSD(K) K <- exp(-0.01*distanceMatrix(x,distancePermutationHamming)) is.PSD(K)
# The following permutations will produce # a non-PSD kernel matrix with Insert distance # and a PSD distance matrix with Hamming distance # (for the given theta value of 0.01) x <- list(c(2,1,4,3),c(2,4,3,1),c(4,2,1,3),c(4,3,2,1),c(1,4,3,2)) K <- exp(-0.01*distanceMatrix(x,distancePermutationInsert)) is.PSD(K) K <- exp(-0.01*distanceMatrix(x,distancePermutationHamming)) is.PSD(K)
Calculate the similarities between all samples in a list, and return as matrix.
kernelMatrix(X, kernFun, ...)
kernelMatrix(X, kernFun, ...)
X |
list of samples, where each list element is a suitable input for |
kernFun |
Kernel function of type f(x,y)=r, where r is a scalar and x and y are elements whose similarity is evaluated. |
... |
further arguments passed to distFun |
The similarity / kernel matrix
x <- list(5:1,c(2,4,5,1,3),c(5,4,3,1,2), sample(5)) kernFun <- function(x,y){ exp(-distancePermutationHamming(x,y)) } kernelMatrix(x,distancePermutationHamming)
x <- list(5:1,c(2,4,5,1,3),c(5,4,3,1,2), sample(5)) kernFun <- function(x,y){ exp(-distancePermutationHamming(x,y)) } kernelMatrix(x,distancePermutationHamming)
This function is loosely based on the Gaussian Landscape Generator by Bo Yuan and Marcus Gallagher.
It creates a Gaussian Landscape every time it is called. This Landscape can be evaluated like a function.
To adapt to combinatorial spaces, the Gaussians are here based on a user-specified distance measure.
Due to the expected nature of combinatorial spaces and their lack of direction, the resulting
Gaussians are much simplified in comparison to the continuous, vector-valued case (e.g., no rotation).
Since the CEGO
package is tailored to minimization, the landscape is inverted.
landscapeGeneratorGaussian( nGaussian = 10, theta = 1, ratio = 0.2, seed = 1, distanceFunction, creationFunction )
landscapeGeneratorGaussian( nGaussian = 10, theta = 1, ratio = 0.2, seed = 1, distanceFunction, creationFunction )
nGaussian |
number of Gaussian components in the landscape. Default is 10. |
theta |
controls width of Gaussian components as a multiplier. Default is 1. |
ratio |
minimal function value of the local minima. Default is 0.2. (Note: Global minimum will be at zero, local minima will be in range |
seed |
seed for the random number generator used before creation of the landscape. Generator status will be saved and reset afterwards. |
distanceFunction |
A function of type |
creationFunction |
function to randomly generate the centers of the Gaussians, in form of their given representation. |
returns a function.The function requires a list of candidate solutions as its input, where each solution is suitable for use with the distance function.
B. Yuan and M. Gallagher (2003) "On Building a Principled Framework for Evaluating and Testing Evolutionary Algorithms: A Continuous Landscape Generator". In Proceedings of the 2003 Congress on Evolutionary Computation, IEEE, pp. 451-458, Canberra, Australia.
#rng seed seed=101 # distance function dF <- function(x,y)(sum((x-y)^2)) #sum of squares #dF <- function(x,y)sqrt(sum((x-y)^2)) #euclidean distance # creation function cF <- function()runif(1) # plot pars par(mfrow=c(3,1),mar=c(3.5,3.5,0.2,0.2),mgp=c(2,1,0)) ## uni modal distance landscape # set seed set.seed(seed) #landscape lF <- landscapeGeneratorUNI(cF(),dF) x <- as.list(seq(from=0,by=0.001,to=1)) plot(x,lF(x),type="l") ## multi-modal distance landscape # set seed set.seed(seed) #landscape lF <- landscapeGeneratorMUL(replicate(5,cF(),FALSE),dF) plot(x,lF(x),type="l") ## glg landscape #landscape lF <- landscapeGeneratorGaussian(nGaussian=20,theta=1, ratio=0.3,seed=seed,dF,cF) plot(x,lF(x),type="l")
#rng seed seed=101 # distance function dF <- function(x,y)(sum((x-y)^2)) #sum of squares #dF <- function(x,y)sqrt(sum((x-y)^2)) #euclidean distance # creation function cF <- function()runif(1) # plot pars par(mfrow=c(3,1),mar=c(3.5,3.5,0.2,0.2),mgp=c(2,1,0)) ## uni modal distance landscape # set seed set.seed(seed) #landscape lF <- landscapeGeneratorUNI(cF(),dF) x <- as.list(seq(from=0,by=0.001,to=1)) plot(x,lF(x),type="l") ## multi-modal distance landscape # set seed set.seed(seed) #landscape lF <- landscapeGeneratorMUL(replicate(5,cF(),FALSE),dF) plot(x,lF(x),type="l") ## glg landscape #landscape lF <- landscapeGeneratorGaussian(nGaussian=20,theta=1, ratio=0.3,seed=seed,dF,cF) plot(x,lF(x),type="l")
This function generates multi-modal fitness landscapes based on distance measures. The fitness is the minimal distance to several reference individuals or centers. Hence, each reference individual is an optimum of the landscape.
landscapeGeneratorMUL(ref, distanceFunction)
landscapeGeneratorMUL(ref, distanceFunction)
ref |
list of reference individuals / centers |
distanceFunction |
Distance function, used to evaluate d(x,ref[[n]]), where x is an arbitrary new individual |
returns a function. The function requires a list of candidate solutions as its input, where each solution is suitable for use with the distance function. The function returns a numeric vector.
landscapeGeneratorUNI
, landscapeGeneratorGaussian
fun <- landscapeGeneratorMUL(ref=list(1:7,c(2,4,1,5,3,7,6)),distancePermutationCos) x <- 1:7 fun(list(x)) x <- c(2,4,1,5,3,7,6) fun(list(x)) x <- 7:1 fun(list(x)) x <- sample(7) fun(list(x)) ## multiple solutions at once: x <- append(list(1:7,c(2,4,1,5,3,7,6)),replicate(5,sample(7),FALSE)) fun(x)
fun <- landscapeGeneratorMUL(ref=list(1:7,c(2,4,1,5,3,7,6)),distancePermutationCos) x <- 1:7 fun(list(x)) x <- c(2,4,1,5,3,7,6) fun(list(x)) x <- 7:1 fun(list(x)) x <- sample(7) fun(list(x)) ## multiple solutions at once: x <- append(list(1:7,c(2,4,1,5,3,7,6)),replicate(5,sample(7),FALSE)) fun(x)
This function generates uni-modal fitness landscapes based on distance measures.
The fitness is the distance to a reference individual or center. Hence, the reference individual
is the optimum of the landscape. This function is essentially a wrapper
for the landscapeGeneratorMUL
landscapeGeneratorUNI(ref, distanceFunction)
landscapeGeneratorUNI(ref, distanceFunction)
ref |
reference individual |
distanceFunction |
Distance function, used to evaluate d(x,ref), where x is an arbitrary new individual |
returns a function. The function requires a list of candidate solutions as its input, where each solution is suitable for use with the distance function. The function returns a numeric vector.
Moraglio, Alberto, Yong-Hyuk Kim, and Yourim Yoon. "Geometric surrogate-based optimisation for permutation-based problems." Proceedings of the 13th annual conference companion on Genetic and evolutionary computation. ACM, 2011.
landscapeGeneratorMUL
, landscapeGeneratorGaussian
fun <- landscapeGeneratorUNI(ref=1:7,distancePermutationCos) ## for single solutions, note that the function still requires list input: x <- 1:7 fun(list(x)) x <- 7:1 fun(list(x)) x <- sample(7) fun(list(x)) ## multiple solutions at once: x <- replicate(5,sample(7),FALSE) fun(x)
fun <- landscapeGeneratorUNI(ref=1:7,distancePermutationCos) ## for single solutions, note that the function still requires list input: x <- 1:7 fun(list(x)) x <- 7:1 fun(list(x)) x <- sample(7) fun(list(x)) ## multiple solutions at once: x <- replicate(5,sample(7),FALSE) fun(x)
This function returns the position-number that a permutation would receive in a lexicographic ordering. It is used in the lexicographic distance measure.
lexicographicPermutationOrderNumber(x)
lexicographicPermutationOrderNumber(x)
x |
permutation (integer vector) |
numeric value giving position in lexicographic order.
lexicographicPermutationOrderNumber(1:5) lexicographicPermutationOrderNumber(c(1,2,3,5,4)) lexicographicPermutationOrderNumber(c(1,2,4,3,5)) lexicographicPermutationOrderNumber(c(1,2,4,5,3)) lexicographicPermutationOrderNumber(c(1,2,5,3,4)) lexicographicPermutationOrderNumber(c(1,2,5,4,3)) lexicographicPermutationOrderNumber(c(1,3,2,4,5)) lexicographicPermutationOrderNumber(5:1) lexicographicPermutationOrderNumber(1:7) lexicographicPermutationOrderNumber(7:1)
lexicographicPermutationOrderNumber(1:5) lexicographicPermutationOrderNumber(c(1,2,3,5,4)) lexicographicPermutationOrderNumber(c(1,2,4,3,5)) lexicographicPermutationOrderNumber(c(1,2,4,5,3)) lexicographicPermutationOrderNumber(c(1,2,5,3,4)) lexicographicPermutationOrderNumber(c(1,2,5,4,3)) lexicographicPermutationOrderNumber(c(1,3,2,4,5)) lexicographicPermutationOrderNumber(5:1) lexicographicPermutationOrderNumber(1:7) lexicographicPermutationOrderNumber(7:1)
Implementation of a distance-based Kriging model, e.g., for mixed or combinatorial input spaces. It is based on employing suitable distance measures for the samples in input space.
modelKriging(x, y, distanceFunction, control = list())
modelKriging(x, y, distanceFunction, control = list())
x |
list of samples in input space |
y |
column vector of observations for each sample |
distanceFunction |
a suitable distance function of type f(x1,x2), returning a scalar distance value, preferably between 0 and 1. Maximum distances larger 1 are no problem, but may yield scaling bias when different measures are compared. Should be non-negative and symmetric. It can also be a list of several distance functions. In this case, Maximum Likelihood Estimation (MLE) is used to determine the most suited distance measure. The distance function may have additional parameters. For that case, see distanceParametersLower/Upper in the controls. If distanceFunction is missing, it can also be provided in the control list. |
control |
(list), with the options for the model building procedure:
|
The basic Kriging implementation is based on the work of Forrester et al. (2008). For adaptation of Kriging to mixed or combinatorial spaces, as well as choosing distance measures with Maximum Likelihood Estimation, see the other two references (Zaefferer et al., 2014).
an object of class modelKriging
containing the options (see control parameter) and determined parameters for the model:
theta
parameters of the kernel / correlation function determined with MLE.
lambda
regularization constant (nugget) lambda
yMu
vector of observations y, minus MLE of mu
SSQ
Maximum Likelihood Estimate (MLE) of model parameter sigma^2
mu
MLE of model parameter mu
Psi
correlation matrix Psi
Psinv
inverse of Psi
nevals
number of Likelihood evaluations during MLE of theta/lambda/p
distanceFunctionIndexMLE
If a list of several distance measures (distanceFunction
) was given, this parameter contains the index value of the measure chosen with MLE.
Forrester, Alexander I.J.; Sobester, Andras; Keane, Andy J. (2008). Engineering Design via Surrogate Modelling - A Practical Guide. John Wiley & Sons.
Zaefferer, Martin; Stork, Joerg; Friese, Martina; Fischbach, Andreas; Naujoks, Boris; Bartz-Beielstein, Thomas. (2014). Efficient global optimization for combinatorial problems. In Proceedings of the 2014 conference on Genetic and evolutionary computation (GECCO '14). ACM, New York, NY, USA, 871-878. DOI=10.1145/2576768.2598282
Zaefferer, Martin; Stork, Joerg; Bartz-Beielstein, Thomas. (2014). Distance Measures for Permutations in Combinatorial Efficient Global Optimization. In Parallel Problem Solving from Nature - PPSN XIII (p. 373-383). Springer International Publishing.
Zaefferer, Martin and Bartz-Beielstein, Thomas (2016). Efficient Global Optimization with Indefinite Kernels. Parallel Problem Solving from Nature-PPSN XIV. Accepted, in press. Springer.
# Set random number generator seed set.seed(1) # Simple test landscape fn <- landscapeGeneratorUNI(1:5,distancePermutationHamming) # Generate data for training and test x <- unique(replicate(40,sample(5),FALSE)) xtest <- x[-(1:15)] x <- x[1:15] # Determin true objective function values y <- fn(x) ytest <- fn(xtest) # Build model fit <- modelKriging(x,y,distancePermutationHamming, control=list(algThetaControl=list(method="L-BFGS-B"),useLambda=FALSE)) # Predicted obj. function values ypred <- predict(fit,xtest)$y # Uncertainty estimate fit$predAll <- TRUE spred <- predict(fit,xtest)$s # Plot plot(ytest,ypred,xlab="true value",ylab="predicted value", pch=20,xlim=c(0.3,1),ylim=c(min(ypred)-0.1,max(ypred)+0.1)) segments(ytest, ypred-spred,ytest, ypred+spred) epsilon = 0.02 segments(ytest-epsilon,ypred-spred,ytest+epsilon,ypred-spred) segments(ytest-epsilon,ypred+spred,ytest+epsilon,ypred+spred) abline(0,1,lty=2) # Use a different/custom optimizer (here: SANN) for maximum likelihood estimation: # (Note: Bound constraints are recommended, to avoid Inf values. # This is really just a demonstration. SANN does not respect bound constraints.) optimizer1 <- function(x,fun,lower=NULL,upper=NULL,control=NULL,...){ res <- optim(x,fun,method="SANN",control=list(maxit=100),...) list(xbest=res$par,ybest=res$value,count=res$counts) } fit <- modelKriging(x,y,distancePermutationHamming, control=list(algTheta=optimizer1,useLambda=FALSE)) #One-dimensional optimizer (Brent). Note, that Brent will not work when #several parameters have to be set, e.g., when using nugget effect (lambda). #However, Brent may be quite efficient otherwise. optimizer2 <- function(x,fun,lower,upper,control=NULL,...){ res <- optim(x,fun,method="Brent",lower=lower,upper=upper,...) list(xbest=res$par,ybest=res$value,count=res$counts) } fit <- modelKriging(x,y,distancePermutationHamming, control=list(algTheta=optimizer2,useLambda=FALSE))
# Set random number generator seed set.seed(1) # Simple test landscape fn <- landscapeGeneratorUNI(1:5,distancePermutationHamming) # Generate data for training and test x <- unique(replicate(40,sample(5),FALSE)) xtest <- x[-(1:15)] x <- x[1:15] # Determin true objective function values y <- fn(x) ytest <- fn(xtest) # Build model fit <- modelKriging(x,y,distancePermutationHamming, control=list(algThetaControl=list(method="L-BFGS-B"),useLambda=FALSE)) # Predicted obj. function values ypred <- predict(fit,xtest)$y # Uncertainty estimate fit$predAll <- TRUE spred <- predict(fit,xtest)$s # Plot plot(ytest,ypred,xlab="true value",ylab="predicted value", pch=20,xlim=c(0.3,1),ylim=c(min(ypred)-0.1,max(ypred)+0.1)) segments(ytest, ypred-spred,ytest, ypred+spred) epsilon = 0.02 segments(ytest-epsilon,ypred-spred,ytest+epsilon,ypred-spred) segments(ytest-epsilon,ypred+spred,ytest+epsilon,ypred+spred) abline(0,1,lty=2) # Use a different/custom optimizer (here: SANN) for maximum likelihood estimation: # (Note: Bound constraints are recommended, to avoid Inf values. # This is really just a demonstration. SANN does not respect bound constraints.) optimizer1 <- function(x,fun,lower=NULL,upper=NULL,control=NULL,...){ res <- optim(x,fun,method="SANN",control=list(maxit=100),...) list(xbest=res$par,ybest=res$value,count=res$counts) } fit <- modelKriging(x,y,distancePermutationHamming, control=list(algTheta=optimizer1,useLambda=FALSE)) #One-dimensional optimizer (Brent). Note, that Brent will not work when #several parameters have to be set, e.g., when using nugget effect (lambda). #However, Brent may be quite efficient otherwise. optimizer2 <- function(x,fun,lower,upper,control=NULL,...){ res <- optim(x,fun,method="Brent",lower=lower,upper=upper,...) list(xbest=res$par,ybest=res$value,count=res$counts) } fit <- modelKriging(x,y,distancePermutationHamming, control=list(algTheta=optimizer2,useLambda=FALSE))
This function builds an ensemble of Gaussian Process model, where each individual model is fitted to a partition of the parameter space. Partitions are generated by.
modelKrigingClust(x, y, distanceFunction, control = list())
modelKrigingClust(x, y, distanceFunction, control = list())
x |
x |
y |
y |
distanceFunction |
distanceFunction |
control |
control |
an object
A simple linear model based on arbitrary distances. Comparable to a k nearest neighbor model, but potentially able to extrapolate into regions of improvement. Used as a simple baseline by Zaefferer et al.(2014).
modelLinear(x, y, distanceFunction, control = list())
modelLinear(x, y, distanceFunction, control = list())
x |
list of samples in input space |
y |
matrix, vector of observations for each sample |
distanceFunction |
a suitable distance function of type f(x1,x2), returning a scalar distance value, preferably between 0 and 1. Maximum distances larger 1 are no problem, but may yield scaling bias when different measures are compared. Should be non-negative and symmetric. |
control |
currently unused, defaults to |
a fit (list, modelLinear), with the options and found parameters for the model which has to be passed to the predictor function:
x
samples in input space (see parameters)
y
observations for each sample (see parameters)
distanceFunction
distance function (see parameters)
Zaefferer, Martin; Stork, Joerg; Friese, Martina; Fischbach, Andreas; Naujoks, Boris; Bartz-Beielstein, Thomas. (2014). Efficient global optimization for combinatorial problems. In Proceedings of the 2014 conference on Genetic and evolutionary computation (GECCO '14). ACM, New York, NY, USA, 871-878. DOI=10.1145/2576768.2598282
#set random number generator seed set.seed(1) #simple test landscape fn <- landscapeGeneratorUNI(1:5,distancePermutationHamming) #generate data for training and test x <- unique(replicate(40,sample(5),FALSE)) xtest <- x[-(1:15)] x <- x[1:15] #determin true objective function values y <- fn(x) ytest <- fn(xtest) #build model fit <- modelLinear(x,y,distancePermutationHamming) #predicted obj. function values ypred <- predict(fit,xtest)$y #plot plot(ytest,ypred,xlab="true value",ylab="predicted value", pch=20,xlim=c(0.3,1),ylim=c(min(ypred)-0.1,max(ypred)+0.1)) abline(0,1,lty=2)
#set random number generator seed set.seed(1) #simple test landscape fn <- landscapeGeneratorUNI(1:5,distancePermutationHamming) #generate data for training and test x <- unique(replicate(40,sample(5),FALSE)) xtest <- x[-(1:15)] x <- x[1:15] #determin true objective function values y <- fn(x) ytest <- fn(xtest) #build model fit <- modelLinear(x,y,distancePermutationHamming) #predicted obj. function values ypred <- predict(fit,xtest)$y #plot plot(ytest,ypred,xlab="true value",ylab="predicted value", pch=20,xlim=c(0.3,1),ylim=c(min(ypred)-0.1,max(ypred)+0.1)) abline(0,1,lty=2)
Implementation of a distance-based Radial Basis Function Network (RBFN) model, e.g., for mixed or combinatorial input spaces. It is based on employing suitable distance measures for the samples in input space. For reference, see the paper by Moraglio and Kattan (2011).
modelRBFN(x, y, distanceFunction, control = list())
modelRBFN(x, y, distanceFunction, control = list())
x |
list of samples in input space |
y |
column vector of observations for each sample |
distanceFunction |
a suitable distance function of type f(x1,x2), returning a scalar distance value, preferably between 0 and 1. Maximum distances larger 1 are no problem, but may yield scaling bias when different measures are compared. Should be non-negative and symmetric. |
control |
(list), with the options for the model building procedure:
|
a fit (list, modelRBFN), with the options and found parameters for the model which has to be passed to the predictor function:
SSQ
Variance of the observations (y)
centers
Centers of the RBFN model, samples in input space (see parameters)
w
Model parameters (weights) w
Phi
Gram matrix
Phinv
(Pseudo)-Inverse of Gram matrix
w0
Mean of observations (y)
dMax
Maximum observed distance
D
Matrix of distances between all samples
beta
See parameters
fbeta
See parameters
distanceFunction
See parameters
Moraglio, Alberto, and Ahmed Kattan. "Geometric generalisation of surrogate model based optimisation to combinatorial spaces." Evolutionary Computation in Combinatorial Optimization. Springer Berlin Heidelberg, 2011. 142-154.
#set random number generator seed set.seed(1) #simple test landscape fn <- landscapeGeneratorUNI(1:5,distancePermutationHamming) #generate data for training and test x <- unique(replicate(40,sample(5),FALSE)) xtest <- x[-(1:15)] x <- x[1:15] #determin true objective function values y <- fn(x) ytest <- fn(xtest) #build model fit <- modelRBFN(x,y,distancePermutationHamming) #predicted obj. function values ypred <- predict(fit,xtest)$y #plot plot(ytest,ypred,xlab="true value",ylab="predicted value", pch=20,xlim=c(0.3,1),ylim=c(min(ypred)-0.1,max(ypred)+0.1)) abline(0,1,lty=2)
#set random number generator seed set.seed(1) #simple test landscape fn <- landscapeGeneratorUNI(1:5,distancePermutationHamming) #generate data for training and test x <- unique(replicate(40,sample(5),FALSE)) xtest <- x[-(1:15)] x <- x[1:15] #determin true objective function values y <- fn(x) ytest <- fn(xtest) #build model fit <- modelRBFN(x,y,distancePermutationHamming) #predicted obj. function values ypred <- predict(fit,xtest)$y #plot plot(ytest,ypred,xlab="true value",ylab="predicted value", pch=20,xlim=c(0.3,1),ylim=c(min(ypred)-0.1,max(ypred)+0.1)) abline(0,1,lty=2)
Given a population of bit-strings, this function mutates all individuals by randomly inverting one or more bits in each individual.
mutationBinaryBitFlip(population, parameters)
mutationBinaryBitFlip(population, parameters)
population |
List of bit-strings |
parameters |
list of parameters: parameters$mutationRate => mutation rate, specifying number of bits flipped. Should be in range between zero and one |
mutated population
Given a population of bit-strings, this function mutates all individuals by inverting a whole block, randomly selected.
mutationBinaryBlockInversion(population, parameters)
mutationBinaryBlockInversion(population, parameters)
population |
List of bit-strings |
parameters |
list of parameters: parameters$mutationRate => mutation rate, specifying number of bits flipped. Should be in range between zero and one |
mutated population
Given a population of bit-strings, this function mutates all individuals by cyclical shifting the string to the right or left.
mutationBinaryCycle(population, parameters)
mutationBinaryCycle(population, parameters)
population |
List of bit-strings |
parameters |
list of parameters: parameters$mutationRate => mutation rate, specifying number of bits flipped. Should be in range between zero and one |
mutated population
Given a population of bit-strings, this function mutates all individuals by randomly inverting one bit in each individual. Due to the fixed mutation rate, this is computationally faster.
mutationBinarySingleBitFlip(population, parameters)
mutationBinarySingleBitFlip(population, parameters)
population |
List of bit-strings |
parameters |
not used |
mutated population
Given a population of permutations, this function mutates all individuals by randomly selecting two indices. The element at index1 is moved to positition index2, other elements
mutationPermutationInsert(population, parameters = list())
mutationPermutationInsert(population, parameters = list())
population |
List of permutations |
parameters |
list of parameters, currently only uses parameters$mutationRate, which should be between 0 and 1 (but can be larger than 1). The mutation rate determines the number of reversals performed, relative to the permutation length (N). 0 means none. 1 means N reversals. The default is 1/N. |
mutated population
Given a population of permutations, this function mutates all individuals by randomly interchanging two arbitrary elements of the permutation.
mutationPermutationInterchange(population, parameters = list())
mutationPermutationInterchange(population, parameters = list())
population |
List of permutations |
parameters |
list of parameters, currently only uses parameters$mutationRate, which should be between 0 and 1 (but can be larger than 1). The mutation rate determines the number of interchanges performed, relative to the permutation length (N). 0 means none. 1 means N interchanges. The default is 1/N. |
mutated population
Given a population of permutations, this function mutates all individuals by randomly selecting two indices, and reversing the respective sub-permutation.
mutationPermutationReversal(population, parameters = list())
mutationPermutationReversal(population, parameters = list())
population |
List of permutations |
parameters |
list of parameters, currently only uses parameters$mutationRate, which should be between 0 and 1 (but can be larger than 1). The mutation rate determines the number of reversals performed, relative to the permutation length (N). 0 means none. 1 means N reversals. The default is 1/N. |
mutated population
Given a population of permutations, this function mutates all individuals by randomly interchanging two adjacent elements of the permutation.
mutationPermutationSwap(population, parameters = list())
mutationPermutationSwap(population, parameters = list())
population |
List of permutations |
parameters |
list of parameters, currently only uses parameters$mutationRate, which should be between 0 and 1 (but can be larger than 1). The mutation rate determines the number of swaps performed, relative to the permutation length (N). 0 means none. 1 means N swaps. The default is 1/N. |
mutated population
This mutation function selects an operator and mutationRate (provided in parameters$mutationFunctions) based on self-adaptive parameters chosen for each individual separately.
mutationSelfAdapt(population, parameters)
mutationSelfAdapt(population, parameters)
population |
List of permutations |
parameters |
list, contains the available single mutation functions ( |
optimEA
, recombinationSelfAdapt
seed=0 N=5 #distance dF <- distancePermutationHamming #mutation mFs <- c(mutationPermutationSwap,mutationPermutationInterchange, mutationPermutationInsert,mutationPermutationReversal) rFs <- c(recombinationPermutationCycleCrossover,recombinationPermutationOrderCrossover1, recombinationPermutationPositionBased,recombinationPermutationAlternatingPosition) mF <- mutationSelfAdapt selfAdaptiveParameters <- list( mutationRate = list(type="numeric", lower=1/N,upper=1, default=1/N), mutationOperator = list(type="discrete", values=1:4, default=expression(sample(4,1))), #1: swap, 2: interchange, 3: insert, 4: reversal mutation recombinationOperator = list(type="discrete", values=1:4, default=expression(sample(4,1))) #1: CycleX, 2: OrderX, 3: PositionX, 4: AlternatingPosition ) #recombination rF <- recombinationSelfAdapt #creation cF <- function()sample(N) #objective function lF <- landscapeGeneratorUNI(1:N,dF) #start optimization set.seed(seed) res <- optimEA(,lF,list(parameters=list(mutationFunctions=mFs,recombinationFunctions=rFs), creationFunction=cF,mutationFunction=mF,recombinationFunction=rF, popsize=15,budget=100,targetY=0,verbosity=1,selfAdaption=selfAdaptiveParameters, vectorized=TRUE)) ##target function is "vectorized", expects list as input res$xbest
seed=0 N=5 #distance dF <- distancePermutationHamming #mutation mFs <- c(mutationPermutationSwap,mutationPermutationInterchange, mutationPermutationInsert,mutationPermutationReversal) rFs <- c(recombinationPermutationCycleCrossover,recombinationPermutationOrderCrossover1, recombinationPermutationPositionBased,recombinationPermutationAlternatingPosition) mF <- mutationSelfAdapt selfAdaptiveParameters <- list( mutationRate = list(type="numeric", lower=1/N,upper=1, default=1/N), mutationOperator = list(type="discrete", values=1:4, default=expression(sample(4,1))), #1: swap, 2: interchange, 3: insert, 4: reversal mutation recombinationOperator = list(type="discrete", values=1:4, default=expression(sample(4,1))) #1: CycleX, 2: OrderX, 3: PositionX, 4: AlternatingPosition ) #recombination rF <- recombinationSelfAdapt #creation cF <- function()sample(N) #objective function lF <- landscapeGeneratorUNI(1:N,dF) #start optimization set.seed(seed) res <- optimEA(,lF,list(parameters=list(mutationFunctions=mFs,recombinationFunctions=rFs), creationFunction=cF,mutationFunction=mF,recombinationFunction=rF, popsize=15,budget=100,targetY=0,verbosity=1,selfAdaption=selfAdaptiveParameters, vectorized=TRUE)) ##target function is "vectorized", expects list as input res$xbest
Given a population of strings, this function mutates all individuals by randomly changing an element of the string.
mutationStringRandomChange(population, parameters = list())
mutationStringRandomChange(population, parameters = list())
population |
List of permutations |
parameters |
list of parameters, with |
mutated population
This function
implements the alternating projection algorithm by Glunt et al. (1990) to calculate the nearest conditionally
negative semi-definite (CNSD) matrix (or: the nearest Euclidean distance matrix).
The function is similar to the nearPD
function from the Matrix
package,
which implements a very similar algorithm for finding the nearest Positive Semi-Definite (PSD) matrix.
nearCNSD( x, eig.tol = 1e-08, conv.tol = 1e-08, maxit = 1000, conv.norm.type = "F" )
nearCNSD( x, eig.tol = 1e-08, conv.tol = 1e-08, maxit = 1000, conv.norm.type = "F" )
x |
symmetric matrix, to be turned into a CNSD matrix. |
eig.tol |
eigenvalue torelance value. Eigenvalues between |
conv.tol |
convergence torelance value. The algorithm stops if the norm of the difference between two iterations is below this value. |
maxit |
maximum number of iterations. The algorithm stops if this value is exceeded, even if not converged. |
conv.norm.type |
type of norm, by default the F-norm (Frobenius). See |
list with:
mat
nearestCNSD matrix
normF
F-norm between original and resulting matrices
iterations
the number of performed
rel.tol
the relative value used for the tolerance convergence criterion
converged
a boolean that records whether the algorithm
Glunt, W.; Hayden, T. L.; Hong, S. and Wells, J. An alternating projection algorithm for computing the nearest Euclidean distance matrix, SIAM Journal on Matrix Analysis and Applications, SIAM, 1990, 11, 589-600
nearPD
, correctionCNSD
, correctionDistanceMatrix
# example using Insert distance with permutations: x <- list(c(2,1,4,3),c(2,4,3,1),c(4,2,1,3),c(4,3,2,1),c(1,4,3,2)) D <- distanceMatrix(x,distancePermutationInsert) print(D) is.CNSD(D) nearD <- nearCNSD(D) print(nearD) is.CNSD(nearD$mat) # or example matrix from Glunt et al. (1990): D <- matrix(c(0,1,1,1,0,9,1,9,0),3,3) print(D) is.CNSD(D) nearD <- nearCNSD(D) print(nearD) is.CNSD(nearD$mat) # note, that the resulting values given by Glunt et al. (1990) are 19/9 and 76/9
# example using Insert distance with permutations: x <- list(c(2,1,4,3),c(2,4,3,1),c(4,2,1,3),c(4,3,2,1),c(1,4,3,2)) D <- distanceMatrix(x,distancePermutationInsert) print(D) is.CNSD(D) nearD <- nearCNSD(D) print(nearD) is.CNSD(nearD$mat) # or example matrix from Glunt et al. (1990): D <- matrix(c(0,1,1,1,0,9,1,9,0),3,3) print(D) is.CNSD(D) nearD <- nearCNSD(D) print(nearD) is.CNSD(nearD$mat) # note, that the resulting values given by Glunt et al. (1990) are 19/9 and 76/9
Implementation of a Two-Opt local search.
optim2Opt(x = NULL, fun, control = list())
optim2Opt(x = NULL, fun, control = list())
x |
start solution of the local search |
fun |
function that determines cost or length of a route/permutation |
control |
(list), with the options:
|
a list with:
xbest
best solution found
ybest
fitness of the best solution
count
number of performed target function evaluations
Wikipedia contributors. "2-opt." Wikipedia, The Free Encyclopedia. Wikipedia, The Free Encyclopedia, 13 Jun. 2014. Web. 21 Oct. 2014.
optimCEGO
, optimEA
, optimRS
, optimMaxMinDist
seed=0 #distance dF <- distancePermutationHamming #creation cF <- function()sample(5) #objective function lF <- landscapeGeneratorUNI(1:5,dF) #start optimization set.seed(seed) res <- optim2Opt(,lF,list(creationFunction=cF,budget=100, vectorized=TRUE)) ##target function is "vectorized", expects list of solutions as input res
seed=0 #distance dF <- distancePermutationHamming #creation cF <- function()sample(5) #objective function lF <- landscapeGeneratorUNI(1:5,dF) #start optimization set.seed(seed) res <- optim2Opt(,lF,list(creationFunction=cF,budget=100, vectorized=TRUE)) ##target function is "vectorized", expects list of solutions as input res
Model-based optimization for combinatorial or mixed problems. Based on measures of distance or dissimilarity.
optimCEGO(x = NULL, fun, control = list())
optimCEGO(x = NULL, fun, control = list())
x |
Optional initial design as a list. If NULL (default), |
fun |
target function to be minimized |
control |
(list), with the options of optimization and model building approaches employed:
|
a list:
xbest
best solution found
ybest
fitness of the best solution
x
history of all evaluated solutions
y
corresponding target function values f(x)
fit
model-fit created in the last iteration
fpred
prediction function created in the last iteration
count
number of performed target function evaluations
message
message string, giving information on termination reason
convergence
error/status code: -1
for termination due
to failed model building, 0
for termination due to depleted budget,
1
if attained objective value is equal to or below target (control$targetY
)
Zaefferer, Martin; Stork, Joerg; Friese, Martina; Fischbach, Andreas; Naujoks, Boris; Bartz-Beielstein, Thomas. (2014). Efficient global optimization for combinatorial problems. In Proceedings of the 2014 conference on Genetic and evolutionary computation (GECCO '14). ACM, New York, NY, USA, 871-878. DOI=10.1145/2576768.2598282
Zaefferer, Martin; Stork, Joerg; Bartz-Beielstein, Thomas. (2014). Distance Measures for Permutations in Combinatorial Efficient Global Optimization. In Parallel Problem Solving from Nature - PPSN XIII (p. 373-383). Springer International Publishing.
modelKriging
, modelLinear
, modelRBFN
, buildModel
, optimEA
seed <- 0 #distance dF <- distancePermutationHamming #mutation mF <- mutationPermutationSwap #recombination rF <- recombinationPermutationCycleCrossover #creation cF <- function()sample(5) #objective function lF <- landscapeGeneratorUNI(1:5,dF) #start optimization set.seed(seed) res1 <- optimCEGO(,lF,list( creationFunction=cF, distanceFunction=dF, optimizerSettings=list(budget=100,popsize=10, mutationFunction=mF,recombinationFunction=rF), evalInit=5,budget=15,targetY=0,verbosity=1,model=modelKriging, vectorized=TRUE)) ##target function is "vectorized", expects list as input set.seed(seed) res2 <- optimCEGO(,lF,list( creationFunction=cF, distanceFunction=dF, optimizerSettings=list(budget=100,popsize=10, mutationFunction=mF,recombinationFunction=rF), evalInit=5,budget=15,targetY=0,verbosity=1,model=modelRBFN, vectorized=TRUE)) ##target function is "vectorized", expects list as input res1$xbest res2$xbest
seed <- 0 #distance dF <- distancePermutationHamming #mutation mF <- mutationPermutationSwap #recombination rF <- recombinationPermutationCycleCrossover #creation cF <- function()sample(5) #objective function lF <- landscapeGeneratorUNI(1:5,dF) #start optimization set.seed(seed) res1 <- optimCEGO(,lF,list( creationFunction=cF, distanceFunction=dF, optimizerSettings=list(budget=100,popsize=10, mutationFunction=mF,recombinationFunction=rF), evalInit=5,budget=15,targetY=0,verbosity=1,model=modelKriging, vectorized=TRUE)) ##target function is "vectorized", expects list as input set.seed(seed) res2 <- optimCEGO(,lF,list( creationFunction=cF, distanceFunction=dF, optimizerSettings=list(budget=100,popsize=10, mutationFunction=mF,recombinationFunction=rF), evalInit=5,budget=15,targetY=0,verbosity=1,model=modelRBFN, vectorized=TRUE)) ##target function is "vectorized", expects list as input res1$xbest res2$xbest
A basic implementation of a simple Evolutionary Algorithm for Combinatorial Optimization. Default evolutionary operators aim at permutation optimization problems.
optimEA(x = NULL, fun, control = list())
optimEA(x = NULL, fun, control = list())
x |
Optional start individual(s) as a list. If NULL (default), |
fun |
target function to be minimized |
control |
(list), with the options:
|
a list:
xbest
best solution found.
ybest
fitness of the best solution.
x
history of all evaluated solutions.
y
corresponding target function values f(x).
count
number of performed target function evaluations.
message
Termination message: Which stopping criterion was reached.
population
Last population.
fitness
Fitness of last population.
optimCEGO
, optimRS
, optim2Opt
, optimMaxMinDist
#First example: permutation optimization seed=0 #distance dF <- distancePermutationHamming #mutation mF <- mutationPermutationSwap #recombination rF <- recombinationPermutationCycleCrossover #creation cF <- function()sample(5) #objective function lF <- landscapeGeneratorUNI(1:5,dF) #start optimization set.seed(seed) res <- optimEA(,lF,list(creationFunction=cF,mutationFunction=mF,recombinationFunction=rF, popsize=6,budget=60,targetY=0,verbosity=1, vectorized=TRUE)) ##target function is "vectorized", expects list as input res$xbest #Second example: binary string optimization #number of bits N <- 50 #target function (simple example) f <- function(x){ sum(x) } #function to create random Individuals cf <- function(){ sample(c(FALSE,TRUE),N,replace=TRUE) } #control list cntrl <- list( budget = 100, popsize = 5, creationFunction = cf, vectorized = FALSE, #set to TRUE if f evaluates a list of individuals recombinationFunction = recombinationBinary2Point, recombinationRate = 0.1, mutationFunction = mutationBinaryBitFlip, parameters=list(mutationRate = 1/N), archive=FALSE #recommended for larger budgets. do not change. ) #start algorithm set.seed(1) res <- optimEA(fun=f,control=cntrl) res$xbest res$ybest
#First example: permutation optimization seed=0 #distance dF <- distancePermutationHamming #mutation mF <- mutationPermutationSwap #recombination rF <- recombinationPermutationCycleCrossover #creation cF <- function()sample(5) #objective function lF <- landscapeGeneratorUNI(1:5,dF) #start optimization set.seed(seed) res <- optimEA(,lF,list(creationFunction=cF,mutationFunction=mF,recombinationFunction=rF, popsize=6,budget=60,targetY=0,verbosity=1, vectorized=TRUE)) ##target function is "vectorized", expects list as input res$xbest #Second example: binary string optimization #number of bits N <- 50 #target function (simple example) f <- function(x){ sum(x) } #function to create random Individuals cf <- function(){ sample(c(FALSE,TRUE),N,replace=TRUE) } #control list cntrl <- list( budget = 100, popsize = 5, creationFunction = cf, vectorized = FALSE, #set to TRUE if f evaluates a list of individuals recombinationFunction = recombinationBinary2Point, recombinationRate = 0.1, mutationFunction = mutationBinaryBitFlip, parameters=list(mutationRate = 1/N), archive=FALSE #recommended for larger budgets. do not change. ) #start algorithm set.seed(1) res <- optimEA(fun=f,control=cntrl) res$xbest res$ybest
This function is an interface fashioned like the optim
function.
Unlike optim, it collects a set of bound-constrained optimization algorithms
with local as well as global approaches. It is, e.g., used in the CEGO package
to solve the optimization problem that occurs during parameter estimation
in the Kriging model (based on Maximum Likelihood Estimation).
Note that this function is NOT applicable to combinatorial optimization problems.
optimInterface(x, fun, lower = -Inf, upper = Inf, control = list(), ...)
optimInterface(x, fun, lower = -Inf, upper = Inf, control = list(), ...)
x |
is a point (vector) in the decision space of |
fun |
is the target function of type |
lower |
is a vector that defines the lower boundary of search space |
upper |
is a vector that defines the upper boundary of search space |
control |
is a list of additional settings. See details. |
... |
additional parameters to be passed on to |
The control list contains:
funEvals
stopping criterion, number of evaluations allowed for fun
(defaults to 100)
reltol
stopping criterion, relative tolerance (default: 1e-6)
factr
stopping criterion, specifying relative tolerance parameter factr for the L-BFGS-B method in the optim function (default: 1e10)
popsize
population size or number of particles (default: 10*dimension
, where dimension
is derived from the length of the vector lower
).
restarts
whether to perform restarts (Default: TRUE). Restarts will only be performed if some of the evaluation budget is left once the algorithm stopped due to some stopping criterion (e.g., reltol).
method
will be used to choose the optimization method from the following list:
"L-BFGS-B" - BFGS quasi-Newton: stats
Package optim
function
"nlminb" - box-constrained optimization using PORT routines: stats
Package nlminb
function
"DEoptim" - Differential Evolution implementation: DEoptim
Package
Additionally to the above methods, several methods from the package nloptr
can be chosen.
The complete list of suitable nlopt methods (non-gradient, bound constraints) is:
"NLOPT_GN_DIRECT","NLOPT_GN_DIRECT_L","NLOPT_GN_DIRECT_L_RAND",
"NLOPT_GN_DIRECT_NOSCAL","NLOPT_GN_DIRECT_L_NOSCAL","NLOPT_GN_DIRECT_L_RAND_NOSCAL",
"NLOPT_GN_ORIG_DIRECT","NLOPT_GN_ORIG_DIRECT_L","NLOPT_LN_PRAXIS",
"NLOPT_GN_CRS2_LM","NLOPT_LN_COBYLA",
"NLOPT_LN_NELDERMEAD","NLOPT_LN_SBPLX","NLOPT_LN_BOBYQA","NLOPT_GN_ISRES"
All of the above methods use bound constraints.
For references and details on the specific methods, please check the documentation of the packages that provide them.
This function returns a list with:
xbest
parameters of the found solution
ybest
target function value of the found solution
count
number of evaluations of fun
One-shot optimizer: Create a design with maximum sum of distances, and evaluate. Best candidate is returned.
optimMaxMinDist(x = NULL, fun, control = list())
optimMaxMinDist(x = NULL, fun, control = list())
x |
Optional set of solution(s) as a list, which are added to the randomly generated solutions and are also evaluated with the target function. |
fun |
target function to be minimized |
control |
(list), with the options:
|
a list:
xbest
best solution found
ybest
fitness of the best solution
x
history of all evaluated solutions
y
corresponding target function values f(x)
count
number of performed target function evaluations
optimCEGO
, optimEA
, optimRS
, optim2Opt
seed=0 #distance dF <- distancePermutationHamming #creation cF <- function()sample(5) #objective function lF <- landscapeGeneratorUNI(1:5,dF) #start optimization set.seed(seed) res <- optimMaxMinDist(,lF,list(creationFunction=cF,budget=20, vectorized=TRUE)) ##target function is "vectorized", expects list as input res$xbest
seed=0 #distance dF <- distancePermutationHamming #creation cF <- function()sample(5) #objective function lF <- landscapeGeneratorUNI(1:5,dF) #start optimization set.seed(seed) res <- optimMaxMinDist(,lF,list(creationFunction=cF,budget=20, vectorized=TRUE)) ##target function is "vectorized", expects list as input res$xbest
An optimization algorithm from the family of Evolution Strategies, designed to optimize mixed-integer problems: The search space is composed of continuous (real-valued) parameters, ordinal integers and categorical parameters. Please note that the categorical parameters need to be coded as integers (type should not be a factor or character). It is an implementation (with a slight modification) of MIES as described by Li et al. (2013). Note, that this algorithm always has a step size for each solution parameter, unlike Li et al., we did not include the option to change to a single step-size for all parameters. Dominant recombination is used for solution parameters (the search space parameters), intermediate recombination for strategy parameters (i.e., step sizes). Mutation: Self-adaptive, step sizes sigma are optimized alongside the solution parameters. Real-valued parameters are subject to variation based on independent normal distributed random variables. Ordinal integers are subject to variation based on the difference of geometric distributions. Categorical parameters are changed at random, with a self-adapted probability. Note, that a more simple bound constraint method is used. Instead of the Transformation Ta,b(x) described by Li et al., optimMIES simply replaces any value that exceeds the bounds by respective boundary value.
optimMIES(x = NULL, fun, control = list())
optimMIES(x = NULL, fun, control = list())
x |
Optional start individual(s) as a list. If NULL (default), |
fun |
target function to be minimized. |
control |
(list), with the options:
|
The control variables types, lower, upper and levels are especially important.
a list:
xbest
best solution found.
ybest
fitness of the best solution.
x
history of all evaluated solutions.
y
corresponding target function values f(x).
count
number of performed target function evaluations.
message
Termination message: Which stopping criterion was reached.
population
Last population.
fitness
Fitness of last population.
Rui Li, Michael T. M. Emmerich, Jeroen Eggermont, Thomas Baeck, Martin Schuetz, Jouke Dijkstra, and Johan H. C. Reiber. 2013. Mixed integer evolution strategies for parameter optimization. Evol. Comput. 21, 1 (March 2013), 29-64.
optimCEGO
, optimRS
, optimEA
, optim2Opt
, optimMaxMinDist
set.seed(1) controlList <- list(lower=c(-5,-5,1,1,NA,NA),upper=c(10,5,10,10,NA,NA), types=c("numeric","numeric","integer","integer","factor","factor"), levels=list(NA,NA,NA,NA,c(1,3,5),1:4), vectorized = FALSE) objFun <- function(x){ x[[3]] <- round(x[[3]]) x[[4]] <- round(x[[4]]) y <- sum(as.numeric(x[1:4])^2) if(x[[5]]==1 & x[[6]]==4) y <- exp(y) else y <- y^2 if(x[[5]]==3) y<-y-1 if(x[[5]]==5) y<-y-2 if(x[[6]]==1) y<-y*2 if(x[[6]]==2) y<-y * 1.54 if(x[[6]]==3) y<- y +2 if(x[[6]]==4) y<- y * 0.5 if(x[[5]]==1) y<- y * 9 y } res <- optimMIES(,objFun,controlList) res$xbest res$ybest
set.seed(1) controlList <- list(lower=c(-5,-5,1,1,NA,NA),upper=c(10,5,10,10,NA,NA), types=c("numeric","numeric","integer","integer","factor","factor"), levels=list(NA,NA,NA,NA,c(1,3,5),1:4), vectorized = FALSE) objFun <- function(x){ x[[3]] <- round(x[[3]]) x[[4]] <- round(x[[4]]) y <- sum(as.numeric(x[1:4])^2) if(x[[5]]==1 & x[[6]]==4) y <- exp(y) else y <- y^2 if(x[[5]]==3) y<-y-1 if(x[[5]]==5) y<-y-2 if(x[[6]]==1) y<-y*2 if(x[[6]]==2) y<-y * 1.54 if(x[[6]]==3) y<- y +2 if(x[[6]]==4) y<- y * 0.5 if(x[[5]]==1) y<- y * 9 y } res <- optimMIES(,objFun,controlList) res$xbest res$ybest
Random Search for mixed or combinatorial optimization. Solutions are generated completely at random.
optimRS(x = NULL, fun, control = list())
optimRS(x = NULL, fun, control = list())
x |
Optional set of solution(s) as a list, which are added to the randomly generated solutions and are also evaluated with the target function. |
fun |
target function to be minimized |
control |
(list), with the options:
|
a list:
xbest
best solution found
ybest
fitness of the best solution
x
history of all evaluated solutions
y
corresponding target function values f(x)
count
number of performed target function evaluations
optimCEGO
, optimEA
, optim2Opt
, optimMaxMinDist
seed=0 #distance dF <- distancePermutationHamming #creation cF <- function()sample(5) #objective function lF <- landscapeGeneratorUNI(1:5,dF) #start optimization set.seed(seed) res <- optimRS(,lF,list(creationFunction=cF,budget=100, vectorized=TRUE)) ##target function is "vectorized", expects list as input res$xbest
seed=0 #distance dF <- distancePermutationHamming #creation cF <- function()sample(5) #objective function lF <- landscapeGeneratorUNI(1:5,dF) #start optimization set.seed(seed) res <- optimRS(,lF,list(creationFunction=cF,budget=100, vectorized=TRUE)) ##target function is "vectorized", expects list as input res$xbest
Predict with a model fit resulting from modelKriging
.
## S3 method for class 'modelKriging' predict(object, x, ...)
## S3 method for class 'modelKriging' predict(object, x, ...)
object |
fit of the Kriging model (settings and parameters), of class |
x |
list of samples to be predicted |
... |
further arguments, not used |
Returned value depends on the setting of object$predAll
TRUE: list with function value (mean) object$y
and uncertainty estimate object$s
(standard deviation)
FALSE:object$y
only
Predict with a model fit resulting from modelKrigingClust
.
## S3 method for class 'modelKrigingClust' predict(object, newdata, ...)
## S3 method for class 'modelKrigingClust' predict(object, newdata, ...)
object |
fit of the clustered Kriging model ensemble (settings and parameters), of class |
newdata |
list of samples to be predicted |
... |
further arguments, currently not used |
list with function value (mean) object$y
and uncertainty estimate object$s
(standard deviation)
Predict with amodelLinear fit.
## S3 method for class 'modelLinear' predict(object, x, ...)
## S3 method for class 'modelLinear' predict(object, x, ...)
object |
fit of the Kriging model (settings and parameters), of class |
x |
list of samples to be predicted |
... |
further arguments, not used |
numeric vector of predictions
Predict with a model fit resulting from modelRBFN
.
## S3 method for class 'modelRBFN' predict(object, x, ...)
## S3 method for class 'modelRBFN' predict(object, x, ...)
object |
fit of the RBFN model (settings and parameters), of class |
x |
list of samples to be predicted |
... |
further arguments, not used |
Returned value depends on the setting of object$predAll
TRUE: list with function value (mean) $y
and uncertainty estimate $s
(standard deviation)
FALSE:$y
only
Given a population of bit-strings, this function recombines each individual with another individual by randomly specifying a single position. Information before that position is taken from the first parent, the rest from the second.
recombinationBinary1Point(population, parameters)
recombinationBinary1Point(population, parameters)
population |
List of bit-strings |
parameters |
not used |
population of recombined offspring
Given a population of bit-strings, this function recombines each individual with another individual by randomly specifying 2 positions. Information in-between is taken from one parent, the rest from the other.
recombinationBinary2Point(population, parameters)
recombinationBinary2Point(population, parameters)
population |
List of bit-strings |
parameters |
not used |
population of recombined offspring
Given a population of bit-strings, this function recombines each
individual with another individual by computing parent1 & parent2
(logical AND).
recombinationBinaryAnd(population, parameters)
recombinationBinaryAnd(population, parameters)
population |
List of bit-strings |
parameters |
not used |
population of recombined offspring
Given a population of bit-strings, this function recombines each
individual with another individual by randomly picking bits from each parent.
Note, that optimEA
will not pass the whole population
to recombination functions, but only the chosen parents.
recombinationBinaryUniform(population, parameters)
recombinationBinaryUniform(population, parameters)
population |
List of bit-strings |
parameters |
not used |
population of recombined offspring
Given a population of permutations, this function recombines each
individual with another individual.
Note, that optimEA
will not pass the whole population
to recombination functions, but only the chosen parents.
recombinationPermutationAlternatingPosition(population, parameters)
recombinationPermutationAlternatingPosition(population, parameters)
population |
List of permutations |
parameters |
not used |
population of recombined offspring
Given a population of permutations, this function recombines each
individual with another individual.
Note, that optimEA
will not pass the whole population
to recombination functions, but only the chosen parents.
recombinationPermutationCycleCrossover(population, parameters)
recombinationPermutationCycleCrossover(population, parameters)
population |
List of permutations |
parameters |
not used |
population of recombined offspring
Given a population of permutations, this function recombines each
individual with another individual.
Note, that optimEA
will not pass the whole population
to recombination functions, but only the chosen parents.
recombinationPermutationOrderCrossover1(population, parameters)
recombinationPermutationOrderCrossover1(population, parameters)
population |
List of permutations |
parameters |
not used |
population of recombined offspring
Given a population of permutations, this function recombines each
individual with another individual.
Note, that optimEA
will not pass the whole population
to recombination functions, but only the chosen parents.
recombinationPermutationPositionBased(population, parameters)
recombinationPermutationPositionBased(population, parameters)
population |
List of permutations |
parameters |
not used |
population of recombined offspring
This recombination function selects an operator (provided in parameters$recombinationFunctions) based on self-adaptive parameters chosen for each individual separately.
recombinationSelfAdapt(population, parameters)
recombinationSelfAdapt(population, parameters)
population |
List of permutations |
parameters |
list, contains the available single mutation functions ( |
Given a population of strings, this function recombines each
individual with another random individual.
Note, that optimEA
will not pass the whole population
to recombination functions, but only the chosen parents.
recombinationStringSinglePointCrossover(population, parameters)
recombinationStringSinglePointCrossover(population, parameters)
population |
List of strings |
parameters |
not used |
population of recombined offspring
This function repairs correlation matrices, so that the following two properties are ensured: The correlations values should be between -1 and 1, and the diagonal values should be one.
repairConditionsCorrelationMatrix(mat)
repairConditionsCorrelationMatrix(mat)
mat |
symmetric, PSD distance matrix. If your matrix is not CNSD, use |
repaired correlation matrix
Martin Zaefferer and Thomas Bartz-Beielstein. (2016). Efficient Global Optimization with Indefinite Kernels. Parallel Problem Solving from Nature-PPSN XIV. Accepted, in press. Springer.
correctionDefinite
, correctionDistanceMatrix
, correctionKernelMatrix
, correctionCNSD
, repairConditionsDistanceMatrix
x <- list(c(2,1,4,3),c(2,4,3,1),c(4,2,1,3),c(4,3,2,1),c(1,4,3,2)) D <- distanceMatrix(x,distancePermutationInsert) K <- exp(-0.01*D) K <- correctionDefinite(K,type="PSD")$mat K K <- repairConditionsCorrelationMatrix(K)
x <- list(c(2,1,4,3),c(2,4,3,1),c(4,2,1,3),c(4,3,2,1),c(1,4,3,2)) D <- distanceMatrix(x,distancePermutationInsert) K <- exp(-0.01*D) K <- correctionDefinite(K,type="PSD")$mat K K <- repairConditionsCorrelationMatrix(K)
This function repairs distance matrices, so that the following two properties are ensured: The distance values should be non-zero and the diagonal should be zero. Other properties (conditionally negative semi-definitene (CNSD), symmetric) are assumed to be given.
repairConditionsDistanceMatrix(mat)
repairConditionsDistanceMatrix(mat)
mat |
symmetric, CNSD distance matrix. If your matrix is not CNSD, use |
repaired distance matrix
Martin Zaefferer and Thomas Bartz-Beielstein. (2016). Efficient Global Optimization with Indefinite Kernels. Parallel Problem Solving from Nature-PPSN XIV. Accepted, in press. Springer.
correctionDefinite
, correctionDistanceMatrix
, correctionKernelMatrix
, correctionCNSD
, repairConditionsCorrelationMatrix
x <- list(c(2,1,4,3),c(2,4,3,1),c(4,2,1,3),c(4,3,2,1),c(1,4,3,2)) D <- distanceMatrix(x,distancePermutationInsert) D <- correctionCNSD(D) D D <- repairConditionsDistanceMatrix(D) D
x <- list(c(2,1,4,3),c(2,4,3,1),c(4,2,1,3),c(4,3,2,1),c(1,4,3,2)) D <- distanceMatrix(x,distancePermutationInsert) D <- correctionCNSD(D) D D <- repairConditionsDistanceMatrix(D) D
(Conditional) Simulate at given locations, with a model fit resulting from modelKriging
.
In contrast to prediction or estimation, the goal is to reproduce the covariance
structure, rather than the data itself. Note, that the conditional simulation
also reproduces the training data, but
has a two times larger error than the Kriging predictor.
## S3 method for class 'modelKriging' simulate( object, nsim = 1, seed = NA, xsim, conditionalSimulation = TRUE, returnAll = FALSE, ... )
## S3 method for class 'modelKriging' simulate( object, nsim = 1, seed = NA, xsim, conditionalSimulation = TRUE, returnAll = FALSE, ... )
object |
fit of the Kriging model (settings and parameters), of class |
nsim |
number of simulations |
seed |
random number generator seed. Defaults to NA, in which case no seed is set |
xsim |
list of samples in input space, to be simulated |
conditionalSimulation |
logical, if set to TRUE (default), the simulation is conditioned with the training data of the Kriging model. Else, the simulation is non-conditional. |
returnAll |
if set to TRUE, a list with the simulated values (y) and the corresponding covariance matrix (covar) of the simulated samples is returned. |
... |
further arguments, not used |
Returned value depends on the setting of object$simulationReturnAll
N. A. Cressie. Statistics for Spatial Data. JOHN WILEY & SONS INC, 1993.
C. Lantuejoul. Geostatistical Simulation - Models and Algorithms. Springer-Verlag Berlin Heidelberg, 2002.
modelKriging
, predict.modelKriging
Returns a function that generates random bit-strings of length N. Can be used to create individuals of NK-Landscapes or other problems with binary representation.
solutionFunctionGeneratorBinary(N)
solutionFunctionGeneratorBinary(N)
N |
length of the bit-strings |
returns a function, without any arguments
Returns a function that generates random permutations of length N. Can be used to generate individual solutions for permutation problems, e.g., Travelling Salesperson Problem
solutionFunctionGeneratorPermutation(N)
solutionFunctionGeneratorPermutation(N)
N |
length of the permutations returned |
returns a function, without any arguments
fun <- solutionFunctionGeneratorPermutation(10) fun() fun() fun()
fun <- solutionFunctionGeneratorPermutation(10) fun() fun() fun()
Returns a function that generates random strings of length N, with given letters. Can be used to generate individual solutions for permutation problems, e.g., Travelling Salesperson Problem
solutionFunctionGeneratorString(N, lts = c("A", "C", "G", "T"))
solutionFunctionGeneratorString(N, lts = c("A", "C", "G", "T"))
N |
length of the permutations returned |
lts |
letters allowed in the string |
returns a function, without any arguments
fun <- solutionFunctionGeneratorString(10,c("A","C","G","T")) fun() fun() fun()
fun <- solutionFunctionGeneratorString(10,c("A","C","G","T")) fun() fun() fun()
Generate test functions for assessment of optimization algorithms with non-conditional or conditional simulation, based on real-world data.
testFunctionGeneratorSim( x, y, xsim, distanceFunction, controlModel = list(), controlSimulation = list() )
testFunctionGeneratorSim( x, y, xsim, distanceFunction, controlModel = list(), controlSimulation = list() )
x |
list of samples in input space, training data |
y |
column vector of observations for each sample, training data |
xsim |
list of samples in input space, for simulation |
distanceFunction |
a suitable distance function of type f(x1,x2), returning a scalar distance value, preferably between 0 and 1. Maximum distances larger 1 are no problem, but may yield scaling bias when different measures are compared. Should be non-negative and symmetric. It can also be a list of several distance functions. In this case, Maximum Likelihood Estimation (MLE) is used to determine the most suited distance measure. The distance function may have additional parameters. For that case, see distanceParametersLower/Upper in the controls. If distanceFunction is missing, it can also be provided in the control list. |
controlModel |
(list), with the options for the model building procedure,
it will be passed to the |
controlSimulation |
(list), with the parameters of the simulation:
|
a list with the following elements: fun
is a list of functions, where each function is the interpolation of one simulation realization. The length of the list depends on the nsim parameter.
fit
is the result of the modeling procedure, that is, the model fit of class modelKriging
.
N. A. Cressie. Statistics for Spatial Data. JOHN WILEY & SONS INC, 1993.
C. Lantuejoul. Geostatistical Simulation - Models and Algorithms. Springer-Verlag Berlin Heidelberg, 2002.
Zaefferer, M.; Fischbach, A.; Naujoks, B. & Bartz-Beielstein, T. Simulation Based Test Functions for Optimization Algorithms Proceedings of the Genetic and Evolutionary Computation Conference 2017, ACM, 2017, 8.
modelKriging
, simulate.modelKriging
, createSimulatedTestFunction
,
nsim <- 10 seed <- 12345 n <- 6 set.seed(seed) #target function: fun <- function(x){ exp(-20* x) + sin(6*x^2) + x } # "vectorize" target f <- function(x){sapply(x,fun)} dF <- function(x,y)(sum((x-y)^2)) #sum of squares # plot params par(mfrow=c(4,1),mar=c(2.3,2.5,0.2,0.2),mgp=c(1.4,0.5,0)) #test samples for plots xtest <- as.list(seq(from=-0,by=0.005,to=1)) plot(xtest,f(xtest),type="l",xlab="x",ylab="Obj. function") #evaluation samples (training) xb <- as.list(runif(n)) yb <- f(xb) # support samples for simulation x <- as.list(sort(c(runif(100),unlist(xb)))) # fit the model and simulate: res <- testFunctionGeneratorSim(xb,yb,x,dF, list(algThetaControl=list(method="NLOPT_GN_DIRECT_L",funEvals=100), useLambda=FALSE), list(nsim=nsim,conditionalSimulation=FALSE)) fit <- res$fit fun <- res$fun #predicted obj. function values ypred <- predict(fit,as.list(xtest))$y plot(unlist(xtest),ypred,type="l",xlab="x",ylab="Estimation") points(unlist(xb),yb,pch=19) ############################## # plot non-conditional simulation ############################## ynew <- NULL for(i in 1:nsim) ynew <- cbind(ynew,fun[[i]](xtest)) rangeY <- range(ynew) plot(unlist(xtest),ynew[,1],type="l",ylim=rangeY,xlab="x",ylab="Simulation") for(i in 2:nsim){ lines(unlist(xtest),ynew[,i],col=i,type="l") } ############################## # create and plot test function, conditional ############################## fun <- testFunctionGeneratorSim(xb,yb,x,dF, list(algThetaControl= list(method="NLOPT_GN_DIRECT_L",funEvals=100), useLambda=FALSE), list(nsim=nsim,conditionalSimulation=TRUE))$fun ynew <- NULL for(i in 1:nsim) ynew <- cbind(ynew,fun[[i]](xtest)) rangeY <- range(ynew) plot(unlist(xtest),ynew[,1],type="l",ylim=rangeY,xlab="x",ylab="Conditional sim.") for(i in 2:nsim){ lines(unlist(xtest),ynew[,i],col=i,type="l") } points(unlist(xb),yb,pch=19)
nsim <- 10 seed <- 12345 n <- 6 set.seed(seed) #target function: fun <- function(x){ exp(-20* x) + sin(6*x^2) + x } # "vectorize" target f <- function(x){sapply(x,fun)} dF <- function(x,y)(sum((x-y)^2)) #sum of squares # plot params par(mfrow=c(4,1),mar=c(2.3,2.5,0.2,0.2),mgp=c(1.4,0.5,0)) #test samples for plots xtest <- as.list(seq(from=-0,by=0.005,to=1)) plot(xtest,f(xtest),type="l",xlab="x",ylab="Obj. function") #evaluation samples (training) xb <- as.list(runif(n)) yb <- f(xb) # support samples for simulation x <- as.list(sort(c(runif(100),unlist(xb)))) # fit the model and simulate: res <- testFunctionGeneratorSim(xb,yb,x,dF, list(algThetaControl=list(method="NLOPT_GN_DIRECT_L",funEvals=100), useLambda=FALSE), list(nsim=nsim,conditionalSimulation=FALSE)) fit <- res$fit fun <- res$fun #predicted obj. function values ypred <- predict(fit,as.list(xtest))$y plot(unlist(xtest),ypred,type="l",xlab="x",ylab="Estimation") points(unlist(xb),yb,pch=19) ############################## # plot non-conditional simulation ############################## ynew <- NULL for(i in 1:nsim) ynew <- cbind(ynew,fun[[i]](xtest)) rangeY <- range(ynew) plot(unlist(xtest),ynew[,1],type="l",ylim=rangeY,xlab="x",ylab="Simulation") for(i in 2:nsim){ lines(unlist(xtest),ynew[,i],col=i,type="l") } ############################## # create and plot test function, conditional ############################## fun <- testFunctionGeneratorSim(xb,yb,x,dF, list(algThetaControl= list(method="NLOPT_GN_DIRECT_L",funEvals=100), useLambda=FALSE), list(nsim=nsim,conditionalSimulation=TRUE))$fun ynew <- NULL for(i in 1:nsim) ynew <- cbind(ynew,fun[[i]](xtest)) rangeY <- range(ynew) plot(unlist(xtest),ynew[,1],type="l",ylim=rangeY,xlab="x",ylab="Conditional sim.") for(i in 2:nsim){ lines(unlist(xtest),ynew[,i],col=i,type="l") } points(unlist(xb),yb,pch=19)