License: CC BY-NC-ND 4.0
arXiv:2403.11307v1 [cs.NE] 17 Mar 2024

An upper bound of the mutation probability in the genetic algorithm for general 0-1 knapsack problem

Yang Yang Email: zhugemutian@outlook.com School of Mathematical Sciences, Xiamen University, Xiamen 361005, P.R. China
Abstract

As an important part of genetic algorithms (GAs), mutation operators is widely used in evolutionary algorithms to solve 𝒩𝒫𝒩𝒫\mathcal{NP}caligraphic_N caligraphic_P-hard problems because it can increase the population diversity of individual. Due to limitations in mathematical tools, the mutation probability of the mutation operator is primarily empirically set in practical applications.

In this paper, we propose a novel reduction method for the 0-1 knapsack problem(0-1 KP) and an improved mutation operator (IMO) based on the assumption 𝒩𝒫𝒫𝒩𝒫𝒫\mathcal{NP}\neq\mathcal{P}caligraphic_N caligraphic_P ≠ caligraphic_P, along with the utilization of linear relaxation techniques and a recent result by Dey et al. (Math. Prog., pp 569-587, 2022). We employ this method to calculate an upper bound of the mutation probability in general instances of the 0-1 KP, and construct an instance where the mutation probability does not tend towards 0 as the problem size increases. Finally, we prove that the probability of the IMO hitting the optimal solution within only a single iteration in large-scale instances is superior to that of the traditional mutation operator.

Keywords: genetic algorithm; mutation operator; mutation probability; upper bound; reduction method

1 Introduction

Definition 1.

[1, 2, 3, 4] Given a set N𝑁Nitalic_N consisting of n𝑛nitalic_n items, where the j𝑗jitalic_j-th item has a profit and a weight, and denoted by pjsubscript𝑝𝑗p_{j}italic_p start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT and wjsubscript𝑤𝑗w_{j}italic_w start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT, respectively. The goal of the 0-1 knapsack problem(0-1 KP) is to find an optimal subset of the set N𝑁Nitalic_N such that its total profit is maximized without exceeding the knapsack capacity C𝐶Citalic_C. The problem can be formulated as follows.

maxf(𝑿)=j=1nxjpj,𝑓𝑿superscriptsubscript𝑗1𝑛subscript𝑥𝑗subscript𝑝𝑗\displaystyle\max f(\textbf{X})=\sum\limits_{j=1}^{n}x_{j}p_{j},roman_max italic_f ( X ) = ∑ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT italic_p start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , (1.1)

s.t.

g(𝑿)=j=1nxjwjC,𝑔𝑿superscriptsubscript𝑗1𝑛subscript𝑥𝑗subscript𝑤𝑗𝐶\displaystyle g(\textbf{X})=\sum\limits_{j=1}^{n}x_{j}w_{j}\leq C,italic_g ( X ) = ∑ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT italic_w start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ≤ italic_C , (1.2)
xj{0,1},subscript𝑥𝑗01\displaystyle x_{j}\in\{0,1\},italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ∈ { 0 , 1 } , (1.3)

where xj=1subscript𝑥𝑗1x_{j}=1italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT = 1 represents the j𝑗jitalic_j-th item is packed in the knapsack while xj=0subscript𝑥𝑗0x_{j}=0italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT = 0 indicates not. The objective function (1.1) of the model aims to maximize the total profit of the selected items. Inequality (1.2) is a capacity constraint that ensures the total weight of selected items do not exceed the knapsack capacity C𝐶Citalic_C. The decision variables xjsubscript𝑥𝑗x_{j}italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT are binary, as indicated in (1.3).

A substantial number of commercial decision-making and industrial manufacturing problems can be formulated as the 0-1 KP, such as budget control and cargo loading. This has resulted in a significant focus on developing efficient algorithms to tackle these problems. However, the finding for a polynomial-time algorithm to exactly solve the 0-1 KP, which is widely recognized as an 𝒩𝒫𝒩𝒫\mathcal{NP}caligraphic_N caligraphic_P-hard problem[5, 6, 7], poses a substantial challenge. Existing algorithms like dynamic programming [8, 9] can only solve the problem exactly within a pseudo-polynomial time complexity of O(nC)𝑂𝑛𝐶O(nC)italic_O ( italic_n italic_C ), and even the fastest exact algorithms require O(2n/2)𝑂superscript2𝑛2O(2^{n/2})italic_O ( 2 start_POSTSUPERSCRIPT italic_n / 2 end_POSTSUPERSCRIPT ) [10, 11]. Additionally, a significant body of research indicates that unless one intends to demonstrate 𝒩𝒫=𝒫𝒩𝒫𝒫\mathcal{NP}=\mathcal{P}caligraphic_N caligraphic_P = caligraphic_P, we can assume 𝒩𝒫𝒫𝒩𝒫𝒫\mathcal{NP}\neq\mathcal{P}caligraphic_N caligraphic_P ≠ caligraphic_P [12, 13]. If 𝒩𝒫𝒫𝒩𝒫𝒫\mathcal{NP}\neq\mathcal{P}caligraphic_N caligraphic_P ≠ caligraphic_P, for a general instance of the 0-1 KP, there are items whose selection in the optimal solution cannot be determined without exhaustively enumerating all feasible solutions.

Coincidentally, in the 1960s, Holland proposed a genetic algorithm(GA) [14] that simulates biological evolution, consisting of selection, crossover, and mutation operators, primarily used to search the binary solution space. Genetic algorithms(GAs) do not require the solution space to be continuous or need to calculate the gradient of the objective function during iteration, and as a result, the algorithm has been widely applied to solving various 𝒩𝒫𝒩𝒫\mathcal{NP}caligraphic_N caligraphic_P-hard problems, such as the task scheduling problem [15], the Traveling Salesman Problem [16], and the Vehicle Routing Problems [17].

It is worth noting that the convergence analysis of GAs is primarily limited to convex problems or 𝒫𝒫\mathcal{P}caligraphic_P problems [18, 19, 20, 21, 22, 23]. However, there is a lack of theoretical results for 𝒩𝒫𝒩𝒫\mathcal{NP}caligraphic_N caligraphic_P-hard problems, especially for the optimal parameter control of operators such as the mutation operator (MO)[24, 25, 26]. Despite the widespread and efficient performance of GAs in practical industrial and application processes[27], there is still widespread skepticism among researchers regarding GA research[28, 29, 30, Seretal2019]. Moreover, the No Free Lunch (NFL) theorem states that there is no superior parameterization for every problem[27, 31, 32, 33, 34], but it is still important to determine if there exists an optimal parameterization for a given 0-1 KP instance.

1.1 Our results

In this paper, we build upon the assumption of 𝒩𝒫𝒫𝒩𝒫𝒫\mathcal{NP}\neq\mathcal{P}caligraphic_N caligraphic_P ≠ caligraphic_P [5, 6, 7], improve the recent theoretical results of the branch and bound algorithm (B&B) [35, 36], integrating generating function approaches[37, 38], introduce a new reduction method and an improved mutation operator (IMO). Simultaneously, the method is applicable in computing an upper bound of the mutation probability for the 0-1 KP, as well as in constructing a counterexample where the mutation probability does not tend toward 0 with an increasing number of decision variables [25]. The contributions of this paper can be summarized as follows.

  1. 1.

    We propose an improved reduction method for the 0-1 KP, and provide the upper bound of leaves in the search tree (Theorem 3.6).

  2. 2.

    We propose the IMO and provide a general mathematical formulation for the upper bound of the mutation probability in solving the 0-1 KP (Theorem 3.7).

  3. 3.

    We construct examples where the mutation probability of the IMO does not tend towards 0 with the increase in problem size (Theorem 3.9).

  4. 4.

    We demonstrate that for large-scale instances, the performance of the IMO is superior to that of the MO (Theorem 3.11).

1.2 Organization of remainder paper

The remainder of this paper is organized as follows. In Section 2, we introduce the preliminary and relevant concepts of GAs. Section 3 presents the theoretical results, including a novel reduction method, the IMO proposed from this reduction method, and the associated theoretical findings. Additionally, we provide a convergence analysis of both the IMO and the MO. Finally, in the last section, we summarize the work of this paper and discuss future research directions.

2 Preliminary and existing conclusions

2.1 Preliminary

From the objective function (1.1) and inequality (1.2), a primitive greedy algorithm can be easily identified: the likelihood of an item appearing in the optimal solution increases as its profit increases and its weight decreases. Consequently, we introduce the classic concept of profit density[1, 2] to characterize the return on profit per unit weight of the items.

Definition 2.

For jN𝑗𝑁j\in Nitalic_j ∈ italic_N, the profit density of item j𝑗jitalic_j is defined by

ej=pj/wj.subscript𝑒𝑗subscript𝑝𝑗subscript𝑤𝑗\displaystyle e_{j}=p_{j}/w_{j}.italic_e start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT = italic_p start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT / italic_w start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT . (2.4)

Without loss of generality, we always assume that the items are arranged in non-increasing order, denoted as e1e2ensubscript𝑒1subscript𝑒2subscript𝑒𝑛e_{1}\geq e_{2}\geq\dots\geq e_{n}italic_e start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ≥ italic_e start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ≥ ⋯ ≥ italic_e start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT. The greedy algorithm selects items according to their profit density in descending order until the first item that cannot be packed in the knapsack. The item is known as the break item b𝑏bitalic_b and the solution obtained employing this greedy algorithm is the break solution. The characteristics of the break item and the break solution are defined as follows.

Definition 3.

The break item b𝑏bitalic_b is defined to be the first item that cannot be packed in the knapsack with profit density descending order, that is,

j=1b1wjC𝑎𝑛𝑑j=1bwj>Cformulae-sequencesuperscriptsubscript𝑗1𝑏1subscript𝑤𝑗𝐶𝑎𝑛𝑑superscriptsubscript𝑗1𝑏subscript𝑤𝑗𝐶\displaystyle\sum\limits_{j=1}^{b-1}w_{j}\leq C\quad\text{and}\quad\sum\limits% _{j=1}^{b}w_{j}>C∑ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_b - 1 end_POSTSUPERSCRIPT italic_w start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ≤ italic_C and ∑ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_b end_POSTSUPERSCRIPT italic_w start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT > italic_C (2.5)

Correspondingly, the residual capacity r𝑟ritalic_r is defined as

r=Cj=1b1wj𝑟𝐶superscriptsubscript𝑗1𝑏1subscript𝑤𝑗\displaystyle r=C-\sum\limits_{j=1}^{b-1}w_{j}italic_r = italic_C - ∑ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_b - 1 end_POSTSUPERSCRIPT italic_w start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT (2.6)
Definition 4.

The break solution is donated as 𝐗*=(x1*,x2*,,xn*){0,1}nsuperscript𝐗subscriptsuperscript𝑥1subscriptsuperscript𝑥2normal-…subscriptsuperscript𝑥𝑛superscript01𝑛\textbf{X}^{*}=(x^{*}_{1},x^{*}_{2},\dots,x^{*}_{n})\in\{0,1\}^{n}X start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT = ( italic_x start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_x start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , … , italic_x start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) ∈ { 0 , 1 } start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT, where xj*=1subscriptsuperscript𝑥𝑗1x^{*}_{j}=1italic_x start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT = 1 if j{1,2,,b1}𝑗12normal-…𝑏1j\in\{1,2,\dots,b-1\}italic_j ∈ { 1 , 2 , … , italic_b - 1 } and xj*=0subscriptsuperscript𝑥𝑗0x^{*}_{j}=0italic_x start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT = 0 otherwise. For the sake of comparison, let Y be the optimal solution.

To further verify that items with higher profit density are more likely to be selected in the optimal solution, Pisinger[39] conducted a comparison between the optimal solution Y and the break solution 𝐗*superscript𝐗\textbf{X}^{*}X start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT for 1000 instances. Each instance had a data size of 1000, with the break item being the 500th. Pisinger found that the primary concentrations of differences between Y and 𝐗*superscript𝐗\textbf{X}^{*}X start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT were around the break item.

Based on empirical evidence derived from the greedy algorithm and a significant number of instances, it can be empirically observed that items with higher profit density tend to be more frequently selected for inclusion in the optimal solution Y. However, it is regrettable that this observation may be challenged by a classic counterexample, which can be shown as follows.

Example 2.1.

[2] Let n=2𝑛2n=2italic_n = 2 and C=M𝐶𝑀C=Mitalic_C = italic_M. The first item is given by w1=1subscript𝑤11w_{1}=1italic_w start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = 1 and p1=2subscript𝑝12p_{1}=2italic_p start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = 2 whereas the second item is defined by w2=p2=Msubscript𝑤2subscript𝑝2𝑀w_{2}=p_{2}=Mitalic_w start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = italic_p start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = italic_M, where M𝑀Mitalic_M is a large constant.

Clearly, the break solution is 𝐗*=(1,0)superscript𝐗10\textbf{X}^{*}=(1,0)X start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT = ( 1 , 0 ), but the optimal solution for the counterexample is 𝐘=(0,1)𝐘01\textbf{Y}=(0,1)Y = ( 0 , 1 ). From this counterexample, we can observe that the profit density of an item does not necessarily determine whether it will be selected in the optimal solution, and also implies that in the process of solving through GA, the optimal probability distribution for the MO cannot be determined for a given instance.

Although it is currently impossible to determine all decision variables in polynomial time complexity, with the continuous development of solving algorithms, a reduction algorithm has been discovered that can exactly solve a part of decision variables in polynomial time complexity. Given an instance 𝒫𝒫\mathscr{P}script_P, the reduction algorithm treats the solving process as a search tree, where each branch represents a decision variable in the problem. We employ the notation (𝒫|xj=α)conditional𝒫subscript𝑥𝑗𝛼(\mathscr{P}|x_{j}=\alpha)( script_P | italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT = italic_α ) to denote the subproblem where the j𝑗jitalic_j-th decision variable is fixed at α{0,1}𝛼01\alpha\in\{0,1\}italic_α ∈ { 0 , 1 }. To facilitate subsequent discussions, v¯(𝒫|xj=α)¯𝑣conditional𝒫subscript𝑥𝑗𝛼\overline{v}(\mathscr{P}|x_{j}=\alpha)over¯ start_ARG italic_v end_ARG ( script_P | italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT = italic_α ) is introduced as an upper bound for this subproblem. Furthermore, we introduce v¯(𝒫)¯𝑣𝒫\underline{v}(\mathscr{P})under¯ start_ARG italic_v end_ARG ( script_P ) as a lower bound for the original problem 𝒫𝒫\mathscr{P}script_P, typically computed using the greedy algorithm or a meta-heuristic algorithm.

Despite the subproblem (𝒫|xj=α)conditional𝒫subscript𝑥𝑗𝛼(\mathscr{P}|x_{j}=\alpha)( script_P | italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT = italic_α ) is also known to be 𝒩𝒫𝒩𝒫\mathcal{NP}caligraphic_N caligraphic_P-hard, relaxing the solution space to the real field converts it into a convex problem[40]. Consequently, the optimal solution for the relaxed subproblem, v¯(𝒫|xj=α)¯𝑣conditional𝒫subscript𝑥𝑗𝛼\overline{v}(\mathscr{P}|x_{j}=\alpha)over¯ start_ARG italic_v end_ARG ( script_P | italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT = italic_α ), can be obtained within polynomial time complexity. These findings lead to several significant conclusions.

Theorem 2.2.

For the 0-1 KP, if v¯(𝒫|xj=α)v¯(𝒫)normal-¯𝑣conditional𝒫subscript𝑥𝑗𝛼normal-¯𝑣𝒫\overline{v}(\mathscr{P}|x_{j}=\alpha)\leq\underline{v}(\mathscr{P})over¯ start_ARG italic_v end_ARG ( script_P | italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT = italic_α ) ≤ under¯ start_ARG italic_v end_ARG ( script_P ), then xjsubscript𝑥𝑗x_{j}italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT is set to 1α1𝛼1-\alpha1 - italic_α in the optimal solution, where α{0,1}𝛼01\alpha\in\{0,1\}italic_α ∈ { 0 , 1 }.

The decision variable xjsubscript𝑥𝑗x_{j}italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT is fixed, and the branch xj=αsubscript𝑥𝑗𝛼x_{j}=\alphaitalic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT = italic_α is pruned. To compute v¯(𝒫|xj=α)¯𝑣conditional𝒫subscript𝑥𝑗𝛼\overline{v}(\mathscr{P}|x_{j}=\alpha)over¯ start_ARG italic_v end_ARG ( script_P | italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT = italic_α ), the most common approach is to relax the solution space from the integer field to the real field. By doing this, we can obtain the well-known Dantzig upper bound[41], which is denoted as U𝑈Uitalic_U and is given by

U=j=1b1pj+rpbw,𝑈superscriptsubscript𝑗1𝑏1subscript𝑝𝑗𝑟subscript𝑝𝑏𝑤\displaystyle U=\sum\limits_{j=1}^{b-1}p_{j}+\frac{rp_{b}}{w},italic_U = ∑ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_b - 1 end_POSTSUPERSCRIPT italic_p start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT + divide start_ARG italic_r italic_p start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT end_ARG start_ARG italic_w end_ARG ,

where r𝑟ritalic_r represents the residual capacity. It is clear that the decision variables dominated by the Dantzig upper bound U𝑈Uitalic_U can be expressed as follows:

Theorem 2.3.

[39] For the j𝑗jitalic_j-th item, if ejebsubscript𝑒𝑗subscript𝑒𝑏e_{j}\geq e_{b}italic_e start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ≥ italic_e start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT(resp. ej<ebsubscript𝑒𝑗subscript𝑒𝑏e_{j}<e_{b}italic_e start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT < italic_e start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT) and pj>(wj+r)pb/wbsubscript𝑝𝑗subscript𝑤𝑗𝑟subscript𝑝𝑏subscript𝑤𝑏p_{j}>(w_{j}+r)p_{b}/w_{b}italic_p start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT > ( italic_w start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT + italic_r ) italic_p start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT / italic_w start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT(resp. pj<(wjr)pb/wbsubscript𝑝𝑗subscript𝑤𝑗𝑟subscript𝑝𝑏subscript𝑤𝑏p_{j}<(w_{j}-r)p_{b}/w_{b}italic_p start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT < ( italic_w start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT - italic_r ) italic_p start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT / italic_w start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT), then we have xj=1subscript𝑥𝑗1x_{j}=1italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT = 1(resp. xj=0subscript𝑥𝑗0x_{j}=0italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT = 0) in the optimal solution.

The time complexity of computing the Dantzig bound is O(n)𝑂𝑛O(n)italic_O ( italic_n ). It is evident that we can solve a decision variable exactly within polynomial time complexity. A significant amount of research has been conducted to improve the upper bounds obtained within the time complexity of O(n2)𝑂superscript𝑛2O(n^{2})italic_O ( italic_n start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) [39, 42, 43, 44, 45], thereby enabling the exact fixing of more decision variables within polynomial time complexity. However, in the worst-case scenario, the method cannot solve all decision variables. Consequently, for a preprocessed 𝒩𝒫𝒩𝒫\mathcal{NP}caligraphic_N caligraphic_P-hard problem, to the best of our knowledge, there is currently no relevant research available regarding which decision variables are more likely to be selected in the optimal solution. Moreover, if an 𝒩𝒫𝒩𝒫\mathcal{NP}caligraphic_N caligraphic_P-hard problem can be exactly solved within polynomial time complexity, heuristic algorithms, such as GAs, that can only provide locally optimal solutions within polynomial time complexity lose their significance. Based on the current state of research, we make the following assumptions.

Assuption 1.

𝒩𝒫𝒫𝒩𝒫𝒫\mathcal{NP}\neq\mathcal{P}caligraphic_N caligraphic_P ≠ caligraphic_P.

If Assumption 1 holds, then for an unpruned feasible solution, we cannot determine its status as an exact solution within polynomial time complexity. Consequently, each unpruned feasible solution has an equal probability of being the optimal solution.

2.2 Genetic algorithm Framework

With the prosperity of commercial activities, a vast number of business operations can be described as 𝒩𝒫𝒩𝒫\mathcal{NP}caligraphic_N caligraphic_P-hard problems. Considering Assumption 1, the development of exact algorithms for solving such problems has been slow. Meanwhile, heuristic algorithms have flourished [50, 51]. As a classic algorithm in the field of heuristics, GAs have been extensively applied to solve 𝒩𝒫𝒩𝒫\mathcal{NP}caligraphic_N caligraphic_P-hard problems, particularly the knapsack problem and its variants [46], over the past decades. In GAs, each solution in the search space can be represented by an individual composed of several genes. By simulating natural selection, genetic mutation, and crossover operations, GAs evolve and select individuals within the current population, aiming to find the optimal solution. The flowchart of the GA can be illustrated in Figure 1. It is worth noting that Rudolph pointed out, through Markov chain analysis, that the GA lacks global convergence, whereas the Elitist Genetic Algorithm (EGA), which selects the current population and the optimal individual from the previous generation, possesses the property of converging to the global optimum as time approaches infinity [22]. The main distinction between EGA and GA lies in the selection operators, while the same mutation operator is employed. Since this paper primarily focuses on the theoretical results of mutation operators, we will still adopt the representation of the GA in this study.

Initialize PopulationFitness EvaluationptCheckTerminationCriteriaptReturnSolutionCrossover OperatorMutation OperatorSelection OperatorptNextIterationNoYes
Figure 1: Flowchart of a basic genetic algorithm

The GA begins by initializing the population and generating initial solutions, and the fitness value of each individual is calculated by the objective function. The algorithm then checks if the termination condition is satisfied based on the specified number of iterations. If the termination condition is met, the results are output; otherwise, the next generation population is obtained by applying the crossover, mutation, and selection operators sequentially. Common crossover operators include single-point crossover, k-point crossover, and uniform crossover. Frequent mutation operators include flip bit mutation, swap mutation, and inversion mutation. Selection operators mainly consist of roulette wheel selection, stochastic universal sampling, and rank-based selection. Various works on these operators are available in the references [47, 48]. In this study, we utilize the widely employed single-point crossover, flip bit mutation, and roulette wheel selection operators, along with presenting pseudo-code for the GA.

Let Otk(j)superscriptsubscript𝑂𝑡𝑘𝑗O_{t}^{k}(j)italic_O start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT ( italic_j ) represent the value of the j𝑗jitalic_j-th decision variable in the k𝑘kitalic_k-th individual of the t𝑡titalic_t-th (1kpop)1𝑘pop(1\leq k\leq\text{pop})( 1 ≤ italic_k ≤ pop ) generation where pop represents the population size, and let minf(Ot)𝑓subscript𝑂𝑡\min f(O_{t})roman_min italic_f ( italic_O start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ) (resp. maxf(Ot)𝑓subscript𝑂𝑡\max f(O_{t})roman_max italic_f ( italic_O start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT )) denote the minimum (resp. maximum) fitness value of individuals in the t𝑡titalic_t-th generation population. Additionally, LOS refers to the local optimal solution, while I𝐼Iitalic_I and t𝑡titalic_t represent the maximum number of iterations and the current iteration number, respectively. Moreover, pmsubscript𝑝𝑚p_{m}italic_p start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT and pcsubscript𝑝𝑐p_{c}italic_p start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT represent the mutation probability and the crossover probability, respectively.

Algorithm 1 Genetic algorithm
  Input: n,W,P,C,pop,I,pc,pm𝑛𝑊𝑃𝐶pop𝐼subscript𝑝𝑐subscript𝑝𝑚n,W,P,C,\text{pop},I,p_{c},p_{m}italic_n , italic_W , italic_P , italic_C , pop , italic_I , italic_p start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT , italic_p start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT
  Output: LOS
  Generating initial population O1subscript𝑂1O_{1}italic_O start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and let t=1𝑡1t=1italic_t = 1.
  while tI𝑡𝐼t\leq Iitalic_t ≤ italic_I do
     //Crossover Operator
     for k=1:2:pop:𝑘12:popk=1:2:\text{pop}italic_k = 1 : 2 : pop do
        if randpcrandsubscript𝑝𝑐{\rm rand}\leq p_{c}roman_rand ≤ italic_p start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT then
           rt=rand*(n1)+1,T=Otkformulae-sequencesubscript𝑟𝑡rand𝑛11𝑇superscriptsubscript𝑂𝑡𝑘r_{t}=\text{rand}*(n-1)+1,T=O_{t}^{k}italic_r start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT = rand * ( italic_n - 1 ) + 1 , italic_T = italic_O start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT,
           Otk(rt+1:n)=Otk+1(rt+1:n),O_{t}^{k}(r_{t}+1:n)=O_{t}^{k+1}(r_{t}+1:n),italic_O start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT ( italic_r start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT + 1 : italic_n ) = italic_O start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k + 1 end_POSTSUPERSCRIPT ( italic_r start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT + 1 : italic_n ) ,
           Otk+1(rt+1:n)=T(rt+1:n).O_{t}^{k+1}(r_{t}+1:n)=T(r_{t}+1:n).italic_O start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k + 1 end_POSTSUPERSCRIPT ( italic_r start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT + 1 : italic_n ) = italic_T ( italic_r start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT + 1 : italic_n ) .
        end if
     end for
     //Mutation Operator
     Ot=|Ot[rand(pop,n)pm]|.subscript𝑂𝑡subscript𝑂𝑡delimited-[]randpop𝑛subscript𝑝𝑚O_{t}=|O_{t}-[{\rm rand}(\text{pop},n)\leq p_{m}]|.italic_O start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT = | italic_O start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT - [ roman_rand ( pop , italic_n ) ≤ italic_p start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ] | .
     //Selection Operator
     The selection probability of the k𝑘kitalic_k-th individual is determined as f(Otk)minf(Ot)+1k=1popf(Otk)popminf(Ot)+pop𝑓superscriptsubscript𝑂𝑡𝑘𝑓subscript𝑂𝑡1superscriptsubscriptsuperscript𝑘1pop𝑓superscriptsubscript𝑂𝑡superscript𝑘pop𝑓subscript𝑂𝑡pop\frac{f(O_{t}^{k})-\min f(O_{t})+1}{\sum_{k^{\prime}=1}^{\text{pop}}f(O_{t}^{k% ^{\prime}})-\text{pop}\min f(O_{t})+\text{pop}}divide start_ARG italic_f ( italic_O start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT ) - roman_min italic_f ( italic_O start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ) + 1 end_ARG start_ARG ∑ start_POSTSUBSCRIPT italic_k start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT pop end_POSTSUPERSCRIPT italic_f ( italic_O start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT ) - pop roman_min italic_f ( italic_O start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ) + pop end_ARG, and resulting in Ot+1subscript𝑂𝑡1O_{t+1}italic_O start_POSTSUBSCRIPT italic_t + 1 end_POSTSUBSCRIPT.
     t=t+1.𝑡𝑡1t=t+1.italic_t = italic_t + 1 .
  end while
  LOS=maxf(OI)LOS𝑓subscript𝑂𝐼\text{LOS}=\max f(O_{I})LOS = roman_max italic_f ( italic_O start_POSTSUBSCRIPT italic_I end_POSTSUBSCRIPT )

The GA starts by generating initial solutions and then enters a loop. Next, the algorithm performs the crossover and mutation operators on the individuals determined by random numbers. These operations involve randomly generating crossover points, denoted as trsubscript𝑡𝑟t_{r}italic_t start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT, crossing two individuals from position trsubscript𝑡𝑟t_{r}italic_t start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT to n𝑛nitalic_n, and flipping their bits. Afterward, the GA calculates the probability of each individual being selected using the roulette wheel selection operator. Particularly, as problem size increases, the difference between the objective function values of sub-optimal and local optimal solutions decreases, increasing the likelihood of sub-optimal solutions being selected. To address this, we assign a value of 1 to the objective function value of the worst solution in the current generation, then subtract this value from the objective function values of the other solutions, and add 1. Finally, the algorithm terminates the loop after a limited number of iterations and returns the local optimal solution (LOS).

It is worth noting that GAs were initially perceived as a robust search algorithm, leading to a lack of extensive attention to parameter control in the early stages of its development [22]. As the research on algorithm theory progressed, however, numerous researchers acknowledged the impact of parameter control on search outcomes, and it has since become one of the most crucial issues in the field of evolutionary algorithms [49].

3 Mainly theoretical result

3.1 A novel reduction method

Recently, a new reduction method was proposed by Dey et al.[36] for analyzing the performance of the B&B in solving random integer linear programming problems. This method divides the decision variables into different regions and constrains the branching generated by decision variables within each region. In this section, we combine traditional reduction methods [39] with the approach by Dey et al. [36] to solve the 0-1 KP. We introduce a colored regions partitioning method for 0-1 KP, as demonstrated in Theorem 3.1. This method determines the minimum or maximum number of items that should be included in the optimal solution for each colored region.

Theorem 3.1.

Given a set of items Nsuperscript𝑁normal-′N^{\prime}italic_N start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT where the j𝑗jitalic_j-th item satisfies the inequality

pjwj+r/i>pbwbsubscript𝑝𝑗subscript𝑤𝑗𝑟𝑖subscript𝑝𝑏subscript𝑤𝑏\displaystyle\frac{p_{j}}{w_{j}+r/i}>\frac{p_{b}}{w_{b}}divide start_ARG italic_p start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_ARG start_ARG italic_w start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT + italic_r / italic_i end_ARG > divide start_ARG italic_p start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT end_ARG start_ARG italic_w start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT end_ARG (3.7)

for i𝑖i\in\mathbb{N}italic_i ∈ blackboard_N, we can conclude that jN|xjyj|i1subscript𝑗superscript𝑁normal-′subscript𝑥𝑗subscript𝑦𝑗𝑖1\sum\limits_{j\in N^{\prime}}|x_{j}-y_{j}|\leq i-1∑ start_POSTSUBSCRIPT italic_j ∈ italic_N start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT | italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT - italic_y start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT | ≤ italic_i - 1, where Y represents the optimal solution.

It is evident that for the 0-1 KP, the number of leaves in the search tree for items in Nsuperscript𝑁N^{\prime}italic_N start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT is reduced from 2|N|superscript2superscript𝑁2^{|N^{\prime}|}2 start_POSTSUPERSCRIPT | italic_N start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT | end_POSTSUPERSCRIPT to k=1i1(|N|k)superscriptsubscript𝑘1𝑖1binomialsuperscript𝑁𝑘\sum\limits_{k=1}^{i-1}\binom{|N^{\prime}|}{k}∑ start_POSTSUBSCRIPT italic_k = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_i - 1 end_POSTSUPERSCRIPT ( FRACOP start_ARG | italic_N start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT | end_ARG start_ARG italic_k end_ARG ). Finding ways to effectively reduce the algorithmic complexity and improve the solving speed, particularly by exploring more efficient prune strategies between different colored regions, remains an intriguing question. Regrettably, existing research has not investigated the relationships between items within different colored regions. Therefore, we present the following definition based on Theorem 3.1.

Definition 5.

The item set N1={1,2,,b1}subscript𝑁112normal-…𝑏1N_{1}=\{1,2,\dots,b-1\}italic_N start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = { 1 , 2 , … , italic_b - 1 }, consisting of items with a higher profit density than the break item b𝑏bitalic_b, can be divided into m𝑚mitalic_m disjoint subsets, denoted as N1=i=1mN1isubscript𝑁1superscriptsubscriptsquare-union𝑖1𝑚subscript𝑁1𝑖N_{1}=\sqcup_{i=1}^{m}N_{1i}italic_N start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = ⊔ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT italic_N start_POSTSUBSCRIPT 1 italic_i end_POSTSUBSCRIPT. Each N1isubscript𝑁1𝑖N_{1i}italic_N start_POSTSUBSCRIPT 1 italic_i end_POSTSUBSCRIPT subset is defined as follows:

N1i={j|rpbpjwbpbwj+1=i,ej>eb}.subscript𝑁1𝑖conditional-set𝑗formulae-sequence𝑟subscript𝑝𝑏subscript𝑝𝑗subscript𝑤𝑏subscript𝑝𝑏subscript𝑤𝑗1𝑖subscript𝑒𝑗subscript𝑒𝑏\displaystyle N_{1i}=\{j|\lfloor\frac{rp_{b}}{p_{j}w_{b}-p_{b}w_{j}}\rfloor+1=% i,e_{j}>e_{b}\}.italic_N start_POSTSUBSCRIPT 1 italic_i end_POSTSUBSCRIPT = { italic_j | ⌊ divide start_ARG italic_r italic_p start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT end_ARG start_ARG italic_p start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT italic_w start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT - italic_p start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT italic_w start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_ARG ⌋ + 1 = italic_i , italic_e start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT > italic_e start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT } . (3.8)

If no item satisfies the formula (3.8) for any integer i(1im)𝑖1𝑖𝑚i(1\leq i\leq m)italic_i ( 1 ≤ italic_i ≤ italic_m ), then N1i=subscript𝑁1𝑖N_{1i}=\emptysetitalic_N start_POSTSUBSCRIPT 1 italic_i end_POSTSUBSCRIPT = ∅.

Theorem 3.2.

Let D1isubscript𝐷1𝑖D_{1i}italic_D start_POSTSUBSCRIPT 1 italic_i end_POSTSUBSCRIPT and s1isubscript𝑠1𝑖s_{1i}italic_s start_POSTSUBSCRIPT 1 italic_i end_POSTSUBSCRIPT denote the item set and the number of items that are not packed into the optimal solution in N1isubscript𝑁1𝑖N_{1i}italic_N start_POSTSUBSCRIPT 1 italic_i end_POSTSUBSCRIPT, respectively, where 1im1𝑖𝑚1\leq i\leq m1 ≤ italic_i ≤ italic_m. Furthermore, we have s1i=|D1i|=jN1i|yj1|subscript𝑠1𝑖subscript𝐷1𝑖subscript𝑗subscript𝑁1𝑖subscript𝑦𝑗1s_{1i}=|D_{1i}|=\sum\limits_{j\in N_{1i}}|y_{j}-1|italic_s start_POSTSUBSCRIPT 1 italic_i end_POSTSUBSCRIPT = | italic_D start_POSTSUBSCRIPT 1 italic_i end_POSTSUBSCRIPT | = ∑ start_POSTSUBSCRIPT italic_j ∈ italic_N start_POSTSUBSCRIPT 1 italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT | italic_y start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT - 1 |. All s1isubscript𝑠1𝑖s_{1i}italic_s start_POSTSUBSCRIPT 1 italic_i end_POSTSUBSCRIPT form the vector S=(s1i)i=1m𝑆superscriptsubscriptsubscript𝑠1𝑖𝑖1𝑚S=(s_{1i})_{i=1}^{m}italic_S = ( italic_s start_POSTSUBSCRIPT 1 italic_i end_POSTSUBSCRIPT ) start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT. Therefore, we can conclude that:

i=1ms1ii1.superscriptsubscript𝑖1𝑚subscript𝑠1𝑖𝑖1\displaystyle\sum\limits_{i=1}^{m}\frac{s_{1i}}{i}\leq 1.∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT divide start_ARG italic_s start_POSTSUBSCRIPT 1 italic_i end_POSTSUBSCRIPT end_ARG start_ARG italic_i end_ARG ≤ 1 .
Proof.

Given a vector 𝐘superscript𝐘\textbf{Y}^{\prime}Y start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT, we can determine the corresponding D1isubscriptsuperscript𝐷1𝑖D^{\prime}_{1i}italic_D start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 1 italic_i end_POSTSUBSCRIPT and S=(s1i)i=1msuperscript𝑆subscriptsuperscriptsubscriptsuperscript𝑠1𝑖𝑚𝑖1S^{\prime}=(s^{\prime}_{1i})^{m}_{i=1}italic_S start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT = ( italic_s start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 1 italic_i end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT of 𝐘superscript𝐘\textbf{Y}^{\prime}Y start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT for i=1,2,,m𝑖12𝑚i=1,2,\dots,mitalic_i = 1 , 2 , … , italic_m using the definitions of D1isubscript𝐷1𝑖D_{1i}italic_D start_POSTSUBSCRIPT 1 italic_i end_POSTSUBSCRIPT and S𝑆Sitalic_S.

Assume, for the sake of contradiction, that 𝐘superscript𝐘\textbf{Y}^{\prime}Y start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT is the optimal solution and satisfies

i=1ms1i/i>1.superscriptsubscript𝑖1𝑚subscriptsuperscript𝑠1𝑖𝑖1\displaystyle\sum\limits_{i=1}^{m}s^{\prime}_{1i}/i>1.∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT italic_s start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 1 italic_i end_POSTSUBSCRIPT / italic_i > 1 .

It is clear that the optimal solution 𝐘superscript𝐘\textbf{Y}^{\prime}Y start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT must not exceed the Dantzig bound. Therefore, we have

i=1mjD1ipj(i=1mjD1iwj+r)pb/wb.superscriptsubscript𝑖1𝑚subscript𝑗subscriptsuperscript𝐷1𝑖subscript𝑝𝑗superscriptsubscript𝑖1𝑚subscript𝑗subscriptsuperscript𝐷1𝑖subscript𝑤𝑗𝑟subscript𝑝𝑏subscript𝑤𝑏\displaystyle\sum\limits_{i=1}^{m}\sum\limits_{j\in D^{\prime}_{1i}}p_{j}\leq(% \sum\limits_{i=1}^{m}\sum\limits_{j\in D^{\prime}_{1i}}w_{j}+r)p_{b}/w_{b}.∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT ∑ start_POSTSUBSCRIPT italic_j ∈ italic_D start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 1 italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_p start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ≤ ( ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT ∑ start_POSTSUBSCRIPT italic_j ∈ italic_D start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 1 italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_w start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT + italic_r ) italic_p start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT / italic_w start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT . (3.9)

Using the formulas (3.8) and Theorem 3.1, we can deduce the following inequality:

i=1mjD1ipji=1mjD1i(wj+r/i)=i=1mjD1ipji=1mjD1iwj+ri=1ms1i/i>pbwbsuperscriptsubscript𝑖1𝑚subscript𝑗subscriptsuperscript𝐷1𝑖subscript𝑝𝑗superscriptsubscript𝑖1𝑚subscript𝑗subscriptsuperscript𝐷1𝑖subscript𝑤𝑗𝑟𝑖superscriptsubscript𝑖1𝑚subscript𝑗subscriptsuperscript𝐷1𝑖subscript𝑝𝑗superscriptsubscript𝑖1𝑚subscript𝑗subscriptsuperscript𝐷1𝑖subscript𝑤𝑗𝑟superscriptsubscript𝑖1𝑚subscriptsuperscript𝑠1𝑖𝑖subscript𝑝𝑏subscript𝑤𝑏\displaystyle\frac{\sum\limits_{i=1}^{m}\sum\limits_{j\in D^{\prime}_{1i}}p_{j% }}{\sum\limits_{i=1}^{m}\sum\limits_{j\in D^{\prime}_{1i}}(w_{j}+r/i)}=\frac{% \sum\limits_{i=1}^{m}\sum\limits_{j\in D^{\prime}_{1i}}p_{j}}{\sum\limits_{i=1% }^{m}\sum\limits_{j\in D^{\prime}_{1i}}w_{j}+r\sum\limits_{i=1}^{m}s^{\prime}_% {1i}/i}>\frac{p_{b}}{w_{b}}divide start_ARG ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT ∑ start_POSTSUBSCRIPT italic_j ∈ italic_D start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 1 italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_p start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_ARG start_ARG ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT ∑ start_POSTSUBSCRIPT italic_j ∈ italic_D start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 1 italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( italic_w start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT + italic_r / italic_i ) end_ARG = divide start_ARG ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT ∑ start_POSTSUBSCRIPT italic_j ∈ italic_D start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 1 italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_p start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_ARG start_ARG ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT ∑ start_POSTSUBSCRIPT italic_j ∈ italic_D start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 1 italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_w start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT + italic_r ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT italic_s start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 1 italic_i end_POSTSUBSCRIPT / italic_i end_ARG > divide start_ARG italic_p start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT end_ARG start_ARG italic_w start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT end_ARG

and

i=1mjD1ipjsuperscriptsubscript𝑖1𝑚subscript𝑗subscriptsuperscript𝐷1𝑖subscript𝑝𝑗\displaystyle\sum\limits_{i=1}^{m}\sum\limits_{j\in D^{\prime}_{1i}}p_{j}∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT ∑ start_POSTSUBSCRIPT italic_j ∈ italic_D start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 1 italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_p start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT >(i=1mjD1iwj+ri=1ms1i/i)pb/wbabsentsuperscriptsubscript𝑖1𝑚subscript𝑗subscriptsuperscript𝐷1𝑖subscript𝑤𝑗𝑟superscriptsubscript𝑖1𝑚subscriptsuperscript𝑠1𝑖𝑖subscript𝑝𝑏subscript𝑤𝑏\displaystyle>(\sum\limits_{i=1}^{m}\sum\limits_{j\in D^{\prime}_{1i}}w_{j}+r% \sum\limits_{i=1}^{m}s^{\prime}_{1i}/i)p_{b}/w_{b}> ( ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT ∑ start_POSTSUBSCRIPT italic_j ∈ italic_D start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 1 italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_w start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT + italic_r ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT italic_s start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 1 italic_i end_POSTSUBSCRIPT / italic_i ) italic_p start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT / italic_w start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT
>(i=1mjD1iwj+r)pb/wb.absentsuperscriptsubscript𝑖1𝑚subscript𝑗subscriptsuperscript𝐷1𝑖subscript𝑤𝑗𝑟subscript𝑝𝑏subscript𝑤𝑏\displaystyle>(\sum\limits_{i=1}^{m}\sum\limits_{j\in D^{\prime}_{1i}}w_{j}+r)% p_{b}/w_{b}.> ( ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT ∑ start_POSTSUBSCRIPT italic_j ∈ italic_D start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 1 italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_w start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT + italic_r ) italic_p start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT / italic_w start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT . (3.10)

Inequality (3.9) and inequality (3.10) contradict each other, proving the proposition. Hence, Theorem 3.2 is established. ∎

It is worth noting that s1isubscript𝑠1𝑖s_{1i}italic_s start_POSTSUBSCRIPT 1 italic_i end_POSTSUBSCRIPT represents the items in N1isubscript𝑁1𝑖N_{1i}italic_N start_POSTSUBSCRIPT 1 italic_i end_POSTSUBSCRIPT that are not selected by the optimal solution, so s1isubscript𝑠1𝑖s_{1i}italic_s start_POSTSUBSCRIPT 1 italic_i end_POSTSUBSCRIPT belongs to the set of natural numbers, i.e., s1isubscript𝑠1𝑖s_{1i}\in\mathbb{N}italic_s start_POSTSUBSCRIPT 1 italic_i end_POSTSUBSCRIPT ∈ blackboard_N. Even when the integer constraints of the original problem are relaxed, i.e., xj[0,1]subscript𝑥𝑗01x_{j}\in[0,1]italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ∈ [ 0 , 1 ] and s1isubscript𝑠1𝑖s_{1i}\in\mathbb{R}italic_s start_POSTSUBSCRIPT 1 italic_i end_POSTSUBSCRIPT ∈ blackboard_R, the conclusion still holds. This means that

Corollary 3.3.

Given a solution 𝐗[0,1]n𝐗superscript01𝑛\textbf{X}\in[0,1]^{n}X ∈ [ 0 , 1 ] start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT and s1i=jD1i|xj1|subscript𝑠1𝑖subscript𝑗subscript𝐷1𝑖subscript𝑥𝑗1s_{1i}=\sum\limits_{j\in D_{1i}}|x_{j}-1|italic_s start_POSTSUBSCRIPT 1 italic_i end_POSTSUBSCRIPT = ∑ start_POSTSUBSCRIPT italic_j ∈ italic_D start_POSTSUBSCRIPT 1 italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT | italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT - 1 |. If

i=1ms1i/i>1,superscriptsubscript𝑖1𝑚subscript𝑠1𝑖𝑖1\displaystyle\sum\limits_{i=1}^{m}s_{1i}/i>1,∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT italic_s start_POSTSUBSCRIPT 1 italic_i end_POSTSUBSCRIPT / italic_i > 1 , (3.11)

then X is not the optimal solution.

Furthermore, we establish the following equivalent corollary to facilitate algorithmic computation, considering the presence of numerous empty sets N1isubscript𝑁1𝑖N_{1i}italic_N start_POSTSUBSCRIPT 1 italic_i end_POSTSUBSCRIPT in Theorem 3.2.

Corollary 3.4.

Consider a vector H={h1,h2,,hn}𝐻subscript1subscript2normal-…subscript𝑛H=\{h_{1},h_{2},\dots,h_{n}\}italic_H = { italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_h start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , … , italic_h start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT }, where each element is defined as:

hj={rpbpjwbpbwj+1,𝑖𝑓ej>eb,,𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒.subscript𝑗cases𝑟subscript𝑝𝑏subscript𝑝𝑗subscript𝑤𝑏subscript𝑝𝑏subscript𝑤𝑗1𝑖𝑓subscript𝑒𝑗subscript𝑒𝑏𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒\displaystyle\begin{split}h_{j}=\left\{\begin{array}[]{ll}\lfloor\frac{rp_{b}}% {p_{j}w_{b}-p_{b}w_{j}}\rfloor+1,&\text{if}\ e_{j}>e_{b},\\ \infty,&\text{otherwise}.\end{array}\right.\end{split}start_ROW start_CELL italic_h start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT = { start_ARRAY start_ROW start_CELL ⌊ divide start_ARG italic_r italic_p start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT end_ARG start_ARG italic_p start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT italic_w start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT - italic_p start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT italic_w start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_ARG ⌋ + 1 , end_CELL start_CELL if italic_e start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT > italic_e start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT , end_CELL end_ROW start_ROW start_CELL ∞ , end_CELL start_CELL otherwise . end_CELL end_ROW end_ARRAY end_CELL end_ROW

If X is the optimal solution, then the following inequality holds:

j=1n1xjhj1.superscriptsubscript𝑗1𝑛1subscript𝑥𝑗subscript𝑗1\displaystyle\sum\limits_{j=1}^{n}\frac{1-x_{j}}{h_{j}}\leq 1.∑ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT divide start_ARG 1 - italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_ARG start_ARG italic_h start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_ARG ≤ 1 .

Based on Corollary 3.4, we can derive the following corollary for items with a profit density lower than the break item b𝑏bitalic_b.

Corollary 3.5.

Consider a vector L={l1,l2,,ln}𝐿subscript𝑙1subscript𝑙2normal-…subscript𝑙𝑛L=\{l_{1},l_{2},\dots,l_{n}\}italic_L = { italic_l start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_l start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , … , italic_l start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT }, where each element is defined as:

lj={rpbpbwjpjwb+1,𝑖𝑓ej<eb,,𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒.subscript𝑙𝑗cases𝑟subscript𝑝𝑏subscript𝑝𝑏subscript𝑤𝑗subscript𝑝𝑗subscript𝑤𝑏1𝑖𝑓subscript𝑒𝑗subscript𝑒𝑏𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒\displaystyle\begin{split}l_{j}=\left\{\begin{array}[]{ll}\lfloor\frac{rp_{b}}% {p_{b}w_{j}-p_{j}w_{b}}\rfloor+1,&\text{if}\ e_{j}<e_{b},\\ \infty,&\text{otherwise}.\end{array}\right.\end{split}start_ROW start_CELL italic_l start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT = { start_ARRAY start_ROW start_CELL ⌊ divide start_ARG italic_r italic_p start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT end_ARG start_ARG italic_p start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT italic_w start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT - italic_p start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT italic_w start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT end_ARG ⌋ + 1 , end_CELL start_CELL if italic_e start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT < italic_e start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT , end_CELL end_ROW start_ROW start_CELL ∞ , end_CELL start_CELL otherwise . end_CELL end_ROW end_ARRAY end_CELL end_ROW

If X is the optimal solution, then the following inequality holds:

j=1nxjlj1.superscriptsubscript𝑗1𝑛subscript𝑥𝑗subscript𝑙𝑗1\displaystyle\sum\limits_{j=1}^{n}\frac{x_{j}}{l_{j}}\leq 1.∑ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT divide start_ARG italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_ARG start_ARG italic_l start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_ARG ≤ 1 .

The number of leaves pruned in the search tree of the 0-1 KP by Corollaries 3.4 and 3.5 can be expressed using generating functions[37, 38] as follows.

Theorem 3.6.

Consider the number of leaves in the search tree, denoted as ω𝜔\omegaitalic_ω, and the number of items in N1isubscript𝑁1𝑖N_{1i}italic_N start_POSTSUBSCRIPT 1 italic_i end_POSTSUBSCRIPT, denoted as n1isubscript𝑛1𝑖n_{1i}italic_n start_POSTSUBSCRIPT 1 italic_i end_POSTSUBSCRIPT, respectively. Additionally, let 𝒳(λ)𝒳𝜆\mathcal{X}(\lambda)caligraphic_X ( italic_λ ) be the polynomial representing a potential optimal solution, given by:

𝒳(λ):=i=1m(j=0min{n1i,i}(n1ij)λj/i).assign𝒳𝜆superscriptsubscriptproduct𝑖1𝑚superscriptsubscript𝑗0subscript𝑛1𝑖𝑖binomialsubscript𝑛1𝑖𝑗superscript𝜆𝑗𝑖\displaystyle\mathcal{X}(\lambda):=\prod\limits_{i=1}^{m}\left(\sum\limits_{j=% 0}^{\min\{n_{1i},i\}}{n_{1i}\choose j}\lambda^{j/i}\right).caligraphic_X ( italic_λ ) := ∏ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT ( ∑ start_POSTSUBSCRIPT italic_j = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_min { italic_n start_POSTSUBSCRIPT 1 italic_i end_POSTSUBSCRIPT , italic_i } end_POSTSUPERSCRIPT ( binomial start_ARG italic_n start_POSTSUBSCRIPT 1 italic_i end_POSTSUBSCRIPT end_ARG start_ARG italic_j end_ARG ) italic_λ start_POSTSUPERSCRIPT italic_j / italic_i end_POSTSUPERSCRIPT ) . (3.12)

Then, the number of leaves ω𝜔\omegaitalic_ω is equal to the sum of the coefficients of terms with an exponent no greater than 1.

It is evident from Theorem 3.6 that Corollary 3.4 and Corollary 3.5 significantly enhance the efficiency of B&B algorithms in solving problems such as the 0-1 KP.

3.2 An improvement mutation operator and the upper bound of the mutation probability

As one of the most commonly used tools to enhance population diversity in various heuristic algorithms, the flip bit mutation operator has been widely applied due to its simplicity. Specifically, for Otk(j)superscriptsubscript𝑂𝑡𝑘𝑗O_{t}^{k}(j)italic_O start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT ( italic_j ), the flip bit mutation compares the mutation probability pmsubscript𝑝𝑚p_{m}italic_p start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT (typically set as pm[0.001,0.01]subscript𝑝𝑚0.0010.01p_{m}\in[0.001,0.01]italic_p start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ∈ [ 0.001 , 0.01 ]) with a randomly generated number. If the generated random number is less than pmsubscript𝑝𝑚p_{m}italic_p start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT, the value of Otk(j)superscriptsubscript𝑂𝑡𝑘𝑗O_{t}^{k}(j)italic_O start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT ( italic_j ) is flipped, changing 1 to 0 and 0 to 1. This procedure effectively increases the diversity of the population, enabling better exploration of the search space. To facilitate subsequent discussions, we first provide the mathematical representation of the flip bit mutation.

Otk(j)={1Otk(j),if rand<pm,Otk(j),otherwise,superscriptsubscript𝑂𝑡𝑘𝑗cases1superscriptsubscript𝑂𝑡𝑘𝑗if randsubscript𝑝𝑚superscriptsubscript𝑂𝑡𝑘𝑗otherwise\displaystyle O_{t}^{k}(j)=\begin{cases}1-O_{t}^{k}(j),&\text{if }\text{rand}<% p_{m},\\ O_{t}^{k}(j),&\text{otherwise},\end{cases}italic_O start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT ( italic_j ) = { start_ROW start_CELL 1 - italic_O start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT ( italic_j ) , end_CELL start_CELL if roman_rand < italic_p start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT , end_CELL end_ROW start_ROW start_CELL italic_O start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT ( italic_j ) , end_CELL start_CELL otherwise , end_CELL end_ROW (3.13)

where rand is a randomly generated number within the interval [0,1]01[0,1][ 0 , 1 ].

From Theorem 3.6, it becomes evident that items positioned farther from the line formed by the break item and the origin are more likely to be selected in the optimal solution. Building upon this observation and combining it with a greedy algorithm, we propose the IMO. Given a mutation probability pmsubscript𝑝𝑚p_{m}italic_p start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT, the IMO initially employs the break item to partition the set of items into two parts. Mutation operations are then performed on items belonging to different parts based on the value of Otk(j)superscriptsubscript𝑂𝑡𝑘𝑗O_{t}^{k}(j)italic_O start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT ( italic_j ). The mathematical representation of the IMO can be described using formula (3.14).

Otk(j)={1Otk(j),if ej>eb and rand<Otk(j)pm+(1Otk(j))(1pm),1Otk(j),if ejeb and rand<(1Otk(j))pm+Otk(j)(1pm),Otk(j),otherwise.superscriptsubscript𝑂𝑡𝑘𝑗cases1superscriptsubscript𝑂𝑡𝑘𝑗if subscript𝑒𝑗subscript𝑒𝑏 and randsuperscriptsubscript𝑂𝑡𝑘𝑗subscript𝑝𝑚1superscriptsubscript𝑂𝑡𝑘𝑗1subscript𝑝𝑚1superscriptsubscript𝑂𝑡𝑘𝑗if subscript𝑒𝑗subscript𝑒𝑏 and rand1superscriptsubscript𝑂𝑡𝑘𝑗subscript𝑝𝑚superscriptsubscript𝑂𝑡𝑘𝑗1subscript𝑝𝑚superscriptsubscript𝑂𝑡𝑘𝑗otherwise\displaystyle O_{t}^{k}(j)=\begin{cases}1-O_{t}^{k}(j),&\text{if }e_{j}>e_{b}% \text{ and }\text{rand}<O_{t}^{k}(j)p_{m}+(1-O_{t}^{k}(j))(1-p_{m}),\\ 1-O_{t}^{k}(j),&\text{if }e_{j}\leq e_{b}\text{ and }\text{rand}<(1-O_{t}^{k}(% j))p_{m}+O_{t}^{k}(j)(1-p_{m}),\\ O_{t}^{k}(j),&\text{otherwise}.\end{cases}italic_O start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT ( italic_j ) = { start_ROW start_CELL 1 - italic_O start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT ( italic_j ) , end_CELL start_CELL if italic_e start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT > italic_e start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT and roman_rand < italic_O start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT ( italic_j ) italic_p start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT + ( 1 - italic_O start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT ( italic_j ) ) ( 1 - italic_p start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ) , end_CELL end_ROW start_ROW start_CELL 1 - italic_O start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT ( italic_j ) , end_CELL start_CELL if italic_e start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ≤ italic_e start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT and roman_rand < ( 1 - italic_O start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT ( italic_j ) ) italic_p start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT + italic_O start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT ( italic_j ) ( 1 - italic_p start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ) , end_CELL end_ROW start_ROW start_CELL italic_O start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT ( italic_j ) , end_CELL start_CELL otherwise . end_CELL end_ROW (3.14)

For a given mutation probability pmsubscript𝑝𝑚p_{m}italic_p start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT, the IMO yields an expectation of pmn1isubscript𝑝𝑚subscript𝑛1𝑖p_{m}n_{1i}italic_p start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT italic_n start_POSTSUBSCRIPT 1 italic_i end_POSTSUBSCRIPT for the number of unselected items in each set N1isubscript𝑁1𝑖N_{1i}italic_N start_POSTSUBSCRIPT 1 italic_i end_POSTSUBSCRIPT. However, due to the constraint imposed by Corollary 3.3, there is a maximum limit on the number of unselected items among all sets. Solutions that do not meet this constraint cannot be considered optimal. Consequently, we can further reduce the ineffective search space of the mutation operator and determine an upper bound for the mutation probability pmsubscript𝑝𝑚p_{m}italic_p start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT.

Theorem 3.7.

Let p¯msubscriptnormal-¯𝑝𝑚\overline{p}_{m}over¯ start_ARG italic_p end_ARG start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT denote the upper bound of the mutation probability, which can be expressed as:

p¯m=min{1j=1n1/hj,1j=1n1/lj}.subscript¯𝑝𝑚1superscriptsubscript𝑗1𝑛1subscript𝑗1superscriptsubscript𝑗1𝑛1subscript𝑙𝑗\displaystyle\overline{p}_{m}=\min\{\frac{1}{\sum_{j=1}^{n}1/h_{j}},\frac{1}{% \sum_{j=1}^{n}1/l_{j}}\}.over¯ start_ARG italic_p end_ARG start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT = roman_min { divide start_ARG 1 end_ARG start_ARG ∑ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT 1 / italic_h start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_ARG , divide start_ARG 1 end_ARG start_ARG ∑ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT 1 / italic_l start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_ARG } . (3.15)
Proof.

Without loss of generality, we assume that 0<1j=1n1/hj1j=1n1/lj01superscriptsubscript𝑗1𝑛1subscript𝑗1superscriptsubscript𝑗1𝑛1subscript𝑙𝑗0<\frac{1}{\sum_{j=1}^{n}1/h_{j}}\leq\frac{1}{\sum_{j=1}^{n}1/l_{j}}0 < divide start_ARG 1 end_ARG start_ARG ∑ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT 1 / italic_h start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_ARG ≤ divide start_ARG 1 end_ARG start_ARG ∑ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT 1 / italic_l start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_ARG. It is evident that, according to Corollary 3.3 and Corollary 3.4, if p¯m>1j=1n1/hjsubscript¯𝑝𝑚1superscriptsubscript𝑗1𝑛1subscript𝑗\overline{p}_{m}>\frac{1}{\sum_{j=1}^{n}1/h_{j}}over¯ start_ARG italic_p end_ARG start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT > divide start_ARG 1 end_ARG start_ARG ∑ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT 1 / italic_h start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_ARG, the individual processed by the IMO cannot be an optimal solution in terms of mathematical expectation. The proof of Theorem 3.7 is thereby established. ∎

In the past decades, numerous researchers have conducted extensive research on the optimal mutation probability for convex functions or 𝒫𝒫\mathscr{P}script_P problems such as OneMax and LeadingOnes. The commonly obtained value for this probability is 1n1𝑛\frac{1}{n}divide start_ARG 1 end_ARG start_ARG italic_n end_ARG [18, 19, 20]. These studies have led to the following conclusions.

Proposition 1.

[25] With the increase in problem size, the mutation probability gradually tends to 0, i.e.,

limnpm=0.subscript𝑛subscript𝑝𝑚0\displaystyle\lim\limits_{n\rightarrow\infty}p_{m}=0.roman_lim start_POSTSUBSCRIPT italic_n → ∞ end_POSTSUBSCRIPT italic_p start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT = 0 .

As the problem size increases, it becomes apparent that for a general instance of the 0-1 KP, the number of items with profit density greater than the break item also increases. This raises an intriguing question of whether the upper bound p¯msubscript¯𝑝𝑚\overline{p}_{m}over¯ start_ARG italic_p end_ARG start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT for the mutation probability in the IMO gradually tends to 0.

Theorem 3.8.

Let R𝑅Ritalic_R be a constant such that R+𝑅superscriptR\in\mathbb{N}^{+}italic_R ∈ blackboard_N start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT and consider an instance ��𝒫\mathscr{P}script_P of the 0-1 KP with pj,wj{1,2,,R}subscript𝑝𝑗subscript𝑤𝑗12normal-…𝑅p_{j},w_{j}\in\{1,2,\dots,R\}italic_p start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , italic_w start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ∈ { 1 , 2 , … , italic_R }. We can conclude that

limnp¯m=0.subscript𝑛subscript¯𝑝𝑚0\displaystyle\lim\limits_{n\rightarrow\infty}\overline{p}_{m}=0.roman_lim start_POSTSUBSCRIPT italic_n → ∞ end_POSTSUBSCRIPT over¯ start_ARG italic_p end_ARG start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT = 0 .
Proof.

For an instance of the 0-1 KP, without loss of generality, let the profit density of the break item be constant at pb/wbsubscript𝑝𝑏subscript𝑤𝑏p_{b}/w_{b}italic_p start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT / italic_w start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT, and let the residual capacity r𝑟ritalic_r also be constant. The knapsack capacity C𝐶Citalic_C can then be expressed as:

C=j=1b1wj+r.𝐶superscriptsubscript𝑗1𝑏1subscript𝑤𝑗𝑟\displaystyle C=\sum\limits_{j=1}^{b-1}w_{j}+r.italic_C = ∑ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_b - 1 end_POSTSUPERSCRIPT italic_w start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT + italic_r .

Consider the item set N1subscript𝑁1N_{1}italic_N start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT consisting of items with profit density greater than the break item b𝑏bitalic_b, and let R𝑅Ritalic_R be any constant. It is evident that there exists a constant M+𝑀superscriptM\in\mathbb{N}^{+}italic_M ∈ blackboard_N start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT such that m<M𝑚𝑀m<Mitalic_m < italic_M and N1subscript𝑁1N_{1}italic_N start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT can be expressed as the disjoint union of sets N1isubscript𝑁1𝑖N_{1i}italic_N start_POSTSUBSCRIPT 1 italic_i end_POSTSUBSCRIPT, as defined in Definition 5. In other words, for any jN1𝑗subscript𝑁1j\in N_{1}italic_j ∈ italic_N start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT, we have:

pjwj+r/M>pbwb.subscript𝑝𝑗subscript𝑤𝑗𝑟𝑀subscript𝑝𝑏subscript𝑤𝑏\displaystyle\frac{p_{j}}{w_{j}+r/M}>\frac{p_{b}}{w_{b}}.divide start_ARG italic_p start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_ARG start_ARG italic_w start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT + italic_r / italic_M end_ARG > divide start_ARG italic_p start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT end_ARG start_ARG italic_w start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT end_ARG .

Furthermore, by leveraging Corollary 3.4, we are able to acquire a vector H𝐻Hitalic_H that is associated with the given instance. Consequently, the following conclusion can be derived:

limnp¯mlimn1j=1n1/hjlimnMn=0.subscript𝑛subscript¯𝑝𝑚subscript𝑛1superscriptsubscript𝑗1𝑛1subscript𝑗subscript𝑛𝑀𝑛0\displaystyle\lim\limits_{n\rightarrow\infty}\overline{p}_{m}\leq\lim\limits_{% n\rightarrow\infty}\frac{1}{\sum_{j=1}^{n}1/h_{j}}\leq\lim\limits_{n% \rightarrow\infty}\frac{M}{n}=0.roman_lim start_POSTSUBSCRIPT italic_n → ∞ end_POSTSUBSCRIPT over¯ start_ARG italic_p end_ARG start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ≤ roman_lim start_POSTSUBSCRIPT italic_n → ∞ end_POSTSUBSCRIPT divide start_ARG 1 end_ARG start_ARG ∑ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT 1 / italic_h start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_ARG ≤ roman_lim start_POSTSUBSCRIPT italic_n → ∞ end_POSTSUBSCRIPT divide start_ARG italic_M end_ARG start_ARG italic_n end_ARG = 0 .

The proof of Theorem 3.8 is complete. ∎

It is notable that when the profit and weight values of items in a 0-1 KP instance are limited, as exemplified in Theorem 3.8, the instance can be solved within O(Rn2)𝑂𝑅superscript𝑛2O(Rn^{2})italic_O ( italic_R italic_n start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) time complexity by Dynamic Programming[8, 9]. Since R𝑅Ritalic_R is a constant, the instance belongs to the 𝒫𝒫\mathcal{P}caligraphic_P complexity class. Consequently, it becomes an interesting question of whether the upper bound of the mutation operator in the IMO still tends towards 0 when the instance of the 0-1 KP no longer limits the profit and weight values of items.

Theorem 3.9.

For any given constant θ𝜃\thetaitalic_θ within the interval (0,1)01(0,1)( 0 , 1 ), it is feasible to construct an instance of the 0-1 KP with pj,wj+subscript𝑝𝑗subscript𝑤𝑗superscriptp_{j},w_{j}\in\mathbb{N}^{+}italic_p start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , italic_w start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ∈ blackboard_N start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT, and we have

limnp¯m=θ.subscript𝑛subscript¯𝑝𝑚𝜃\displaystyle\lim\limits_{n\rightarrow\infty}\overline{p}_{m}=\theta.roman_lim start_POSTSUBSCRIPT italic_n → ∞ end_POSTSUBSCRIPT over¯ start_ARG italic_p end_ARG start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT = italic_θ .
Proof.

Following the same approach as the proof of Theorem 3.8, we assume that the profit density of the break item remains constant at pb/wbsubscript𝑝𝑏subscript𝑤𝑏p_{b}/w_{b}italic_p start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT / italic_w start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT, while maintaining a constant residual capacity r𝑟ritalic_r. Furthermore, the knapsack capacity C𝐶Citalic_C must satisfy

C=j=1b1wj+r.𝐶superscriptsubscript𝑗1𝑏1subscript𝑤𝑗𝑟\displaystyle C=\sum\limits_{j=1}^{b-1}w_{j}+r.italic_C = ∑ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_b - 1 end_POSTSUPERSCRIPT italic_w start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT + italic_r .

Without loss of generality, we set θ=12𝜃12\theta=\frac{1}{2}italic_θ = divide start_ARG 1 end_ARG start_ARG 2 end_ARG. Next, we construct the profit and weight of items in the instance, ensuring that their profit densities exceed that of the break item. Let pjsubscript𝑝𝑗p_{j}italic_p start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT and wjsubscript𝑤𝑗w_{j}italic_w start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT represent the profit and weight, respectively, of the j𝑗jitalic_j-th item, satisfying

rpbpjwbpbwj+1=2j1.𝑟subscript𝑝𝑏subscript𝑝𝑗subscript𝑤𝑏subscript𝑝𝑏subscript𝑤𝑗1superscript2𝑗1\displaystyle\lfloor\frac{rp_{b}}{p_{j}w_{b}-p_{b}w_{j}}\rfloor+1=2^{j-1}.⌊ divide start_ARG italic_r italic_p start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT end_ARG start_ARG italic_p start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT italic_w start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT - italic_p start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT italic_w start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_ARG ⌋ + 1 = 2 start_POSTSUPERSCRIPT italic_j - 1 end_POSTSUPERSCRIPT . (3.16)

In other words, we have hj=2j1subscript𝑗superscript2𝑗1h_{j}=2^{j-1}italic_h start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT = 2 start_POSTSUPERSCRIPT italic_j - 1 end_POSTSUPERSCRIPT. Since pj,wj+subscript𝑝𝑗subscript𝑤𝑗superscriptp_{j},w_{j}\in\mathbb{N}^{+}italic_p start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , italic_w start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ∈ blackboard_N start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT, it is clear that there exist pjsubscript𝑝𝑗p_{j}italic_p start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT and wjsubscript𝑤𝑗w_{j}italic_w start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT that satisfy equality (3.16). Consequently, we can conclude that

limnp¯m=limn1j=1n1/hj=12=θ.subscript𝑛subscript¯𝑝𝑚subscript𝑛1superscriptsubscript𝑗1𝑛1subscript𝑗12𝜃\displaystyle\lim\limits_{n\rightarrow\infty}\overline{p}_{m}=\lim\limits_{n% \rightarrow\infty}\frac{1}{\sum_{j=1}^{n}1/h_{j}}=\frac{1}{2}=\theta.roman_lim start_POSTSUBSCRIPT italic_n → ∞ end_POSTSUBSCRIPT over¯ start_ARG italic_p end_ARG start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT = roman_lim start_POSTSUBSCRIPT italic_n → ∞ end_POSTSUBSCRIPT divide start_ARG 1 end_ARG start_ARG ∑ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT 1 / italic_h start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_ARG = divide start_ARG 1 end_ARG start_ARG 2 end_ARG = italic_θ .

The proof of Theorem 3.9 is complete. ∎

3.3 Comparison analysis

In 𝒩𝒫𝒩𝒫\mathcal{NP}caligraphic_N caligraphic_P-hard problems, given two solutions 𝐗superscript𝐗\textbf{X}^{\prime}X start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT and 𝐗′′superscript𝐗′′\textbf{X}^{\prime\prime}X start_POSTSUPERSCRIPT ′ ′ end_POSTSUPERSCRIPT, even if 𝐗𝐘<𝐗′′𝐘normsuperscript𝐗𝐘normsuperscript𝐗′′𝐘\|\textbf{X}^{\prime}-\textbf{Y}\|<\|\textbf{X}^{\prime\prime}-\textbf{Y}\|∥ X start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT - Y ∥ < ∥ X start_POSTSUPERSCRIPT ′ ′ end_POSTSUPERSCRIPT - Y ∥, we cannot conclude that f(𝐗)f(𝐗′′)𝑓superscript𝐗𝑓superscript𝐗′′f(\textbf{X}^{\prime})\geq f(\textbf{X}^{\prime\prime})italic_f ( X start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ≥ italic_f ( X start_POSTSUPERSCRIPT ′ ′ end_POSTSUPERSCRIPT ), where Y is the optimal solution. In other words, solutions with a smaller Hamming distance to the optimal solution Y do not necessarily dominate solutions with a larger Hamming distance to Y. Therefore, the existing convergence analysis of GAs is limited to specific scenarios. For a general instance of 𝒩𝒫𝒩𝒫\mathcal{NP}caligraphic_N caligraphic_P-hard problems, the expected runtimes of the GA often exceed O(2n)𝑂superscript2𝑛O(2^{n})italic_O ( 2 start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ), which lacks strong theoretical guarantees for the performance.

To demonstrate the performance of the IMO, we evaluate the algorithm by considering the probability τ(Algorithm)𝜏Algorithm\tau(\text{Algorithm})italic_τ ( Algorithm ) of hitting the optimal solution Y from the zero vector 0 within a single iteration for a given instance 𝒫𝒫\mathscr{P}script_P. This evaluation is conducted to assess the performance of the IMO and the MO on general large-scale instances.

Example 3.10.

Given an instance 𝒫𝒫\mathscr{P}script_P of the 0-1 KP with n𝑛nitalic_n items, let λ1subscript𝜆1\lambda_{1}italic_λ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and λ2subscript𝜆2\lambda_{2}italic_λ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT denote the number of selected and unselected items within the first b1𝑏1b-1italic_b - 1 items, respectively. Similarly, λ3subscript𝜆3\lambda_{3}italic_λ start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT and λ4subscript𝜆4\lambda_{4}italic_λ start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT represent the number of selected and unselected items after the (b1)𝑏1(b-1)( italic_b - 1 )-th item, respectively. We have λ1=j=1b1|yj1|subscript𝜆1superscriptsubscript𝑗1𝑏1subscript𝑦𝑗1\lambda_{1}=\sum\limits_{j=1}^{b-1}|y_{j}-1|italic_λ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = ∑ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_b - 1 end_POSTSUPERSCRIPT | italic_y start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT - 1 |, λ2=j=1b1|yj|subscript𝜆2superscriptsubscript𝑗1𝑏1subscript𝑦𝑗\lambda_{2}=\sum\limits_{j=1}^{b-1}|y_{j}|italic_λ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = ∑ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_b - 1 end_POSTSUPERSCRIPT | italic_y start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT |, λ3=j=bn|yj|subscript𝜆3superscriptsubscript𝑗𝑏𝑛subscript𝑦𝑗\lambda_{3}=\sum\limits_{j=b}^{n}|y_{j}|italic_λ start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT = ∑ start_POSTSUBSCRIPT italic_j = italic_b end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT | italic_y start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT |, and λ4=j=bn|yj1|subscript𝜆4superscriptsubscript𝑗𝑏𝑛subscript𝑦𝑗1\lambda_{4}=\sum\limits_{j=b}^{n}|y_{j}-1|italic_λ start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT = ∑ start_POSTSUBSCRIPT italic_j = italic_b end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT | italic_y start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT - 1 |.

Theorem 3.11.

If λ1<λ2subscript𝜆1subscript𝜆2\lambda_{1}<\lambda_{2}italic_λ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT < italic_λ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT and pm<0.5subscript𝑝𝑚0.5p_{m}<0.5italic_p start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT < 0.5, then τ(IMO)>τ(MO)𝜏normal-IMO𝜏normal-MO\tau({\rm IMO})>\tau({\rm MO})italic_τ ( roman_IMO ) > italic_τ ( roman_MO ).

Proof.
τ(IMO)τ(MO\displaystyle\frac{\tau({\rm IMO})}{\tau({\rm MO}}divide start_ARG italic_τ ( roman_IMO ) end_ARG start_ARG italic_τ ( roman_MO end_ARG =(1pm)λ1pmλ2pmλ3(1pm)λ4pmλ1+λ3(1pm)λ2+λ4absentsuperscript1subscript𝑝𝑚subscript𝜆1superscriptsubscript𝑝𝑚subscript𝜆2superscriptsubscript𝑝𝑚subscript𝜆3superscript1subscript𝑝𝑚subscript𝜆4superscriptsubscript𝑝𝑚subscript𝜆1subscript𝜆3superscript1subscript𝑝𝑚subscript𝜆2subscript𝜆4\displaystyle=\frac{(1-p_{m})^{\lambda_{1}}p_{m}^{\lambda_{2}}p_{m}^{\lambda_{% 3}}(1-p_{m})^{\lambda_{4}}}{p_{m}^{\lambda_{1}+\lambda_{3}}(1-p_{m})^{\lambda_% {2}+\lambda_{4}}}= divide start_ARG ( 1 - italic_p start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT italic_λ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT italic_p start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_λ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT italic_p start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_λ start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ( 1 - italic_p start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT italic_λ start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT end_ARG start_ARG italic_p start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_λ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + italic_λ start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ( 1 - italic_p start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT italic_λ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT + italic_λ start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT end_ARG
=pmλ2+λ3(1pm)λ1+λ4pmλ1+λ3(1pm)λ2+λ4absentsuperscriptsubscript𝑝𝑚subscript𝜆2subscript𝜆3superscript1subscript𝑝𝑚subscript𝜆1subscript𝜆4superscriptsubscript𝑝𝑚subscript𝜆1subscript𝜆3superscript1subscript𝑝𝑚subscript𝜆2subscript𝜆4\displaystyle=\frac{p_{m}^{\lambda_{2}+\lambda_{3}}(1-p_{m})^{\lambda_{1}+% \lambda_{4}}}{p_{m}^{\lambda_{1}+\lambda_{3}}(1-p_{m})^{\lambda_{2}+\lambda_{4% }}}= divide start_ARG italic_p start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_λ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT + italic_λ start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ( 1 - italic_p start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT italic_λ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + italic_λ start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT end_ARG start_ARG italic_p start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_λ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + italic_λ start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ( 1 - italic_p start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT italic_λ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT + italic_λ start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT end_ARG
=(1pmpm)λ2λ1absentsuperscript1subscript𝑝𝑚subscript𝑝𝑚subscript𝜆2subscript𝜆1\displaystyle=(\frac{1-p_{m}}{p_{m}})^{\lambda_{2}-\lambda_{1}}= ( divide start_ARG 1 - italic_p start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT end_ARG start_ARG italic_p start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT end_ARG ) start_POSTSUPERSCRIPT italic_λ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT - italic_λ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT
>1.absent1\displaystyle>1.> 1 .

This completes the proof of Theorem 3.11. ∎

Theorem 3.11 demonstrates that when the number of items in the optimal solution exceeds the number of items not in the optimal solution before the break item, the probability of the IMO hitting the optimal solution in a single iteration is higher, which aligns with empirical experimental data. Moreover, since GAs are primarily employed for solving large-scale 𝒩𝒫𝒩𝒫\mathcal{NP}caligraphic_N caligraphic_P-hard problems, which rarely exhibit the scenario of Example 3.10 with λ1λ2subscript𝜆1subscript𝜆2\lambda_{1}\geq\lambda_{2}italic_λ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ≥ italic_λ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT, and the mutation probability pmsubscript𝑝𝑚p_{m}italic_p start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT does not exceed 0.50.50.50.5, the IMO generally outperforms the MO in large-scale instances.

4 Conclusion and future work

Heuristic algorithms with parameters exhibit significant performance differences depending on the value of the parameters. Due to limitations in mathematical tools, the problem of finding the optimal parameters for heuristic algorithms has received extensive attention in recent decades. Based on the assumption that 𝒩𝒫𝒫𝒩𝒫𝒫\mathcal{NP}\neq\mathcal{P}caligraphic_N caligraphic_P ≠ caligraphic_P, we propose a novel reduction method and apply it to compute the upper bound of the mutation probability.

Furthermore, our method not only proves that in the 0-1 KP, when the weight and profit of items are limited, the mutation probability tends toward 0, but also demonstrates that the mutation probability can tend toward a constant within the open interval (0,1)01(0,1)( 0 , 1 ) when the weight and profit values of items are unrestricted.

For future work, we can approach it from two perspectives. Firstly, the upper bound of the method deserves further improvement. Our research only utilized the linear relaxation technique, and it would be worthwhile to explore better upper bound computation methods, such as the Lagrangian relaxation technique. Secondly, Dey et al. have demonstrated that such reduction methods can apply to not only one-dimensional cases but also multidimensional problems. Therefore, the application scope of these reduction methods can be further expanded, such as multidimensional knapsack problems.

References

  • [1] S. Martello and P. Toth, Knapsack Problems: Algorithms and Computer Implementations, John Wiley & Sons, Chichester, 1990.
  • [2] H. Kellerer, U. Pferschy, and D. Pisinger, Knapsack Problems, Springer-Verlag, Berlin, Heidelberg, 2004.
  • [3] V. Cacchiani, M. Iori, A. Locatelli, and S. Martello, “Knapsack problems- An overview of recent advances. Part I: Single knapsack problems,” Computers & Operations Research, vol. 143, pp. 105692, 2022.
  • [4] V. Cacchiani, M. Iori, A. Locatelli, and S. Martello, “Knapsack problems-An overview of recent advances. Part II: Multiple, multidimensional, and quadratic knapsack problems,” Computers & Operations Research, vol. 143, pp. 105693, 2022.
  • [5] S. A. Cook, “The complexity of theorem-proving procedures,” in Proceedings of the 3rd Annual ACM Symposium on Theory of Computing(STOC), ACM, New York, NY, USA, 1971, pp. 151-158.
  • [6] R. M. Karp, “Reducibility among Combinatorial Problems,” in Complexity of Computer Computations, R. E. Miller, J. W. Thatcher, and J. D. Bohlinger, Eds. Boston, USA: Springer, 1972, pp. 85-103.
  • [7] M. R. Garey and D. S. Johnson, Computer and Intractablility: A Guide to the Theory of NP-Completeness, Freeman, San Francisco, CA, 1979.
  • [8] R. Bellman, Dynamic programming, Princeton University Press, Princeton, 1957.
  • [9] P. Toth, “Dynamic programming algorithms for the Zero-One Knapsack Problem,” Computing, vol. 25, pp. 29-45, 1980.
  • [10] E. Horowitz and S. Sahni, “Computing Partitions with Applications to the Knapsack Problem,” Journal of the ACM, vol. 21, no. 2, pp. 277-292, 1974.
  • [11] R. Schroeppel and A. Shamir, “A T=O(2n/2)𝑇𝑂superscript2𝑛2T=O(2^{n/2})italic_T = italic_O ( 2 start_POSTSUPERSCRIPT italic_n / 2 end_POSTSUPERSCRIPT ), S=O(2n/4)𝑆𝑂superscript2𝑛4S=O(2^{n/4})italic_S = italic_O ( 2 start_POSTSUPERSCRIPT italic_n / 4 end_POSTSUPERSCRIPT ) algorithm for certain NP-complete problems,” SIAM Journal on Computing, vol. 10, no. 3, pp. 456-464, 1981.
  • [12] M. V. Sapir, J.-C. Birget, and E. Rips, “Isoperimetric and isodiametric functions of groups,” Annals of Mathematics, vol. 156, no. 2, pp. 345-466, 2002.
  • [13] I. Dinur and S. Safra, “On the hardness of approximating vertex cover,” Annals of Mathematics, vol. 162, no. 1, pp. 439-485, 2005.
  • [14] J. H. Holland, Adaptation in Natural and Artificial Systems, MIT Press, 1975.
  • [15] H. Zhang, J. Xie, J. Ge, Z. Zhang, and B. Zong, “A hybrid adaptively genetic algorithm for task scheduling problem in the phased array radar,” European Journal of Operational Research, vol. 272, no. 3, pp. 868-878, 2019.
  • [16] Y. Nagata and S. Kobayashi, “A Powerful Genetic Algorithm Using Edge Assembly Crossover for the Traveling Salesman Problem,” INFORMS Journal on Computing, vol. 25, no. 2, pp. 346-363, 2012.
  • [17] T. Vidal, T. G. Crainic, M. Gendreau, N. Lahrichi, and W. Rei, “A Hybrid Genetic Algorithm for Multidepot and Periodic Vehicle Routing Problems,” Operations Research, vol. 60, no. 3, pp. 611-624, 2012.
  • [18] Y. Zhou and J. He, “A Runtime Analysis of Evolutionary Algorithms for Constrained Optimization Problems,” IEEE Transactions on Evolutionary Computation, vol. 11, no. 5, pp. 608-619, 2007.
  • [19] C. Qian, C. Bian, W. Jiang, and K. Tang, “Running Time Analysis of the (1 + 1)-EA for OneMax and LeadingOnes Under Bit-Wise Noise,” Algorithmica, vol. 81, pp. 749-795, 2019.
  • [20] W. Zheng, H. Chen, and X. Yao, “Analysis of Evolutionary Algorithms on Fitness Function with Time-linkage Property,” IEEE Transactions on Evolutionary Computation, vol. 25, no. 4, pp. 696-709, 2021.
  • [21] A. E. Eiben, E. H. L. Aarts, and K. M. Van Hee, “Global convergence of genetic algorithms: A markov chain analysis,” in Parallel Problem Solving from Nature. PPSN 1990, H.-P. Schwefel and R. Manner, Eds. Springer, Berlin, Heidelberg, 1991, pp. 507-516.
  • [22] G. Rudolph, “Convergence analysis of canonical genetic algorithms,” IEEE Transactions on Neural Networks, vol. 5, no. 1, pp. 96-101, 1994.
  • [23] G. Rudolph, “Finite Markov Chain Results in Evolutionary Computation: A Tour d’Horizon,” Fundamenta Informaticae, vol. 35, no. 1-4, pp. 67-89, 1998.
  • [24] G. Karafotias, M. Hoogendoorn, and A. E. Eiben, “Parameter Control in Evolutionary Algorithms: Trends and Challenges,” IEEE Transactions on Evolutionary Computation, vol. 19, no. 2, pp. 167-187, 2015.
  • [25] J. Hesser and R. Manner, “Towards an optimal mutation probability for genetic algorithms,” in Proceedings of the 1st Conference on Parallel Problem Solving from Nature, Springer, Dortmund, Germany, 1991, pp. 23-32.
  • [26] F. G. Lobo, C. F. Lima, and Z. Michalewicz, Parameter Setting in Evolutionary Algorithms, Springer-Verlag Berlin Heidelberg, 2007.
  • [27] O. Kramer, Genetic Algorithm Essentials, Springer, 2017.
  • [28] Z. Michalewicz, “Ubiquity symposium: Evolutionary computation and the processes of life: the emperor is naked: evolutionary algorithms for real-world applications,” Ubiquity, vol. 2012, no. November, pp. 1-13.
  • [29] Z. Michalewicz, “Quo Vadis, Evolutionary Computation?” in Advances in Computational Intelligence, J. Liu, C. Alippi, B. Bouchon-Meunier, G. W. Greenwood, and H. A. Abbass, Eds., vol. 7311, Springer, Berlin, Heidelberg, 2012.
  • [30] K. Sorensen, “Metaheuristics-the metaphor exposed,” International Transactions in Operational Research, vol. 22, pp. 3-18, 2015.
  • [31] D. H. Wolpert and W. G. Macready, “No free lunch theorems for optimization,” IEEE Transactions on Evolutionary Computation, vol. 1, no. 1, pp. 67-82, Apr. 1997.
  • [32] J. C. Culberson, “On the futility of blind search: An algorithmic view of ’No Free Lunch’,” Evolutionary Computation, vol. 6, no. 2, pp. 109-127, Jun. 1998.
  • [33] Y. C. Ho, Q. C. Zhao, and D. L. Pepyne, “The No Free Lunch Theorems: Complexity and Security,” IEEE Transactions on Automatic Control, vol. 48, no. 5, pp. 783-795, May 2003.
  • [34] G. J. Koehler, “Conditions that Obviate the No-Free-Lunch Theorems for Optimization,” INFORMS Journal on Computing, vol. 19, no. 2, pp. 273-279, May 2007.
  • [35] A. H. Land and A. G. Doig, “An automatic method of solving discrete programming problem,” Econometrica, vol. 28, no. 3, pp. 497-520, Jul. 1960.
  • [36] S. S. Dey, Y. Dubey, and M. Molinaro, “Branch-and-bound solves random binary IPs in poly(n𝑛nitalic_n)-time,” Mathematical Programming, vol. 200, pp. 569-587, 2023. Accepted for publication in SODA 2021.
  • [37] R. A. Brualdi, Introductory Combinatorics, Pearson, Fifth Edition, 2017.
  • [38] E. Y. Jin, “Symmetric generating functions and Euler-Stirling statistics on permutations,” Journal of Combinatorial Theory, Series A, vol. 197, p. 105752, Mar. 2023.
  • [39] D. Pisinger, “An expanding-core algorithm for the exact 0-1 Knapsack Problem,” European Journal of Operational Research, vol. 87, no. 1, pp. 175-187, Nov. 1995.
  • [40] S. Boyd and L. Vandenberghe, Convex Optimization, Cambridge University Press, 2004.
  • [41] G. B. Dantzig, “Discrete-Variable Extremum Problems,” Operations Research, vol. 5, no. 2, pp. 266-288, Apr. 1957.
  • [42] E. Balas and E. Zemel, “An Algorithm for the Large Zero-One Knapsack Problems,” Operations Research, vol. 28, no. 5, pp. 1130-1154, Sep. 1980.
  • [43] D. Pisinger, “A Minimal Algorithm for the 0-1 Knapsack Problem,” Operations Research, vol. 46, no. 5, pp. 758-767, Oct. 1997.
  • [44] S. Martello, D. Pisinger, and P. Toth, “Dynamic programming and strong bounds for the 0-1 knapsack problem,” Management Science, vol. 45, no. 3, pp. 414-424, Mar. 1999.
  • [45] S. Martello, D. Pisinger, and P. Toth, “New trends in exact algorithms for the 0-1 knapsack problem,” European Journal of Operational Research, vol. 123, no. 2, pp. 325-332, Jun. 2000.
  • [46] X. Wang and Y. He, “Evolutionary Algorithms for Knapsack Problems,” Journal of Software, vol. 28, no. 1, pp. 1-16, Jan. 2017. (in Chinese)
  • [47] L. M. Schmitt, “Theory of genetic algorithms,” Theoretical Computer Science, vol. 259, no. 1-2, pp. 1-61, Nov. 2001.
  • [48] M. Mitchell, An Introduction to Genetic Algorithms, MIT Press, 1996.
  • [49] A. E. Eiben, R. Hinterding, and Z. Michalewicz, “Parameter Control in Evolutionary Algorithms,” IEEE Transactions on Evolutionary Computation, vol. 3, no. 2, pp. 124-141, Jul. 1999.
  • [50] B. Chopard and M. Tomassini, An Introduction to Metaheuristics for Optimization, Springer, 2018.
  • [51] M. Gendreau and J. Y. Potvin, Handbook of Metaheuristics, Springer, 2nd Edition, 2010.