Necessary and Sufficient Conditions of e-Separability
Maria A. Ivanchuk^{1}, Igor V. Malyk^{2}
^{1}Department of Biological Physics and Medical Informatics, Bucovinian State Medical University, Chernivtsi, Ukraine
^{2}Department of the System and Analysis and Insurance Financial Mathematics, Yuriy Fedkovych Chernivtsi National University, Chernivtsi, Ukraine
Email address:
To cite this article:
Maria A. Ivanchuk, Igor V. Malyk. Necessary and Sufficient Conditions of e-Separability.International Journal of Theoretical and Applied Mathematics Vol. 2, No. 2, 2016, pp. 28-34. doi: 10.11648/j.ijtam.20160202.11
Received: September 13, 2016; Accepted: October 28, 2016; Published: November 21, 2016
Abstract: In this paper we propose a new approach for solving the classification problem, which is based on the using e-nets theory. It is shown that for e-separating of two sets one can use their e-nets in the range space w.r.t. halfspaces, which considerably reduce the complexity of the separating algorithm for large sets’ sizes. The separation space which contains the possible values of e for e-nets of both sets is considered. The separation space is quasi-convex in general case. To check necessary and sufficient conditions of e-separability of two sets one can solve an optimisation problem, using the separation space as constraints. The lower bound of the separation space is convex for the exponential distribution and linear for the uniform distribution. So, we have convex and linear optimisation problems in these cases.
Keywords: e-nets, Separability, Quasiconvexity
1. Introduction
In 1987 D. Haussler and E. Welzl [1] introduced e-nets. Since that time e-nets are used in computational and combinatorics geometry.
Let be a set (possibly infinite) and . The pair is called a range space with its points and the elements of its ranges. Typical examples of range spaces in geometric context are set of halfspaces in , set of balls in , set of convex hulls in , axis parallel rectangles in the plane [2], set of d-dimensional simplexes in [3], triangles in the plane [4], set of a-fat wedges in the plain [5].
Numerous works study e-nets of one set [2-6]. B. Aronov et al. shown the existence of ε-nets of size for planar point sets and axis-parallel rectangular ranges [2]. J. Kulkarni et al. get an e-net of size for -fat wedges in [5]. B .Gärtner proved the existing of e-nets of size , where is Vapnik-Chervonenkis dimension [3]. Matousek J. et al. shown that disks and pseudo-disks in the plane as well as halfspaces in allow e-nets of size only , which is best possible up to a multiplicative constant [6]. But, unfortunately, the analogous questions for higher-dimensional spaces remain open.
In this paper we will build e-nets of two sets for solving the classification problem.
Consider two sets and in Euclidian space . Consider the set contains points, set contains points. Let’s denote by the convex hull of the set . Let .
Definition 1. Sets and are called e-separable if there exist subsets , , such that
(1)
and
(2)
The classification problem consists of finding the hyperplane , which separates into two halfspaces and , such that .
Definition 2. Hyperplane is called e-separating for the sets and if
.
Consider an infinite range space , where is the closed halfspaces in bounded by hyperplanes. In [7] the following theorem was proved.
Theorem 1. A necessary and sufficient condition that two sets of points and are e-separable is there exist and corresponding e-nets , in such that
(3)
and
(4)
To find , which satisfy the inequality (3), let’s denote the separation space.
Definition 3. The set
(5)
is called the separation space for.
Let sets and are generated by the general populations and with distributions and .
Definition 4. The set
(6)
is called the separation space for.
Consider the Euclidian space . In [8] the following lemma was proved.
Lemma 1. Let the inverse function exist. Then the sets and are separated by the curve
(7)
Let’s consider the general case, when distribution functions don’t have the inverse functions in some points. We will use the generalized inverse [9].
Definition 5. For an increasing function with and , the generalized inverse
(8)
with the convention that . If is a distribution function, is also called the quantile function of .
In [8] the following lemma was proved.
Lemma 2. Sets and are separated by the line
(9)
For the line , which separates the sets and the following theorem was proved in [8].
Theorem 2. Let the following conditions
(1) The sets of size are generated by the independent continuous random variables .
(2) The sets and are separated by the line .
are satisfied. Then there exist the following equality
,
where
.
In this paper we will consider separation curve for the sets , and it’s properties.
2. Results and Discussion
2.1. Quasiconvexity of the Separation Curve
Consider the function where is separation curve for the sets , . Let’s show that in the general case function is quasiconvex [10].
Definition 6. Let be the convex subset of . Function is called quasiconvex, if for any and
.
Theorem 3. Function
is quasiconvex.
Proof. A continuous function is quasiconvex if and only if at least one of the following conditions holds: is nondecreasing; is nonincreasing; there is a point such that for , is nonincreasing, and for , is nondecreasing [10]. Let’s show that function is nonincreasing.
(1) Consider function . Function is nondecreasing, so, is also nondecreasing. Then is nonincreasing. Since is nondecreasing, is nonincreasing.
(2) Consider function . Function is nondecreasing, so is also nondecreasing. Then is nonincreasing.
Since function and are nonincreasing, is nonincreasing. Thus, Theorem 3 is proved.
In particular cases function may be convex or linear. Consider corresponding examples.
2.2. Exponential Distribution
Let two random variables which is exponentially distributed with parameters. Then according to Lemma 1 we have
.
Lemma 3. Let two random variables exponentially distributed with parameters , then function , which separates sets and is convex.
Proof. Let’s consider the following two possible cases.
(1) Let , then
We need to show that the function is convex. Since
as .
So, function is convex as .
(2) Let , then
Also, need to show that the function is convex. Since
as .
So, function is convex as . Thus, Lemma 3 is proved.
Let and . Function is illustrated in the figure 1.
Fig. 1. Function for exponential distribution.
2.3. Uniform Distribution
Let two random variables which is uniformly distributed with parameters and . According to the Lemma 1, the set is lower bounded by the function
.
Lemma 4. Let two random variables uniformly distributed with parameters and , then function , which separates sets and is linear decreasing function.
Proof. For this reason consider the following two possible cases:
(1) Let for : , then
;
So, is linear function. Since , function is decreasing.
(2) Let for : , then
So, is linear function. Since , function is decreasing. Thus, Lemma 4 is proved.
Let and . Function is illustrated in the figure 2.
Fig. 2. Function for uniform distribution.
2.4. Normal Distribution
Let two random variables which is normally distributed with parameters and . According to lemma 1 the set is lower bounded by the function
,
where
.
Let and . Function is illustrated in the figure 3.
Fig.3. Function for normal distribution.
2.5. Checking the Necessary and Sufficient Conditions for e-Separability
Obviously, for any the equality (4) holds. So, to check the e-separability of the sets and it’s enough to check the existing such that the inequality (3) holds. Consider the optimisation problem
(10)
(11)
Let be the solution of the task (10), (11). If the equality (3) doesn’t hold for , it doesn’t hold for all , it means the sets and are not e-separable. According to the Theorem 2, the condition (10) can be changed into
(12)
According to the Theorem 3, in the general case, the set is quasiconvex. According to the Lemma 4, the set is convex if the distribution is exponential. In this case the problem (10), (12) is the problem of convex programming. In case of uniform distribution, according to the Lemma 5, the task (10), (12) is the task of linear programming.
Let the point is the solution of the optimisation problem (10), (12).
In the paper [8] the following theorem was proved.
Theorem 4. Let
(1) have distribution , have distribution ,
(2) be independent random variables,
(3) Functions , have inverse and bounded densities exist.
Then we obtain the weak convergence
,
where
, as ,
According to the theorem 4, the point belongs to the neighbourhood
,
where , is probability, is t-distribution quantile for the corresponding probabilities in (fig.4).
If in the boundary point the condition (4) of the theorem 1 doesn’t hold, namely:
,
then the sets and are not e-separable.
Example 1. Consider two sets of points and generated by the uniformly distributed general populations and . Let , , , , . Let’s verify the e-separability of the sets and as .
Build the set (fig.5) and consider the task of linear programming
(13)
(14)
(15)
The solution of the linear programming problem (13)-(15) is the point (fig.5). Let’s find the confidential interval for the solution of the problem
(16)
(17)
with probability .
According to the theorem 4
as ,
so, the confidential interval for the problem (16), (17) solution degenerates to the point .
Fig.5. Lower bounds of the sets and and the linear programing problem’s solution the point
Let’s verify the condition (3) of the theorem 1
.
The condition (3) doesn’t hold for , hence it doesn’t hold for all . So, the sets and are not e-separable as .
Example 2. Consider two sets of points and generated by the exponentialy distributed general populations and . Let , , , . Let’s verify the e-separability of the sets and as .
Let’s build the set (fig.6). Consider the problem of convex programming
(18)
(19)
(20)
The solution of the convex programming problem (18)-(20) is the point (fig.6).
Fig.6. Lower bounds of the sets and and the convex programing problem’s solution the point
Let’s find the confidence interval for problem’s solution with probability
(21)
(22)
According to theorem 4
as ,
then
.
So, the confidence interval for the problem’s (18)-(20) solution is .
Let’s verify the condition (3) in the boundary point
.
The condition (3) doesn’t hold for , hence it doesn’t hold for all . So, the sets and are not e-separable as .
Example 3. Consider two sets of points and generated by the general populations . Let random variable be uniformly distributed with the parameters , and random variable be normally distributed with the parameters . Let , . Let’s verify the e-separability of the sets and as . Consider the problem of convex programming
(23)
(24)
(25)
The convex programming problem’s solution is illustrated in the figure 7.
Fig. 7. Lower bounds of the sets and and the convex programing problem’s solution the point
Let’s find the confidence interval for problem’s solution with probability
(26)
(27)
as, hence the confidential interval degenerates to the point . Let’s verify the condition (3).
The condition holds. So, the sets and are e-separable as .
3. Conclusions
Two sets and are e-separable iff there exist their e-nets , in such that conditions (3)-(4) hold. To verify the conditions (3)-(4) it is enough to solve the optimisation problem (10)-(11). If the solution of the problem (10)-(11) does not satisfy the condition (3), then the sets and are not e-separable. According to the theorem 2, one can use the theoretical set as constraints for the optimisation problem (10)-(11). Three examples show verification of e-separability for uniform and exponential distributions.
References