Abstract
In this study, we introduce and study fuzzy polynomial neurons (FPNs) being regarded as generic processing units in neurofuzzy computing. The underlying topology of FPNs is formed through fuzzy rules, fuzzy inference and polynomials. Each polynomial offers a nonlinear mapping and is centred around a modal value of the corresponding membership functions defined in the input space of the neuron. The adjustable order of the polynomial is essential when addressing the level of nonlinearity to be handled in the approximation problem. We demonstrate that fuzzy polynomial neurons form a certain class of functional neurons and afterwards discuss their properties and an overall design process. Furthermore, these neurons are discussed in the context of universal approximation and universal approximators
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Bishop CM (1995) Neural networks for pattern recognition. Oxford University Press
Buckley JJ (1993) Sugeno type controllers are universal controllers. Fuzzy Sets Syst 53(3):299–303
Castillo E, Cobo A, Gutierrez JM, Pruneda E (2000) Functional networks: a new network-based methodology. Comput Aided Civil Infrastructure Eng 15:90–106
Dickerson JA, Kosko B (1996) Fuzzy function approximation with ellipsoidal rules. IEEE Trans Syst Man Cybern B 26(4):542–560
Ferraty F, Vieu P (2002) The functional nonparametric model and application to spectrometric data. Comput Statist 17(4):545–564
Jang J-SR (1993) ANFIS: adaptive-network-based fuzzy inference system. IEEE Trans Syst Man Cybern 23(3):665–685
Joo MG, Lee JS (2005) A class of hierarchical fuzzy systems with constraints on the fuzzy rules. IEEE Trans Fuzzy Syst 13(2):194–203
Narendra KS, Parthasarathy K (1990) Identification and control of dynamical systems using neural networks. IEEE Trans Neural Netw 1(1):4–27
Pao YH (1989) Adaptive pattern recognition and neural networks. Addison-Wesley, New York
Pao YH, Takefuji Y (1992) Functional-link net computing: theory, system architecture, and functionalities. Computer 25(5):76–79
Park B-J, Pedrycz W, Oh S-K (2001) Identification of fuzzy models with the aid of evolutionary data granulation. IEE Proc Control Theory Appl 148(5):406–418
Pedrycz W, Reformat M (1997) Rule-based modeling of nonlinear relationships. IEEE Trans Fuzzy Syst 5(2):256–269
Ramsay JO, Silverman BW (1997) Functional data analysis. Springer, Berlin Heidelberg New York
Rossi F, Conan-Guez B (2005) Functional multi-layer perceptron: a non-linear tool for functional data analysis. Neural Netw 18:45–60
Rossi F, Delannay N, Conna-Guez B, Verleysen M (2005) Representation of functional data in neural networks. Neurocomputing 64:183–210
Rudin W (1976) Principles of mathematical analysis. McGraw-Hill, New York
Wang LX, Mendel JM (1992) Fuzzy basis functions, universal approximations and orthogonal least squares learning. IEEE Trans Neural Netw 3(5):807–814
Ying H (1998) General SISO Takagi-Sugeno fuzzy systems with linear rule consequent are universal approximators. IEEE Trans Fuzzy Syst 6:582–587
Zeng XJ, Singh MG (1994) Approximation theory of fuzzy systems-SISO case. IEEE Trans Fuzzy Syst 2(2):162–176
Zeng XJ, Singh MG (1995) Approximation theory of fuzzy systems-MIMO case. IEEE Trans Fuzzy Syst 3(2):219–235
Acknowledgments
This work was supported by the Korea Research Foundation Grant funded by the Korea Government (MOEHRD, Basic Research Promotion Fund) (M01-2004-000-20175-0). Support from the Natural Sciences and Engineering Council of Canada (NSERC) and the Canada Research Chair (CRC) Program (W. Pedrycz) is gratefully acknowledged.
Author information
Authors and Affiliations
Corresponding author
Appendices
Appendix 1
The adjustment of a modal value v i is done through a standard gradient-based learning We consider the Euclidean distance expressing the observed learning error
where E p is an error for the p−th data, y p is the p−th target output data (desired response) and \(\hat{y}_{p}\) stands for the p−th actual output of the FPN for this specific data point.
Next we get
which leads to the detailed expression of the form
Depending upon the location of x, we distinguish between several cases over which different derivatives are formed \(\frac{{\partial \hat{y}_{p} }}{{\partial v_{i} }}:\)
-
i)
$$\begin{aligned} \, & {\text{For}}\;x < v_{1} \quad \left\{ {\begin{array}{*{20}l} {{A_{1} (x):1}} \\ {{{\rm others}:0}} \\ \end{array} } \right. \\ \, & \frac{{\partial \hat{y}_{p} }}{{\partial v_{i} }} = \frac{\partial }{{\partial v_{1} }}{\left({A_{1}(x) \cdot \varphi _{1}(x)} \right)} = \frac{{\partial \varphi _{1}(x)}}{{\partial v_{1} }} \\ \, & \therefore \frac{{\partial \hat{y}_{p} }}{{\partial v_{i} }} = \frac{{\partial \varphi _{1}(x)}}{{\partial v_{1} }} \\ \end{aligned} $$
-
ii)
$$\begin{aligned} & {\text{For}}\;v_{c} \leq x\quad \left\{ {\begin{array}{*{20}l} {{A_{c} (x):1}} \\ {{{\rm others}:0}} \\ \end{array} } \right. \\ & \frac{{\partial \hat{y}_{p} }}{{\partial v_{i} }} = \frac{\partial }{{\partial v_{c} }}{\left({A_{c}(x) \cdot \varphi _{c}(x)} \right)} = \frac{{\partial \varphi _{c} (x)}}{{\partial v_{c} }} \\ & \therefore \frac{{\partial \hat{y}_{p} }}{{\partial v_{i} }} = \frac{{\partial \varphi _{c}(x)}}{{\partial v_{c} }} \\ \end{aligned} $$
-
iii)
$$\begin{aligned} & {\text{For}}\;v_{i} \leq x < v_{{i + 1}} \left\{ {\begin{array}{*{20}l} {{A_{i} (x):\frac{{v_{i} - x}} {{v_{{i + 1}} - v_{i} }} + 1}} \\ {{A_{{i + 1}}(x):1 - A_{i}(x) = \frac{{x - v_{i} }}{{v_{{i + 1}} - v_{i} }}}} \\ {{{\rm others}:0}} \\ \end{array} } \right. \\ & \frac{{\partial \hat{y}_{p} }} {{\partial v_{i} }} = \frac{\partial }{{\partial v_{i} }}{\left({A_{i}(x)\varphi _{i}(x) + A_{{i + 1}}(x)\varphi _{{i + 1}}(x)} \right)} \\ & \quad \quad = \frac{{\partial A_{i}(x)}}{{\partial v_{i} }}\varphi _{i}(x) + \frac{{\partial \varphi _{i}(x)}}{{\partial v_{i} }}A_{i}(x) + \frac{{\partial A_{{i + 1}}(x)}}{{\partial v_{i} }}\varphi _{{i + 1}}(x) \\ & \frac{{\partial A_{i}(x)}} {{\partial v_{i} }} = \frac{\partial }{{\partial v_{i} }}{\left({\frac{{v_{i} - x}}{{v_{{i + 1}} - v_{i} }} + 1} \right)} = \frac{{(v_{{i + 1}} - v_{i}) + (v_{i} - x)}}{{(v_{{i + 1}} - v_{i})^{2} }} = \frac{{v_{{i + 1}} - x}}{{(v_{{i + 1}} - v_{i})^{2} }} = \frac{{A_{i} }}{{v_{{i + 1}} - v_{i} }} \\ & \because \frac{{v_{{i + 1}} - x}}{{v_{{i + 1}} - v_{i} }} = \frac{{v_{i} - x}}{{v_{{i + 1}} - v_{i} }} + 1 = A_{i} \\ & \frac{{\partial A_{{i + 1}}(x)}}{{\partial v_{i} }} = \frac{\partial }{{\partial v_{i} }}{\left({\frac{{x - v_{i} }}{{v_{{i + 1}} - v_{i} }}} \right)} = \frac{{ - (v_{{i + 1}} - v_{i}) + (x - v_{i})}}{{(v_{{i + 1}} - v_{i})^{2} }} = \frac{{x - v_{{i + 1}} }}{{(v_{{i + 1}} - v_{i})^{2} }} = - \frac{{A_{i} }}{{v_{{i + 1}} - v_{i} }} \\ & \because \frac{{x - v_{{i + 1}} }}{{v_{{i + 1}} - v_{i} }} = \frac{{x - v_{i} }}{{v_{{i + 1}} - v_{i} }} - 1 = - A_{i} \\ \end{aligned} $$
Therefore,
For any input, the process of learning involves only two modal values v i and v i+1. For v i ≤ x < v i+1, \(\frac{{\partial\hat{y}_{p}}}{{\partial v_{{i+ 1}}}}\) comes in the form
Therefore,
Depending on the order of the polynomial ∂φ i (x)/∂v i is specified as follows
Here, φ i , φ i+1, and ∂φ i /∂v i stand for the order of polynomial of the conclusion of the ith rule.
Finally, the expressions for Δv i are formed as
-
i)
For x < v 1 or for v c ≤ x, (the value of A i is 1 while others are equal to 0),
$$\Delta v_{i} = - \eta\frac{{\partial E_{p}}}{{\partial v_{i}}} = 2\eta(y_{p} - \hat{y}_{p})\frac{{\partial \varphi _{i}}}{{\partial v_{i}}}\quad (i=1\;\hbox{or}\;c)$$ -
ii)
For v i ≤ x < v i+1
$$\begin{aligned} \Delta v_{i} &= - \eta\frac{{\partial E_{p}}}{{\partial v_{i}}} = 2\eta(y_{p} - \hat{y}_{p}){\left({\frac{{\varphi _{i} - \varphi _{{i + 1}}}}{{v_{{i + 1}} - v_{i}}} + \frac{{\partial \varphi _{i}}}{{\partial v_{i}}}} \right)}A_{i} \\ \Delta v_{{i + 1}} &= - \eta\frac{{\partial E_{p}}}{{\partial v_{{i + 1}}}} = 2\eta(y_{p} - \hat{y}_{p}){\left({\frac{{\varphi _{i} - \varphi _{{i + 1}}}}{{v_{{i + 1}} - v_{i}}} + \frac{{\partial \varphi _{{i + 1}}}}{{\partial v_{{i + 1}}}}} \right)}A_{{i + 1}} \\ \end{aligned}$$
Quite commonly to accelerate convergence, a momentum term is being added to the learning formula. The complete update formulas combining the momentum components arises in the form
-
i)
For A i =1 (x < v 1 or v c ≤ x, so, i=1 or c)
$$\Delta v_{i} (t + 1) = 2\eta(y_{p} - \hat{y}_{p})\frac{{\partial \varphi _{i}}}{{\partial v_{i}}} + \alpha\Delta v_{i} (t)$$ -
ii)
For v i ≤ x < v i+1
$$\begin{aligned} \Delta v_{i} (t + 1) &= 2\eta(y_{p} - \hat{y}_{p}){\left({\frac{{\varphi _{i} - \varphi _{{i + 1}}}}{{v_{{i + 1}} - v_{i}}} + \frac{{\partial \varphi _{i}}}{{\partial v_{i}}}} \right)}A_{i} + \alpha\Delta v_{i} (t)\\ \Delta v_{{i + 1}} (t + 1) &= 2\eta \cdot (y_{p} - \hat{y}_{p}){\left({\frac{{\varphi _{i} - \varphi _{{i + 1}}}}{{v_{{i + 1}} - v_{i}}} + \frac{{\partial \varphi _{{i + 1}}}}{{\partial v_{{i + 1}}}}} \right)}A_{{i + 1}} + \alpha\Delta v_{{i + 1}} (t)\\ \end{aligned}$$
Where Δv i (t)=v i (t) − v i (t − 1). η is the learning rate, α denotes a momentum coefficient, we confine the values of these two to the unit interval.
Appendix 2
The determination of the parameters of the conclusion is completed through the gradient-based learning and follows a general scheme similar to that outlined in Appendix 1.
For (12), we have
For the polynomial φ i (x)=a 0i +a 1i (x − v i ) +a 2i (x − v i )2+⋯+a 5i (x − v i )5, the following relationships holds
Depending upon the order of the polynomial, the detailed expressions are obtained
Rights and permissions
About this article
Cite this article
Park, BJ., Pedrycz, W. & Oh, SK. Fuzzy polynomial neurons as neurofuzzy processing units. Neural Comput & Applic 15, 310–327 (2006). https://doi.org/10.1007/s00521-006-0033-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-006-0033-2