System idendification and control with ANN

Hi,

This set of questions are related to system identification of a nonlinear system with artificial neural network (ANN) followed by its control.

Our system consist of two input vectors namely I1, I2 and an output vector O. 'I1(T)' is the external control inputs to the system at time 'T', 'I2(T)' is the state of the system at time 'T', 'O(T)' is the state of the system at time 'T' The system can be defined as a function 'F' O(T+1) = F(I1(T), I2(T)) which means that the next state of the system is a function of its current state and control inputs. where I2(T) = O(T-1) ie. O(T+1) = F( I1(T), O(T-1)) or O(T) = F(I1(T-1), I2(T-1) )

We have loged data set of I1(T), I2(T), O(T) from the actual system observations.

Can someone suggest a particular ANN type which can be trained to learn O(T+1) = F(I1(T), I2(T)) (ie. system identification with ANN)?

Can we use this ANN to find I1(T) given I2(T) and O(T+1) ie. how we can use this ANN to the required I1(T) if we know the current state I2(T) and desired next state O(T+1) of the system?

I1(T) = X(I2(T),O(T+1)) ==> X=?

I1(T) ====>| | | F |===>O(T+1) I2(T) ====>| |

O(T+1) ====>| | | X |===>I1(T) I2(T) ====>| |

Reply to
<Phil>
Loading thread data ...

You can try, but model-free system identification doesn't, typically, work very well. If you can come up with a rough model that just needs to be tuned, the odds of getting a usable model are much better.

I know, when you don't have a clue what's going on, it's tempting to throw an ANN at the problem and expect it to solve it. But ANNs are actually rather dumb. They really deal with nonlinearity piecewise, after all. If there's a square or log relationship between some input and output, an ANN will not match it well. Try linearizing the problem first.

John Nagle

Phil wrote:

Reply to
John Nagle

Confusing notation and terminology. Years ago the characterization would be of the form

'I(T)' is the external control inputs to the system at time 'T', 'X(T)' is the state of the system at time 'T', 'O(T)' is the *output* of the system at time 'T'

with corresponding state and output equations

X(T+1) = G( X(T),I(T) ) O(T) = H( X(T),I(T) )

so that all delays are in terms of the state. Using the unit delay operator, D,

X(T) = D X(T+1)

so that the corresponding block diagram is

|--------------------------------| | v I(T)--->[G]-->X(T+1)-->[D]-->X(T)--->[H]-->O(T) ^ | | | |------------------------|

Given the time histories of I,H and O allows G and H to be represented by NNs which can be trained independently.

However, there may be more modern approaches available.

In my notation you have

O(T+1) = F( X(T),I(T) ) X(T) = O(T-1)

Therefore the state and output equations have to be written in the

*noncausal* form

x(T+1) = D F( X(T),I(T) ) O(T) = x(T+1)

A corresponding block diagram is

I(T)--->[DF]---->X(T+1)------>O(T) ^ | | O(T) = F(I1(T-1), I2(T-1) )

Either a MLP or Elman should work.

Train the NN K:

|--[D]--| | v O(T)------->[K]--->I(T)

Hope this old-fashioned view helps.

Greg

Reply to
Greg Heath

# John Nagle

Would you please explain what you mean by "piecewise"?

Reply to
Toby Newman

Try training an ANN to approximate

y = x^2

and see what happens.

John Nagle

Toby Newman wrote:

Reply to
John Nagle

Part of your problem seems to be conceptual and terminological. Therefore I suggest a reformulation of your problem.

Unfortunately, I can only help with system concepts and terminology that I taught decades ago. I have not kept up with system theory since then. In addition, most of my experience has been with continuous-time linear systems; so take my discrete-time nonlinear conversions with a grain of salt.

The system is characterized by the following state and output equations X(T+1) = G( X(T),I(T) ) O(T) = H( X(T),I(T) )

where

I(T) -- Input at time T X(T) -- State at time T O(T) -- Output at time T

Notice that all delays are in terms of the state. Using the unit delay operator, D, X(T) = D X(T+1) so that the corresponding block diagram is |--------------------------------| | v I(T)--->[G]-->X(T+1)-->[D]-->X(T)--->[H]-->O(T) ^ | | | |------------------------| SYSTEM IDENTIFICATION (SYSTEM REALIZATION)

Given time histories of I and O, estimate the parameters of G and H to minimize MSE.

MINIMAL-DIMENSIONAL REALIZATION (IRREDUCIBLE REALIZATION)

A {G,H} realization is irreducible if and only if the state X has minimal dimensionality.

CONTROLLABILITY

A system is controllable if and only if given initial and final states {X(0),X*} there is a *finite* sequence of inputs {I(0),I(1),...I(T*)} that will cause the system to make this transition.

A system is completely controllable if and only if given the transition can be accomplished with one input I(0).

Notice that the definition of controllability does not involve the output.

OBSERVABILITY

A system is observable at time T0 if for any state X(T0), there exist *finite* input and output sequences {I(T0), I(T0+1),...I(T1)} and {O(T0), O(T0+1),...O(T1)} that suffice to determine X(T0).

CONDITIONS FOR A MINIMAL-DIMENSIONAL REALIZATION

A realization of a system is minimal-dimensional if and only if it is both controllable and observable.

FINAL COMMENTS

I have no idea how the finite sequences in the definitions of controllability and observability can be obtained.

Typically, it is easier to overfit a NN and use one of the mitigation techniqes described in the FAQ in order to avoid overtraining.

I'm not sure how overfitting-mitigation techniques fit into the scheme of things when controllability and observability are an issue.

Hope this helps.

Greg

Reply to
Greg Heath

have a look at

formatting link
this software can also self-organize dynamic systems of equations from observational data including providing the analytical equations. but always be careful for using the models for control purposes.

frank

Reply to
Frank Lemke

Consider three training cases at -100, -20 and 60. Using a gaussian basis function standard deviation of 8.49 and a NMSE goal of 0.1, MATLAB obtained a two hidden node RBF solution that obtained NMSE = 0.000038 over the interval [-400,400] using 81 evenly spaced test cases.

If the point you were trying to make is that the performance *always* degrades rather fast when x is outside the interval spanned by the training data, you need a more convincing multimodal example. Maybe something like

y = (x^2 - 625) * (x^2 - 5625)

would better illustrate the point. Hope this helps.

Greg

Reply to
Greg Heath

# Greg Heath

If your training data has larger numerical limits than the ultimate data that will be used after training, then the problem will be reduced, no?

Reply to
Toby Newman

Yes.

A well trained NN should interpolate well within the convex hull of the training data, provided there is enough training data.

No matter how much training data is available and how well a NN interpolates within the convex hull, there is no guarantee that it will extrapolate well.

Hope this helps.

Greg

Reply to
Greg Heath

PolyTech Forum website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.