Basic Principles of Mathematical Modelling of "Genotype-Environment” Interaction

20 major tasks, which were determined by V.A. Dragavtsev[6] for future quantitative technologies of eco-genetic improvement of plants productivity and yield are investigate. Authors offer principles of mathematical modelling for each of these tasks.


Introduction
Further development of the genetic improvement of plants quantitative traits is largely constrained by inadequate theoretical level of all b ranches of modern genetics. List of unsolved problems in this area today is quite extensive, but common to them is the lack of a quantitative theory of eco-genetic processes, based on its mathemat ical models. In this case we are not interested by models in general, but only those models that address to specific breeding objectives. Despite the large number of possible solutions of problems, modern cybernetics includes only three classes of problems whose can solute different in appearance and shape the mathematical models. These include: the problems of optimal control by the state of dynamical systems, which use predictive dynamic model (PDM), the problems of the optimal design, which uses both dynamic and static model of design (MD), exp licit ly contain the parameters that need to be decide based on the operation of the system objectives of the study systems in wh ich the natural experiment is replaced by computer experiment, and which use detailed models, reflection-mental "physics" of phenomena and processes (FM).
G. Kekser in his famous work "Kinetic models of development and heredity" has come to the conclusion that the theories, laws, hypotheses are logically equivalent to a model. Often there are two types of models -speculative and heuristic. Recent allow formu late a more o r less precise statements, leading to further experiments. In this case, they can be not mathematical models.
A genetic model, such as a model of Mendel, predicts the pictures of the hereditary traits segregation. In this model, the genotype is postulated solely on the basis of genetic experiments analysis [1, p.47].
A mathemat ical model of quantitative traits inheritance identify the key processes (essential variab les) necessary for predicting of biological systems behaviour at different levels of organizat ion -fro m molecu lar to biocenosisal. For a model of natural resources of the biosphere using, there are common variables: segregation, preying-proof, co mpetit ion, parasitism, the spread of d iseases, the impact of environmen tal factors on the distribution of plants and animals. In modelling these processes are linked by co mmon mathe mati cal properties of skim: retardation effects, cu mulative effects, threshold values, a large number of variables, and their complex interaction.
In the papers [2,3,4,5,6,7] introduced the concept of eco-genetic organizat ion of quantitative traits, which allows us to develop a genetic theory by implementing the following guidelines wh ich may be regarded as variants of the researcher-ray problems. Here we are at the same time we point and type of mathematical model: 1. Understanding of the mechanisms of the genotype reactions on the limiting factors of environment -can be implemented by the FM; 2. Understanding of the mechanisms of a quantitative trait development in ontogeny -can be implemented by the FM; 3. Understanding of the mechanisms and prediction of "reactions norms" and homeostasis of productivity under different environ mental conditions -can be implemented by the FM; 4. Understanding of the mechanisms of the dominance "shifts" for quantitative characters in different environ ments -can be imp lemented by the FM; 5. Understanding of the nature of ecologically dependent heterosis, and prediction o f its appearance in ecological or competitive limit ing of gro wth processes -can be implemented through the PDM; 6. Understanding of the nature of the "shifts" and predicting changes of signs and levels of genotypic, genetic and environmental correlations -can be implemented through the PDM; 7. Control of the amp litude of genetic variability of quantitative character in the population by changing external constraints -can be realized through the PDM; 8. Managing by the number of genes, which determine the level and genetic variability of trait in the population by changing the limiting factors -can be realized through PDM; 9. The choice of characters by wh ich to conduct selection in specific environmental conditions -can be implemented by MD; 10. Designing the ideal background characters (with zero genetic variance) to identify in real-t ime genotypes by phenotypes -can be realized by the MD; 11. Understanding of the transgressions mechanisms and prediction of their occurrence in specific hybrid populations -may be implemented through the PDM and MD; 12. Understanding of the mechanisms and predicting the effects of genotype-environment interaction (GEI) -can be implemented through the PDM; 13. Understanding of the nature of p leiotropy for quantita tive traits and prognosis of "breaking" pleiotropic comp lexes in different environ mental conditions -can be realized by means of PDM; 14. Creation of new principles of parental pairs selection for solving breeding tasks -may be imp lemented through the MP; 15. Develop ment of the theory and practice of improved seeds production -may be imp lemented through the MD; 16. Creating a common science-based technologies of breeding for productivity with using of artificial climate systems for the study of adaptive traits of genotypes in different phases of development and at different lim-factors, -can be realized by the MD and PDM; 17. Assessing one of the three basic characteristics: adaptive property of varieties, the dynamics of the limiting factors and genetic parameters of populations by the other two -could be implemented through the PDM; 18. Predict ion of Hay man's graphs configurations and Hey man's parameters rank options without diallel crossescan be implemented through the PDM; 19. Establishment of a quantitative theory of plant ontoge nesis on the basis of ecology-genetic control -may be implemented through the PDM; 20. Genetic inventory of plant populations -can be realized by PDM.
Thus, of the 20 tasks listed 12 tasks based on the use of prognostic dynamic models (PDM), 4 the problem -by using models of design (MD) and the same nu mber of tasks based on the research of the physical model (FM). Obviously, fro m the viewpoint of the authors of the concept of the most urgent tasks, which use predictive dynamic models that largely corresponds to the very essence of the concept of «ecogenetic response -a quantitative character».
Any of the above problems involves the construction offorecast changes of quantitative traits in response to changing of the environment's parameters. In this large majority of quantitative traits are states of the production process. Therefore, the new model on the concept of "genotype -environment" interaction will be largely the further develop ment of methods for modelling the productio n process, which should include all the main features and characteristics of a particular geno-and phenotype.

Conceptual Schema of Model
The most important step in the construction of any mathematical model of a co mplex system o f course, applies a liv ing organism, is the selection of significant variables in the system, describing its comp lete state. Despite the difficult ies of selecting substantially variab les in b iological systems, and here it is expedient to adhere to the principle of the system-wide. It lies in the fact that these variables should be makes for the purpose based on them, d irectly o r through auxiliary variab les must be calculated or assessed optimality criteria used for the solution of specific problems. Therefo re does not make sense to talk about a set of essential variables, until the final statement of the specific tasks of forecasting and management. Yet restrict the most common approach to choosing the type and structure of the model and will start fro m the adoption of modular structure, developed by the authors of the concept.
Distinguish the main features of the simulated processes and phenomena: most of the quantitative traits are the main parameters of state (hereinafter simply states), the other less common of characters of a resultant outputs, and are co mbinations of states (often paired), а lot of dynamic and static characteristics of external factors: the randomness, the existence of the inverse effect on the plants themselves (competition), the intensity and duration of exposure, the magnitude and the moments of minima and maxima of the action (strikes), the presence of the phases of ontogeny and their influence on the dynamic of characteristics, availability of the spectrums of genes that control quantitative traits, the number of which depends on environmental factors, and due to which there are significant changes in the dynamics of quantitative traits. Fig. 1 shows the conceptual scheme of the model, wh ich defines the dual effect of environ mental factors, directly on the evolution of quantitative traits (signal communicat ion) and in the spectra of genes -the parametric context by which it is possible to significantly affect the distribution of the contributions of states in the module quantitative traits, and thus the final result. In addition, there is reflected the inverse effect on the environment the plants themselves, which is especially noticeable in the soil.
According to this scheme to construct a mathemat ical model of «genotype-environment» interaction, we must address the following main tasks: modelling the evolution of states and outputs of the modules of quantitative traits, simu lation of the change of the spectra of genes in the genotype or the state of genetic-physiological systems, depending on the perturbation of the environment and their influence on the parameters of the model of the evolution of states, an imitation of changes in equations of environmental factors.   It is helpful to p lace a large nu mber of genes in the spectrums, wh ich determine the evolution of quantitative traits in the modu les, to simu late the seven genetic-physiolo gical systems: attraction -providing a period of plastic substances filling pu mping out with crowbars, and leaves in the ear; micro-distribution attraction matter between grain and chaff have spiked, husk and kernel in sunflower, etc.; adaptation (resistance to stressors on the climatic and chemical parameters of the environment); polygenic resistance; susceptibility (response) to the doses of soil nutrition, tolerance to the density; genetic variability of ontogenetic phases. Fig.2 shows the structure of a mathemat ical model that contains the basic modules and components of quantitative traits outputs, forming several layers.

The Modular Structure of the Model
In accordance with the above structural scheme, we g ive the canonical form of model, which includes a block of core modules that define the evolution of the continuum states (the component characters) (which include bio mass, the mass of commodit ies or items of plant, linear d imensions, etc.): Counting of states (number o f fru its, grains in the ear, ears, stems, internodes, and others) 1 1 And, in fact, the unit outputs Model of quantitative traits in vector-matrix fo rm is as follows In models (1) -(8), the following notation: i = 1, I; n = 1, N indexes modules, l-the index of the state of genetic-physiological system, denoted by φ; s-indices phenophases; a 11 (φ ij )-a 22 (φ ij ), A is (φ ij )-dynamic parameters of the models of the modules perturbed by the corresponding genetic and physiological systems; d 11 -d 22 , D is -parameters of cross-lin king of modules; f 1 , f 2 , f 3 , F-uncontrollable external factors (climate factors); u 1 , u 2 , u 3 , U-managed by external factors (nutrient); c 11 (φ ij )-C 23 (φ ij ), C is (φ ij ) -parameters that define the system sensitivity to external uncontrollab le impact and perturbed by the relevant genetic -physiological systems; b 11 (φ ij )-b 22 (φ ij ), B is (φ ij ) -parameters that define the system sensitivity to external control action and angers the relevant genetic and physiological systems; v i -outs of the first level; v n -outs subsequent levels; k 1i , k 2i ; k ni , k n2 ;-parameters and matrices of increasing dimension, the unit output modules; X N -total state vector; ζ 1 , ζ 2 ; Ζ-random perturbations of states (noise quantitative traits) is a Gaussian process with zero mean and covariance θ 1 , θ 2 ; Θ.
In the above model, there are t wo main sources of uncertainty (information noise): direct dynamic random perturbations of the external environment, stimu lating the covariances states of any phenotype; covariances of the model parameters, including both genetic variation and inaccuracy in the parameters for individual phenotypes.
Applying the above model, the operation of mathemat ical expectation and then selecting from it centered co mponent, can cause the model to two dynamic blocks of sufficient statistics: the expectation of the vector of states and the covariance mat rix of states.

State Model of Genetic-Physiological Systems
Due to the fact that the ecology-genetic approach involves instead of "genetic analysis" study of the characteristics of genetic-physiological systems through which possibly improve the species, it does not make sense to model the dynamics of a large nu mber o f genetic spectrums, and immed iately assess the result of the influence these systems on the status of output modules through the parameters of basic units, as indicated in the diagram Fig. 1. However, unlike the signal to the external environ ment on the condition of the basic modules, which is important only the magnitude and duration of exposure, here are informat ive, form and time of the annex of exposure.
We introduce the shape parameters of external influence (the limiting factor), which for convenience we co mbine the vectors Π l , where l = 1,7 -indices of genetic-physiological systems. The choice of an adequate set of parameters forms for each condition (informative factors) is precisely the subject of study genetic-physiological systems. After identifying a set of parameters is possible to construct model state of genetic-physiological systems.
Ф l -a vector of state parameters l-th genetic-physiological system; Α l -the dynamic matrix; Β l -matrix parameters influence the form of external influence on the state of a l-th genetic-physiological systems P l -the vector o f parameters of the n-th module model of qualitat ive features perturbed by l-th genetic-physiological system, D nl -matrix of connection parameters n-th module of qualitative features of the model and condition of l-th genetic-physiological system.
We distinguish the possible parameters of the state basic genetic-physiological systems in order: -attractions, providing a period of plastic substances filling pu mping out stems and leaves in the ear -weight of stem and spike (economic and non-economic parts of the plant) in the content of carbohydrates and protein in the commodity and non-market units; -microdistribution of attracted matter between grain and chaff for cereals, husk and kernel in sunflower, etc. -weight of grain and non-cereal part o f the ear (chaff, awns, etc.); -adaptation (resistance to stressors on climate and chemical parameters of the environment) -the degree of slowing of the growth processes under the influence of environmental parameters, speed and time of restoring the normal dynamic o f growth processes; -polygenic steadiness -resistance of plants to pathogens, the development of plant stability mechanisms; susceptibility (response) to the elements of soil nutritionthe parameters of the sensitivity of the productivity; tolerance to density -the parameters of sensitivity of productive indicators to the density of plants.
variability of periods of ontogenesis -the parameters covariance of between phases periods to environ mental factors.

Imitation of External Disturbances
Of the three above-mentioned problems of modelling we need to consider the principles of imitat ion external influenc es. Without solving this problem is not as rapid p redicting the status of agricultural crops, and the solution design and objectives of planning on technology. Under the operational forecasts, projections are understood to proactively from one week to 3 to 4 month period. In view of the local task and the limited information capacity, we do not use models of atmospheric physics, and each of the climatic factors are considered as random vectors observed process. Results of forecasting model, can cover area bounded by the size of a farm field or a farm.
However, even this simp lified approach faces a serious problem of choosing an adequate mathematical model of the dynamics of individual climat ic factors, which would be sufficient to solve the problem prognostic properties. So study the dynamics of the monthly fluctuations of each factor in a few years have shown that a single a priori model, even very precisely tuned to their measured value has property of regularity and posterior projections derived fro m them do not meet the specified standards of accuracy and reliability.
This has led to resort to special adaptive algorithms for variable structure. The basis of these algorith ms is the representation of changes of weather factors as co mplex po ly scenarios a dynamic process in which each of the dominant scenarios is a random event and is given by its model with a priory unknown parameters. Then in the task of identifying the process is how to identify the number of do minant scenarios for each forecast period, the parametric-ray view of a priori p robability of their occurrence, as well as to refine these probabilities and parameters in real t ime.
In this representation of processes model of each of the three above-mentioned climatic optical factors for any given period can be represented as follows: where -a deterministic basis for s-th scenario, the j-th climatic factor, wh ich represents the expectation of the process -centred co mponent of the scenario, represented by an autoregressive process with a memory depth p, ε(t)-error simu lation (noise model) is accepted as a Gaussian random process with zero mean and residual variance σ 2 ε ; H sj , L sj -the vectors of model parameters of determin istic basis and centred part of the scenario -φ sj (t)-the basic functions of time a determinate basis scenario; P s (t) -a priori probability of occurrence of scenarios was determined by the sphere of permissible errors of prediction Preliminary study of the processes of change of temperature j = 1 to separate months of growing season (May-September) revealed the following possible scenarios: linear trend of s = 1 For estimat ing the parameters of the matrix forms of external influence P l used in the block modelling of the genetic and physiological systems, Monte Carlo simulated scenario processes together with the probabilit ies of change scenarios that can first be unified (out scenario) processes of change of external disturbances, and from them the estimation of parameters forms (elements of the vectors Π l ).

Issues Model of Identification of Genotypes by Phenotypes
Of all the 20 above-mentioned problems of greatest interest to the models of researchers, geneticists, is the problem of identifying of genotypes by observations of phenotypes. This problem is directly jo ined to a co mmon task of identification and mathematical model. Sense of such connections is that without enough impossible reliably identify the model reliab le identification of the genotype as identified by the model. What is the difference between these two acts of identification? To clarify them in Fig. 3,4,5 are shown block diagrams of these procedures.
Identificat ion of the mathematical model is to estimate all parameters and correcting model structure, if you can not get reliable estimates and regularity in terms of forecasting accuracy for current structure. For parameter estimation using arbitrary changes in external factors and reaction to them investigated genotype, which is reflected in the dynamics of the state vector. Effectiveness evaluation verified for accuracy of forecasting.
In contrast, model identification, the task of identify ing the genotype is essentially a pattern recognition problem and its solution is possible to use mult iple methods and techniqu es. Depending on the conditions of their imp lementation, these methods can be divided into two large groups. In the first group of methods that can be called deterministic, use standard impact for the studied plants. The co mposition of these Impact contain all forms of options included in the vectors Π l models (9), (10) genetic-physiological systems. Naturally, the set of reference actions we located causeway and a set of standard responses in the modules of quantitative traits, which allow the identification of genotypes. It is clear that such methods of identification require the specialized phytothrone protected from the effects of uncontrolled environmental factors, as well as the database is a set of standard responses of different genotypes.
Closer to the real conditions of breed ing are stochastic methods of recognition, where all the external factors are uncontrollable and may vary randomly-formed way. Here is an important condition of observables, whereby constructed a posteriori laws of probability distributions of quantitative traits. Availability of such informat ion allows us to apply Bayesian classification methods, or method of maximu m likelihood [9,10]. But the imp lementation methods of the second group requires a large amount of prior information about the state of genotypes and phenotypes and imposes strict requirements to store all the info rmation of the breeding process, which means a large number of variants and replications. All this information must be combined into a knowledge bases of breeders, which is not currently the case.

Conclusions
The basic princip les of modelling the interaction "genoty pe-environment", included in the jo int use of a dynamic model of the production process, and seven models of genetic-physiological systems: attraction providing a period of plastic substances filling pu mping out with crowbars, and leaves in the ear; micro-d istribution attraction matter between grain and chaff have spiked, husk and kernel in sunflower, etc.; adaptation (resistance to stressors on the climatic and chemical parameters of the environ ment); polygenic resistance; susceptibility (response) to the doses of soil nutrition, tolerance to the density; genetic variability of ontogenetic phases.
In this model the production process is comp lemented by mu lti-unit output interface modules for each phase of ontogenesis, each of which is connected to the inputs to the outputs of the model of the production process and the genetic-physiological systems, wh ich are modelling by a change of the spectrums of a huge number of genes and their influence on the quantitative characters, depending on the limit ing factors of environ ment.
In accordance with the principles of the proposed methods for estimating substantiated parameters of models of productivity and the state of genetic-physiological systems and identification of genotypes by their phenotypes.