SYSTEM AND METHOD FOR SUGARCANE YIELD ESTIMATION
A combination of yield prediction models is usable to predict the yield of a crop, such as sugarcane, from land. The model combination includes at least first and/or second models. The first model may be a structured or unstructured model that models season dependent effects on yield. If structured, the first model may be a linear, non-linear, or polynomial representation. The second model may be a structured or unstructured model that models age dependent effects on yield. If structured, the second model may be a linear, non-linear, or polynomial representation. Additional models that model weather and/or soil dependent effects on yield may also be used in the model combination.
A method and/or apparatus is disclosed herein for estimating yield of a crop, such as sugarcane, from a given area, such as an acre, of land.
BACKGROUNDSugarcane is a member of the grass family and is valued chiefly for the juices (especially sucrose) that can be extracted from its stems. The raw sugar that is produced from these juices is later refined into white granular sugar.
Sugarcane, which is the raw material for the production of sugar, is a perennial crop. One planting of sugarcane generally results in three to six annual harvests before replanting is necessary. The very first harvest after the planting is called “Plant Cane,” while the subsequent harvests before the next replanting are called “Stubble” or “Ratoon.” The first stubble or ratoon is the first harvest following the plant cane harvest, the second stubble or ratoon is the second harvest following the plant cane harvest, and so on.
As a sugarcane plant matures throughout the growing season, the weight of total sugarcane per acre increases. The production and health of a sugarcane crop depends on several factors, which vary seasonally as well as annually, and the interactions among these factors are very complex.
A typical sugar industry buys sugarcane from various farmers, and the farmers have a contract with sugar industries. Each sugar industry knows the planting date of each crop along with the plant variety for every farm. The farmers are paid based on the weight and quality of the sugarcane harvested from an individual field.
The amount of sugarcane per unit area (e.g., acre) is called yield. The yield of sugarcane depends on various factors such as plant variety (also called as cultivar), maturity (age from the date of planting for plant cane or from the date of last harvest for ratoon) of the sugarcane, weather conditions, soil conditions, diseases, harvesting conditions, and the amount of trash incorporated into the harvested sugarcane. This trash can be defined as the amount or quantity of leaves, tops, dead stalks, roots, soil, etc. delivered together with sugarcane.
The long-term viability of the sugar industry depends upon finding ways to produce sugar more economically through production management decisions which can reduce production costs and/or increase returns. Harvest scheduling, deciding when to harvest which sugarcane variety (or cultivar) of what age, is one such practice which has a direct impact on net farm returns. The net farm return is the total sugar (in weight) obtained from a given planting.
In this application, a methodology for predicting sugarcane yield is presented. The modeling of sugarcane yield captures the dynamic effects of vegetative growth (which depends on variety, age, weather conditions, soil condition, farming practices, etc.) during growing and harvesting seasons. This model can then be used to determine when specific sugarcane farms should be harvested in order to maximize the sugarcane yield so as to help improve net farm returns.
Yield, which is the amount of sugarcane that is harvested from the fields, is distinguished from recovery, which is the amount of sugar that is recovered from the sugarcane. A methodology has been developed in co-pending U.S. patent application Ser. No. 11/445,053 filed on Jun. 1, 2006 for the estimation of sugar recovery (e.g., amount of sugar in sugarcane). The estimation of recovery can be used along with estimated yield values to assist in daily harvest scheduling at the individual farm level so as to maximize total farm net returns.
As indicated above, sugarcane yield (the amount of sugarcane that can be harvested from a field) is affected by a combination of deterministic parameters (e.g., variety, age, season) and stochastic parameters (e.g., weather conditions and soil type, farming practices etc.). Hence, modeling of the sugarcane yield is a complex task.
A method is disclosed herein to estimate sugarcane yield in a systematic way. When the nature of the relationship between the individual parameters (like yield and sugarcane age or yield and season) is known a priori, then modeling can be achieved using compact structured representation. However, in some practical scenarios, the nature of this relationship may not be known a priori. In these latter scenarios, modeling can be achieved using unstructured (non-parsimonious) models. Based on the situation and information at hand, either or both of the models can be used for yield prediction. Both of these approaches are discussed below in detail.
Features, aspects, and advantages of the present invention will become better understood when the following detailed description is read with reference to the accompanying drawings in which:
Yield prediction may, for example, involve a training stage and a use stage. During the training stage, a model for yield prediction is generated, values for the parameters of the model are determined based on training data, and the values are inserted for the parameters in the model to complete the yield prediction model. Model generation and parameter value determination are based on equations (1)-(48) described below. During the use stage, the completed crop yield prediction model is used to predict the yield of a crop such as sugarcane.
As an example, a sample sugar processing facility may have two harvest seasons in a year. The first (or Main) season can be from December to July, while the second (or Special) season can run from August to October. In this example, the harvest data (training data) of both seasons from the few years can be used to find the relationship between sugarcane yield, age at harvest, seasonal effects, variety, crop class or ratoon number, weather conditions, and/or soil type.
A representative set of a portion of the training data for this example over the last few years is shown in the following table.
The planting regions can be classified into different zones based on weather and soil conditions. The difference between the planting date and the harvest date indicates the sugarcane age at the time of harvest. The ratoon code P indicates Plant Cane, while the ratoon code R represents the ratoon after first harvest. Net plot area indicates the harvested area of the farm, while weight indicates the actual weight of the sugarcane harvested from the net farm area. In the main and special seasons over the last few years, there are many entries similar to those shown in the above table.
The overall yearly distribution (considering the last few years of harvest data) of the harvest load (weight percent-age) as a function of harvest month is plotted in
With this information in hand, the sugarcane loads may be classified in a number (such as 23 as shown in
It can be seen from the age distribution diagram of
The example training data may indicate a usage of different sugarcane varieties as shown in
Sugarcane yield is a function of various factors or effects such as variety (cultivar), age of sugarcane loads at harvest, seasonal conditions, soil type, and/or weather conditions. Weather conditions include rainfall, maximum temperature, difference between maximum and minimum temperature, and/or relative humidity. Seasonal conditions are captured herein using julian date, which is a numerical equivalent of the date. August 1 is julian date 1 and the following July 31 is julian date 365 (or 366 for leap year). However, the selection of August 1 as the reference julian date is arbitrary and is merely used for the sake of illustration of growing conditions in a region such as south India. This reference date can change for the particular country, or its part, which is being modeled, and is based primarily on the weather cycle.
Modeling all desired effects in a single model in one step makes it difficult to determine the contributions or exact relationships between yield and the individual effects. Therefore, an individual approach, for example, may be followed to develop the mathematical formulation (model) for sugarcane yield prediction. This individual modeling approach is shown in
This individual modeling approach starts with inputting harvest and production training data at a block 10. This data includes data on sugarcane yield by variety, by age, by seasonal date (e.g., Julian Date), and also includes a yield database for other effects such as weather, soil, and/or irrigation conditions. The weather effects, for example, may include the effects of rainfall, maximum temperature, difference between maximum and minimum temperature, and/or relative humidity by actual date.
As is shown in
Based on these analyses, variety wise season and age modeling and parameter estimation are performed at 26 to produce an initial sugarcane yield model. This model is refined at 28 based on weather effects 30 and soil effects 32. As shown in
The equations and associated discussion below illustrate an example of one manner in which the flow of
It should be understood from
The independent models so obtained for julian date and age effects are then merged together to produce a combined model which considers variety, julian date, and age effects while predicting sugarcane yield. An independent dynamic model is obtained at 28 for weather effects, such as rainfall, maximum and minimum temperature information, and/or relative humidity, on sugarcane yield. The dynamic weather model so obtained will be combined with static combined model for julian date and age effects.
Finally, a soil model is developed considering the soil type of the farm based on soil nutrients and/or irrigation data specific to the farm.
The unified model 34 produced at 28 is obtained in the final step by combining all of the above mentioned effects.
Modeling Season (Julian Date) EffectAs discussed above, each effect on sugarcane yield is modeled independently of the other effects. As a first step, the seasonal effect on sugarcane yield is captured. The seasonal effect is represented in the form of julian date, although other representations could be used. In order to understand the seasonal effect on sugarcane yield, variations in sugarcane yield are first analyzed. In order to analyze variations in sugarcane yield due to the seasonal effect, the age of the sugarcane is fixed, such as to the 390-400 day range, in order to minimize the age effect on sugarcane yield variations, as indicated by the constant age notation in
The same exercise is repeated by considering variety specific data. Generally, it has been found that the trend in the overall sugarcane yield variation with respect to harvest month so as to capture the seasonal effect is substantially the same as the trend in sugarcane yield variations of at least the dominant varieties with respect to harvest month. However, variety wise, the sugarcane yield versus harvest data curves can move up (for a rich sugarcane yield variety) or down (for a poor sugarcane yield variety).
The relationship between the average sugarcane yield and harvest month (expressed in terms of julian date) is generally polynomial in nature. The following equation captures the polynomial relationship between sugarcane yield and julian date:
ŷv,dJD=αv,py(JDd)p+αv,p−1y(JDd)p−1+ . . . +αv,ly(JDd)1+αv,0y (1)
for all v and d, where d represents harvesting day, v represents variety, p represents the order of the polynomial, ŷv,dJD is a variable representing the predicted sugarcane yield for variety v on day d because of only the julian date effect, αv,py is a parameter for variety v to model the julian date effect for a polynomial of order p on sugarcane yield, and JDd is a parameter representing Julian Date for harvesting day d.
Modeling Age EffectThe relationship between the average sugarcane yield and age of the sugarcane is generally quadratic in nature. Typically, sugarcane yield increases with the age of the sugarcane as maturity adds mass to the sugarcane. However, this trend reverses after a certain age, as evaporation dries out the sugarcane mass. This domain understanding also suggests the quadratic relationship between sugarcane yield and its age at harvest.
Further, to confirm this understanding, the variation of average sugarcane yield (considering all the varieties) as function of age of the sugarcane can be studied with the training data. As indicated above, the training data is actual data accumulated with respect to past harvests and is used as described herein to determine the parameters of the sugarcane yield prediction models.
It is important that this variation in average sugarcane yield as a function of age should only be due to the age effect so that the underlying conclusion related to this relationship is unbiased. Therefore, the analysis of the effect of age on sugarcane yield should be carried out for those sugarcane entries which are harvested in the same month of a given year, as indicated by the constant harvest month notation in
The quadratic relationship between sugarcane yield and the age effect is given in the following equation:
ŷv,aA=cvya2+dvya+evy (2)
for all v and a, where a represents age group, cvy, dvy, and evy are parameters for variety v to model the age effect on sugarcane yield, and ŷv,aA represents the predicted yield for age a of variety v because of the age effect only. The quadratic nature of age effect on yield of the crop is illustrative only and can be a polynomial of higher order.
Combined Modeling of Julian Date and AgeAbove, the individual seasonal and age effects were modeled separately. Such models are useful for considering the impact analysis of the individual effects. However, to make these models more suited in an optimization framework, they can be combined in order to address both the seasonal and age effects simultaneously. The combined model is given by the following equation:
ŷv,d,a(JD,A)=ŷv,dJD+ŷv,aA+δvy (3)
for all v, d, and a, where ŷv,d,a(JD,A) is the predicted yield for age a of variety v on day d combining julian date and age effects, ŷv,dJD is given be equation (1), ŷv,aA is given be equation (2), and δvy is a bias term for variety v to model the julian and age effects on yield. The individual values of ŷv,dJD and ŷv,aA are such that they each are close to the actual yield value of the sugarcane for the given variety, julian date, and age group. Hence, the combined sum of these two terms falls in the range of twice the actual yield value. The bias term δvy adjusts ŷv,d,a(JD,A) to the realistic range.
Using equations (1) and (2), equation (3) can be expanded as given by the following equation:
ŷv,d,a(JD,A)=αv,py(JDd)p+αv,p−1y(JDd)p−1+ . . . +αv,1y(JDd)1+cvya2+dvya+γvy (4)
for all v, d, and a, where
a=d−pd+1 (5)
and γvy is an aggregation of all constant terms as given by the following equation:
γvy=αv,0y+evy+δvy (6)
for all v.
The model represented by equations (4) and (6) may be fitted to production and harvest training data during the training or modeling phase in order to estimate optimal values for the parameters αv,py to αv,1y, cvy, dvy, and γvy. The estimation problem is solved as an optimization problem. The optimization problems is stated as an objective function by the following equation:
where NE represents the total number of harvest load entries containing the training data (i.e., the size of the data set representing all harvest load entries in the harvest training database). It should be noted that this modeling scheme is more generic and does not fix the sugarcane age and harvest month as was done in connection with equations (1) and (2). The value εnabs represents the absolute error between predicted yield and actual yield and is constrained as given by the following inequalities:
εnabs≧Yn−ŷn(JD,A) (8)
εnabs≧−(Yn−ŷn(JD,A)) (9)
for all entries n, and where Yn is the actual yield for entry n in the training database, ŷn(JD,A) is the predicted yield for harvest load entry n using julian and age effects, and
for all n, and where Nv is the set of varieties, NVn,v is a binary matrix indicating to which variety v harvest load entry n belongs, such that NVn,v is equal to one when the harvest load of entry n belongs to variety v and otherwise is equal to zero, and
for all v, where An is a parameter indicating the age of the harvest entry n at harvest, HDn is a parameter indicating the harvest date of harvest entry n, PDn is a parameter indicating the planting date of harvest entry n, ŷv,HD
εnabs ≦Yn (13)
for all n.
A few additional linear programming tightening constraints obtained by using the domain knowledge about the relationship between age and yield are given as follows:
Yn−5≦ŷn(JD,A)≦Yn+3 ∀n,∀(An≦340 or An≧440) (14)
Yn−3≦ŷn(JD,A)≦Yn+5 ∀n,∀(An=341, . . . ,439) (15)
The numbers 3, 5, 340, and 440 are illustrative only and may change depending on geography and the training data selected for creating the models described herein.
It should be noted that the constraints given by equations (13) to (15) are optional constraints. However, these constraints help make the optimization search space more compact. The following additional constraints may be imposed:
αv,py
cvy
dvy
γvy
for all v and p. The upper and lower bounds on the above parameters in equations (16) to (19) can be obtained using the results from modeling the julian date and age effects separately. It should be noted that these ranges on parameters are very specific to the training data used for modeling and need to be pre-estimated for other industries' harvest data, as sugarcane is a weather sensitive crop. The linear optimization problem with objective function given by equation (7) and subjected to constraints given by equations (8)-(19) is solved to estimate the optimal values for the parameters.
Once the optimal parameter values are computed, the residual error is calculated according to the following equation:
errn=Yn−ŷn(JD,A) (20)
for all n. Assuming that a residual analysis is tabulated using a set of training data, it will be apparent that there can be significant modeling errors in the combined (julian and age) model predictions and that additional variables like weather and/or soil effects need to be considered to improve the model and thereby the quality of predictions.
Modeling Weather EffectAs discussed earlier, sugarcane yield is influenced by stochastic effects like weather and/or soil conditions. The weather effect is highly complex and poorly characterized in practice. It comprises rainfall, temperature, humidity, wind, and/or sunshine related effects. These individual effects have a dynamic impact on sugarcane yield. For example, sugarcane yield is dependent on the pattern of rainfall on the sugarcane crop throughout its lifetime. Therefore, weather related effects should be modeled within a dynamic framework.
The weather model captures significant amount of residual error errn (of the combined model of julian date and age effects) using weather information such as rainfall, maximum temperature, and the difference between maximum and minimum temperatures (delta temperature). Although only temperature and rainfall data are used herein to model weather effects, the weather model, in general, can be more generic, comprising other variables such as humidity, sunshine hours, etc.
It may be assumed that the yield contribution due to weather related effects on harvest day d in planting zone z is denoted as ŷd,zW. As can be seen, the weather contribution factor is a function of the harvest date (d) of the sugarcane crop. Once the harvest date of the crop is known, the related weather information experienced by the sugarcane crop prior to harvest can be computed and put in the model. The yield contribution ŷd,zW can be modeled in accordance with the following equation:
ŷd,zW=ŷd,zRF+ŷd,zMT+ŷd,zΔT (21)
for all d and z. The first term on the right hand side of equation (21) is the rain fall (RF) model that considers past rainfall information (such as rainfall in the last eight months), while the second and last right hand terms indicate dynamic models for maximum and delta temperatures which consider past temperature effects (such as temperature effects in the last six months). Equation (21) also expresses weather variation across different planting zones z.
Modeling Rainfall EffectThe rainfall model is dynamic in nature and can comprise, for example, two terms as given in the following equation:
ŷd,zRF=ŷd,zRF
for all d and z.
The first term ŷd,zRF
for all d and z, where rfiy is a parameter useful in modeling the rainfall effect on yield, and RFd,z is the rainfall on day d in zone z. The groupings used in equation (23) may be different in number and size. There are six rainfall parameters rfiy in equation (23), which will be determined while predicting the effect on yield of rainfall over the last two months (in slots of 10 days each).
The second term in equation (22) captures the effect of rainfall during the last 61 to 240 days in groups of 30 days. Hence, there are six distinct groups. This second term is given by the following equation:
for all d and z. The groupings used in equation (24) also may be different in number and size.
Hence, equations (23) and (24) over a past period of time (such as 8 months) consider the rainfall effect on yield. There are a total of twelve parameters rfiy in the dynamic rainfall model of equations (23) and (24) to predict the effect of rainfall on yield.
Modeling Maximum Temperature EffectThe model to predict the effect of maximum temperature on yield is given by way of example by the following equation:
ŷd,zMT=yd,zMT
for all d and z, where ŷd,zMT is the maximum temperature dependent yield on day d in zone z.
The first term ŷd,zMT
for all d and z, and where mtiy are parameters to model the effect of maximum temperature on yield, and MTd,z is the maximum temperature in zone z on day d.
The second term ŷd,zMT
for all d and z.
Hence, there are in total 10 parameters mtiy (six from equation (26) and four from equation (27)) in dynamically modeling the effect of maximum temperature on yield prediction. The groupings used in equation (26) and (27) also may be different in number and size.
Modeling Delta Temperature EffectThe dynamic model to capture the effect of delta temperature on yield is very similar to that used for modeling the maximum temperatures effect. Hence, the dynamic model to capture the effect of delta temperature on yield is given by the following equation:
ŷd,zΔT=ŷd,zΔT
for all d and z.
The first term ŷd,zΔT
for all d and z, and where δtiy are parameters to model the effect of temperature difference on yield, and ΔTd,z is the temperature difference in zone z on day d.
The second term ŷd,zΔT
for all d and z.
Hence, there are in total 10 parameters δtiy (six from equation (29) and four from equation (30)) in dynamically modeling the effect of delta temperature on yield prediction. The groupings used in equation (29) and (30) also may be different in number and size.
The combined dynamic model to predict the effect of weather conditions on yield is comprising of a total 32 parameters (12 for rainfall and 10 each for maximum temperature and delta temperature). The weather model represented by equations (21) to (30) is variety independent. In other words, it assumes that all the varieties show similar sensitivity to weather conditions. However, it is straight forward to develop a weather model that considers variety dependency in a similar manner.
Modeling Combined Effects of Julian Date, Age and WeatherThe weather model of equation (21) is merged with the combined model of the effects of julian date and age, and the optimal values for all parameters (julian date, age, and weather model parameters) of the global model are obtained using an optimization framework. This combined model is given by the following equation:
for all harvest load entries n in the training database, where ŷn(JD,A,W) is the predicted yield of harvest load entry n using the julian date, age, and weather effects, Nz is the set of all zones, and NZn,z is a binary matrix indicating to which zone z farm a harvest load entry n belongs.
The linear optimization problem presented by the objective function given by equation (7) is solved, using the sample training data to estimate the optimal values for all parameters, by subjecting the objective function to the constraints given by equations (8)-(19), by replacing ŷn(JD,A) with ŷn(JD,A,W) and by replacing equation (10) with the following equation:
for all n.
Once the optimal parameter values are computed as described above, the residual error is calculated in accordance with the following equation:
errn=Yn−ŷn(JD,A,W) (33)
for all n. Assuming that a residual analysis is tabulated using the set of training data, it will be apparent that there can be modeling errors in the combined (julian and age and weather) model predictions and that an additional variable for soil effects can be considered to improve the model and thereby the quality of predictions.
Modeling Soil and Irrigation EffectsThe sugarcane yield of a farm will naturally depend on its soil quality, irrigation, and farming practices adopted by the farmer. Soil quality represents the quantum of nutrients (like Nitrogen (N), Phosphorous (P), Potassium (K), etc.) available in the soil. Irrigation practices represent the availability of water for the farm field. Farming practices are related to the practices adopted by the farmer at various stages, varying from seed sowing to crop harvest, and include, for example, seed quality, sowing and harvesting methods, fertilizers, pesticides, etc.
Based on these practices, the training data relating to sugarcane yields can be classified into a number of different zones using a sample sugarcane variety of fixed age and fixed harvest month. It can be concluded from such training data that the selected zones represent a gross level classification of soil types. In a zone, various kinds of sugarcane yields can be produced. These variations are related to different farming and irrigation practices.
Training data will show that soil, farming, and irrigation related effects are difficult to quantify and are complex in nature. Therefore, these effects can be modeled in a rule based manner. Based on soil, farming, and irrigation training data, the soil effects can be ranked from rich soil to poor soil. Rich soil indicates an improved sugarcane yield over the average sugarcane yield predicted by the above models, and poor soil indicates sugarcane yield that is under the average sugarcane yield.
The gradations can be made with ranks varying from, for example, one to ten, where one indicates poor quality sugarcane yield (adverse soil and related effects) and ten represents the best possible sugarcane yield (favorable soil and related effects).
Corresponding to each soil rank, a contribution factor is assigned so as to amend the above models in accordance with the following equation:
ŷn=ŷn(JD,A,W)+ŷnS (34)
for all n, where ŷnS predicts the sugarcane yield contribution due to soil effects for harvest load entry n, and where
for all n, where NST is the set of all soil types, NSn,st is a binary matrix indicating to which soil type st the farm of load entry n belongs and which is equal to one when the soil of the farm corresponding to entry n belongs to soil type st and is otherwise zero, and δv,stS is the sugarcane yield contribution factor of variety v for soil type st (representing soil and irrigation effects). Because the soil and irrigation effect related contribution factor δv,stS is dependent on the farm and plant variety, different values of the contribution factor can be obtained for the same farm but planted with different varieties. Optimal values of δv,stS can be obtained using soil nutrients, irrigation, and farming practice related data of a field (i.e. domain knowledge), or can be obtained using optimization techniques.
The Unified Model for Yield PredictionThe soil model can be combined with the combined julian, age, and weather model. Hence the linear optimization problem with the objective function given by equation (7) and the constraints given by equations (8)-(19) is given by the following equations:
for all n, and where the unified model is give by the following equation:
for all n.
The set of constraints still include the constraints given by equations (11)-(19) along with the parameter ranges calculated as discussed above. When the unified model (aggregating all effects) is applied on the sample harvest and production data during training, the non-modeled variation (the error between predicted and actual sugarcane yield) has fallen below reasonable limits. It is concluded that the unified model, comprising julian date, age, weather, and soil related effects, accurately predicts sugarcane yield. The non modeled variation after unified model is attributed to complex effects such as plant diseases, sun shine hours, etc.
Unstructured Modeling of Sugarcane YieldAs indicated above, the relationship between sugarcane yield, age at harvest, and harvest date (defining the seasonal effects) was developed by applying structured models. The relationship between sugarcane yield and its age at harvest was assumed to be a second order polynomial relationship, and the relationship between sugarcane yield and harvest julian date was assumed to be a polynomial relationship of order p. However, in actual practice, the order of the polynomial relationship may not be known a priori. Moreover, it may not be a good practice to assume a very high polynomial order because such an assumption may lead to data over fitting. These drawbacks can be addressed by assuming an unstructured yield model rather than the structured polynomial yield model discussed above.
For example, the structured yield model taking in account the effects of julian date and age, ignoring the bias term, is derived from equation (3) and is given by the following equation:
ŷv,d,a(JD,A)=ŷv,dJD+ŷv,aA (40)
for all v, d, and a.
For an unstructured yield model, the age effect captured by the term ŷv,aA in the model of equation (4) can be replaced by an unstructured age effect model given by an age term δv,aA. The age term δv,aA indicates the deviation from nominal yield of variety v of age a. The nominal yield value is for a particular harvest age (such as of 300 days) and can be different for different harvest dates as indicated by a julian date term ηv,dJD indicating the nominal yield of variety v on harvest day d. The nominal yield value ηv,dJD captures the seasonal (julian date) effects in an unstructured way. It should be noted that the term δv,aA can take on negative values. The unstructured model given by the following equation results:
ŷv,d,a(JD,A)=ηv,dJD+δv,aA (41)
for all v, d, and a.
The advantage of the above modeling representation is that the order of the model polynomial is not required to be known a priori. Instead, the data guides the nature of the relationship. In order to compute the optimal values of ηv,dJD and δv,aA in equation (41), the following optimization equation can be used:
for all v, d, and a, where NA is the set of all age groups, NEv,d,a is the number of entries in the training data for variety v on day d for age group a. The following equation expresses the relationship between NEv,d,a and NE:
The error εv,d,aabs is the difference between the average yield
for all v, d, and a.
The deviation in yield from its nominal value as indicated by δv,aA in equation (41) may be bounded as follows:
where
The predicted average yield {circumflex over (
for all v and d, where AEv,d,a represents the total area of the loads of variety v harvested on day d for an age range given by group a. In other words, it is the total area of the entries given by NEv,d,a. These predicted average yield values given by {circumflex over (
(0.85)(
for all v and d, where
The optimization problem given by equations (41)-(48) and the objective function given by equation (42) is solved using the training data in order to determine the optimal values for ηv,dJD and δv,aA. Once the model for the predicted yield ŷv,d,a(JD,A) is established using this approach, the weather and soil model can be combined in the same manner as described above to obtain the unified model.
Thus, the unified model may be used to predict the yield trends for the dominant crop varieties. The yield trends representing the relationship between average predicted yield as a function of harvest age or harvest month are of special interest for validating the results with the domain knowledge. This analysis will also help in understanding the relative ranking of the varieties with respect to sugarcane yield. The unified model so developed being modular in nature can be utilized for understanding the relative contributions of age and seasonal effects on the yield of a given variety.
According to the program 50, the effects on yield of the harvest season are modeled at 52 in accordance with equation (1) and the training data relating yield to plant variety and harvest date stored in a database 54. Also, the effects on yield due to age of the crop at the time of harvest are modeled at 56 in accordance with equation (2) and the training data relating yield to plant variety and age stored in the database 54. These models produced at 52 and 56 are combined at 58 in accordance with equations (3)-(20) to produce a model that is capable of predicting yield based on the effects of harvest season and crop age at harvest.
As indicated above, the modeling performed at 52, 56, and 58 is based on a structured representation of the season and age effects on yield and, in particular, a polynomial representation of the season and age effects. Alternatively, the modeling performed at 52, 56, and 58 can instead be based on an unstructured yield model rather than a structured yield model. Accordingly, the modeling performed at 52, 56, and 58 can instead be based on equations (41)-(48).
The combined model (structured or unstructured) produced at 58 fails to capture other effects on yield. Accordingly, at 60, the effects of weather on yield are modeled in accordance with equations (21)-(30) and the training data relating yield to harvest date and zone stored in the database 54. At 62, the model produced at 60 is combined at 58 with the combined model produced at 58 in accordance with equations (31)-(33) to produce a model that is capable of predicting yield based on the effects of harvest season, crop age, and weather.
The combined model produced at 62 may still fail to capture other effects on yield. Accordingly, at 64, the effects of soil on yield are modeled in accordance with equations (34) (35) and the training data relating yield to soil type and plant variety stored in the database 54. The model produced at 64 is combined at 66 with the combined model produced at 62 in accordance with equations (36)-(39) to produce a global model that is capable of predicting yield based on the effects of harvest season, crop age, weather, and soil conditions.
The program corresponding to the flow charts of
Certain modifications of the present invention have been discussed above. Other modifications of the present invention will occur to those practicing in the art of the present invention. For example, the present invention has been described above in connection with sugarcane crops. However, the present invention could be used in connection with other crop classes. Further to this example, included within the crop classes for which the above described yield prediction can be of benefit are crop classes that have the property of synchronized group maturity. The term “synchronized group maturity” indicates a similar perceivable growth (vegetative or non-vegetative) level at any point of time within and across the crop entities planted simultaneously within a defined geographical territory for the crop classes. Such crop classes may include, for example, wheat, rice, oil seed, etc.
As another example, the order in which seasonal, age, weather, and/or soil effects on yield are modeled may be varied.
As still another example, those who are skilled in the area will recognize that various models have been disclosed above using a polynomial function for the season effect and a quadratic function for the age effect. However, the season effect can be captured using linear functions or non-linear functions such as exponential, logarithm, etc. Similarly, the age effect can be captured by using polynomial, linear, or non-linear functions.
Accordingly, the description of the present invention is to be construed as illustrative only and is for the purpose of teaching those skilled in the art the best mode of carrying out the invention. The details may be varied substantially without departing from the spirit of the invention, and the exclusive use of all modifications which are within the scope of the appended claims is reserved.
Claims
1. A method implemented by a computer for generating a crop yield prediction model for predicting yield of a crop, the method comprising:
- generating a first structured model that models a first dependent effect on the yield of the crop, wherein the first dependent effect comprises a season dependent effect;
- generating a second structured model that models a second dependent effect on the yield of the crop, wherein the second dependent effect comprises an age dependent effect;
- combining the first and second structured models; and,
- determining parameters for the first and second structured models based on training data relating yield to plant variety, harvesting season, and age at harvest.
2. The method of claim 1 further comprising:
- generating a third model that models a third dependent effect on the yield of the crop, wherein the third dependent effect comprises an effect other than the season dependent effect and the age dependent effect;
- combining the first and second structured models and the third model; and,
- determining parameters for the third model based on the training data.
3. The method of claim 1 further comprising:
- generating a third model that models a third dependent effect on the yield of the crop, wherein the third dependent effect comprises a weather dependent effect;
- combining the first and second structured models and the third model; and,
- determining parameters for the third model based on the training data relating yield to plant variety, harvesting season, age at harvest, and weather conditions during growing and/or harvesting, wherein weather conditions include at least one of temperature, rainfall, humidity, and sunshine.
4. The method of claim 1 further comprising:
- generating a third model that models a third dependent effect on the yield of the crop, wherein the third dependent effect comprises a soil dependent effect;
- combining the first and second structured models and the third model; and,
- determining parameters for the third model based on the training data relating yield to plant variety, harvesting season, age at harvest, and soil conditions during growing and/or harvesting, wherein soil conditions include at least one of soil type, irrigation practice, and soil nutrients.
5. The method of claim 1 wherein the training data includes actual yield, and wherein the determining of parameters for the first and second structured models comprises optimizing the parameters by minimizing an error between predicted yield predicted by the first and second structured models and the actual yield.
6. The method of claim 1 where the first structured model comprises a first linear model, and wherein the second structured model comprises a second linear model.
7. The method of claim 1 wherein the first structured model comprises a first non-linear model, and wherein the second structured model comprises a second non-linear model.
8. A method implemented by a computer for generating a crop yield prediction model for predicting yield of a crop, the method comprising:
- generating a first unstructured model that models a first dependent effect on the yield of the crop, wherein the first dependent effect comprises a season dependent effect;
- generating a second unstructured model that models a second dependent effect on the yield of the crop, wherein the second dependent effect comprises an age dependent effect;
- combining the first and second unstructured models; and,
- determining parameters for the first and second unstructured models based on training data relating yield to plant variety, harvesting season, and age at harvest.
9. The method of claim 8 further comprising:
- generating a third model that models a third dependent effect on the yield of the crop, wherein the third dependent effect comprises an effect other than the season dependent effect and the age dependent effect;
- combining the first and second unstructured models and the third model; and,
- determining parameters for the third model based on the training data.
10. The method of claim 8 further comprising:
- generating a third model that models a third dependent effect on the yield of the crop, wherein the third dependent effect comprises a weather dependent effect;
- combining the first and second unstructured models and the third model; and,
- determining parameters for the third model based on the training data relating yield to plant variety, harvesting season, age at harvest, and weather conditions during growing and/or harvesting, wherein weather conditions include at least one of temperature, rainfall, humidity, and sunshine.
11. The method of claim 8 further comprising:
- generating a third model that models a third dependent effect on the yield of the crop, wherein the third dependent effect comprises a soil dependent effect;
- combining the first and second unstructured models and the third model; and,
- determining parameters for the third model based on the training data relating yield to plant variety, harvesting season, age at harvest, and soil conditions during growing and/or harvesting, wherein soil conditions include at least one of soil type, irrigation practice, and soil nutrients.
12. The method of claim 8 further comprising:
- generating a third model that models a third dependent effect on the yield of the crop, wherein the third dependent effect comprises a weather dependent effect;
- generating a fourth model that models a fourth dependent effect on the yield of the crop, wherein the fourth dependent effect comprises a soil dependent effect;
- combining the first and second unstructured models and the third and fourth models; and,
- determining parameters for the fourth model based on the training data relating yield to plant variety, harvesting season, age at harvest, and soil conditions during growing and/or harvesting.
13. The method of claim 8 wherein the training data includes actual yield, and wherein the determining of parameters for the first and second unstructured models comprises optimizing the parameters by minimizing an error between predicted yield predicted by the first and second unstructured models and the actual yield.
14. A method implemented by a computer of indicating yield of a crop comprising:
- predicting the yield of the crop assuming that the crop is harvested on day d based on season and age data corresponding to the crop, wherein the predicting is performed based on a combination of first and second structured models, wherein the first structured model models a first dependent effect on the yield of the crop, wherein the first dependent effect comprises a season dependent effects, wherein the second structured model models a second dependent effect on the yield of the crop, and wherein the second dependent effect comprises an age dependent effect; and,
- providing the yield of the crop as an output of the computer implemented method.
15. The method of claim 14 wherein the predicting is performed based on a combination of the first and second structured models and a third model, wherein the third model models a third dependent effect on the yield of the crop, and wherein the third dependent effect comprises an effect other than the season dependent effect and the age dependent effect.
16. The method of claim 14 wherein the predicting is performed based on a combination of the first and second structured models and a third model, wherein the third model models a third dependent effect on the yield of the crop, and wherein the third dependent effect comprises a weather dependent effect.
17. The method of claim 14 wherein the predicting is performed based on a combination of the first and second structured models and a third model, wherein the third model models a third dependent effect on the yield of the crop, and wherein the third dependent effect comprises a soil dependent effect.
18. The method of claim 14 where the first structured model comprises a first linear model, and wherein the second structured model comprises a second linear model.
19. The method of claim 14 wherein the first structured model comprises a first non-linear model, and wherein the second structured model comprises a second non-linear model.
20. The method of claim 14 wherein one of the first and second structured models comprises a linear model, and wherein the other of the first and second structured models comprises a non-linear model.
21. A method implemented by a computer of indicating yield of a crop comprising:
- predicting the yield of the crop assuming that the crop is harvested on day d based on season and age data corresponding to the crop, wherein the predicting is performed based on a combination of first and second unstructured models, wherein the first unstructured model models a first dependent effect on the yield of the crop, wherein the first dependent effect comprises a season dependent effect, wherein the second unstructured model models a second dependent effect on the yield of the crop, and wherein the second dependent effect comprises an age dependent effect; and,
- providing the yield of the crop as an output of the computer implemented method.
22. The method of claim 21 wherein the predicting is performed based on a combination of the first and second unstructured models and a third model, wherein the third model models a third dependent effect on the yield of the crop, and wherein the third dependent effect comprises an effect other than the season dependent effect and the age dependent effect.
23. The method of claim 21 wherein the predicting is performed based on a combination of the first and second unstructured models and a third model, wherein the third model models a third dependent effect on the yield of the crop, and wherein the third dependent effect comprises a weather dependent effect.
24. The method of claim 21 wherein the predicting is performed based on a combination of the first and second structured models and a third model, wherein the third model models a third dependent effect on the yield of the crop, and wherein the third dependent effect comprises a soil dependent effect.
25. The method of claim 21 wherein the predicting is performed based on a combination of the first and second unstructured models and third and fourth models, wherein the third model models a third dependent effect on the yield of the crop, and wherein the third dependent effect comprises a weather dependent effect, wherein the fourth model models a fourth dependent effect on the yield of the crop, and wherein the fourth dependent effect comprises a soil dependent effect.
Type: Application
Filed: Oct 16, 2007
Publication Date: Apr 16, 2009
Inventors: Mangesh D. Kapadi (Bangalore), Jinendra K. Gugaliya (Bangalore), Lingathurai Palanisamy (Bangalore)
Application Number: 11/872,999
International Classification: G06F 19/00 (20060101); G01W 1/00 (20060101);