‘We used data science to improve potato breeding’
- Student project
Growing potatoes can be quite challenging. The plants need to withstand all kinds of weather, from extreme rain to drought, and farmers also need to find the perfect amount of fertilizer to put on the land if they want the most bountiful harvest. Companies that sell potato seedlings, like HZPC, are always looking for ways to help farmers and sell them the most robust potato with the highest yield. Data Science for Life Science students Rose Hazenberg and Simone van Lunsen helped HZPC to find these potatoes, with the help of data science.
In this specific project, HZPC was interested in knowing how different amounts of nitrogen in the soil affected the growth of the potatoes. 'They had already gathered a lot of data on this subject, ranging from drone images to yield numbers', says Simone. 'We had to use multivariate analysis and machine learning to make sense of all this data, and see if we could find patterns.'
This turned out to be quite a challenge. 'There were so many different datasets, that it was hard at first to see how they compared to each other', Rose explains. 'But in the end we found ways to combine all the data, through dimensionality reduction, regression models and multivariate analyses.'Unfortunately they did not find much clear differences. 'There were many different concentrations of nitrogen in the field, but we did not see many changes in yield in correlation with this.'
The code they made did however help HZPC with their second goal of the project: finding a way to breed certain qualities into their potatoes faster than with classical breeding techniques. Simone: 'Right now it takes about 10 years to get to a new genotype available for cultivation. That costs a lot of time and money, and it remains challenging to improve multiple breeding targets simultaneously. With our model, we were able to find some interesting targets that might speed up this process.'
One of the targets they found was the concentration of minerals in the plants. Simone: 'We found that mineral concentrations in plants strongly correlate with vegetation features derived from drone images, such as plant height and chlorophyll content. These features, in turn, correlate with important agronomic traits like yield. Because mineral concentrations can be measured early in the plant’s development, identifying them as predictors for yield could replace time-consuming yield measurements at the end of the growing season, and speed up selection in breeding programs.'
Both students are happy that they could work on such a practical project in agriculture. Rose: 'I find agriculture very interesting, and it was fun to see the workings of a company that works in this field.' Simone hopes she can stay in this field: 'I really like the intricacies of soil and everything that is going on in there. And data science can really make a difference in understanding these processes, so I hope I can find a job in this field.'
How satisfied are you with the information on this page?