New opportunities and challenges for managing big data in grain production

Author: Mark Pawsey, SST Software Australia | Date: 23 Feb 2016

Introduction

Before we begin talking about Big Data as an industry concept I want to talk a little about our story as a small company with a 20 year journey in agricultural data management. In the mid 90’s SST was providing geo-spatial software to farm service providers mainly in the Corn Belt of the USA. Our investment in site specific management was a hit and we quickly gained significant market share including internationally in 23 countries. But we began to see flaws in an open system related to the lack of scalability due to unstructured data and outputs.

In the early 2000’s the concept of the SST Information lab emerged. Now service providers could evolve scale and efficiency within an organisation and create consistent outcomes for farmers via internal standards. The gains of this model were however degraded by the challenges of manual data processing, skills mobility and silo’s emerging within a single organisation.

The result - through collaboration with our customers and a quest for our own productivity gains - came in the mid 2000’s when an investment was made into a single global synchronised database platform. From that platform blossomed a suite of innovations such as machine automated processing, secure synchronisation, stable tractor interaction, software innovation and other gains now supporting 120 million acres of farm land. Now our attention is starting to turn towards Big Data or as we are calling it, Analytics.

One key principle we believe is indisputable is that agriculture is inherently spatial and requires site specific data. We plant a seed at a specific point on the earth and make decisions around it which need to be tied to that location, meaning GIS data management is essential. For example, try taking data off a tractor without using a GIS software system!

Big Data in agriculture is being promoted as the panacea to global food security and sustainability and it quite possibly could be. The potential value in helping identify relationships and predictive decision making is enormous. But I ask you, is Big Data really the end goal for agriculture? Is it in fact a digital economy generating lots of standardised small geo-referenced data that is really the goal?

The first assumption of Big Data is that we have lots of data together in one place, and probably in a cloud (which means its sitting on someone’s computer). The path to Big Data in agriculture lies not in the cloud, but in the economic, technical, and social foundations of a structured digital agriculture economy. 

The idea of Big Data in agriculture is suffering from the old sales adage of “selling the sizzle and not the steak”. There is probably no more complex production system than agriculture, and the variety of data decisions and data types that make up the agricultural production system are frightening.  Bringing together structured and unstructured data is hard enough, let alone when the industry is rife with proprietary file formats. Most of the discussion about Big Data in agriculture is theoretical and idealistic with little thought given to the realities of collecting data from disparate systems and creating economic and technical pathways to aggregate it. Hence, most current Big Data initiatives involve weather data and models which can be useful for risk management and other decisions, but will never deliver the hard core metrics and understanding we need for true practice change on farm.

So to break that down, two paths exist to Big Data in agriculture which can be defined as a Top Down or Bottom up approach. Top Down is the aggregation of remote data such as weather data, satellite imagery, and potentially sensor data from emerging networks. This is the most scalable approach and is the domain of modelling and inferred outcomes which allow for decision making prior to the event. The problem with this data is that it lacks a feedback loop of what did occur to fine tune the model, and asking farmers to enter that actual data is unlikely to be successful!

The bottom up approach of aggregated farm data is more traditionally defined as benchmarking. But as geo spatial data from the farm itself becomes more available the ability to query and identify relationships becomes a reality. This is the domain of data collected from farm software, tractors and machinery sensors. This data opens up the ability to understand and predict outcomes using more intensive field level data combined with the remote weather and other data. This “small data” analysis and the relationships found within a farm are just as valuable as those found in a regional data set, its just that they only benefit the one farmer.  Attempts are underway to integrate the two camps which will be essential as neither tell the whole story alone. The point of integration however is where the breakdown exists.

So if what we really need is the field level geo-spatial data, how do we collect it and in a way that’s useful?  A term born in the late 90’s among my colleagues for data collected other than for running the day to day farming operation was “recreational data keeping”. Data needs to be collected first for business use, then made available for other applications.  Because of this we need to make sure it has business value first. Secondly, farmers need to be willing to invest in machine generated data such as from tractors and other equipment. It’s not just the financial decision but the decision to actually use the data. In many cases the yield monitor comes factory fitted but the value proposition to use it has not been well defined. Without clearly defined value propositions the field level data we need for Big Data will never exist.

Ultimately, it leads us back to the farmer and the farm service providers who walk side by side with the farmer every day. Many farmers can embrace technology and practice change independently, however many more benefit from services support. If Big Data is the panacea in agriculture to global security then we need to focus on value driven farm level services to encourage adoption and implement practice change. Offering a “pot of data gold” at the end of a rainbow as the reward for collecting data, or worse cash for contributing that data, is not the answer. The only sustainable model for farm services and extension is in commercial agronomic service providers who live in our regional communities.

So maybe what we really need when we break it down is an investment in a digital agriculture exchange framework starting with digital agronomy and extending through to digital value exchanges and data co-ops. Data collected for agronomic and business practice on farm is the starting point which is where our farm services model becomes so important. Once data exists a structured digital supply chain can offer value for farmers sharing that data.

Three models for digital agriculture

Digital agriculture is emerging as a high profile market. As the opportunity for innovation and disruption attracts more attention, three potential models have emerged. The first model is the 'one company does it all' approach. This is the domain of large multi-nationals who want to be a one stop shop for the farmer. The chance of one company being the best at every aspect of agriculture is low, even through aggressive acquisition of other companies. Even if one company was able to provide quality expertise and service across the board, concerns of privacy and constrained competition would hamper progress.

The second model, and the one that is the most popular today is the interconnected API approach. In this model, companies stand up data transfer connections pipes between themselves. The problem is the data inside the pipe is not standardised.  While this approach is manageable for a few relationships, it will never stand up to the emerging multiples of required connections which need to maintain currency of data and data edits.

The third model is the centralised data repository approach. In this model, all data synchronisation seamlessly occurs due to common database schemas that all compliant applications have adopted. This industry specific cloud can be thought of as a centrally secure “cloud data vault.” This data vault is able to access, receive, and create data from many different sources, such as software and machinery loggers. Data can be seamlessly and securely shared securely between users while administration rights protect against unauthorised abuse.

This model is currently executed on platform known as agX®, and is a service model (PaaS) available to all players in the agri-business industry. agX is emerging in numerous tractor controllers, farm software applications, farm apps, data services and on demand data providers such as aerial and satellite imagery providers. As agX continues to gain adoption, a market place of third-party, best-of-breed products is evolving.

agX® is the result of tens of millions of dollars, a decade of development, and deep domain knowledge. As new innovations arise, companies can develop new products cheaper and bring them to market faster by utilizing the agX Platform. Furthermore, being agX compliant is extremely valuable because it integrates products into a larger pool of users and markets that are commonly aligned.

This allows a farmer the option of using software and apps from many different vendors and migrating data seamlessly between apps. A service provider can differentiate their offerings to farmers by investing in customised solutions that add value. Software vendors can be nimble and aggressive in developing solutions to market opportunities without having to reinvent the wheel each time. The supply chain can develop value exchanges whereby the farmer can willingly share valuable farm data with those who can provide security of supply, access to markets, and improved risk management solutions. Corporate ERP systems can become integrated solutions all the way down to the farm gate thanks to structured database schemas existing that conform to corporate requirements. Irrespective of Big Data, what is emerging is a marketplace and ecosystem of interconnected and interoperable systems driven by proprietary IP and commercial market forces. In other words two mates can get together and write an app that can focus on one feature that will talk to the data a farmer already has in the cloud data vault. They don’t have to compete to be that farmer's one software system as now it’s an ecosystem.

Big Data is now a reality, and the opportunity to share data via trusted entities, where regional geo-spatial data sets can be created that identify relationships and trends, exists. Allowing farmers to compare results for common management practices and soil types to fast track agronomic and business practice change becomes a real prospect. Ag Retailers now have access to Business Intelligence tools and Researchers can engage with farm level outcomes on a regional scale to compare to R & D outcomes and run models. The ideal “trust entity” to represent farmers' data in aggregate needs to be identified, but the agX system in itself does most of the work managing security, privacy and engagement pathways.

If we can collaborate around an industry framework without creating an artificial construct that needs ongoing public funding, or disrupting a commercial digital agriculture marketplace, we can start creating farm level value that will drive a digital supply chain. Public policy, public R&D, industry bodies, the farm services sector, farm input supply sector, private investment in technology, and the farmer themselves, can all achieve their individual competitive goals when focused around a common framework. If we focus on the small data on farm and upstream into the supply chain the rest will take care of itself.

Contact details

Mark Pawsey
SST Software Australia
91 Commercial Rd, Teneriffe, QLD, 4005
Ph: 07 3854 2340
Email: mpawsey@sstsoftware.com

® Registered Trademark