Streamlining agriculture operations with serverless anomaly detection using AWS IoT

Introduction

Honeybees live in swarms of tens of thousands, gathering nectar. In this process, they carry pollen from one flowering plant to another, pollinating them.

” Close to 75 percent of the world’s crops producing fruits and seeds for human use depend, at least in part, on pollinators[1]. ”

As well as being one of nature’s key pollinators, bees transform nectar into honey. With the help of beekeepers, like David Gerber from Switzerland, this delicious honey is made available for global consumption.

Figure 1: David Gerber’s IoT enabled beehives (Neuchatel, Switzerland)

Bees live in hives. These hives are often located in remote locations, like forests or high mountain pastures. These remote locations make monitoring the health of bees challenging. However, by creating connected solutions using cloud-based services, such as AWS IoT Core and AWS Lambda, beekeepers can implement near real-time monitoring tools to track health parameters for a bee hive. AWS IoT Core is a fully managed cloud service, that lets you connect Internet of Things (IoT) devices and route their messages to AWS without managing infrastructure. AWS Lambda is a serverless compute service allowing you to deploy code without provisioning or deploying servers. In this blog post, we walk through an IoT architecture and provide a hands-on example of how to create and test your own serverless anomaly detector to improve your operations.

Prerequisites

For this walk through, you should have the following prerequisites:

The hands-on example is written in Java and the CDK infrastructure code is written in Typescript. It’s not required to have deep knowledge in either to deploy and run the example. This solution can run entirely within the AWS Free Tier for one or even several executions. Clean-up instructions are provided at the end of this post.

Gaining insights into hive health

We gain insights by measuring and sending IoT events. Choosing what to measure about a hive is important. The right metric allows us to gain insights into the lives of the bees. In Figure 2, we can see the variation of a hive’s weight as the days go by. At first glance, the data appears quite chaotic. However, a closer glance reveals a wealth of information.

Figure 2 : Weight of hive over two weeks

From Figure 2, we can chart a hive’s major events over 24 hours.

Bees make honey by reducing the nectar’s water content. Bees fan their wings to create airflow within the hive, causing the water in the nectar to evaporate. This results in a steady weight reduction of the hives overnight.
At sunrise, the bees are ready for their day’s work, causing a sudden drop in the hive’s weight.
Over the day, bees return to the hive carrying nectar with them, causing a steady increase in the weight of the hive.
At sunset, all the bees return to the hive with their remaining sector resulting in a sudden increase in the hive’s weight.
Finally, by comparing the hive’s weight, at the same time of day 24 hours apart, we can tell how productive the hive has been.

Figure 3 A hive’s major events over 24 hours

Detecting anomalies

Returning to the original dataset in Figure 3, we can see, in fact, the first week has been very productive :

The bees benefit from excellent conditions with a daily increase in the hive’s weight.
The beekeeper extracts approximately 10kg of honey at the end of the week.

Figure 4 : Daily increase in hives weight

However, not every week is as good, and at the start of the second week in Figure 4, we can see things get off to a more difficult start.

The bees do not leave the hive; this could be due to a shortage of nectar in the area, a sign to consider moving the hive.
Or it could be just temporary bad weather, which passes and allows the bees to continue collecting nectar later on in the week.

Figure 5 : First week of June

After taking a series of measurements, an anomaly deviates from what we’ve previously seen; it is unexpected. Bad weather can be detected as an anomalous event, but little can be done. Unfortunately, both bees and humans have to live with it. However, several other anomalous events can be beneficial to detect in remote hives.

A sudden increase in the quantity of nectar available for bees to collect results in a significant rise in honey production called honeyflow.
1. During a honeyflow, the weight of a hive can increase daily by a kilogram and lets the beekeeper know it’s time to add additional space to the hive.
2. Conversely, a stagnation in weight increases allows the beekeeper to confirm the end of the honeyflow. The honey will be available to harvest a few days later after its moisture content has been reduced.
A sudden increase in the daily sector collected over a 24-hour period lets the beekeeper know it’s time to collect the honey and free up space to allow the bees to continue working.
When a hive grows, it will eventually split in two by swarming, with half the hive deciding to leave (a sudden decrease in weight but not at sunrise) with the old queen. Typically, this swarm will settle in a temporary location and can be recaptured by the beekeeper if detected in time.
A significant reduction in weight of tens of kilos implies someone other than the beekeeper is collecting the honey, leading to potential operational losses for beekepers.

Solution overview

Figure 6 : The overall AWS architecture of the solution

Figure 6 shows the overall AWS architecture of the solution. The solution uses IoT sensors deployed under each beehive to send the hive’s weight regularly in an IoT event. These IoT sensors communicate using the LoRaWAN protocol. LoRaWAN is ideally suited for the delivery of IoT events in hard-to-reach locations. It trades severely restricting message payload size for the ability to deliver this payload over kilometers using minimal power consumption. The beehive’s IoT sensors sends the event to a Things Network (TTN) Gateway. TTN democratizes access to an IoT network, allowing participants to set up their own gateways. This gateway is the communication link between the IoT sensor and AWS IoT Core for LoRaWAN. AWS IoT Core for LoRaWAN provides access to a fully managed LoRaWAN Network Server (LNS), eliminating the need to develop, maintain, or operate a separate server. You can find further details on integrating TTN and AWS IoT Core here.

Using AWS IoT Core Rules Engine, you can automatically route messages to Amazon Simple Queue Service (Amazon SQS). This decouples AWS IoT Core from AWS Lambda, allowing the IoT event to be processed asynchronously. AWS Lambda allows the anomaly detection code to be deployed in a serverless fashion, eliminating, yet again, the need to manage your infrastructure. AWS Lambda will scale horizontally to meet any increase in IoT traffic. The first of two Lambda functions persists the event and allows all previous events to be sorted on retrieval. Retrieval of events in chronological order is essential in determining whether an event is anomalous.

The anomaly detection code running in AWS Lambda lies at the heart of the solution. It relies on an implementation of the Random Cut Forest (RCF) [2] algorithm written by AWS. RCF is a machine learning algorithm capable of detecting anomalies in an unsupervised manner. The algorithm constructs collections of random binary trees. An anomaly score reflects how far a point is from the others in the tree. Outlying data points are less likely to be consistent with other data points in the tree, leading to higher anomaly scores. RCF is designed to process streamed multi-dimensional data efficiently, making it perfect for our scenario of streamed IoT messages containing the beehive’s weight. Finally, the beekeeper can be notified of anomalous events using Amazon Simple Notification Service.

Hands-on setup architecture

Figure 7 : Simulation architecture

To test the anomaly detection solution more easily from our laptops, we’ve created a third Lambda function, which will simulate the creation of IoT events during May (see Figure 7).

Figure 8 : Simulation data

Figure 8 visualizes the synthetic data used for the simulation. The data shows a gradual increase in the hive’s weight over thirty days starting from the 1st of May. The hive’s weight peaks each evening while gradually reducing in weight overnight, with a sudden dip as the hive departs at sunrise. The hive’s weight slowly recovers during the day with the return of nectar-laden bees. The data set contains 720 data points (30 days times 24 hours). Only one data point is unusual: the 8th of May, when the hive’s weight is unexpectedly reduced by 1.5+ Kg. This example shows the power of the RCF algorithm; a simple threshold value will not suffice due to the hives increasing weight. Indeed the 8th of May anomaly is a valid data point on the morning of the 4th of May.

Simulation execution and results

The goal of the simulation is to correctly identify the one anomalous IoT event (on the 8th of May at 04:00) among the 719 other events. Please refer to the beehive-anomaly-detection-simulation git repository for more details on environment setup and instructions on how you can run the simulation from your laptop.

Before we deploy any infrastructure, we first have to compile and package the Java Lambda by running the following commands:

git clone https://github.com/aws-samples/iot-beehive-anomaly-detection-simulation-blog-source-code.git
cd iot-beehive-anomaly-detection-simulation-blog-source-code
mvn clean install

The infrastructure for this simulation is described using AWS Cloud Development Kit (CDK). CDK allows you to define each infrastructure component as code, in our case, using typescript.

const iotEventsSQSQueue = new sqs.Queue(this, 'IoTEventsSQSQueue', {
    visibilityTimeout: cdk.Duration.seconds(120),
    queueName: 'iot-events'
});

new iot.TopicRule(this, 'IoTEventsSQSQueueRule', {
    topicRuleName: 'ioTEventsSQSQueue',
    description: 'invokes the lambda function',
    sql: iot.IotSql.fromStringAsVer20160323("SELECT * FROM 'iot/beehive'"),
    actions: [new actions.SqsQueueAction(iotEventsSQSQueue)],
});

For example, in the code snippet above, we describe the creation of an SQS queue named iot-events and an AWS IoT Core rule that forwards IoT events from the iot/beehive MQTT topic to the SQS queue. Similarly, all the remaining infrastructure components (the three Lambdas and one DynamoDB table) are defined in infrastructure/lib/infrastructure-stack.ts

We deploy the infrastructure using the following CDK commands. If this is the first time you deploy infrastructure with CDK, you will need to bootstrap. CDK bootstrapping sets up permissions policies, an AWS CloudFormation stack, and an S3 bucket to store deployment assets. It is required only once per account and region.

Run the following commands to deploy our infrastructure:

cd infrastructure
npm install
cdk bootstrap
cdk deploy

Now, we can begin the simulation proper by invoking the IoTBeehiveEventsSimulator. At the core of this Lambda, we create an AWSIotDataAsyncClient, a client for accessing the AWS IoT Data plane asynchronously. For every element in the iot-beehive-events-simulator-lambda/src/main/resources/hive-sample-events.json array an IoT event is sent to the MQTT topic iot/beehive. The quality of service (QoS) is set to 1, ensuring the event is sent at least once. As we cannot guarantee exactly once event delivery in distributed systems, the choice is between not receiving an event or receiving an event multiple times. However, we can ensure exactly once processing by making each Lambda idempotent. They return the same result whether they are called once or many times.

AWSIotData iotClient = AWSIotDataAsyncClientBuilder.defaultClient();

for (HiveEvent hiveEvent : hiveEvents) {
    PublishRequest publishRequest = new PublishRequest()
            .withQos(1)
            .withTopic("iot/beehive")
            .withPayload(ByteBuffer.wrap(hiveEvent.toString().getBytes(StandardCharsets.UTF_8)));
    iotClient.publish(publishRequest);
}

Run the following command to begin the simulation:

aws lambda invoke --function-name IoTBeehiveEventsSimulatorLambda --cli-binary-format raw-in-base64-out --payload '{"hiveID":"1"}' response.json

We can confirm that all IoT events have been persisted successfully by running a full scan of the DynamoDB table with the following command and ensuring the result is 720.

aws dynamodb scan --table-name HIVE_EVENTS --select "COUNT"

Note: Feel free to call IoTBeehiveEventsSimulator multiple times, confirming each unique event is processed exactly once.

Finally, it’s time to determine if any IoT events are anomalous by running IoTAnomalyDetectionLambda. The anomaly detection Lambda reads the IoT events from a DynamoDB table. DynamoDB is essential in ensuring no events are lost and allows the processing of IoT events in order (according to their timestamp). Whether the hive weight at any particular point in time is as expected can only be determined by an ordered processing of previous events.

Run the following commands to begin the anomaly detection. The results are stored in the response.json file:

aws lambda invoke --function-name IoTAnomalyDetectionLambda --cli-binary-format raw-in-base64-out --payload '{"hiveID": "1"}' response.json
less response.json | jq

Sample Response:

[
  {
    "datetime": "2023-05-08 04:00:00.0 +0200",
    "weight": 64650,
    "anomalyGrade": 1.0,
    "anomalyScore": 1.2257463093204803,
    "expectedValue": 66195,
    "isEventAnomalous": true
  }
]

An anomaly score represents how likely the event is to be an outlier, with a threshold value of 1.0 typically used to signify an anomaly. The score of a model and its (inverse transform to) inference are considered separately. Hence, we have an anomaly grade. In our case the transformation is a normalization of the event stream, where the linear increase in weight of the hive is factored out. An anomaly grade ranges from 0 to 1, where a value greater than 0 is likely anomalous.

Figure 9: Successful detection of anomaly

In figure 9 we can see the CloudWatch metrics reported by the anomaly detection algorithm show indeed, only a single anomaly has been detected. Furthermore, the response confirms the anomalous event is from 04:00 on the 8th of May.

Calculating an event’s anomaly detection by reprocessing the previous events stored in DynamoDB adds several seconds of latency to the score calculation. However, this allows the solution to remain entirely serverless, making it an acceptable trade-off. Streaming the events using Amazon Managed Service for Apache Flink could be an alternative solution for latency-sensitive solutions.

Cleaning up

Infrastructure created with CDK can be very easily torn down. Simply run the following commands from a terminal.

cd infrastructure
cdk destroy

Conclusion

The blog post showed how IoT can solve exciting and important challenges in the natural world. The architecture we presented is entirely serverless, keeping costs and infrastructure maintenance efforts low. Finally, we walked through a hands-on example where you can dive into the code and run the simulations yourself. If you want to work on your own IoT projects, check out TTN and AWS IoT.

References

[1] https://www.fao.org/world-bee-day/en/

[2] https://assets.amazon.science/d2/71/046d0f3041bda0188021395b8f48/robust-random-cut-forest-based-anomaly-detection-on-streams.pdf

David Gerber.jpg

David Gerber

David works with his customer’s teams on their full software development lifecycle, from initial concepts right through to production. He is passionate about software development, IoT and … beekeeping.

Kevin Nash.jpg

Kevin Nash

Kevin is a Senior Solutions Architect at Amazon Web Services (AWS), based in Switzerland. With a background in distributed systems and many years experience building for the customer. He is passionate about technology, understanding how systems work and helping customers bringing their solutions into the Cloud.

The Internet of Things on AWS – Official Blog