2025 Challenge and Winners

Winners

Congratulations to 2025 Winners!

First Prize Award Winner($5000)Sean Williams California, United States
Second Prize Award Winner($3000)Prashanti Rao Seattle, United States
Third Prize Award Winner($1000)Justin Collins, Maine, United States

2025 US Applied AI Olympiad Challenge: Last-Mile Delay Prediction Challenge

Challenge Overview

In this challenge, you will build a machine learning model that predicts delivery delay time (in minutes) using real-world operational signals such as distance, prep time, traffic, weather, and courier experience.

Your task is to train a model that accurately predicts delays before they happen, enabling smarter dispatch, pricing, and staffing decisions.

This is a pure applied AI problem. Strong feature engineering, modeling choices, and validation strategy matter more than deep theory.

The Problem

Given historical delivery data, predict:

delay_minutes – the total delivery delay beyond the expected time.

Each row represents a single delivery order. The data contains a mix of numerical and categorical features that reflect real operational complexity.


Dataset

You are provided with:

  • train.csv – labeled training data (features + target)
  • test.csv – unlabeled evaluation data (released later)

Target Variable

  • delay_minutes (continuous, regression task)

Features (Data Dictionary)

FeatureDescription
order_idUnique identifier
distance_kmDelivery distance
prep_time_minRestaurant preparation time
courier_experience_monthsCourier tenure
weatherWeather condition
traffic_levelTraffic congestion
time_of_dayMeal period
day_of_week0=Monday … 6=Sunday
order_value_usdOrder price
restaurant_ratingAverage rating
items_countNumber of items
past_delay_rateCourier’s historical delay frequency
surge_multiplierDynamic pricing multiplier
zoneDelivery zone type

Objective

Build a model that minimizes prediction error on unseen deliveries.

Evaluation Metric

Mean Absolute Error (MAE)
Lower MAE = better performance.


Rules

  • Any ML framework allowed (sklearn, XGBoost, LightGBM, CatBoost, PyTorch, etc.)
  • External datasets not allowed
  • Feature engineering is allowed
  • Ensembles are allowed
  • No manual labeling or leakage from test data

What We’re Looking For

  • Clean data handling
  • Thoughtful feature engineering
  • Solid validation strategy
  • Practical model choices
  • Clear reasoning

This challenge mirrors problems faced by logistics companies, delivery platforms, and operations teams in the real world.

Deliverables

Participants must submit the following:

Google Colab Notebook:

Content:

  • Preprocessing of the dataset.
  • Exploratory data analysis (EDA).
  • Implementation of a CNN classifier to predict plant species.
  • Evaluation of model performance using metrics like accuracy and confusion matrix.
  • Insights drawn from the data and model results.

Submission Format: The notebook must be submitted as an HTML file (.html).

Predictions file

  • CSV with columns: order_iddelay_minutes

Business Presentation:

Audience: Tailor your presentation for a Data Science Lead in the agricultural industry.

Content:

  • Business Overview: Clearly define the problem and explain your solution approach.
  • Key Findings: Highlight insights derived from the data and model results that drive business decisions.
  • Business Recommendations: Provide actionable steps to improve agricultural efficiency using your model.
  • Potential Benefits: Explain how the solution can lead to higher yields, reduced manual labor, and sustainable practices.

Submission Format: The presentation must be submitted as a PDF file (.pdf). Avoid copying code into the slides unless necessary to illustrate key points.

Submission Guidelines

  • Challenge Opens: Nov 15, 2025
  • Submission Deadline: Nov 30, 2025

Submit your deliverables through the competition portal. Ensure that all files are error-free and adhere to the submission formats.

Evaluation Criteria

Submissions will be judged on the following:

  • Model Accuracy: How effectively your CNN model classifies the plant species.
  • Insights and Analysis: The depth and clarity of insights derived from the data.
  • Business Recommendations: Relevance, feasibility, and impact of your proposed actions.
  • Presentation Quality: Engaging and clear storytelling tailored to a business audience.

Best Practices for Success

  • Use Google Colab’s GPU runtime for faster training.
  • Document your notebook with inline comments and markdown cells for better readability.
  • Ensure the notebook runs sequentially without warnings or errors.
  • Keep your presentation concise, with a focus on business impact and actionable insights.

Why Participate?

  • Gain hands-on experience with real-world datasets and deep learning techniques.
  • Develop and showcase your data science and business communication skills.
  • Be part of the growing movement to integrate AI into agriculture and solve global challenges.

Join us in transforming agriculture with AI—your innovations today can shape a sustainable tomorrow!