π What Customers Are Really Saying β And What ExpressWay Logistics Can Do About It

Summary: This project applies sentiment analysis to customer reviews for a logistics company. Using natural language processing and large language models (LLMs), it classifies text into positive or negative sentiment, helping the company monitor service quality and customer experience.
π Full code & notebook: GitHub β Expressway Logistics Sentiment Analysis
In the fast-moving world of logistics, customer trust is everything. When packages arrive late, get lost, or show up damaged, the impact on brand reputation can be more lasting than the delay itself. In this case study β the final capstone project required to complete the Generative AI for Business with Microsoft Azure OpenAI Program from Great Learning β I analyzed customer reviews from ExpressWay Logistics to uncover what customers were really saying about their delivery experiences, and what the business could learn from it.

The project involved examining real customer feedback to understand where the company was falling short, and more importantly, what it could do to improve. The end result was a set of clear, evidence-backed recommendations for reducing delivery errors, reinforcing customer satisfaction, and turning feedback into action.
π From Complaints to Clarity
The customer reviews were raw, emotional, and unpredictable β just like real feedback tends to be. Some were glowing, but many expressed dissatisfactions. While the content was unstructured, the message was consistent: customers had experienced service failures, and they wanted to be heard.
To make sense of the feedback at scale, I used a Large Language Model (LLM) accessed through Azureβs OpenAI API, along with a carefully designed classification pipeline that sorted reviews into positive or negative sentiment. This enabled the extraction of recurring themes and actionable insights from a messy stream of natural language.
βοΈ A Technically Rigorous Foundation
While the business goals were central, the project also had clear technical objectives β as part of a structured capstone exercise. Specifically, I evaluated:
- How to use Azureβs OpenAI API to query a production-grade LLM
- The effectiveness of two prompt engineering techniques β zero-shot and few-shot β in classifying text reliably
- Whether the outputs of prompt-based sentiment analysis could support practical business decisions
This dual focus ensured both robust methodology and business-aligned outcomes, even in a simulated environment.
π What the Data Revealed
Several important patterns stood out:
- β Negative reviews commonly mentioned packages marked as βdeliveredβ that never arrived β revealing issues with backend tracking
- π Complaints often cited slow or unhelpful customer service β suggesting gaps in training or support capacity
- π Positive reviews consistently praised friendly, respectful delivery staff β a human asset worth protecting and promoting
These patterns helped isolate both risks and strengths in ExpressWayβs last-mile delivery process. While only a few key examples are discussed here, the full notebook contains a much broader set of operational and strategic recommendations derived from the model outputs.
π‘ Recommendations That Drive Action
The following recommendations emerged from the analysis:
- π Audit delivery status workflows to catch discrepancies between system status and actual outcomes
- π§ Introduce a sentiment-aware support queue, giving priority to customers flagged by the model as unsatisfied
- π Use high-performing drivers (as revealed in positive reviews) as internal benchmarks for hiring or training
Each recommendation ties directly to patterns in the data β not assumptions β making them easy to justify and test in practice.

I would like to invite the reader to explore Section 5 β Observations, Insights, and Business Perspective in the notebook, especially Section 5.3 β Business Recommendations, which contains a broader set of insights across five strategic areas.
π§βπ Sample Code Output
To give a concrete example of how the evaluation process was implemented, here is a verified code snippet from the ExpressWay sentiment analysis notebook. It shows how zero-shot and few-shot prompts were evaluated over multiple runs using gold-standard labeled examples:
# Running the evaluations
for _ in tqdm(range(num_eval_runs)):
# For each run create a new sample of examples
examples = create_examples(cs_examples_df)
# Assemble the zero shot prompt with these examples
zero_shot_prompt = [{'role':'system', 'content': zero_shot_system_message}]
# Assemble the few shot prompt with these examples
few_shot_prompt = create_prompt(few_shot_system_message, examples, user_message_template)
# Evaluate zero shot prompt accuracy on gold examples
zero_shot_micro_f1 = evaluate_prompt(zero_shot_prompt, gold_examples, user_message_template)
# Evaluate few shot prompt accuracy on gold examples
few_shot_micro_f1 = evaluate_prompt(few_shot_prompt, gold_examples, user_message_template)
zero_shot_performance.append(zero_shot_micro_f1)
few_shot_performance.append(few_shot_micro_f1)
π Final Takeaway
This project demonstrates how Large Language Models, used thoughtfully, can surface operational truth from noisy, human-written reviews. It also shows that the design of your prompt β the bridge between model and task β can make or break your results. With the right framing, even a generic LLM can deliver insights that matter.
π Full code & notebook in Github: https://github.com/musicwil/Generative-AI-projects/tree/main/sentiment-analysis
π Evaluator Feedback
"I am thoroughly impressed with your performance on this assignment. Your demonstration of proficiency shines through in the flawless execution of all necessary steps. The clarity and coherence with which you have aligned your concepts with the assignment criteria are exemplary, showcasing a deep understanding of the material. Achieving a perfect score of 30 reflects your exceptional attention to detail and commitment to excellence. Your dedication to mastering the subject matter is evident and serves as a model of academic achievement."
β Evaluator feedback from a capstone project in the Generative AI for Business with Azure program with Great Learning.