Over the weekend, American Airlines started boasting about its APEX Five-Star rating on Twitter, and the internet had a breakdown. Is this rating out of 100? What criteria were used? Because American Airlines certainly hasn’t had the best reputation among frequent flyers in the past year. This summer was especially rough as cancelations, maintenance disputes, and indifferent customer service agents sent everything down the tubes.
All the stars! We've been named a Five Star Global Airline by @theAPEXassoc — ratings are based on feedback by more than one million customers in five categories: seat comfort, cabin service, food and beverage, entertainment and Wi-Fi. Way to go #AATeam! pic.twitter.com/Z0NCNZNJ1P
— americanair (@AmericanAir) September 13, 2019
Just one personal example: My sister had to change her honeymoon flight on British Airways because of their pilot strike and got rebooked on American Airlines. She was thrilled. I told her to be careful what she wished for. Lo and behold she later texted me from Philadelphia after cascading delays and a closed airport made her never want to fly again.
But her opinion will never be included in the APEX survey. Read on to learn why.
Can We Trust APEX Ratings That Are “Verified, Certified, and Validated”?
The most immediate concern you should have about APEX ratings is that the airlines being rated are also passing out the awards. True, the responses come from their customers, but they have a lot of control over how the data are collected and interpreted.
As Mark Twain said: “There are lies, damned lies, and statistics.” Anyone can justify anything when it comes to data analysis, whether adding more data or taking some out. Just because the inputs are real and the math is correct doesn’t mean the results are meaningful or useful. This is why any scientific manuscript will spend the bulk of its text describing the methodology. If you only want to know the results, there’s usually a one-paragraph summary at the start.
Is APEX Using the Best Data?
Voting theory is among the most interesting parts of political science because the first thing you learn is there is no best way to choose a government. And choosing a government is the most important way that citizens (customers) indicate their satisfaction with a candidate (product).
Sometimes you get weird outcomes like Clinton winning the popular vote and Trump winning the electoral college vote. Britain recently elected Boris Johnson as its new prime minister by polling only registered voters of the Conservative Party. New Zealand uses an uncommon (but my favorite) method known as ranked choice voting that gives preference to second-choice candidates who would satisfy more of the total population if the first-choice candidate fails to earn a majority.
APEX only considers feedback from actual passengers, which is verified through Concur and TripIt (a company owned by Concur). I suspect that most people who use Concur or TripIt are business travelers, so there is already a risk of excluding the leisure segment. My sister may use Concur for her business travel needs, but this flight was booked as an award ticket through British Airways. She certainly doesn’t use TripIt for either personal or business travel.
APEX excludes any customers who are dissatisfied with the carrier and decline to travel with them anymore because these would not be “verified” passengers. My sister may have been a customer for this particular flight but definitely won’t book American again.
Finally, APEX excludes any passengers on a delayed flight because that negative event might impact their experience. Wait, what? Surely a plane that takes off and lands on-time matters more to the customer experience than the type of Champagne served in the air. And as Ben points out, some of the criteria being scored may not even apply to every airline.
As a result, the APEX ratings are kind of like polling people for their approval of President Trump but asking only members of the Republican Party who actually voted in 2016. It’s true that the president’s own party represents his strongest political base, but one hopes that he is interested in the opinion of all citizens.
One further issue is that people who buy a product (or choose a candidate) don’t like to admit when they made a mistake. Ratings from verified customers could be inflated, intentionally or not.
Are APEX Members Suppressing Bad Ratings?
Out of 72 airlines that earned four- or five-star ratings, 63 are members of APEX. This by itself is not too surprising. I would expect most airlines—especially large ones—to be members of this association, and I would expect most airlines—especially large ones—to have an interest in passenger satisfaction whether or not they are being rated by APEX.
But of the 80 airlines that are members of APEX, 62 earned four- or five-star ratings. The remaining 18 do not appear on the award list, which means they either did not qualify for consideration (perhaps Air Canada is not a “major airline”?) or received a low rating that was not published. This outcome is more suspicious since it implies that member airlines are able to suppress negative results.
I should also note there are some inconsistencies in how APEX publishes its results. There are four categories: Global, Major, Regional, and Low Cost. However, only the five-star ratings are broken out into separate categories for Major and Regional, while the four-star ratings are lumped together in a single “Major Regional” category. The Low Cost carriers do not have any five-star-rated airlines, but this in itself is not surprising given their business strategy. I don’t see any evidence that they are disqualified from a five-star rating.
I am not claiming here, nor have I ever claimed, that airlines are paying APEX for favorable ratings. I actually think there are enough flaws in the system that airlines don’t have to pay. If an airline still manages to earn a bad rating, it just gets swept under the rug.
Are APEX Ratings Impactful to Customer or Business Decisions?
Finally, let’s get to the heart of the issue. APEX is doing all this measuring and coming up with ratings, but do they actually matter? Will an airline change its practices to earn a higher rating, or will a customer choose to fly on an airline with a higher rating than a competitor?
I just mentioned that nearly all APEX members earned four- or five-star ratings. It’s really difficult to see how these airlines can distinguish themselves when everyone gets the same score.
This is just like grade inflation in school. I had professor who told the class that if anyone got 100% on an exam, he would throw it away and never use it again. That exam was clearly useless since it had failed to adequately measure how much we learned. If more than one person scored 100%, then there was no way to determine who had learned more. How useful is an airline rating system to customers when nearly everyone earns a 5/5?
I also mentioned the problem with limiting your dataset to the likely business travelers who didn’t suffer any delays and already made a commitment to paying for your service. This isn’t going to help anyone in management spot a problem growing the customer base or expand into different customer segments.
Finally, I have an issue with how APEX categorizes these airlines into Global, Major, Regional, or Low Cost. This is the least important point I’m going to make, but I’ve written so much it’s worth throwing in at the end. Just how much do these categories matter anyway, and are they relevant to how the data were collected or how customers view the airline?
For example, I’ve had plenty of people complain to me when I call Southwest a low-cost carrier. They say it shouldn’t be in the same bucket as Allegiant or Spirit. That’s a perception issue. But I think there is another reason to view the categories as capricious and subjective.
If APEX is going to use verified passengers to collect its data, then it makes sense that it should use some related measure of passenger volume or seat capacity to categorize the airlines being ranked. That’s going to give you some correlation in the number of passengers who can contribute their opinions to your data set.
OAG, an aviation data and analytics group, recently provided data to Business Insider on airlines with the most available seats in their fleet (a product of both the number of aircraft and the size of each aircraft). Since this kind of data is often behind paywalls I can’t access it myself, but I’d be glad to take a look at other resources if you want to share them in the comments.
The 20th largest airline in the world is one close to my own heart: Alaska Airlines. APEX lists it as a Major airline, presumably because it doesn’t have the global reach of a carrier like Aeroflot, the 19th largest airline. (Both got five stars, by the way.)
Air Canada, coming in at #16, is nowhere to be found on the APEX ratings list. This supports my suspicion that it scored three or less since I think most people would argue it is either a Global or Major airline. And yet this is doubly perplexing because my own opinion is that they’re a pretty good carrier. Other airlines missing from APEX:
- Indigo (#13)
- Air China (#10)
- EasyJet (#8)
- Ryanair (#5)
I’m not saying this methodology of using seat capacity to categorize airlines is better. I’m just saying it’s another option and should make you ask if APEX’s methodology is the right choice. Remember, anyone can justify anything.
Conclusion
And that brings us back to where we started. The largest airline in the world by seat capacity is American Airlines. APEX apparently believes they are a five-star airline, one of the best of the best as far as passenger experience is concerned. When the carrier started promoting this on Twitter, I and others found it amusing and a sign that the ratings in general were not to be trusted. The CEO of APEX responded, suggesting we can trust the methodology because he is a “Ph.D. and analytical purist.”
TONS of them. Of nearly 600 airlines rated worldwide, a single digit percentage of them reach Four Stars. An even smaller single digit percentage of them reach Five Stars. I am a PhD and analytical purist that believes in our responsibility for truth from verified passengers.
— Dr. Joe Leader (@joepleader) September 14, 2019
Well I’ve got news for you: I’m also a Ph.D., and my day job is campaign reporting and analytics for a major corporation. I create and track KPIs. My goals are to make sure (1) we trust the data that goes into each metric, (2) we can take action on the outcome of each metric, and (3) these metrics cannot be manipulated either intentionally or unintentionally.
I don’t trust that the APEX ratings satisfy any of those criteria. There are big gaps in the data they use, which are biased toward happy customers. There is a big gap in the ratings produced, which are too homogeneous and fail to call out struggling carriers. And there is an overall lack of transparency in the process that makes me skeptical of everything in between.