Skip to content Skip to sidebar Skip to footer

How To Get Top 5 Reason For Each Airline?

I need the top 5 reasons for each airline only. I managed to get the crosstab for all airlines but it is not sorted and it displayed all the reasons. How can I narrow my results? p

Solution 1:

This is not the best solution but it does the job.

top_n = 5gb = df.groupby(['airline', 'negativereason']).size().reset_index(name='freq')
df_tops = gb.groupby('airline').apply(lambda x: x.nlargest(top_n, ['freq'])).reset_index(drop=True)

It requires 2 steps. First is to calculate the frequencies for each negativereason per airline, second is to take top_n reasons based on frequency.

Solution 2:

I managed to get the count of each negativereason for each flight. But I still can't get the top 5 results of each airline, sorted from highest to lowest.

count = df.groupby(['airline','negativereason']).size()
print(count) 

>airline         negativereason>American        Bad Flight                      87>                Can't Tell                     198>                Cancelled Flight               246>                Customer Service Issue         768>                Damaged Luggage                 12>                Flight Attendant Complaints     87>                Flight Booking Problems        130>                Late Flight                    249>                Lost Luggage                   149>                longlines                       34>Delta           Bad Flight                      64>                Can't Tell                     186>                Cancelled Flight                51>                Customer Service Issue         199>                Damaged Luggage                 11>                Flight Attendant Complaints     60>                Flight Booking Problems         44>                Late Flight                    269>                Lost Luggage                    57>                longlines                       14

Solution 3:

One approach:

Dataset

,Bad Flight,Cant Tell, Cancelled Flight,Customer Service Issue,Damaged Luggage,Flight Attendant Complaints,Flight Booking Problems,Late Flight,Lost Luggage,Longlines Airline
American,87,198,246,768,12,87,130,249,149,34
Delta,64,186,51,199,11,60,44,269,57,14
Southwest,90,159,162,391,14,38,61,152,90,29
US Airways,104,246,189,811,11,123,122,453,154,50
United,216,379,181,681,22,168,144,525,269,48

Code

import pandas as pd
air = pd.read_csv("airlines.csv", index_col = 0)
print(air)
print(" ")
american5 = air.loc["American"].sort_values(ascending = False).get(range(5))
print(american5)

Output

BadFlightCantTellCancelledFlightCustomerServiceIssueDamagedLuggageFlightAttendantComplaintsFlightBookingProblemsLateFlightLostLuggageLonglinesAirlineAmerican87198246768128713024914934Delta64186511991160442695714Southwest901591623911438611529029USAirways1042461898111112312245315450United2163791816812216814452526948CustomerServiceIssue768LateFlight249CanceledFlight246CantTell198LostLuggage149Name:American,dtype:int64

Post a Comment for "How To Get Top 5 Reason For Each Airline?"