This project analyzes the Ford GoBike bike-sharing system data to uncover trends in user behavior, trip duration, peak usage times, and more. The dataset includes details on ride duration, user type, gender, birth year, and trip timestamps.
-
duration_sec: the duration of the trip in seconds
-
start_time: the date and time when the trip started
-
end_time: the date and time when the trip ended
-
start_station_id: The unique identifier for the station where the trip began
-
start_station_name: the name of the station where the trip began
-
start_station_latitude: The latitude coordinate of the start station
-
start_station_longitude: The longitude coordinate of the start station
-
end_station_id: the unique identifier for the station where the trip ended
-
end_station_name: The name of the station where the trip ended
-
end_station_latitude: The latitude coordinate of the end station
-
end_station_longitude: The longitude coordinate of the end station
-
bike_id: The unique identifier for the bike used during the trip
-
user_type: The type of user: "Subscriber" for members or "Customer" for casual users
-
member_birth_year: The birth year of the user
-
member_gender: The gender of the user
-
bike_share_for_all_trip: Indicates whether the trip was part of the "Bike Share for All" program which offers discounted memberships to low income residents
- Trip Duration: Most trips last under 16 minutes, but outliers extend beyond 83 minutes.
- User Types: The majority of riders are subscribers (regular users), while customers (occasional riders) take longer trips on average.
- Gender Distribution: Most riders are male, followed by female. A smaller group chose 'Other', showing higher trip duration variation.
- Age Distribution: Most users were born between 1980 and 2000, but some data entries suggest birth years before 1900, likely errors.
- Peak Usage Patterns:
- Weekdays: Peaks at 8 AM and 5-6 PM, indicating commuter behavior.
- Weekends: Higher activity from 12 PM to 4 PM, suggesting leisure trips.
- Trip Duration by User Type:
- Customers tend to have longer trips.
- Subscribers have consistent trip durations.
- Daily Usage Patterns:
- Subscribers ride more on weekdays (especially Tuesday-Thursday).
- Customers have more balanced usage, with slightly higher rides on Saturdays.
The dataset used is the 201902-fordgobike-tripdata.csv, which contains:
- Trip details (start/end time, duration, start/end station)
- User information (user type, gender, birth year)
- Bike-sharing participation
- Python π
- Pandas π
- Matplotlib & Seaborn π¨
- NumPy π’
This analysis reveals distinct biking patterns in the Ford GoBike system, differentiating between subscriber and customer behaviors. Peak times align with work commutes, while weekends show increased leisure activity.
- Enhance data cleaning to handle unrealistic birth years.
- Include weather data to assess its impact on bike usage.
- Develop a predictive model for trip duration based on user behavior.
This project was conducted as part of the DECI and Udacity Data & AI Track.
- Clone this repository:
git clone https://github.com/AmmarYasserAI-2/FordGoBike-Analysis.git
- Install dependencies:
pip install numpy pandas matplotlib seaborn
- Run the Jupyter Notebook or Python scripts.