For two key reasons, public transportation accessibility has become a hot topic for scholars and transit organizations. First, improved transit accessibility promotes active transportation (such as walking and bicycling) while decreasing private vehicle use. As a result, it will enhance public health and reduce GHG emissions –. Second, transit-dependent people primarily rely on public transportation to reach key services (e.g., workplace, university, and shopping center). Thus, transit accessibility is crucial to attaining socioeconomic fairness, . Also, transit accessibility research may help guide decisions about transportation investment and land use development.
GTFS stands for General Transit Feed Specification. It was developed by TriMet and Google in Portland . Google announced transit feed specs in 2007, enabling transit agencies to develop and publish transit data online as open sources using the GTFS format. The feed rapidly became the most extensively used standard for exchanging static transit data in Canada and the United States  . Additionally, the transit sector has embraced the GTFS format as a standard for communicating schedule data due to its expanding popularity. Subsequently, software such as OpenTripPlanner, GoogleMaps, and Bing Maps was developed and updated to use GTFS and provide services like, stop locations, timetables, and route planning. The trip (or route) planner is essential to these applications since it examines GTFS data for possible routes between two places, which is the most demanded service.
Handling a user trip planning request requires two actions or steps. First, identifying all feasible pathways or routes between two places as candidate solutions; second, filtering and validating these solutions according to the user's schedule and start time. Route planning is more complex than identifying a path or route in a graph since it considers journeys, directions, and intermediate transfers between bus stops or stations. The shortest path of a graph is a frequent issue; various algorithms have been developed to solve it –. Depending on the factors used to calculate the weight of graph edges, the algorithm may be bi- or multi-criteria. For instance, if the graph edges depict roads, the road weight may be a bi-criterion, taking distance and cost into account, or a multi-criterion, taking other parameters into account. Numerous techniques attempt to solve the shortest path issue by decreasing the considered factors to a single value; these algorithms fall into types like two-phases algorithms , kth shortest-path , label correction, and setting algorithms–, and others –. A subset of these techniques is used in certain studies  to locate pathways in local transportation networks. We have already introduced a trip planning algorithm variation this algorithm ignores the weight criteria while checking for all possible next transitions from the current and uses a limit for the number of made transitions to find the best route (trip plan). This trip planning algorithm can find the possible trip plans without considering the timing factor. However, some plans may be rejected due to time conflict between the plan trips and the GTFS trips timetable or trip unavailability at the time the user determines to start the journey. Therefore, we must calculate and find the trips time and transit accessibility as a next step.
To calculate transit accessibility in spatiotemporal dimensions, trip time for station pairs must be calculated at any particular time of day, which is practically impossible with a standard computer as it is time-consuming and needs high computation powerAlthough previous studies  introduce algorithms that try to calculate the trip time and transit accessibility while reducing the time complexity and computational power, there is still a need to find an approach to simplify the complexity of such problems solution and reduce the required time and resources, what is the aim of this paper. Algorithms enhancement is a common research topic–. The contribution of this paper has two parts. First, we introduce a new time validation algorithm that can find the timing information for a trip plan or reject the plan if there is a time conflict according to the trips timetable in the GTFS. Second, we go beyond the algorithm enhancement and propose RMH (Range Mapping Hash), which is a new method that can find and extract the timing information for any trip using GTFS data with O(2) time complexity. Our new approach (RMH) eliminates the need for an algorithm to search the GTFS timing records. We use Redis NoSQL Hash to create RMH. Thus we provide a solution by turning the problem of simplifying the existing algorithms into a simple database interaction that can run even on a stander computer. The idea is that for a route going through a station, at any minute between the last going bus and the next bus, the answer for the question "when is the next bus time" will be the time of the next bus. The RMH is applicable not only for the GTFS timing data but also for improving the performance of similar problems, as we describe later. We experiment with the performance of RMH and compare it to run-time search algorithm performance using arbitrary search input for 30 pairs of origin and destination stops using the GTFS data of Debrecen and Budapest. We implement the algorithm and RMH as an open-source project using C# and Redis available on https://github.com/mustafamajid/GTFS-csharp. The project also includes our published trip planning algorithm.
Next, we review our route planning algorithm and its output data structure. Then we introduce the time validation algorithm, which will use this output to provide the trip's timing information. Later, we introduce the RMH approach and the Redis implementation. Finally, we list our experiment's results and performance evaluation.
2- GTFS and Trip Planning
The GTFS data is a set of tables, usually in CSV file format. There are three main objects in the GTFS: route, trips, and stop . The routes represent the pathway used by the vehicle, a bus, tram, train, etc., and are usually denoted by the vehicle name. The route visits a set of stops in a specific sequence where the stop can be a bus or tram stop or a train or subway station.
Planning a trip between two locations (the start and destination) requires finding all possible single or combinations of trips that can take the passenger from start to destination points. Finding trip plans can be divided into two steps. First, find all possible routes that can connect the start to the destination point, and these will be the candidate solutions list. Then find the trip's timing information and check for any time conflict in the candidate's plans. To understand the problem, we use Figure 1 as an example of the GTFS data. The figure shows three routes, A, B, and C going through a set of stops denoted by circles with numbers. We consider that the user wants to start the trip from stop 7 at 11:10:00 going to stop 6. Therefore, the candidate solutions will be as follows: first, the user takes route C to stop 9 and then takes route A from stop 9 to stop 6. The second solution is that the user walks from stop 7 to stop 3 and then takes route B to stop 6 if the distance is walkable . The next step is to validate the solution according to the timetable. For the first solution, as in the figure, if the user starts at 11:07:00, 11:17:00, or 11:27:00, he will arrive to stop nine at 11:35:00, 11:35:00, or 11:45:00, respectively. Thus, the user will take the trip at 11:17:00 because it is the earlier trip and then arrive at stop 9 at 11:35:00, where the next trip using route A will be at 11:45:00 and reach the destination at 12:05:00. In the same way, we can find the time information for the second solution. As we mentioned earlier, a solution can be rejected if there is a time conflict; for example, the first solution may be rejected if there is no outgoing trip using route A from stop 9 any time after 11:35:00.
Fig 1. GTFS routes and stops example.
3- Find Trip Time Information
3-1- Data Structure
Finding trip plans according to the time is a complex problem and needs computation power that is not provided by standard computers . For this work, we use the output structure (the candidate solutions) provided by our previously proposed trip routes planning algorithm  as an input to introduce our new trip timing algorithm. Figure 2 shows the UML design of the trip routes planning algorithm output with additional fields to store the time data. The structure contains three main objects Solution_LIST, PATH, and MOVE. Each MOVE represents a single transition from a start-stop to an end-stop using a route (e.g., a bus), and PATH denotes a trip plan or solution containing at least one or more transitions (MOVE) stored in a list called Way. The algorithm's final output is a list of PATH called the Solution_List. The MOVE_WITH_TIME class was inherited from the MOVE class and contained the arrival and departure time fields. Finally, a list of MOVE_WITH_TIME is added to the PATH class called Way_With_Time. The task of the next trip timing algorithm is to validate the PATH by checking every MOVE object in its Way list. If time conflict is found in any MOVE, the whole PATH will be rejected; Otherwise, a new MOVE_WITH_TIME object will be created from the current MOVE by adding the timing fields. The newly created list of MOVE_WITH_TIME objects will form the Way_With_Time list.
Fig 2. Algorithm data structure.
The stoptimes.txt file list a set of records for each trip; each record contains stop ID, trip ID, trip arrival, and departure time at that stops. The set of trips records is present in the file ordered by trip ID and the arrival time. Thus, if the file starts to list a trip that visits ten stops at row number N, then the row N contains timing data about the first stop, row N + 9 shows the data about the last stop that the trip visits, and row N + 10 will list data for the first stop of another new trip if any. Every PATH must be checked by examining the MOVEs in its WAY list using T's time. The stoptimes.txt file record is checked sequentially to find the trip with the closest time to T. Initially, T is set to the time determined by the user (USER_TIME) to start the trip, and during the next MOVEs check, T is set to the arrival time at the last checked MOVE end stop. The check starts from the first record in the stoptimes.txt until finding the first record (i) with stop_id equal to the MOVE start_stop_id and with the same route used by the MOVE and the departure time is greater than T and one of the next record (i + j) in the same trip with stop_id equal to the MOVE end_stop_id. Where j is the number of intermediate stops, if such records are found, a MOVE_WITH_TIME object is created using the examined MOVE and record (i) departure time as Start_time and record (i + j) arrival_time as arrive time for the new MOVE_WITH_TIME object and as the new T value for the next MOVE check. If no such record is found in the stoptimes.txt file, then the whole PATH is rejected and mentioned as an unacceptable solution. The new resulting MOVE_WITH_TIME objects are used to form a WAY_WITH_TIME list. Figure 3 shows the Trim timing algorithm that validates the MOVE according to the trip's timing information. The algorithm input is the start-stop from which the MOVE starts, the end-stop where the MOVE ends, the route used to make that MOVE, and the user's time. The algorithm output must be the trip on that route with the nearest time to T.
Input: WAY a List of MOVE, USER_TIME.
OutPut: WAY_WITH_TIME as List of MOVE_WITH_TIME objects
Step1: T=USER_TIME , WAY_WITH_TIME = empty
Step2: ForEach MOVE M in WAY
Do Step4 To Step5
Step3: Set M_WITH_TIME =NULL,
Step4: For (i=0 ; i< Stoptimes.Length-2 ; i++)
IF(Stoptimes[i].stop_id == M.start_stop_id && Stoptimes[i].Route == M.route_id && Stoptimes[i].Departure_time >T ) Then:
For (j=i+1 ; i< Stoptimes.Length-1 ; j++)
IF (Stoptimes[j].stop_id == M. end_stop_id)
M_WITH_TIME = MOVE_WITH_TIME ( M,Stoptimes[i].Derparture_time Stoptimes[j].Arraival_time).
ADD M_WITH_TIME to WAY_WITH_TIME,
Goto Step2 check the next MOVE
Step5: IF M_WITH_TIME ==NULL Then:
Return FALS and Exit Else
Step6: Return WAY_WITH_TIME
Fig 3. Trip timing algorithm.
3-3- Time Complexity
Searching the stoptimes.txt file record is a time-consuming process as any linear search, the algorithm performs a linear search for timing information. Let N is the number of records present in the stoptimes file. Then, the best case is if the stop with time greater than T is at the stoptimes file's first record. The worst case is O (N-1), and the average case is O((N-1)/2). Thus, the algorithm time is increased by increasing the number of records. We ignore the number of records between the start and the end stops, as this number is minimal compared to N.
Redis is a high-performance in-memory NoSQL database written with C and worked on most POSIX platforms . Redis is a message broker and session manager that stores data in key-value pairs. An HTML page including its resources may be serialized to a string and saved in a Redis to enable a high-speed page load. Thus, software organizations prefer Redis for its fast performance and scalability. Strings, Lists, Sets, Hashes, and Sorted Sets are the five data structures available in Redis. In this research, we will depend only on the Hash structure. All our object data will be converted to a string and concatenated before being stored in the Redis Hashes. A wide range of programming languages supports Redis. Each language has its libraries and packages for communicating with and manipulating the Redis server. In this project, we utilized StackExchange.Redis, which can be installed using NuGet Package Manager.
5- Range Mapping Hash (RMH)
We propose the RMH as a Redis model to avoid the time-consuming liner data scanning by mapping the input parameters to the output directly without any liner search or scan using the power of hash structure in Redis. For each route between any two stops, we need to map a route and two stops ID and time T to the ID of the trip with the nearest time to T on that route, the trip departure time at the start-stop, and trip arrival time at the end stop. The RMH consists of two structures. Both structures querying results are combined to form the timing answer. We use a Redis Hash structure for the implementation. The Hash structure syntax has three-part, the KEY, which refers to the Hash name, the FIELD that uniquely identifies a row in the hash; and the VALUE. The HGET and HSET commands are used to retrieve and insert data into Redis Hash .
5-1- RMH Trip Departure Time Structure
The first structure is used to map the start-stop ID, route ID, and Time (T) into the next trip's ID and time at this stop using that route. For this structure, we create a Redis hash for each stop route pair to store all trip visiting time at the stop. The Hash KEY part will mention the stop ID and the route ID separated by the "__" string. The FIELD will contain the time to be examined and denote it as T. The hash VALUE part holds the time of the next coming trip according to T, and the trip ID, separated by the "__" string. Figure 4 shows the trip time structure.
All this information is available in the stoptimes file except the time (T). Any possible T value belongs to the set of sharp minutes in the day. Thus a maximum of 1440 entries is needed to cover all the possibilities. For each stop ID, route ID pair from the stoptimes data, and a time T entry, the value field contains the time of the next coming trip and the trip ID separated by "___". For example, in Figure 4, we take route 10 and stop 1100905. Three trips, 208,209, and 210 are listed in the stoptimes file with departure times 11:44:00, 11:54:00, and 12:04:00. At the first hash entry, where the T value is 11:43:00, the answer (the Hash value field) shows the time 11:44:00 and trip ID 208. For any value of T equal to or greater than 11:44:00, the answer will be trip 209 as its time is 11:54:00. When T is greater or equal to 11:54:00, the answer will be trip 210 and its time 12:04:00.
Fig 4. RMH trip departure time structure.
5-2- RHM Arrival Time Structure
After using the trip departure time structure, we have the departure time from the start-stop and the trip ID. Finally, we can find the end stop's arrival time using the end-stop ID and the trip ID using the arrival time structure. Figure 5 shows the arrival time structure, a single Redis Hash for each stop, to list all the trip arrival time combinations. The KEY part of the Redis Hash is used to refer to stop using the stop ID; the FIELD is used to refer to the trip ID, where the VALUE stores the arrival time.
For example, in Figure 5, the trip planning algorithm results in a MOVE with start-stop 1100905, end stop 1002315, and route 10. If the T value is 11:45:00, the timing validation will work as follow: HGET statement using the trip departure time structure is used to retrieve the trip ID and the departure time from the start-stop. The HGET KEY will be "1100905__10" (start-stop ID and the route ID); the FIELD part is "11:45:00" (T). the returned value from this HGET statement will be "11:54:00__209" (the departure time and the trip ID), as shown in figure 8. Now, the trip is known, and the end stop's arrival time is the only missing part. Another HGET statement with KEY is "1002315", and FIELD "209" is used with the arrival time structure; the return value from this statement will be "12:02:00 (the arrival time at the end stop) as shown in Figure 9. If any of the two structures did not return a match for the HGET command, the MOVE is rejected and the whole PATH (path). Thus the total time complexity of RMH formed by two Hash table read operations only. As the time complexity for reading from a Hash structure is O(1), then RMH total complexity is O(2).
Fig 5. RHM arrival time structure.
6- Experiment and Results
6-1- Experiment Tool
We implement the trip timing algorithm as a C# project. The project is a WinForm application (GUI) containing a set of classes: GTFSData for loading and preprocessing the GTFS data, Algorithm class contains the implementation of our previously published trip planning algorithm, TimeCalculator class contains the implementation of the trip timing algorithm (trip timing), Redis action class include the code for connecting to Redis database load the data to Redis and retrieve the solution and other classes. Figure 6 shows the UML design of the main classes in the project.
We used the GetMovesWithTime( ) function from the TimeCalculator class for this experiment, which takes a MOVE list and start-time as a parameter and returns a MOVE_WITH_TIME list. We calculate the execution time for this function and compare it with the execution time for reviving the timing information using the RMH using the GetSolFromRedis( ) function from the RedisAction class.
The GetMovesWithTime() illustrates the implementation of the trip timing algorithm given earlier, whereas the GetSolFromRedis() represents the interaction with the Redis RMH model, which is implemented as mentioned before. This function will retrive the same result returned by the GetMovesWithTime() function (for the same parameters). The Redis RMH serves in a similar way to a data warehouse model where redundant data is stored and utilized to serve the application purpose.