Transit agencies seek to offer high-quality, door-to-door travel options to people with limited mobility. Paratransit operations define a complex and costly middle ground between on-demand ride hailing and fixed-route transit, as pickup-and-delivery systems with advance demand visibility and employed drivers. This work addresses the strategic paratransit planning problem by optimizing workforce schedules that are robust to trip cancellation uncertainty. We contribute a two-stage stochastic program that efficiently integrates two hard combinatorial optimization problems: workforce planning and downstream dial-a-ride operations. To solve the formulation, we develop a scalable, exact, and generalizable solution algorithm based on Benders decomposition. We solve subproblems that are restricted to selected first-stage driver itineraries, and then use a primal-dual approach to construct full subproblem solutions that yield valid global optimality cuts. Computational results show that our solution algorithm consistently outperforms benchmark algorithms in case studies at scale. We reduce a 45.3% average optimality gap in 1 hour to 0% in 10 minutes over standard Benders decomposition. Preliminary practical insights demonstrate that optimized itineraries significantly reduce costs over historical shifts by leveraging pickup time windows and contractor outsourcing.