Ultra-dense 5G and beyond deployments are setting significant burden on cellular networks, especially for wireless backhauls. Today, a careful planning for wireless backhaul is more critical than ever. In this letter, we study the hierarchical wireless backhaul topology design problem. We introduce a Deep Reinforcement Learning (DRL) based algorithm that can solve the problem efficiently. We compare the quality of the solutions derived by our DRL approach to the optimal solution, derived according to the Integer Linear Program (ILP) formulation in our previous work. A simulation using practical channel propagation scenarios and different network densities proves that our DRL-based algorithm is providing a sub-optimal solution with different levels of resiliency. Our DRL algorithm is further shown to scale for larger instances of the problem.