Privileged Reinforcement and Communication Learning for Distributed, Bandwidth-limited Multi-robot Exploration