1. The document proposes using convolutional neural networks and labeled image data from municipal maps to develop a system for high-accuracy mapping from street-level videos.
2. Labeled image data from municipal maps would be used to train convolutional neural networks to detect and map objects like trees, lampposts, and traffic signs from street-level panoramic images.
3. A prototype system is able to detect some objects beyond what is in municipal maps, like driveways and gardens, but challenges remain in accurately mapping 3D objects from 2D images. The system would allow for inexpensive, on-demand, and up-to-date mapping of cities from street-level panoramic videos captured from vehicles.