Abstract: A method and system for positioning of autonomously operating entities are disclosed. A positioning system receives a current location of an entity capable of autonomous operation and generates a 3D virtual construct by splitting a spatial volume associated with the current location into a plurality of voxels. The positioning system receives spatial data corresponding to the current location generated by at least one sensor associated with the entity and determines an occupancy status of one or more voxels using the spatial data. A voxel map is configured from the 3D virtual construct based on the occupancy status of the one or more voxels. The positioning system generates a 3D map by overlaying visual semantic data onto the voxel map. The visual semantic data is derived from image frames corresponding to the current location captured by one or more imaging devices. The 3D map is capable of autonomously positioning the entity.