Abstract: Audio sensors collaborate for geo-location and tracking of multiple users. Different users can be independently geo-located and tracked within the AI environment. Location is determined from two or more AI clients of known locations that detect an event such as a human voice command to connect a call with a specific user. Responsive to classification of the event in view of the estimated location, a command for an AI action, such as connecting a call between users, is received for a response to the event at the AI clients that detected the event, or others.