AUDIO-BASED NOTIFICATIONS

An example operation may include a method comprising one or more of recording data, by a device, wherein the data is one or more of a location, a video, and an audio, sending the data to a server, splitting, by the server, the data into at least one participant, determining, by the server, an interaction by matching the at least one participant, and a group of stored data, and notifying the device, by the server, of the match.

Skip to: Description  ·  Claims  · Patent History  ·  Patent History
Description
TECHNICAL FIELD

This application generally relates to audio, and more particularly to providing audio-based notifications.

BACKGROUND

In the current application, system and method is introduced wherein a device detects incoming audio such that the audio is a voiceprint of a person, and the person, henceforth referred to as an “acquaintance” is of some relationship to the user of the current application, henceforth referred to as the “user”. In certain embodiments, there is processing in the current application that determines recent interactions between the user and the acquaintance wherein it is determined whether the user may further interact with the acquaintance regarding an issue and the issue is either known by both or one of the user and acquaintance or not known by both or one of the user and the acquaintance.

SUMMARY

An example operation may include a method comprising one or more of recording data, by a device, wherein the data is one or more of a location, a video, and an audio, sending the data to a server, splitting, by the server, the data into at least one participant, determining, by the server, an interaction by matching the at least one participant, and a group of stored data, and notifying the device, by the server, of the match.

Another example operation may include a system comprising a device which contains a processor and memory, wherein the processor is configured to perform one or more of record data, by a device, wherein the data is one or more of a location, a video, and an audio, send the data to a server, split, by the server, the data into at least one participant, determine, by the server, an interaction by a match of the at least one participant, and a group of stored data, and notify the device, by the server, of the match.

A further example operation may include a non-transitory computer readable medium comprising instructions, that when read by a processor, cause the processor to perform one or more of recording data, by a device, wherein the data is one or more of a location, a video, and an audio, sending the data to a server, splitting, by the server, the data into at least one participant, determining, by the server, an interaction by matching the at least one participant, and a group of stored data, and notifying the device, by the server, of the match.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a block diagram of one embodiment of the current application.

FIG. 2 is another block diagram of a computer system of one embodiment of the current application.

FIG. 3 is a diagram of speaker diarization of one embodiment of the current application.

FIG. 4 is a message flow of a notification of one embodiment of the current application.

FIG. 5A is a snapshot of a GUI notifying users of an interaction for a calendar in one embodiment of the current application.

FIG. 5B is another snapshot of a GUI notifying users of interaction for a calendar in one embodiment of the current application.

FIG. 6A is a snapshot of a GUI notifying users of an interaction for a project plan in one embodiment of the current application.

FIG. 6B is another snapshot of a GUI notifying users of an interaction for a project plan in one embodiment of the current application.

FIG. 7 is a flowchart showing embodiments of the current application.

FIG. 8 is a message flow of one embodiment of the current application.

FIG. 9 is a second system diagram of one embodiment of the current application.

FIG. 10 is a diagram of a transport seat of one embodiment of the current application.

FIG. 11 is a flowchart of the system modifying a transport's speakers in one embodiment of the current application.

DETAILED DESCRIPTION

It will be readily understood that the instant components and/or steps, as generally described and illustrated in the figures herein, may be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of at least one of a method, system, component and non-transitory computer readable medium, as represented in the attached figures, is not intended to limit the scope of the application as claimed but is merely representative of selected embodiments.

The instant features, structures, or characteristics as described throughout this specification may be combined in any suitable manner in one or more embodiments. For example, the usage of the phrases “example embodiments”, “some embodiments”, or other similar language, throughout this specification refers to the fact that a particular feature, structure, or characteristic described in connection with the embodiment may be included in at least one embodiment. Thus, appearances of the phrases “example embodiments”, “in some embodiments”, “in other embodiments”, or other similar language, throughout this specification do not necessarily all refer to the same group of embodiments, and the described features, structures, or characteristics may be combined in any suitable manner in one or more embodiments.

In addition, while the term “message” may have been used in the description of embodiments, the application may be applied to many types of network data, such as, packet, frame, datagram, etc. The term “message” also includes packet, frame, datagram, and any equivalents thereof. Furthermore, while certain types of messages and signaling may be depicted in exemplary embodiments they are not limited to a certain type of message, and the application is not limited to a certain type of signaling.

Referring to FIG. 1, an illustration of a block diagram of one embodiment of the current application 100 in accordance with the present disclosure. The system includes at least one client device 102. A client device may be at least one of a mobile device (102a, 102b) a tablet, a laptop device, and/or a personal desktop computer. The client device is communicably coupled to the network 104. It should be noted that other types of devices might be used with the present application. For example, a PDA, an MP3 player, or any other wireless device, a gaming device (such as a hand held system or home based system), any computer wearable device, and the like (including a P.C. or other wired device) that may transmit and receive information may be used with the present application. The client device may execute a user browser used to interface with the network 104, an email application used to send and receive emails, a text application used to send and receive text messages, and many other types of applications. Communication may occur between the client device and the network 104 via applications executing on said device and may be applications downloaded via an application store or may reside on the client device by default. Additionally, communication may occur on the client device wherein the client device's operating system performs the logic to communicate without the use of either an inherent or downloaded application.

The system 100 includes a network 104 (e.g., the Internet or Wide Area Network (WAN)). The network may be the Internet or any other suitable network for the transmitting of data from a source to a destination.

A server 106 exists in the system 100, communicably coupled to the network 104, and may be implemented as multiple instances wherein the multiple instances may be joined redundant network or may be singular in nature. Furthermore, the server may be connected to database 108 wherein tables in the database are utilized to contain the elements of the stored data in the current application, such as Structured Query Language (SQL), for example. The database may reside remotely to the server coupled to the network 104 and may be redundant in nature.

Referring to FIG. 2, a block diagram illustrating a computer system 200 upon which embodiments of the current invention may be implemented, for example. The computer system 200 may include a bus 206 or other communication mechanism for communicating information, and a hardware processor 205 coupled with bus 206 for processing information. Hardware processor 205 may be, for example, a general purpose microprocessor, for example.

Computer system 200 may also include main memory 208, such as a random access memory (RAM) or other dynamic storage device, coupled to bus 206 for storing information and instructions to be executed by a processor 205. Main memory 208 also may be used for storing temporary variables or other intermediate information during the execution of instructions to be executed by a processor 205. Such instructions, when stored in the non-transitory storage media accessible to processor 205, may render computer system 200 into a special-purpose machine that is customized to perform the operations specified in the previously stored instructions.

Computer system 200 may also include a read only memory (ROM) 207 or other static storage device, which is coupled to bus 206 for storing static information and instructions for processor 205. A storage device 209, such as a magnetic disk or optical disk, may be provided and coupled to bus 206, which stores information and instructions.

Computer system 200 may also be coupled via bus 206 to a display 212, such as a cathode ray tube (CRT), a light-emitting diode (LED), etc. for displaying information to a computer user. An input device 211 such as a keyboard, including alphanumeric and other keys, is coupled to bus 206, which communicates information and command selections to processor 205. Other type of user input devices may be present including cursor control 210, such as a mouse, a trackball, or cursor direction keys which communicates direction information and command selections to processor 205 and controlling cursor movement on display 212.

According to one embodiment, the techniques herein are performed by computer system 200 in response to a processor 205 executing one or more sequences of one or more instructions which may be contained in main memory 208. These instructions may be read into main memory 208 from another storage medium, such as storage device 209. Execution of the sequences of instructions contained in main memory 208 may cause processor 205 to perform the process steps described herein. In alternative embodiments, hard-wired circuitry or embedded technology may be used in place of or in combination with software instructions.

The term “storage media” as used herein refers to any non-transitory media that may store data and/or instructions causing a machine to operation in a specific fashion. These storage media may comprise non-volatile media and/or volatile media. Non-volatile media may include, for example, optical or magnetic disks, such as storage device 209. Volatile media may include dynamic memory, such as main memory 208. Common forms of storage media include, for example, a hard disk, solid state drive, magnetic tape, or other magnetic data storage medium, a CD-ROM, any other optical data storage medium, any physical medium with patterns of holes, a RAM, a PROM, and EPROM, a FLASH-EPROM, NVRAM, any other memory chip or cartridge.

Various forms of media may be involved in the carrying one or more sequences of one or more of the instructions to processor 205 for execution. For example, the instructions may initially be carried on a magnetic disk or solid state drive of a remote computer. The remote computer may load the instructions into its dynamic memory and send the instructions over a medium such as the Internet 202.

Computer system 200 may also include a communication interface 204 coupled to bus 206. The communication interface may provide two-way data communication coupling to a network link, which is connected to a local network 201.

A network link typically provides data communication through one or more networks to other data devices. For example, the network link may provide a connection through local network 201 to data equipment operated by an Internet Service Provider (ISP) 202. ISP 202 provides data communication services through the worldwide packet data communication network now commonly referred to as the “Internet” 202. Local network 201 and Internet 202 both use electrical, electromagnetic, or optical signals that carry digital data streams. The signals through the various networks and the signals on network link and through communication interface 204, carrying the digital data to and from computer system 200, are example forms of transmission media.

Computer system 200 can send messages and receive data, including program code, through the network(s) 202, the network link, and the communication interface 204. In the Internet example, a server 203 may transmit a requested code for an application program through Internet 202, local network 201, and communication interface 204.

Processor 205 can execute the received code as it is received, and/or stored in storage device 209, or other non-volatile storage for execution at a later time.

Every action or step described herein is fully and/or partially performed by at least one of any element depicted and/or described herein.

In the current application, a device detects incoming audio wherein the audio is a voiceprint of a person, and the person, henceforth referred to as an “acquaintance” is of some relationship to the user of the current application, henceforth referred to as the “user”. In certain embodiments, there is processing in the current application that determines recent interactions between the user and the acquaintance wherein it is determined whether the user may further interact with the acquaintance regarding an issue wherein the issue is either known by both or one of the user and acquaintance or not known by both or one of the user and the acquaintance.

For example, the user of the current application may be in a conversation with an acquaintance or be near the acquaintance wherein the acquaintance's voice is detected by the device executing the current application. The device receives the incoming audio where analysis is performed such that the incoming audio is determined to be a voiceprint of the said acquaintance. The current application, knowing the voiceprint is of the acquaintance, seeks to determine current outstanding issues between the user and the acquaintance wherein a discussion between the two may be beneficial. The current application may then inform the user of this via a notification on the device of the current application.

As another example, a third person referred to as the “remote acquaintance” may update some data, a project plan for example, without the knowledge of the user and the acquaintance, which may have some impact to the user and the acquaintance. This update in data may trigger the current application to request a meeting between the user and the acquaintance without either of them being aware of the situation.

Developed technology in today's marketplace utilizes the technology of voiceprint as further depicted herein. The use of voiceprint in the current application allows for the detection of incoming audio and comparing the received audio against a database storing recorded samples of users in the current environment.

The current application requires a voiceprint of the users in the environment, for example a business environment. The voiceprint recordings, in one embodiment, are received upon the initiation of the user to the environment such as when employees are hired. For employees already in the environment, voiceprints may be requested such as having them send in a voice recording of themselves speaking a particular sentence or sentences.

These recordings are stored in a database, such as a corporate database 108 which may be queried via server 106 as needed.

To record audio from a device, the software initiates recording functionality. For example, in the Android operating system, the following Java code is utilized to initiate the recording of audio on the device:

md = new MediaRecorder( ); md.setAudioSource(MediaRecorder.AudioSource.MIC); md.setOutputFormat(MediaRecorder.OutputFormat.MPEG_4); md.setOutputFile(recordFile); md.setAudioEncoder(MediaRecorder.AudioEncoder.AAC); md.prepare( ); md.start( );

In one embodiment, the recorded media is stored in the device 102 locally then sent to a server 106 for processing. In another embodiment, the recorded media is stored in the device 102.

In one embodiment, the device 102 initiates the recording of audio in the background, without specific interaction with the device from the user. This allows the device to record audio in the device during conversations that occur during a workday. To record audio on the device in the background, the current application may utilize the built-in microphone on many devices, such as mobile phones will suffice.

In the mobile device-programming environment, a service is needed to allow the executing of an application in the background. As an example, one popular mobile operating system utilizes service that that runs in the background without direct interaction with the user. A service has no user interface and is not bound to the lifecycle of an activity. The activity of recording audio is perfect for the use of a service. These services run with a higher priority than inactive or invisible activities and therefore it is less likely that the operating system terminates them.

The use of asynchronous processing in a service allows for the execution of resource intensive tasks in the background. A new thread is created and executed in the service wherein the service (and the thread) may be restarted automatically if the operating system reboots.

Recorded audio samples at the client device 102a may be sent to a server, such as server 106 wherein the recorded audio is processed to compare the recorded audio to voiceprints of users in the environment to determine who is speaking. The interval of when to cut the recording and send the audio samples is hardcoded to a value, such as 10 seconds.

In one embodiment, the recorded audio is sent to the server 106 for processing. The server first splits the audio into sections wherein each portion of audio corresponds to a speaker, a process called diarization.

Speaker diarization is the process of partitioning an input audio stream into homogeneous segments according to the speaker identity. Speaker diarization is a combination of speaker segmentation and speaker clustering. The first aims at finding speaker change points in an audio stream. The second aims at grouping together speech segments on the basis of speaker characteristics.

There are many open-sourced initiatives solving the speaker diarization problem:

ALIZE Speaker Diarization—an opensource platform for speaker recognition. The purpose of this project is to provide a set of low-level and high-level frameworks that will allow anybody to develop applications handling the various tasks in the field of speaker recognition: verification/identification, segmenting, etc.

SpkDiarization—a software dedicated to speaker diarization (ie speaker segmentation and clustering). It is written in Java and includes the most recent developments in the domain.

Audioseg—a toolkit dedicated to audio segmentation and classification of audio streams.

SHoUT—a software package developed at the University of Twente to aid speech recognition research.

pyAudioAnalysis—Python Audio Analysis Library: Feature Extraction, Classification, Segmentation and Applications

FIG. 3 shows a sample of speaker diarization in one embodiment of the current application. Each portion of audio is split from the received audio at server 106 wherein analysis is performed on each section to attempt to obtain the speaker. In the above figure, the input audio stream is split into 3 speakers: Speaker A, Speaker B, speaker C, henceforth referred to as the “speaker sections”.

Having obtained the split of the input audio into sections containing the speakers, the server 106 then compares the audio sections against a database (such as database 108) of voiceprints to determine who the speaker is as further disclosed below.

The identity of the speakers is determined by first splitting the incoming audio stream into the respective speakers called diarization, further disclosed herein. The individual audio streams per speaker is referred to henceforth as audioPerSpeaker.

Once the audio is split, the server 106 processes each audioPerSpeaker. For example, the split audio portions are stored in an array: audioPerSpeaker [n]. The server loops through the array, comparing the stored audio in each array element against a library of voiceprints, wherein each voiceprint is an audio sample of a user associated with the organization. The comparison of audio is accomplished via commonly used logic in computer science with an outcome assigned to the matching of the audio, for example a range of [0-10] where 0 is no match at all and 10 equates to a perfect match.

In one embodiment, if the match is 60%, or an outcome equal to 6 or higher, the match is validated, and any value under 5 reflecting that the audio samples do not match.

Referring to FIG. 4, a message flow notifying users of interaction in one embodiment of the current application 400. The flowchart depicts a message flow of the walkthrough of matching user via matching a voiceprint, then determining possible interactions, and notifying the users of said interaction.

FIG. 4 is a message flow depicting embodiments in one implementation of the current application 400. The client device of the user 102a is placed in a mode wherein audio of the current environment is being received and recorded.

In another embodiment, the client device 102a is recording and storing video wherein the video being recorded is of the environment. For example, the device may be a wearable device, such as computer-equipped glasses, or a device on a user's clothing capturing video. The device also records audio along with the video.

In another embodiment, the recorded data is stored not in the client device 102, but in a remote server, such as server 106 and/or a database such as database 108. The data is sent to the server via the network 104, and routed to the server, and optionally to the database.

A device of the system processes the incoming data to determine the people speaking wherein the device may be the client device 102 or the server 106. The video is analyzed using facial recognition. Facial recognition is a common problem that is solved in today's computer science environment. There are many products that tackle the facial recognition problem, both open source and private solutions.

One technique that is common utilizes three technologies:

    • The Eigenface algorithm
    • The Elastic matching
    • Classification nets

The Eigenface method encodes the statistical variation among face images a form of dimensionality reduction method like Principal Component Analysis (PCA), where the resulting characteristic differences in the feature space don't necessarily correspond to isolated facial features such as eyes, ears, and noses. The indispensable components of the feature vector are not pre-determined).

Elastic matching generates nodal graphs (i.e. wireframe models) corresponding to specific contour points of a face, such as the eyes, chin, tip of the nose, etc. Recognition is based on a comparison of image graphs against a known database. Since image graphs may be rotated during the matching process, this system tends to be more robust to large variation in the images.

Classification net recognition utilizes the same geometric characteristics as elastic matching, but fundamentally differs by being a supervised machine learning technique, often involving the use of support vector machines.

Although Eigenface detection may underperform the other methods when variation in lighting or facial alignment is large, it has many benefits including:

    • being easy to implement
    • computationally efficient
    • able to recognize faces in an unsupervised manner

Therefore, Eigenface detection tends to be a de-facto standard. Many state-of-the-art detection techniques also rely on some form of dimensionality reduction prior to recognition, even if feature vector extraction is handled in a different manner.

An acquaintance 402 is determined by the system. This is accomplished via facial recognition from received image/video media wherein the face data obtained in the received image/video is compared to a bank of facial images stored locally in the client device or in a remote database such as database 108, or matching a received voice audio with a voice print stored in the client device 102 or remotely in a database, such as database 108.

In another embodiment, previous communications between the user and other people with whom the user has interactions with is recorded either locally in the client device, or in a remote location such as server 106 or database 108. The voices and/or facial recognition data of the interactions are stored as future acquaintances wherein this data may be used to determine acquaintances in future interactions.

In another embodiment, the geographic location of individuals or groups of people with whom the user interacts with is used as possible acquaintances. If the acquaintance is normally encountered in a similar geographic area as the user, such as in a nearby office or cube, it is then determined that this person is an acquaintance of the user and the facial/voice print data of that individual is stored as a future acquaintance of the user.

An acquaintance 402 is nearby such that the voice of the acquaintance is being received at the client device 102a and recorded 404. The audio is sent to server 106 for processing 406.

The server splits the received audio into speakers, a process called diarization and is further discussed herein 408.

Once the audio is split into speakers, the speakers' received audio is compared against voiceprints 414 to obtain the identity of the speaker 410, also further disclosed herein.

Having obtained the identity of the speaker in the received audio, the server 106 then performs logic to determine any interaction(s) between the originator and the speaker 412. A database, such as database 108 may be queried via APIs for example to obtain possible interactions on projects, calendar events, etc. 416.

If at least one interaction is encountered, notifications 418 and 420 are sent from the server 106 to the respective client devices of the users 102a and 102b.

In another embodiment, the audio is processed at the client device 102a such that the voiceprints of the users in the environment exist on the client device. In this embodiment, the audio sample message 406 is not present (not depicted).

In another embodiment, the audio is processed at the client device 102a such that the voiceprints exist on a remote database, such as database 108 and messaging occurs between the client device 102a and the database 108 occur to query the stored voiceprints at the database 108. In this embodiment, the audio sample message 406 is not present (not depicted).

In another embodiment, the database as pictured in the above message flow may be multiple databases wherein the databases may also be remote databases such that interactions between the server 106 and said database(s) occur via messaging between the server and the database(s).

The current application seeks to determine interactions between users wherein the users have been detected via a voiceprint matching incoming audio. Other methods are utilized by the system to determine speakers, such as facial recognition wherein incoming data from a camera that the user is wearing, for example. The location of the speaker is also considered when determining a speaker, such as when an interaction occurs at the same or similar geographic location. For example, at a cube outside of the users cube, or a nearby office, etc.

Below are two examples of interaction with data (a calendar data and project plan data), but one versed in programming design will easily be able to use the examples and relay the methods introduced to other types of data with other types of interactions without deviating from the scope of the current application.

For example, if the device 102a of the user of the current application (henceforth referred to as User A) is either speaking to User B, or User B is in the proximity of, such that in the processing of the incoming audio from the device 102a matches User B's voiceprint, the current application seeks to determine interactions between User A and User B, the client device of User B being 102b.

The current application executing on the client device 102a queries the calendar of User A via interactions with either the calendar application on the current device 102a or a remote server such as server 106 containing the calendar data of User A 102a wherein messaging occurs between said remote server and the client device 102a.

For example, to search calendar/event data, the java code below shows retrieving events in a user's calendar application using Java and a popular Calendar API:

import com.google.api.services.calendar.Calendar; import com.google.api.services.calendar.model.Event; // ... // Initialize Calendar service with valid OAuth credentials Calendar service = new Calendar.Builder(httpTransport, jsonFactory,   credentials).setApplicationName(“applicationName”).build( ); // Retrieve an event Event event = service.events( ).get(‘primary’, “eventId”).execute( );

The event returned is of type “Event”, containing the specific details of the event. Including in the Event data is an attendee array:

“attendees”: [   {     “id”: string,     “email”: string,     “displayName”: string,     “organizer”: boolean,     “self”: boolean,     “resource”: boolean,     “optional”: boolean,     “responseStatus”: string,     “comment”: string,     “additionalGuests”: integer   }

The attendee array above is part of the data returned in the Event data in the Calendar API. The Event data will contain an array with all of the attendees in the event, along with details of each attendee including the email and name.

The Event data also contains the event's creator data including the creator's email and name.

The current application obtains events of the originator and the events of the recipient(s). This is the event(s) that either of them has created. Using the returned Event data, it then determines if any of the events created contain the other user's name or email. If there is a match, this means that the originator (User A) and the User B share an event. Furthermore, knowing the data of the event, if the event is scheduled within a particular time period (e.g. within 3 business days), a match has been found and a notification is sent to both parties.

Referring to FIG. 5A, a snapshot notifying users of interaction on a display of a client device in one embodiment of the current application 500. In the notification 502, an upcoming calendar event has been found by the server 106 wherein the user of the client device 102a and the acquaintance 102b are both attendees. The notification may be displayed on top of other applications in a similar fashion as normal notifications received on a mobile device, for example.

In this scenario, the user, Bill Dewitz is interacting with another person of who is determined to be an acquaintance (determined via the system 102 having received data such as an audio recording of a conversation or a video recording from a camera on Bill's person). The system sends the data to the server 106 wherein the data is compared against either voiceprints or facial data to determine the identity of Jim Brisk, of whom is an acquaintance of Bill. The stored comparative data is optionally stored in the database 108.

The notification is received at the client device, in this scenario both 102a and 102b, but the server 106 may send different notifications according to the interaction determined 412 wherein the notification to Bill, for example would have only Jim's name 602 and the notification to Jim would have only Bill's name.

In another embodiment, the words “that person” may be substituted with the name of the user.

In another embodiment, the meeting details may be presented in the message of the notification 602 including the title of the event, the event start time, the duration of the event, the attendees of the event, the location of the event, etc.

Referring to FIG. 5B, a GUI snapshot notifying users of interaction—calendar in one embodiment of the current application 510. FIG. 5B depicts a display on a client device 102 showing a notification in one implementation of the current application 510. A calendar meeting button 512 is displayed wherein the user may interact with the GUI of the current application by using a pointing device to press the button.

When pressed, a message is sent to the calendar API of either the calendar application on the client device, or query remote calendar data on a remote database, such as database 108. The resulting action is either a new window pops up on the display with the details for the calendar data, or the calendar data is added to the current message text.

The current application executing on the client device 102 interacts with the data in an environment, such as project plan data in an enterprise environment via a project plan Application Program Interface (API) to obtain data pertaining to the users of the current application whom have been determined to be within audio range from each other, for example User A 102a and User B 102b.

Understanding a project associated with the user, and the role the user has in the project may aid in determining current interactions between User A and User B. For example, having access to the project management software such as an Application Programming Interface (API), the current application queries the program management software to obtain possible interactions between users.

Project Management software may be available online which exist entirely in the cloud, or on client machines wherein a remote database stores the live changes of the project. Many popular project management applications include an API, allowing other applications to query and retrieve project management specific data pertaining to stored projects.

For example, a popular project management application is the Basecamp software. Basecamp allows for simple communication and collaboration amongst the users in a project. It also is implemented as simple Extensible Markup Language (XML) over the Hypertext Transfer Protocol (HTTP) protocol.

Through the user of the Basecamp API, it is possible to return all people in a project as well as obtain data pertaining to each person's role in a project.

For example, using the Basecamp API, it is possible return a specified person:

GET /people/1.json {   “id”: 149087659,   “identity_id”: 982871737,   “name”: “Jason Fried”,   “email_address”: “jason@basecamp.com”,   “admin”: true,   “trashed”: false,   “avatar_url”:   “https://asset0.37img.com/global/e70b2ea21efeff72c1/   avatar.96.gif?r=3”,   “fullsize_avatar_url”:   “https://asset0.37img.com/global/b2ea21efeff72c1/original.gif?r=3”,   “created_at”: “2012-03-22T16:56:51-05:00”,   “updated_at”: “2012-03-23T13:55:43-05:00”,   “events”: {     “count”: 19,     “updated_at”: “2012-03-23T13:55:43-05:00”,     “url”:     “https://basecamp.com/999999999/api/v1/people/149087659-     jason-fried/events.json”,     “app_url”: “https://basecamp.com/999999999/people/     149087659-jason-fried/events”   },   “assigned_todos”: {   “count”: 80,   “updated_at”: “2013-06-26T16:22:05.000-04:00”,   “url”: “https://basecamp.com/999999999/api/v1/people/149087659-   jason-fried/assigned_todos.json”,   “app_url”: “https://basecamp.com/999999999/people/   149087659-jason-fried/assigned_todos” }

In this example, the user's data is returned including their name, email address, current events associated with the user and any assigned tasks as well as other information.

Furthermore, to get projects a person has access to via the API call:

GET /people/1/projects.json [  {   “id”: 605816632,   “name”: “BCX”,   “description”: “The Next Generation”,   “updated_at”: “2012-03-23T13:55:43-05:00”,   “url”: “https://basecamp.com/999999999/api/v1/projects/   605816632.json”,   “template”: false,   “archived”: false,   “starred”: true,   “trashed”: false,   “draft”: false,   “is_client_project”: false,   “color”: “3185c5”  } ]

This will return a list of all projects a person has access to including draft, template, archived, the date that the project was last updated, and deleted projects. Projects that the requesting user does not have access to will not appear in the project list.

To obtain the people who have access to a particular project, the following API call may be called:

GET /projects/1/accesses.json {   “id”: 149087659,   “identity_id”: 982871737,   “name”: “Bee Stanley”,   “email_address”: “stanley@basecamp.com”,   “admin”: true,   “is_client”: false,   “trashed”: false,   “avatar_url”: “https://asset0.37img.com/global/4feff72c1/   avatar.96.gif”,   “fullsize_avatar_url”: “https://asset0.37img.com/gobal1/   original.gif?r=3”,   “created_at”: “2012-03-22T16:56:48-05:00”,   “updated_at”: “2012-03-22T16:56:48-05:00”,   “url”: “https://basecamp.com/999999999/api/v1/people/149087659-   bee-stanley.json”,   “app_url”: “https://basecamp.com/999999999/people/   149087659-bee-stanley” },

The result includes all the people with access to the project, including the data pertaining to each user.

Using these and similar API function calls, it is possible to determine the other people on the project and obtain their contact information. Through this, it is also possible to automatically determine the date that a project has been updated wherein both User A and User B are members of.

Referring to FIG. 6A, a GUI snapshot notifying users of interaction—Project Plan in one embodiment of the current application. 600. In the notification 602, project plan data has been found via the server 106 wherein a project upon which both the user of client device 102a and the user of client device 102b (user of client device 102b being the matching person in receive audio at server 106) are members of, and the project recently having been updated. Recently being updated may reflect a hardcoded value in the code of the current application such as 2 business days. The notification may be displayed on top of other applications in a similar fashion as normal notifications received on a mobile device, for example.

Referring to FIG. 6B, a second GUI snapshot notifying users of interaction in one implementation of the current application 610. A project plan button 612 is displayed wherein the user may interact with the GUI of the current application by using a pointing device to press the button.

When pressed, a message is sent to the calendar API of either the project plan application on the client device, or query remote project plan data on a remote database, such as database 108. The resulting action is either a new window pops up on the display with the details for the project plan data, or the project plan data is added to the current message text.

In another embodiment, the current data of the project plan is displayed upon the pressing of the project plan button.

Referring to FIG. 7, a flowchart showing many of the embodiments of the current application at the server 106 in one implementation 700. An audio stream is received from a client device 102. The audio stream may consist of many speakers; thus, it is necessary to first split the audio into each respective speaker 702. The process of splitting audio into speakers is called diarization and there are many depicted applications that may be used to perform this activity. Each audio section per speaker is stored in a local array speakers[n] where n is the total number of speakers in the incoming audio stream minus 1.

A local variable i is set to zero 704. This is a counting variable for looping through the speakers array.

A check is made if the array has items remaining 706. If there are remaining items, the first item in the speakers array is compared against the stored voiceprints. These voiceprints are a collection of audio from each of the users in the environment and may be stored locally at the server 106, or may be remotely stored, such as in a database 108. This process is further disclosed herein. If a speaker is found to match the speakers[i] audio, then the person from the voiceprint data is used as the current matched person 710, herein referred to as speaker[i].user.

A check is made to determine if speaker[i].user is the originator 712 wherein speaker[i].user is the person from the voiceprint data that matched the audio section speaker[i] and originator is the owner of the initial client device sending the incoming audio stream 102a.

If speaker[i].user is the same as originator, then that portion of recorded audio belongs to the user of client device 102 and the i variable is incremented 270 and the process loops back.

If speaker[i].user is not the same as originator, then speaker[i].user is a person to attempt to determine if an outstanding issue may be present 714, as further depicted herein.

If no issue is found 716, then the process loops and the i variable is incremented 270 to continue processing other speakers in the received audio stream.

If an issue is found 716, then notifications are sent 718 to at least one of the originator and/or speaker[i].user client device(s) 102a/102b. The process then loops and the i variable is incremented 720 to continue attempting to determine users for the remaining speaker[i] audio segments.

Referring to FIG. 8, a message flow depicting one embodiment of a remote acquaintance updating data that indirectly pertains to the user of the client device 102a and the acquaintance 102b in one implementation of the current application 800.

A remote acquaintance 802 updates data wherein the updated data pertains to both the user 102a and the acquaintance 102b. The remote acquaintance may be another user in the environment and may not personally know the users 102a and 102b.

Data modification message 806 is sent to the server 106, and also sent 808 to database 108 where the data is updated. The data may be a project plan, code pertaining to both users 102a and 102b, or any other data that is able to be updated by various personnel within an organization.

The client device of the user 102a is placed in a mode wherein audio of the current environment is being received and recorded.

An acquaintance 810 is nearby such that the voice of the acquaintance is being received at the client device 102a and recorded 812. The audio is sent to server 106 for processing 814.

Processing continues as previously depicted wherein notifications are presented to the client devices of the user 102a and the acquaintance 102b. The notifications indicate the similar text of the previously depicted notifications.

The users may not be aware of the data update by the remote acquaintance, yet the current embodiment embodies the intuitiveness of the current application. A user may receive a notification pertaining to data wherein the user and/or the acquaintance is not aware of the update yet is made aware via the current application.

The notion of movement of a device, for example a mobile device, has been used in determining a user's action of said device. For example, smartphones in the market today have functionality that automatically answers an incoming call when the device is raised to the user's ear.

Other movement of a device is used to automatically perform particular actions of a device. In some implementations, if a device is turned over on a surface, the functionality of the device is altered according to this action. The device automatically determines that the user is in a meeting and desires to silent the device. As such, the volume of the device is silenced, and the haptic feedback (vibration feedback) is turned on. As other implementations, if the device is shaken up and down, this movement of the device allows the software of the device to perform some functionality, such as erase data (such as in a game) or turn on another application such as a flashlight.

In many transports, speed-sensitive-volume is a feature that is included. This feature modifies the volume of the speakers in the transport to be raised according to the speed-sensitive-volume setting. The faster the transport is traveling; the more adjustment is made to the volume of the speakers to adjust for road noise. Many transports also allow the driver or occupants to adjust the amount of modification by setting the modification to three categories: Low, Medium, and High, wherein the higher the setting, the more modification of the output to the speakers is made.

For example, if an occupant's head turns toward another occupant, it may be determined that the occupant wishes to carry on a conversation with the other occupant. In this scenario, it would be beneficial for the speakers in the transport to be modified such that the speakers near the occupant's head are lowered to allow for conversation, then automatically be returned to a previous level once it was determined that the conversation is complete.

The transport may be an automobile, airplane, train, bus, boat, or any type of vehicle that normally transports people from one place to another.

Referring to FIG. 9, a system diagram of the current application 900. At least one device is located in a transport 902, which communicates with a network 908. The network communicates with a server 910. The transport 902 contains a device such as an in-transport navigation system or entertainment system, or any device including a processor and memory 904, henceforth referred to as the transport system and acts as the main communication device for the current application, and/or a client device 906, and an image/video camera 907, which communicates with the transport system 904. The client device being a device may communicate with the transport system 904 or may directly connect with the network 908. Transport system 904 contains a processor and memory. The processor receives input, analyzes/parses the input, and provides an output to one or more systems, modules, and/or devices.

The client device may be least one of a mobile device, a tablet, or a laptop device. It should be noted that other types of devices might be used with the present application. For example, a PDA, an MP3 player, or any other wireless device, a gaming device (such as a hand held system or home based system), any computer wearable device, and the like (including personal computer or other wired device) that may transmit and receive information may be used with the present application. The client device and/or the in-transport navigation system may execute a user browser used to interface with the network 908, an email application used to send and receive emails, a text application used to send and receive text messages, and many other types of applications. Communication may occur between the client device and/or the in-transport navigation system and the network 908 via applications executing on said device and may be applications downloaded via an application store or may reside on the client device by default. Additionally, communication may occur on the client device wherein the client device's operating system performs the logic to communicate without the use of either an inherent or downloaded application.

A server 910 exists in the system, communicably coupled to the network 908, and may be implemented as multiple instances wherein the multiple instances may be joined to form a complete cryptocurrency wallet or may be singular in nature. Furthermore, the server may be connected to a database (not depicted) wherein tables in the database are utilized to contain the elements of the system and may be accessed via queries to a database, such as Structured Query Language (SQL), for example. The database may reside remotely to the server coupled to the network 908 and may be redundant in nature.

Each seat in the transport has a pair of speakers near the occupant's head, henceforth referred to as “headrest speakers”. Headrest speakers are on the left and right side of the occupant's head when the occupant is sitting in the seat.

Referring to FIG. 10, a transport seat with speakers 1000 in one embodiment of the current application. The headrest speakers 1002 are placed in the seat behind the left and right ears.

In another embodiment, the headrest speakers are along a track 1004 wherein they may be moved vertically to accommodate shorter or taller occupants. A lever 1006 placed alongside the side of the seat allows the headrest speaker to move such that each headrest speaker may be moved inside the seat. The seat lining 1004 over the speaker area is made of mesh such that regardless where the speaker is placed alongside the track, it is able to produce full sound due to the construction of the mesh covering.

A problem arises when an occupant in the transport desires to interact with another occupant in the transport, or on a phone call or the like. In this scenario, to carry on a quality conversation with another party, it is necessary to lower the radio wherein the sound from the radio is lowered for all occupants, even occupants in the rear (for example) who is not part of the conversation or wish to be part of the said conversation.

The current transport system 904 receives data from a source, such as a transport camera 907 and detects through the analysis of received images and/or video that an occupant's head has turned toward another occupant. The transport system 904 alters the headrest speakers 1000 by at least one of the following:

    • lowering the volume of both (right and left) headrest speakers
    • lowering the volume of the headrest speaker where the occupant's ear is closer
    • moving the headrest speaker to follow the movement of the occupant's ears as they move when the occupant's head moves to turn to the other occupant.

This embodiment allows the system to adjust the volume of the headrest speakers for conversation.

In another embodiment, as the user turns the head to address another occupant, for example the driver turning to speak to a passenger, the headrest speakers 1000 are lowered temporarily for both occupants.

In another embodiment, the monitoring camera 907 tracks the conversation such that the conversation is recognized therein, and the speakers are returned to a previous volume as before the conversation when the transport system determines that the conversation is complete, for example after a 10 second time period expires from an occupant speaking in the conversation.

The monitoring camera 907 utilizes the tracking of the occupants' mouth to determine the ongoing or lack of conversation.

In another embodiment, particular functionality of the transport may allow the transport system 904 to override the headrest speaker modification for conversation. For example, if the transport system receives notification that the transport's right turn signal is on, this action will override the modification of the headrest speaker, as the driver will be looking right ahead of the right turn.

In yet another embodiment, other input from the transport may override the functionality of the current application such as the current speed of the transport. The current transport speed is accounted for when determining the modification of headrest speakers 1000. If the speed is below a threshold value, then modifications are not performed on the headrest speakers.

For example, if the driver of the transport is looking for a parking space, or on a highway where an accident has occurred, the driver may turn to look around for either a parking space, or “rubberneck” to view an accident. In both scenarios, the volume of the transport would most probably below a particular speed, such as below 10 miles per hour.

The transport system 904 checks the current speed of the transport before performing the functionality to alter the headrest speaker(s) 1000.

In another embodiment, the movement of the torso is examined to alter the headrest speaker(s). The monitoring camera 907 captures the movement of the occupant's torso and may determine to alter either the volume and/or the direction of the occupant based on the movement of the torso.

Referring to FIG. 11, a flow of the system modifying the transport's speakers in an implementation of one embodiment of the current application 1100.

The transport system 904 receives data from a source, such as the monitoring camera 907 indicating an occupant in the transport has made a gesture to begin a conversation with another occupant, such as a head turn 1102. The data is analyzed utilizing head tracking software, further disclosed herein.

The current speed of the transport is determined by the transport system interacting with the transport's computer through an Application Program Interface (API). If the current speed is below a threshold speed 1104, then the process ends, and the headrest speakers 1000 are not modified. The threshold speed is a value hardcoded in the transport system 907 and is determined to be a speed wherein the modification of the speakers is not necessary, for example 10 miles per hour.

If the current speed is above a threshold speed, a check is made if a blinker is on in the transport. The transport system 904 checks this by interaction with the transport's internal computer 1106. If the blinker is on, the process ends, as any speaker modifications are not needed if the user as turned a blinker on for an upcoming turn. If the transport's blinker is not on, then the speaker is modified 1108 as further depicted herein.

Claims

1. A method, comprising:

recording data, by a device, wherein the data is one or more of a location, a video, and an audio;
sending the data to a server;
splitting, by the server, the data into at least one participant;
determining, by the server, an interaction by matching the at least one participant, and a group of stored data; and
notifying the device, by the server, of the match.

2. The method of claim 1, wherein the interaction is determined by a calendar event data wherein at least one of notes of the calendar event data, locations of the calendar event data, and participants of the calendar event data are used.

3. The method of claim 2, wherein the notification includes one or more of:

a name of the at least one participant; and
the calendar event data.

4. The method of claim 1, wherein the interaction is based on a matching of owners of at least one document.

5. The method of claim 1, comprising determining the interaction by matching one or more of a voiceprint of the data, a facial recognition of the data, and a location of the data.

6. The method of claim 1, wherein the group of stored data is previously recorded data.

7. The method of claim 1, wherein the group of stored data is in at least one database.

8. A system, comprising:

a device which contains a processor and memory, wherein the processor is configured to perform: record data, by a device, wherein the data is one or more of a location, a video, and an audio; send the data to a server; split, by the server, the data into at least one participant; determine, by the server, an interaction by a match of the at least one participant, and a group of stored data; and notify the device, by the server, of the match.

9. The system of claim 8, wherein the interaction is determined by a calendar event data wherein at least one of notes of the calendar event data, locations of the calendar event data, and participants of the calendar event data are used.

10. The system of claim 9, wherein the notification includes one or more of:

a name of the at least one participant; and
the calendar event data.

11. The system of claim 8, wherein the interaction is based on a match of owners of at least one document.

12. The system of claim 8, comprising determine the interaction by a match of one or more of a voiceprint of the data, a facial recognition of the data, and a location of the data.

13. The system of claim 8, wherein the group of stored data is previously recorded data.

14. The system of claim 8, wherein the group of stored data is in at least one database.

15. A non-transitory computer readable medium comprising instructions, that when read by a processor, cause the processor to perform:

recording data, by a device, wherein the data is one or more of a location, a video, and an audio;
sending the data to a server;
splitting, by the server, the data into at least one participant;
determining, by the server, an interaction by matching the at least one participant, and a group of stored data; and
notifying the device, by the server, of the match.

16. The non-transitory computer readable medium of claim 15, wherein the interaction is determined by a calendar event data wherein at least one of notes of the calendar event data, locations of the calendar event data, and participants of the calendar event data are used.

17. The non-transitory computer readable medium of claim 16, wherein the notification includes one or more of:

a name of the at least one participant; and
the calendar event data.

18. The non-transitory computer readable medium of claim 15, wherein the interaction is based on a matching of owners of at least one document.

19. The non-transitory computer readable medium of claim 15, comprising determining the interaction by matching one or more of a voiceprint of the data, a facial recognition of the data, and a location of the data.

20. The non-transitory computer readable medium of claim 15, wherein the group of stored data is previously recorded data.

Patent History
Publication number: 20190362318
Type: Application
Filed: Jun 17, 2019
Publication Date: Nov 28, 2019
Inventor: David Gerard Ledet (Allen, TX)
Application Number: 16/442,804
Classifications
International Classification: G06Q 10/10 (20060101); H04L 29/08 (20060101);