METHOD AND SYSTEM FOR TRAINING USERS TO PERFORM ACTIVITIES
This disclosure relates to method and system for training users to perform physical activities. The method includes capturing real-time video of the user performing the activity based on an activity option selected by the user; extracting an AI model based on the activity option; processing in real-time, by the AI model, the real-time video of the user to determine a set of user performance parameters based on current activity performance of the user; overlaying, by the AI model, the user in the real-time video with a pose skeletal model; comparing, by the AI model, the set of user performance parameters with a set of target activity performance parameters; generating, by the AI model, feedback for the user based on comparison of the set of user performance parameters with the set of target activity performance parameters; and rendering, by the AI model, the feedback on the rendering device.
This application claims priority benefits under 35 U.S.C. § 119(e) to U.S. Provisional Application No. 63/074,539 filed on Sep. 4, 2020, which is hereby incorporated by reference in its entirety.
TECHNICAL FIELDThis disclosure relates generally to activity training, and more particularly to method and system for training users to perform physical activities.
BACKGROUNDIn an era of rapid urbanization and a fast-paced life, in many places, people may find it difficult to dedicate a regular time daily for physical well-being and find a right work-life balance. Moreover, in times of a pandemic, when lockdowns are imposed, gyms and parks are usually closed for public access. In such cases, it is much more convenient for a person to workout at home.
Many exercise machines and training methods are monitored and configured with adjustable parameter settings based on capabilities, goals, and specific training methods desired by a user. However, for best results and to reduce the chance of muscle damage, injuries, many exercises require correct performance of complex actions by the user during an exercise routine and skilled adjustment of weights or force resistance.
A drawback associated with a conventional workout machine is that the setting parameters used and the variations over time during the workout session can be different. Another drawback is that the exercise machine does not remember the customization settings or preferences entered earlier. Another problem is that the workout history is not preserved. Also, a conventional exercise machine may contain a set of pre-configured programs that may not be appropriate for all users. Another problem associated with a conventional exercise machine is that its pre-configured programs do not consider other parameters such as pose, movement of the body and other parts.
Further, at most gyms, there is typically a set of mirrors that allow a person to view and confirm or adjust their pose and movements to account for the proper pose and movements. However, unless the person has an expert to analyze the pose and movements, the person may perform with improper pose and movements which can result in a potential injury. Further, an exercise trainer cannot be present every time while exercising. Also, it is not possible for a trainer to monitor and guide every exerciser at the same time while performing group exercises. Further, most of the time, trainers cannot be available to motivate/encourage the exercisers.
Further, it is known that the sensors record a variety of information about the human body. For example, electromyography (EMG) electrodes can measure electrical activity generated by a person's muscles. Similarly, there are motion sensors that record the motion/movement of the person. Hence, in relation to training of individuals, and especially in relation to self-training or personal training or remote training, current technologies do not enable coaching/training entities to monitor physiological states of individuals they are coaching/training and/or efficiently manage exercise regimens for individuals in a personalized and real-time manner. Since individuals may have personalized needs in relation to improving performance, it is desirable for systems to automatically tailor metrics and instruction by taking into account physiological states.
On the other hand, since monitoring and evaluating the exercise/fitness, matching the exercise sequence, counting the sequence, tracking the real time progress of the exercise through physical presence of the instructor can be time consuming, and reliability of the results may be low according to the subjective evaluation criteria of the instructor, it is beneficial to use a video display, Artificial Intelligence (AI), and Augmented Reality (AR) technology to solve such a problem.
Therefore, interactive exercise machines using sensors for tracking the pose and body movement of the user and further providing an interactive rendering device for displaying and managing the exercise are highly desirable in terms of health and fitness for many users. Besides fitness, the interactive rendering device may also be desirable in various other scenarios, for example, rehab, physiotherapy, Yoga, dance, theatre, and other activities in which a feedback on composure and observation are important.
SUMMARYIn one embodiment, a method for training users to perform physical activities is disclosed. In one example, the method includes rendering, via a Graphical User Interface (GUI) of a rendering device, a plurality of activity type options to a user. Each of the plurality activity type options includes a plurality of activities. The method further includes receiving an activity option selected by the user from the plurality of activity options in response to a user input. The user input includes at least one of a gesture, a touch, or an audio command. The method further includes capturing, by at least one camera, a real-time video of the user performing the activity based on the selected activity option. Each of the at least one camera captures the real-time video of the user from associated predefined angles. The real-time video includes a stream of poses and movements made by the user to perform the activity. The method further includes extracting an AI model based on the activity option selected by the user. The AI model is configured to determine a deviation of the user from a plurality of correct movements associated with an activity corresponding to the activity option based on target activity performance of an activity expert. The method further includes processing in real-time, by the AI model, the real-time video of the user to determine a set of user performance parameters based on current activity performance of the user. The method further includes overlaying, by the AI model, the user in the real-time video with a pose skeletal model. The pose skeletal model includes a plurality of key points based on the activity. Each of the plurality of key points is overlayed over a corresponding joint of the user in the real-time video. The method further includes comparing, by the AI model, the set of user performance parameters with a set of target activity performance parameters. The set of target activity performance parameters corresponds to the activity expert. The method further includes generating, by the AI model, feedback for the user based on comparison of the set of user performance parameters with the set of target activity performance parameters. The feedback includes at least one of corrective actions or alerts. The feedback includes at least one of visual feedback, aural feedback, or haptic feedback. The method further includes rendering, by the AI model, the feedback on the rendering device. Rendering the feedback includes overlaying one of the at least one corrective actions over the pose skeletal model overlayed on the real-time video of the user. Rendering the feedback further includes displaying the alerts on the GUI of the rendering device. Rendering the feedback further includes outputting the aural feedback to the user, via a speaker.
In one embodiment, a rendering device for training users to perform physical activities is disclosed. In one example, the rendering device includes a display. The display includes a GUI configured to render a plurality of activity type options to a user. Each of the plurality activity type options includes a plurality of activities. The GUI is further configured to receive an activity option selected by the user from the plurality of activity options in response to a user input. The user input includes at least one of a gesture, a touch, or an audio command. The rendering device further includes at least one camera configured to capture a real-time video of the user performing the activity based on the selected activity option. Each of the at least one camera captures the real-time video of the user from associated predefined angles. The real-time video includes a stream of poses and movements made by the user to perform the activity. The rendering device further includes a processor and a memory communicatively coupled to the processor. The memory stores processor instructions, which when executed by the processor, cause the processor to extract an AI model based on the activity option selected by the user. The AI model is configured to determine a deviation of the user from a plurality of correct movements associated with an activity corresponding to the activity option based on target activity performance of an activity expert. The processor-executable instructions, on execution, further cause the processor to process in real-time, by the AI model, the real-time video of the user to determine a set of user performance parameters based on current activity performance of the user. The processor-executable instructions, on execution, further cause the processor to overlay, by the AI model, the user in the real-time video with a pose skeletal model. The pose skeletal model includes a plurality of key points based on the activity. Each of the plurality of key points is overlayed over a corresponding joint of the user in the real-time video. The processor-executable instructions, on execution, further cause the processor to compare, by the AI model, the set of user performance parameters with a set of target activity performance parameters. The set of target activity performance parameters corresponds to the activity expert. The processor-executable instructions, on execution, further cause the processor to generate, by the AI model, feedback for the user based on comparison of the set of user performance parameters with the set of target activity performance parameters. The feedback includes at least one of corrective actions or alerts. The feedback includes at least one of visual feedback, aural feedback, or haptic feedback. The processor-executable instructions, on execution, further cause the processor to render, by the AI model, the feedback on the rendering device. Rendering the feedback includes overlaying one of the at least one corrective actions over the pose skeletal model overlayed on the real-time video of the user. Rendering the feedback further includes displaying the alerts on the GUI of the rendering device. Rendering the feedback further includes outputting the aural feedback to the user, via a speaker.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the invention, as claimed.
The accompanying drawings, which are incorporated in and constitute a part of this disclosure, illustrate exemplary embodiments and, together with the description, serve to explain the disclosed principles.
Exemplary embodiments are described with reference to the accompanying drawings. Wherever convenient, the same reference numbers are used throughout the drawings to refer to the same or like parts. While examples and features of disclosed principles are described herein, modifications, adaptations, and other implementations are possible without departing from the spirit and scope of the disclosed embodiments. It is intended that the following detailed description be considered as exemplary only, with the true scope and spirit being indicated by the following claims.
Referring now to
The rendering device 100 may include a display 106, at least one camera 108, at least one external camera 110, one or more processors (not shown in figure), and a memory (not shown in figure) communicatively coupled with the one or more processors. The display 106 may include a Graphical User Interface (GUI). It may be noted that the at least one camera 108 may be positioned at center, along an edge, or at bottom of the smart mirror 100. The rendering device 100 may train users to perform physical activities in real-time using real-time video captured by the at least one camera 108 and the at least one external camera 110.
As will be described in greater detail in conjunction with
The memory may include the AI model. Further, the memory may store instructions that, when executed by the one or more processors, cause the one or more processors to train the user 102 to perform physical activities in real-time, in accordance with aspects of the present disclosure. The memory may also store various data (for example, real-time video AI model data, a plurality of activity types, a plurality of activities, real-time video, set of user performance parameters, target activity performance data, and the like) that may be captured, processed, and/or required by the rendering device 100.
The rendering device 100 may interact with the user 102 via the GUI accessible via the display 106. By way of an example, the display 106 may be a Liquid crystal display (LCD), a Light-emitting diode (LED) backlit LCD, a Thin-Film Transistor (TFT) LCD, an LED display, an Organic LED (OLED) display, an Active Matrix Organic LED (AMOLED) display, a Plasma Display Panel (PDP) display, a Quantum Dot LED (QLED) display, or the like. The rendering device 100 may also include one or more external devices (not shown in figure). In some embodiments, the rendering device 100 may interact with the one or more external devices over a communication network (for example, a Universal Serial Bus (USB) data cable, a High Definition Multimedia Interface (HDMI) cable, Wireless Fidelity (Wi-Fi), Light Fidelity (Li-Fi), Bluetooth, and other similar communication networks) for sending or receiving various data. The external devices may include, but may not be limited to, a remote server, a digital device, or another computing system.
The GUI renders a plurality of activity types for the user 102. Each of the plurality of activity types includes a plurality of activities. The user 102 may select at least one of an activity type from the plurality of activity types and an activity from the plurality of activities associated with activity type via a user command. Further, the rendering device 100 initiates the activity. The GUI displays a target activity performance 112 corresponding to an activity expert on the screen. The target activity performance 112 may be a video recording of the activity expert, a 3-Dimensional (3D) model of the activity expert, a 2-Dimensional (2D) model, or a 4-Dimensional (4D) of the activity expert. The user 102 may follow the target activity performance 112 with a current activity performance. The display 106 shows a real-time video of the current activity performance of the user 102.
Further, the at least one camera 108 and the at least one external camera 110 capture real-time video associated with the current activity performance of the user 102. The at least one external camera 110 enhances accuracy of determination of pose, movement, gaze, and orientation of head of the user 102. Additionally, the at least one camera 108 may be used for facial recognition of the user 102. Facial data corresponding to the user 102 is associated with a user profile. The user profile is stored in the database and may be associated with current and historical user data such as, but not limited to, history, custom settings, messages from the activity expert, profile data, and other similar data.
Further, the rendering device 100 extracts an AI model based on the activity option selected by the user 102. The AI model is configured to determine a deviation of the user 102 from a plurality of correct movements associated with an activity corresponding to the activity option based on target activity performance 112 of an activity expert. Further, the AI model receives the real-time video and processes the real-time video to determine a set of user performance parameters based on current activity performance of the user 102. In an embodiment, the rendering device 100 is configured to automatically adjust at an angle based on the estimated future orientation of the head of the user 102. For example, when the user 102 is performing the current activity in a lying down position, the rendering device 100 may rotate by about 90 degrees to provide an improvised tracking of the current activity performance.
Further, the AI model overlays the user 102 in the real-time video with a pose skeletal model 114. In some configurations, the AI model may be an AI predictive model. In an embodiment, the pose skeletal model 114 may be determined based on the estimated future pose and motion of the user 102. The pose skeletal model 114 includes a plurality of key points based on the activity type and the activity. Each of the plurality of key points corresponds to a joint of the user 102. Additionally, the plurality of key points may be connected with lines representing bones of the user 102 to complete the pose skeletal model 114.
The display 106 of the rendering device 100 shows the pose skeletal model 114 overlayed on top of the real-time video of the user 102, the target activity performance 112 of the activity expert overlayed on the real-time video of the user 102, a set of user performance parameters associated with the current activity performance, and a set of target activity parameters associated with the target activity performance 112. It may be noted that the pose skeletal model 114 is automatically adjusted and normalized with respect to the real-time video of the user 102 based on an estimated future distance of the user 102 relative to the rendering device 100 and the estimated future pose and motion of the user 102. In some embodiments, transparency of the pose skeletal model 114 may be adjustable by the user 102. In an embodiment, the pose skeletal model 114 is completely transparent and invisible to the user 102. In such an embodiment, the pose skeletal model 114 may be used by the AI model solely for computational purposes.
The AI model compares the set of user performance parameters with a set of target activity performance parameters. Further, the AI model generates a feedback for the user 102 based on comparison of the set of user performance parameters with the set of target activity performance parameters. The feedback includes at least one of corrective actions or alerts. The feedback may be at least one of visual feedback, aural feedback, or haptic feedback. Further, the AI model renders the feedback. The rendering may include overlaying one of the at least one corrective actions over the pose skeletal model 114 overlayed on the real-time video of the user 102. Further, the rendering may include displaying the alerts on the GUI of the rendering device 100. Further, the rendering may include outputting the aural feedback to the user 102, via a speaker configured with the rendering device 100. The feedback may include generating a warning to the user 102 including indication for correcting the current pose of the user 102, indication for correcting user motion associated with the current pose of the user 102, and indication for correcting the current position of the user 102, when the user 102 is at least partially outside a field of view of the at least one camera 108.
In
In another exemplary scenario, the user 102 fails to follow the target activity performance 112 correctly as shown on the display 106. The current activity performance of the user 102, and the associated pose skeletal model 114, shows a deviation from the target activity performance 112 of the activity expert. In such a scenario, the rendering device 100 may generate feedback for the user 102 to ensure that the current activity performance is in accordance with the target activity performance 112. When the deviation of the current activity performance is above a predefined threshold performance and continues for a predefined threshold time, the rendering device 100 may pause the display of the set of user performance parameters and the target activity performance 112.
In
In another exemplary scenario, the user 102 fails to follow the target activity performance 112 correctly as shown on the display 106. The real-time video 116 of the current activity performance of the user 102, and the associated pose skeletal model 114, shows a deviation from the target activity performance 112 of the activity expert. In such a scenario, the rendering device 100 may generate feedback for the user 102 to ensure that the current activity performance is in accordance with the target activity performance 112. When the deviation of the current activity performance is above a predefined threshold performance and continues for a predefined threshold time, the rendering device 100 may pause the display of the set of user performance parameters and the target activity performance 112.
In
In another exemplary scenario, the user 102 fails to follow the target activity performance 112 correctly as shown on the display 106. When the current activity performance of the user 102 deviates from the target activity performance 112 above a predefined performance threshold, the rendering device 100 may generate a feedback (video, graphical, aural, or haptic) for the user 102 to ensure that the current activity performance is in accordance with the target activity performance 112. When the deviation of the current activity performance is above a predefined threshold performance and continues for a predefined threshold time, the rendering device 100 may pause the display of the set of user performance parameters and the target activity performance 112.
In
In
Referring now to
Further, the display 206 is configured to display the real-time video of the user 202. The GUI module 218 is accessible to the user 202 via the display 206. The GUI module 218 provides a plurality of activity types to the user 202. By way of an example, the plurality of activity types may include, but may not be limited to, physical exercises, guided meditations, Yoga, physiotherapy, flower arranging, origami, dance, theatre, any form of performing arts, martial arts, speech therapy, rehab, drawing, painting, physical therapy and rehabilitation, CrossFit, Les Mills, F45, Zumba, Bikram Yoga, Orange Theory, or the like. Each of the plurality activity types includes a plurality of activities. The user 202 may select, via a user command, at least one of an activity type from the plurality of activity types and an activity from the plurality of activities associated with activity type. The user command may be at least one of a voice command (received via the microphone 210), a touch gesture, an air gesture, eye gesture, or a signal generated by an input device (for example, a mouse, a touch pad, a stylus, a keyboard, associated connected device or controller (such as, a gaming controller), or the like). The rendering device 204 may include a plurality of displays and a plurality of cameras to handle multiple users simultaneously.
Further, the camera 208 captures in real-time, the real-time video of current activity performance of the user 202 corresponding to the activity type and the activity. In some embodiments, the rendering device 204 may include one or more additional cameras (such as, the at least one external camera 110). The real-time video received from the camera 208 is stored in the database 222. In some embodiments, the user may edit the real-time video based on one or more user commands. The user command may be at least one of a text command, voice command, touch command, or a visual gesture. The one or more user commands include at least one of setting a start point of the real-time video, setting an end point of the real-time video, removing background from the real-time video, assigning one or more tags to the real-time video, and sharing the real-time video with a set of other users.
Further, the rendering device 204 may extract the AI model 220 based on the activity option selected by the user 202. The AI model 220 is configured to determine a deviation of the user 202 from a plurality of correct movements associated with an activity corresponding to the activity option based on target activity performance of an activity expert. Further, the AI model 220 receives the real-time video from the camera 208 through the processor 214 and processes the real-time video to determine a set of user performance parameters based on current activity performance of the user 202.
Further, the AI model 220 overlays the user 202 in the real-time video with a pose skeletal model (such as, the pose skeletal model 114). The pose skeletal model includes a plurality of key points based on the activity type and the activity. Each of the plurality of key points corresponds to a joint or a feature of the user 202 in the real-time video. Additionally, the plurality of key points may be connected with lines representing bones of the user 202 to complete the pose skeletal model. In an embodiment, a 3D rendering of the user 202 may be generated as the pose skeletal model. It may be noted that the pose skeletal model is automatically adjusted and normalized with respect to the real-time video of the user 202 based on a current pose and estimated future distance and viewing position of the user 202 relative to the rendering device 204, the current pose and estimated future field of view, and the current pose and estimated future pose and motion of the user 202. In some embodiments, transparency of the pose skeletal model 114 may be adjustable by the user 202. In an embodiment, the pose skeletal model is completely transparent and invisible to the user 202. In such an embodiment, the pose skeletal model may be used by the AI model 220 solely for computational purposes.
The AI model 220 compares the set of user performance parameters with a set of target activity performance parameters. Further, feedback for the user is generated via pose and AI deviation process, based on comparison of the set of user performance parameters with the set of target activity performance parameters. The feedback includes at least one of corrective actions or alerts. The feedback may be at least one of visual feedback, aural feedback, or haptic feedback. Further, the AI model 220 renders the feedback. The rendering may include overlaying one of the at least one corrective actions over the real-time video of the user 202 on the rendering device 204. Further, the rendering may include displaying the alerts on the GUI of the rendering device 204. Further, the rendering may include outputting the aural feedback to the user 202, via the speaker 212. In an embodiment, the speaker 212 may be a directional speaker to provide a more personalized training experience for the user 202. In some embodiments, the system 200 includes a plurality of speakers (for example, a home theatre system) installed in different regions of the room. In some embodiments, the rendering device 204 is configured to output audio feedback via a Bluetooth headset or speaker. The feedback may include generating a warning to the user 202 including indication for correcting the current pose of the user 202, indication for correcting user motion associated with the current pose of the user 202, and indication for correcting the current position of the user 202, when the user 202 is at least partially outside a field of view of the camera 208. In some embodiments, the AI model 220 may include a plurality of submodules functioning in combination to execute the aforementioned steps for training the user to perform physical activities in real-time. In an embodiment, the AI model 220 of the rendering device 204 may include a recommendation engine for providing recommendations of activities based on performance data of the user 202. By way of an example, the performance data may include exercises performed, circuits performed, duration of exercises, activity performance, personal goals, user profile, age, weight, Body Mass Index (BMI), and the like.
The rendering device 204 may continue to display the target activity performance as long as the current activity performance is in accordance with the target activity performance. However, when the current activity performance of the user 202, and the associated pose skeletal model, shows a deviation from the target activity performance of the activity expert, the rendering device 204 may generate feedback for the user 202 to ensure that the current activity performance is in accordance with the target activity performance. When the deviation of the current activity performance is above a predefined threshold performance and continues for a predefined threshold time, the rendering device 204 may pause the display of the set of user performance parameters and the target activity performance.
As may be appreciated, the feedback based on the activity being performed by the user 102 may not be limited to instructions to perform corrective actions. The feedback may also include biometric feedback or warnings, for example, any irregularity or issues in one or more of pulse rate or heartbeat of the user 102, body temperature of the user 102, spasms in muscles, pupil dilation, and other similar health issues. In some embodiments, feedback may be in the form of motivation or encouragement provided to the user 102 while performing the activity or after completion of the activity. By way of an example, in the form of audio feedback, messages like: “great job,” “you are awesome,” “great going,” “perfectly done,” “done like a pro,” “you are the best,” “that's the best I have seen,” and other similar messages, may be provided to the user 102. The sound of clapping, cheers, or various exclamations may also be provided to the user 102 as feedback. These messages may also be provided in the form of visual feedback, such that, the messages may be displayed in textual form on the GUI of the rendering device 100. Additionally, or alternatively, graphic elements, for example, bursting crackers, flying balloons, the sound of stadium crowd, or avatars of cheerleader, instructor, famous people (for example, Kai Greene, Phil Health, Ronnie Coleman, Arnold, and other famous personalities), may also be displayed to the user 102. In some configurations, gamification of the activities performed by the user and a rewarding mechanism may also be used as feedback provided to the user. As a result of such feedback, the user 102 may be constantly motivated and may not feel that he/she is performing any given activity in silo.
In some configurations, the user 102 may also be able to set goals related to various activities. In such case, the feedback may include status regarding percentage of goals achieved by the user 102.
In some embodiments, in order to provide feedback to the user 102 on their personal smart devices, i.e., third party smart devices, the rendering device 100 may be configured with an open Application Programming Interface (API), which may enable such integration seamlessly. Moreover, data received from the third party smart devices may also be ingested into the rendering device 100, via the open API, and may further be provided to the user 102 via the rendering device 100 using visual elements (such as, graphs or charts), verbal and audio cues, or haptic cues. The data may also correspond to warnings and alerts generated by the third party smart devices. By way of an example, a smart watch that is configured to sense blood pressure of the user 102 may send data regarding the user 102 having high blood pressure to the rendering device 100. Accordingly, the rendering device 100 may render the message “Your blood pressure is too high, please relax and take a break” to the user 102, orally or visually. Thus, the rendering device 100 may act as a collator of feedback and a single point smart device for viewing all feedbacks. In other words, since the smart mirror 100 generates feedback on its own and also receives feedback from other smart devices, the rendering device 100 assimilates all feedback, refines it, and the presents it to the user 102 via the rendering device 100. Thus, the user does not have to rely on multiple devices to receive various types of feedbacks.
Further, the user 102 may want to share activity performance with his friends on various social networks or with other remote users that may also use smart mirrors 100. To this end, the rendering device 100 may be configured with various integrate with social media applications. Examples of these social media applications may include, but are not limited to FACEBOOK™, WHATSAPP™, YOUTUBE™, and/or INSTAGRAM™. In some embodiments, the smart mirror 100 may have these social media applications already installed therein. There may also be a social media application that is specific to the rendering device 100 and is configured to only connect users of other rendering devices 100 and/or smart mirrors.
Thus, by way of integration with these social media applications, the user performance may be posted and published on one or more of these social media platforms and may be made available as online content for other users to access. The rewarding mechanism as discussed before may also be shared or used on social media platforms. In some configurations, scores related to user activities may be presented on a leader board as points for various users who use smart mirrors 100 and/or display devices 200. Badges may also be assigned to various users based on level of activities performed by them and may be displayed on social media platforms. Additionally, records related to exercises performed may also be displayed. Moreover, goals set by various users for activities and respective percentage completion of goals may also be displayed on social media platforms. As may be appreciated, feedback provided to users may also be shared within group of users on social media, including friends, social circles, and classes that may be connected in real-time.
In some embodiments, the rendering device 204 may generate audio messages (such as, the number of reps in audio form, the aural feedback to the user 202, new achievements, personal best, messages from other users, advertising, challenges, errors, warnings, or the like) for the user 202 in an audio form via the speaker 212. It may be noted that when generating the audio messages, timing and duration of an audio output may be important. For example, when the user 202 is exercising at a high speed, some of the audio messages may become obsolete before generation. Moreover, some of the audio messages may become repetitive and unnatural. Additionally, some of the audio messages may be of a higher priority (for example, warnings and errors). In such scenarios, the audio messages may be generated through a mechanism based on priority queues. In an embodiment, the audio messages may use an AI-based approach to generate more natural dialogues. It may be noted that pose and exercise matching through AI may be performed on a remote server while recognition of key points may be performed on an edge node. Thus, transfer of heavy video data to the server can be avoided. Additionally, overall security may be enhanced since the pose and exercise matching is not known on an end edge device.
It should be noted that all such aforementioned modules 206-224 may be represented as a single module or a combination of different modules. Further, as will be appreciated by those skilled in the art, each of the modules 206-224 may reside, in whole or in parts, on one device or multiple devices in communication with each other. In some embodiments, each of the modules 206-224 may be implemented as dedicated hardware circuit comprising custom application-specific integrated circuit (ASIC) or gate arrays, off-the-shelf semiconductors such as logic chips, transistors, or other discrete components. Each of the modules 206-224 may also be implemented in a programmable hardware device such as a field programmable gate array (FPGA), programmable array logic, programmable logic device, and so forth. Alternatively, each of the modules 206-224 may be implemented in software for execution by various types of processors (e.g., processor 214). An identified module of executable code may, for instance, include one or more physical or logical blocks of computer instructions, which may, for instance, be organized as an object, procedure, function, or other construct. Nevertheless, the executables of an identified module or component need not be physically located together, but may include disparate instructions stored in different locations which, when joined logically together, include the module and achieve the stated purpose of the module. Indeed, a module of executable code could be a single instruction, or many instructions, and may even be distributed over several different code segments, among different applications, and across several memory devices over internet, cloud, and in parallel.
As will be appreciated by one skilled in the art, a variety of processes may be employed for training the user to perform physical activities. For example, the exemplary system 200 and the associated rendering device 204 may train the user to perform physical activities in real-time by the processes discussed herein. In particular, as will be appreciated by those of ordinary skill in the art, control logic and/or automated routines for performing the techniques and steps described herein may be implemented by the system 200 and the associated rendering device 204 either by hardware, software (such as, neural networks or other computational models), or combinations of hardware and software. For example, suitable code may be accessed and executed by the one or more processors on the rendering device 204 to perform some or all of the techniques described herein. Similarly, application specific integrated circuits (ASICs) configured to perform some or all of the processes described herein may be included in the one or more processors on the rendering device 204.
Referring now to
Further, the process 300 includes receiving an activity option selected by the user from the plurality of activity options in response to a user input, at step 304. The user input includes at least one of a gesture, a touch, an audio command, or a signal generated by an input device (such as, keyboard, mouse, stylus, graphic pen, or the like). In an embodiment, the audio command may be received by the microphone 210.
Further, the process 300 includes capturing, by at least one camera (for example, the at least one camera 108), a real-time video of the user performing the activity based on the selected activity option, at step 306. Each of the at least one camera captures the real-time video of the user from associated predefined angles. The real-time video includes a stream of poses and movements made by the user to perform the activity. In some embodiments, the user may edit the real-time video based on one or more user commands. The user command is at least one of a text command, voice command, touch command, or a visual gesture. By way of an example, the one or more user commands include at least one of setting a start point of the real-time video, setting an end point of the real-time video, removing background from the real-time video, assigning one or more tags to the real-time video, and sharing the real-time video with the activity expert or a set of other users. Alternately, the activity expert may record and edit real-time video corresponding to the target activity performance to be shared with the user.
Further, the process 300 includes extracting an AI model (such as, the AI model 220) based on the activity option selected by the user, at step 308. The AI model is configured to determine a deviation of the user from a plurality of correct movements associated with an activity corresponding to the activity option based on target activity performance of an activity expert. Further, the process 300 includes processing in real-time, by the AI model, the real-time video of the user to determine a set of user performance parameters based on current activity performance of the user, at step 310.
Further, the process 300 includes overlaying, by the AI model, the user in the real-time video with a pose skeletal model (for example, the pose skeletal model 114), at step 312. The pose skeletal model includes a plurality of key points based on the activity. Each of the plurality of key points is overlayed over a corresponding joint of the user in the real-time video. In an embodiment, the AI model 220 is configured to generate the pose skeletal model based on the real-time video corresponding to the current activity performance of the user 202. In such an embodiment, the AI model 220 is configured to identify various joints or features of the user 202 in real-time and assign a key point to each of the joints or features. In some embodiments, the AI model 220 is configured to estimate a future distance of the user 202 relative to the rendering device 204 and a future pose and motion of the user. Further, the step 312 of the process 300 includes automatically adjusting and normalizing the pose skeletal model based on the current pose and estimated future distance of the user relative to the rendering device and the current pose and estimated future pose and motion of the user, at step 314.
Further, the process 300 includes comparing, by the AI model, the set of user performance parameters with a set of target activity performance parameters, at step 316. The set of target activity performance parameters corresponds to the activity expert. By way of an example, the set of user performance parameters includes, but is not limited to, speed of the current activity performance, number of repetitions completed, overall completion of an activity circuit, third-party smart device information, pulse rate of the user, blood pressure of the user, and motion of the user. By way of an example, the set of target activity performance parameters includes, but is not limited to speed of the target activity performance, target number of repetitions, target pulse rate of the user, and target motion of the user.
Further, the process 300 includes generating, by the AI model, feedback for the user based on comparison of the set of user performance parameters with the set of target activity performance parameters, at step 318. The feedback includes at least one of corrective actions or alerts. The feedback includes at least one of visual feedback, aural feedback, or haptic feedback. The feedback may include generating a warning to the user including indication for correcting the current pose of the user, indication for correcting user motion associated with the current pose of the user, and indication for correcting the current position of the user, when the user is at least partially outside a field of view of the at least one camera.
Further, the process 300 includes rendering, by the AI model, the feedback on the rendering device, at step 320. Further, the step 320 of the process 300 includes overlaying one of the at least one corrective actions over the pose skeletal model overlayed on the real-time video of the user, at step 322. Further, the step 320 of the process 300 includes displaying the alerts on the GUI of the rendering device, at step 324. Further, the step 320 of the process 300 includes outputting the aural feedback to the user, via a speaker, at step 326.
In some embodiments, the process 300 includes pausing the display of the set of user performance parameters and the target activity performance when the current activity performance varies from the target activity performance above a predefined threshold performance for a predefined threshold time based on the comparing. Further, in such embodiments, the process 300 includes generating, by the AI model, feedback for the user based on comparison of the set of user performance parameters with the set of target activity performance parameters. It should be noted that the predefined threshold performance and the predefined threshold time are correlated. For example, even when the user is out of alignment from the target activity performance for a short time interval, the smart mirror 100 may pause the display. In some embodiments, the AI model dynamically determines values for each of the predefined threshold performance and the predefined threshold time based on user's skill level. In some embodiments, the AI model may use various parameters other than performance and time.
Further, the process 300 includes displaying the set of user performance parameters, the set of target activity performance parameters, and the target activity performance of the activity expert through the GUI of the rendering device, at step 328. In an embodiment, real-time video may be received from the activity expert corresponding to the target activity performance in real-time in response to the current activity performance of the user. In such an embodiment, the set of user performance parameters and the target activity performance of the activity expert are displayed in real-time through the GUI via the rendering device. The target activity performance is overlayed over the real-time video of the current activity performance of the user. Further, the process 300 includes overlaying, in real-time, the target activity performance over the current activity performance of the user in the real-time video, at step 330.
Referring now to
Referring now to
Further, the process 500 includes comparing at least one of the trainer avatar and the user avatar corresponding to the target activity performance with the current activity performance of the user based on the activity type and the activity, at step 504. Further, the process 500 includes displaying the set of user performance parameters, at least one of the trainer avatar and the user avatar corresponding to the target activity performance, and the current activity performance of the user through the GUI via the rendering device, at step 506. The current activity performance is overlayed over at least one of the trainer avatar and the user avatar corresponding to the target activity performance in real-time. It may be noted that the steps 502-506 may be iteratively performed throughout the current activity performance of the user.
Referring now to
Referring now to
Referring now to
Referring now to
Referring now to
Referring now to
Some embodiments of the present disclosure may be employed in a gymnasium, rehab, physiotherapy inside a hospital, dance studios, theatre, or any other use case scenario. The gymnasium may include, for example, multiple exercise machines and equipment for performing multiple activities by the user. The user may use a rendering device (for example, the rendering device 100) or any other display device (for example, a smart mirror) to select an activity from the activity categories and may correspondingly select an activity attribute that is associated the activity. The multiple cameras may capture the activity of the user and may provide relevant instructions and feedback to the user for improvising the activities being performed. The cameras may also be used for facial recognition of the users to identify the users and provide history, customized settings, messages from trainers, profile data, and other similar data to the users. In a gymnasium, a single rendering device may include multiple screens and GUIs to provide exercise training to a plurality of users. Further, the rendering device 100 is configured to output audio feedback via a Bluetooth headset or speaker.
The cameras may be used to track and record the activity of the user in the gymnasium as the user moves from one area or from one machine to another for performing various activities. The rendering device 100 may track current progress of the user using the cameras as the user moves from one area of the gymnasium to another. The cameras may allow continuity of the user's context and information across the monitor. The gymnasium (or any other use case scenario) may have various types of activity experts, such as personal coaches, trainers, gym activity experts, physical therapist, occupational therapist, physical education teachers, martial arts teachers, dance and choreography, sports personalities, team coach, and demonstrators and other trainers in health and fitness.
The activities performed in the gymnasium and the goals achieved by the user may be shared by the activity expert or the user with one or more other users practicing in the gymnasium or with one or more remote users.
In addition, the user performance may be posted and published on social media platforms and may be made available as online content for the one or more remote users to access. This may be done via gamification of the activities performed by the user and using a rewarding mechanism. Scores related to user activities may be presented on a leader board as points. Badges may be assigned to the user based on level of activities performed. In addition, records related to activities performed that may include, for example, accuracy, total count of exercises performed, breaks between the exercises and the like may be provided to the one or more users and may also be displayed on the rendering devices and smart mirrors. Further, the rendering device 100 may include additional features such as unlocking features, additional activities, designs, and other similar features.
In an embodiment, the rendering device 100 may be used to create content media and may share the content media comprising information related to, for example, current health status of the user, exercising routine, exercising capacity and previous records and earned rewards for the user on social media platforms. This may be done through an application programming interface (API) integrated with the rendering device 100.
The rendering device 100 may be used as a recording tool for creating new fitness content and associated instructions for the activity to be performed by the user as received through voice based input. Further, both the display device and smart mirror as used in the gymnasium (or any other use case scenario) may be used for editing and reviewing new content related to activities being selected by the user and may be used for reviewing the user's session by the activity expert. In addition, the rendering device 100 may be connected to a health and fitness application using which the users may log into the rendering device 100. Feedback as received on the health and fitness application may be shared on the social platforms for social engagement and provide activity related data to other socially connected parties or groups in form of leaderboards.
Further, the rendering device 100 may be used as a recording device by the user or the activity expert. As may be appreciated, the user and the activity expert using the rendering device 100 may crop, highlight, add voice, voice-to-text feedback on the smart mirror. Additionally, the user and the activity expert may be permitted to add or remove background image as used in the smart mirror. Additionally, the activity expert may create metadata, instructions, threshold parameters, and combinations thereof. The recorded videos may be shared with other users. Additionally, the parameters collected by the rendering device may be processed to create metadata, instructions, threshold parameters, and combinations thereof.
The rendering device 100 may use the one or more cameras and other one or more sensors to capture position of the user during performing the activity. The feedback based on the activity being performed by the user is not limited to instructions related to slowing down visual and other media components, such as a video of the target motion, but may also include other feedback such as beats and rhythm audio cues, such as a metronome. For generating correct and timely feedback, a tight coupling of movement of the user may be done to provided performance guidance clues, target movements, media and voice and audio feedback. The rendering device 100 may map and synchronize media and information as provided with actual movement of the user and may thus provide relevant corresponding feedback.
In an embodiment, a multi-language voice based interface may be provided for enabling the user to navigate, select, schedule and sequence an activity from the plurality of activity categories. The voice based input may be used to create and save playlists, add metadata to the playlists, add comments using speech-to-text mechanism and audio feedback to the playlists and the activities, record a new activity category, edit and clip the activity to be performed, tag an exercise with hashtags, for example, type of exercise, muscle groups or level of difficulty, replace an exercise clip with an alternative version, share playlists and exercises with other users, dictate a message for another user when sharing the playlists.
Some embodiments of the present disclosure may be implemented as an AI-based health and fitness system training method. The method includes detecting a user, determining pose and body movement of a user using a camera, further sensing motion, position, and/or movement of the body or user using a sensor(s), directing and monitoring the exercise through the pose determination and the body movement of the user on the smart mirror, overlaying the pose and the movement over a mirror reflection of the user and providing real time feedback, overlaying the pose over a video stream of an activity expert and showing the pose position in conjunction with the training video and user, tracking exercise in real time, automatic rep counting, guiding and target correlation and accuracy in an exercise sequence and real time social media sharing to groups, friends and others.
In an alternate embodiment, the Artificial Intelligence based health and fitness training method, includes a camera for determining pose and body movement of a user; a microphone for listening for the user's voice instructions; a speaker for providing feedback to the user on their movements, whereby the method provides for determining the pose and the body movement of the user to track exercise in real time, for automatic rep counting, to guide and target correlation in an exercise sequence.
In an embodiment, real-time live feedback may be provided using voice controlled instructions, visuals including video or textual or graphic elements based on instructions for the exercise, scripts, sequence of the exercise and/or performance of the user during performing the activity. The rendering device 100 may include information related to, for example, details related to user's account access, user's workout history and other information relevant information related to the user. The rendering device 100 may determine pose and body movement of the user, overlaying pose and movement of the user over a mirror refection of the user and also overlaying the pose over a video stream of the activity expert and showing the pose position in conjunction with the training video and the user. Based on the AI model, the rendering device 100 may provide live feedback to the user based on instructions related to, for example, exercise/activity, scripts, sequence of the exercise and/or performance of the user during exercise through audio feedback/video feedback/textual feedback/graphic feedback.
In an embodiment, for efficient display of the guidance steps and placement of visual information related to the guidance steps may be presented on a screen of the rendering device. The rendering device GUI may be adjusted based on eye position of the user so that the information is placed appropriate to reflection rather than capture of video stream of the user or in some fixed position.
Further, a 3D model of pose and movement of the user on the smart mirror is provided. This is accompanied by overlaying the provided 3D model over reflection and then the 3D model may be rendered along with analysis for display through the smart mirror. As may be appreciated, the one or more cameras capturing the user's pose and motion mid-way along the long side of the mirror may be adjusted for allowing a better aspect ratio of the user's pose.
The video recording may be created using at least one camera placed at distributed locations. The rendering device 100 may include a recording device for creating new training content, for recording content to be reviewed later by an activity expert, physiotherapist, teacher, choreographer and for real-time sharing of the activity expert's or user's live stream.
As will be also appreciated, the above described techniques may take the form of computer or controller implemented processes and apparatuses for practicing those processes. The disclosure can also be embodied in the form of computer program code containing instructions embodied in tangible media, such as floppy diskettes, solid state drives, CD-ROMs, hard drives, or any other computer-readable storage medium, wherein, when the computer program code is loaded into and executed by a computer or controller, the computer becomes an apparatus for practicing the invention. The disclosure may also be embodied in the form of computer program code or signal, for example, whether stored in a storage medium, loaded into and/or executed by a computer or controller, or transmitted over some transmission medium, such as over electrical wiring or cabling, through fiber optics, or via electromagnetic radiation, wherein, when the computer program code is loaded into and executed by a computer, the computer becomes an apparatus for practicing the invention. When implemented on a general-purpose microprocessor, the computer program code segments configure the microprocessor to create specific logic circuits.
The disclosed methods and systems may be implemented on a conventional or a general-purpose computer system, such as a personal computer (PC) or server computer. Referring now to
The computing system 1200 may also include a memory 1206 (main memory), for example, Random Access Memory (RAM) or other dynamic memory, for storing information and instructions to be executed by the processor 1202. The memory 1206 also may be used for storing temporary variables or other intermediate information during execution of instructions to be executed by the processor 1202. The computing system 1200 may likewise include a read only memory (“ROM”) or other static storage device coupled to bus 1204 for storing static information and instructions for the processor 1202.
The computing system 1200 may also include a storage devices 1208, which may include, for example, a media drive 1210 and a removable storage interface. The media drive 1210 may include a drive or other mechanism to support fixed or removable storage media, such as a hard disk drive, a floppy disk drive, a magnetic tape drive, an SD card port, a USB port, a micro USB, an optical disk drive, a CD or DVD drive (R or RW), or other removable or fixed media drive. A storage media 1212 may include, for example, a hard disk, magnetic tape, flash drive, or other fixed or removable medium that is read by and written to by the media drive 1210. As these examples illustrate, the storage media 1212 may include a computer-readable storage medium having stored therein particular computer software or data.
In alternative embodiments, the storage devices 1208 may include other similar instrumentalities for allowing computer programs or other instructions or data to be loaded into the computing system 1200. Such instrumentalities may include, for example, a removable storage unit 1214 and a storage unit interface 1216, such as a program cartridge and cartridge interface, a removable memory (for example, a flash memory or other removable memory module) and memory slot, and other removable storage units and interfaces that allow software and data to be transferred from the removable storage unit 1214 to the computing system 1200.
The computing system 1200 may also include a communications interface 1218. The communications interface 1218 may be used to allow software and data to be transferred between the computing system 1200 and external devices. Examples of the communications interface 1218 may include a network interface (such as an Ethernet or other NIC card), a communications port (such as for example, a USB port, a micro USB port), Near field Communication (NFC), and other similar communications interface. Software and data transferred via the communications interface 1218 are in the form of signals which may be electronic, electromagnetic, optical, or other signals capable of being received by the communications interface 1218. These signals are provided to the communications interface 1218 via a channel 1220. The channel 1220 may carry signals and may be implemented using a wireless medium, wire or cable, fiber optics, or other communications medium. Some examples of the channel 1220 may include a phone line, a cellular phone link, an RF link, a Bluetooth link, a network interface, a local or wide area network, and other communications channels.
The computing system 1200 may further include Input/Output (I/O) devices 1222. Examples may include, but are not limited to a display, keypad, microphone, audio speakers, vibrating motor, LED lights, and other similar I/O devices. The I/O devices 1222 may receive input from a user and also display an output of the computation performed by the processor 1202. In this document, the terms “computer program product” and “computer-readable medium” may be used generally to refer to media such as, for example, the memory 1206, the storage devices 1208, the removable storage unit 1214, or signal(s) on the channel 1220. These and other forms of computer-readable media may be involved in providing one or more sequences of one or more instructions to the processor 1202 for execution. Such instructions, generally referred to as “computer program code” (which may be grouped in the form of computer programs or other groupings), when executed, enable the computing system 1200 to perform features or functions of embodiments of the present invention.
In an embodiment where the elements are implemented using software, the software may be stored in a computer-readable medium and loaded into the computing system 1200 using, for example, the removable storage unit 1214, the media drive 1210 or the communications interface 1218. The control logic (in this example, software instructions or computer program code), when executed by the processor 1202, causes the processor 1202 to perform the functions of the invention as described herein.
As will be appreciated by those skilled in the art, the techniques described in the various embodiments discussed above are not routine, or conventional, or well understood in the art. The techniques discussed above provide for training users to perform physical activities. The techniques first render, via a Graphical User Interface (GUI) of a rendering device, a plurality of activity type options to a user. Each of the plurality activity type options includes a plurality of activities. The techniques may then receive an activity option selected by the user from the plurality of activity options in response to a user input. The user input includes at least one of a gesture, a touch, or an audio command. The techniques may then capture, by at least one camera, a real-time video of the user performing the activity based on the selected activity option. Each of the at least one camera captures the real-time video of the user from associated predefined angles. The real-time video includes a stream of poses and movements made by the user to perform the activity. The techniques may then extract an AI model based on the activity option selected by the user. The AI model is configured to determine a deviation of the user from a plurality of correct movements associated with an activity corresponding to the activity option based on target activity performance of an activity expert. The techniques may then process in real-time, by the AI model, the real-time video of the user to determine a set of user performance parameters based on current activity performance of the user. The techniques may then overlay, by the AI model, the user in the real-time video with a pose skeletal model. The pose skeletal model includes a plurality of key points based on the activity. Each of the plurality of key points is overlayed over a corresponding joint of the user in the real-time video. The techniques may then compare, by the AI model, the set of user performance parameters with a set of target activity performance parameters. The set of target activity performance parameters corresponds to the activity expert. The techniques may then generate, by the AI model, feedback for the user based on comparison of the set of user performance parameters with the set of target activity performance parameters. The feedback includes at least one of corrective actions or alerts. The feedback includes at least one of visual feedback, aural feedback, or haptic feedback. The techniques may then render, by the AI model, the feedback on the rendering device. Rendering the feedback includes overlaying one of the at least one corrective actions over the pose skeletal model overlayed on the real-time video of the user. Rendering the feedback further includes displaying the alerts on the GUI of the rendering device. Rendering the feedback further includes outputting the aural feedback to the user, via a speaker.
In light of the above mentioned advantages and the technical advancements provided by the disclosed method and system, the claimed steps as discussed above are not routine, conventional, or well understood in the art, as the claimed steps enable the following solutions to the existing problems in conventional technologies. Further, the claimed steps clearly bring an improvement in the functioning of the device itself as the claimed steps provide a technical solution to a technical problem.
The specification has described method and system for training users to perform physical activities. The illustrated steps are set out to explain the exemplary embodiments shown, and it should be anticipated that ongoing technological development will change the manner in which particular functions are performed. These examples are presented herein for purposes of illustration, and not limitation. Further, the boundaries of the functional building blocks have been arbitrarily defined herein for the convenience of the description. Alternative boundaries can be defined so long as the specified functions and relationships thereof are appropriately performed. Alternatives (including equivalents, extensions, variations, and deviations, of those described herein) will be apparent to persons skilled in the relevant art(s) based on the teachings contained herein. Such alternatives fall within the scope and spirit of the disclosed embodiments.
Furthermore, one or more computer-readable storage media may be utilized in implementing embodiments consistent with the present disclosure. A computer-readable storage medium refers to any type of physical memory on which information or data readable by a processor may be stored. Thus, a computer-readable storage medium may store instructions for execution by one or more processors, including instructions for causing the processor(s) to perform steps or stages consistent with the embodiments described herein. The term “computer-readable medium” should be understood to include tangible items and exclude carrier waves and transient signals, i.e., be non-transitory. Examples include random access memory (RAM), read-only memory (ROM), volatile memory, nonvolatile memory, hard drives, CD ROMs, DVDs, flash drives, disks, and any other known physical storage media.
It is intended that the disclosure and examples be considered as exemplary only, with a true scope and spirit of disclosed embodiments being indicated by the following claims.
Claims
1. A method for training users to perform physical activities, the method comprising:
- rendering, via a Graphical User Interface (GUI) of a rendering device, a plurality of activity type options to a user, wherein each of the plurality activity type options comprises a plurality of activities;
- receiving an activity option selected by the user from the plurality of activity options in response to a user input, wherein the user input comprises at least one of a gesture, a touch, an audio command, or a signal generated from an input device;
- capturing, by at least one camera, a real-time video of the user performing the activity based on the selected activity option, wherein each of the at least one camera captures the real-time video of the user from associated predefined angles, and wherein the real-time video comprises a stream of poses and movements made by the user to perform the activity;
- extracting an AI model based on the activity option selected by the user, wherein the AI model is configured to determine a deviation of the user from a plurality of correct movements associated with an activity corresponding to the activity option based on target activity performance of an activity expert;
- processing in real-time, by the AI model, the real-time video of the user to determine a set of user performance parameters based on current activity performance of the user;
- overlaying, by the AI model, the user in the real-time video with a pose skeletal model, wherein the pose skeletal model comprises a plurality of key points based on the activity, and wherein each of the plurality of key points is overlayed over a corresponding joint of the user in the real-time video;
- comparing, by the AI model, the set of user performance parameters with a set of target activity performance parameters, wherein the set of target activity performance parameters corresponds to the activity expert;
- generating, by the AI model, feedback for the user based on comparison of the set of user performance parameters with the set of target activity performance parameters, wherein the feedback comprises at least one of corrective actions or alerts, and wherein the feedback comprises at least one of visual feedback, aural feedback, or haptic feedback; and
- rendering, by the AI model, the feedback on the rendering device, wherein rendering the feedback comprises: overlaying one of the at least one corrective actions over the pose skeletal model overlayed on the real-time video of the user; displaying the alerts on the GUI of the rendering device; and outputting the aural feedback to the user, via a speaker.
2. The method of claim 1, wherein overlaying the user in the real-time video with the pose skeletal model comprises:
- automatically adjusting and normalizing the pose skeletal model based on a current pose and estimated future distance of the user relative to the rendering device and a current pose and estimated future pose and motion of the user.
3. The method of claim 2, wherein the feedback comprises generating a warning to the user comprising:
- indication for correcting the current pose of the user;
- indication for correcting user motion associated with the current pose of the user; and
- indication for correcting the current position of the user, when the user is at least partially outside a field of view of the at least one camera.
4. The method of claim 1, further comprising displaying the set of user performance parameters, the set of target activity performance parameters, and the target activity performance of the activity expert through the GUI of the rendering device.
5. The method of claim 4, further comprising overlaying, in real-time, the target activity performance over the current activity performance of the user in the real-time video.
6. The method of claim 1, wherein the set of user performance parameters comprises speed of the current activity performance, number of repetitions completed, overall completion of an activity circuit, third-party smart device information, pulse rate of the user, blood pressure of the user, and motion of the user, and wherein the set of target activity performance parameters comprises speed of the target activity performance, target number of repetitions, target pulse rate of the user, and target motion of the user.
7. The method of claim 1, further comprising:
- rendering at least one of a trainer avatar corresponding to the target activity performance of the activity expert and a user avatar corresponding to the current activity performance of the user upon capturing the real-time video through the at least one camera, wherein the trainer avatar is a 3-Dimensional (3D) model of the activity expert and the user avatar is a multidimensional model of the user.
8. The method of claim 7, further comprising:
- comparing at least one of the trainer avatar and the user avatar corresponding to the target activity performance with the current activity performance of the user based on the activity type and the activity; and
- displaying the set of user performance parameters, at least one of the trainer avatar and the user avatar corresponding to the target activity performance, and the current activity performance of the user through the GUI via the rendering device, wherein the current activity performance is overlayed over at least one of the trainer avatar and the user avatar corresponding to the target activity performance in real-time.
9. The method of claim 1, further comprising:
- storing the real-time video received from the at least one camera in a database; and
- editing the real-time video based on one or more user commands of the user, wherein the user command is at least one of a text command, voice command, touch command, or a visual gesture, wherein the one or more user commands comprise at least one of: setting a start point of the real-time video; setting an end point of the real-time video; removing background from the real-time video; assigning one or more tags to the real-time video; and sharing the real-time video with a set of other users.
10. The method of claim 1, further comprising:
- receiving real-time video from the activity expert corresponding to the target activity performance in real-time in response to the current activity performance of the user; and
- displaying the set of user performance parameters and the target activity performance of the activity expert in real-time through the GUI via the rendering device, wherein the target activity performance is overlayed over the current activity performance of the user in the real-time video.
11. The method of claim 1, further comprising:
- pausing the display of the set of user performance parameters and the target activity performance when the current activity performance deviates from the target activity performance above a predefined threshold performance for a predefined threshold time based on the comparing; and
- generating, by the AI model, feedback for the user based on comparison of the set of user performance parameters with the set of target activity performance parameters.
12. The method of claim 1, further comprising:
- detecting, via the at least one camera, an initial position of the user;
- determining whether the detected initial position of the user matches an initial position mapped to the at least one activity; and
- instructing the user to correct the initial position, when the detected initial position fails to match the initial position.
13. The method of claim 1, wherein the user command comprises at least one of a voice command, a touch gesture, an air gesture, eye gesture, or a signal generated by an input device.
14. A rendering device for training users to perform physical activities, the rendering device comprising:
- a display comprising a Graphical User Interface (GUI), wherein the GUI is configured to: render a plurality of activity type options to a user, wherein each of the plurality activity type options comprises a plurality of activities; and receive an activity option selected by the user from the plurality of activity options in response to a user input, wherein the user input comprises at least one of a gesture, a touch, an audio command, or a signal generated from an input device;
- at least one camera configured to capture a real-time video of the user performing the activity based on the selected activity option, wherein each of the at least one camera captures the real-time video of the user from associated predefined angles, and wherein the real-time video comprises a stream of poses and movements made by the user to perform the activity;
- a processor; and
- a memory communicatively coupled to the processor, wherein the memory stores processor instructions, which when executed by the processor, cause the processor to: extract an AI model based on the activity option selected by the user, wherein the AI model is configured to determine a deviation of the user from a plurality of correct movements associated with an activity corresponding to the activity option based on target activity performance of an activity expert; process in real-time, by the AI model, the real-time video of the user to determine a set of user performance parameters based on current activity performance of the user; overlay, by the AI model, the user in the real-time video with a pose skeletal model, wherein the pose skeletal model comprises a plurality of key points based on the activity, and wherein each of the plurality of key points is overlayed over a corresponding joint of the user in the real-time video; compare, by the AI model, the set of user performance parameters with a set of target activity performance parameters, wherein the set of target activity performance parameters corresponds to the activity expert; generate, by the AI model, feedback for the user based on comparison of the set of user performance parameters with the set of target activity performance parameters, wherein the feedback comprises at least one of corrective actions or alerts, and wherein the feedback comprises at least one of visual feedback, aural feedback, or haptic feedback; and render, by the AI model, the feedback on the rendering device, wherein rendering the feedback comprises: overlaying one of the at least one corrective actions over the pose skeletal model overlayed on the real-time video of the user; displaying the alerts on the GUI of the rendering device; and outputting the aural feedback to the user, via a speaker.
15. The rendering device of claim 14, wherein to overlay the user in the real-time video with the pose skeletal model, the processor instructions, on execution, cause the processor to:
- automatically adjust and normalize the pose skeletal model based on a current pose and estimated future distance of the user relative to the rendering device and a current pose and estimated future pose and motion of the user, wherein the feedback comprises generating a warning to the user comprising: indication for correcting the current pose of the user; indication for correcting user motion associated with the current pose of the user; and indication for correcting the current position of the user, when the user is at least partially outside a field of view of the at least one camera.
16. The rendering device of claim 14, wherein the processor instructions, on execution, cause the processor to:
- display the set of user performance parameters, the set of target activity performance parameters, and the target activity performance of the activity expert through the GUI of the rendering device; and
- overlaying, in real-time, the target activity performance over the current activity performance of the user in the real-time video.
17. The rendering device of claim 14, wherein the processor instructions, on execution, cause the processor to:
- render at least one of a trainer avatar corresponding to the target activity performance of the activity expert and a user avatar corresponding to the current activity performance of the user upon capturing the real-time video through the at least one camera, wherein the trainer avatar is a 3-Dimensional (3D) model of the activity expert and the user avatar is a multidimensional model of the user;
- compare at least one of the trainer avatar and the user avatar corresponding to the target activity performance with the current activity performance of the user based on the activity type and the activity; and
- display the set of user performance parameters, at least one of the trainer avatar and the user avatar corresponding to the target activity performance, and the current activity performance of the user through the GUI via the rendering device, wherein the current activity performance is overlayed over at least one of the trainer avatar and the user avatar corresponding to the target activity performance in real-time.
18. The rendering device of claim 14, wherein the processor instructions, on execution, cause the processor to:
- store the real-time video received from the at least one camera in a database; and
- edit the real-time video based on one or more user commands of the user, wherein the user command is at least one of a text command, voice command, touch command, or a visual gesture, wherein the one or more user commands comprise at least one of: setting a start point of the real-time video; setting an end point of the real-time video; removing background from the real-time video; assigning one or more tags to the real-time video; and sharing the real-time video with a set of other users.
19. The rendering device of claim 14, wherein the processor instructions, on execution, cause the processor to:
- receive real-time video from the activity expert corresponding to the target activity performance in real-time in response to the current activity performance of the user; and
- display the set of user performance parameters and the target activity performance of the activity expert in real-time through the GUI via the rendering device, wherein the target activity performance is overlayed over the current activity performance of the user in the real-time video.
20. The rendering device of claim 14, wherein the processor instructions, on execution, cause the processor to:
- detect, via the at least one camera, an initial position of the user;
- determine whether the detected initial position of the user matches an initial position mapped to the at least one activity; and
- instruct the user to correct the initial position, when the detected initial position fails to match the initial position.
Type: Application
Filed: Sep 6, 2021
Publication Date: Mar 10, 2022
Inventor: Rajiv Trehan (Bangkok)
Application Number: 17/467,381