M2TD is a web app that allows a user to create music through body movements. Specific motions are mapped to a distinct sounds ensuring that the user is always in rhythm
M2TDRevolution, boxes appear on the screen and moving the desired body part into them activates a sound
Github
https://github.com/betcherj/M2TD
Stack
-
Capture video from webcam → OpenCV
-
Object recognition + motion capture → BlazePose
- https://github.com/CMU-Perceptual-Computing-Lab/openpose/ → F2D human skeleton motion capture
- https://github.com/freemocap/freemocap → Open source motion capture library
MultiPerson3DPoseEstimation.pdf
-
Audio output
- simpleaudio python package for processing wav files downloaded from freesound
Progress
| Sprint |
Goal |
Notes |
Blockers |
Review |
| Jan 11th - Jan 25th |
~~Decide on stack |
|
|
|
| Get streaming to beep when I waive~~ |
* Run a lightweight open pose on a webcam which plays downloaded wave files async when certainly body parts appear on screen |
|
|
|
- Installed the full version of open pose which runs with a .2 FPS on my laptop
- Added ability to read wav files from config | * Need a good way to define and match motions (combinations of frames) in real time | Was able to combine pose tracking via webcam with computer audio so that sounds are mapped to specific body parts appearing in the frame. The current pose tracking is imperfect and works best when more of the body is in frame. |
| Jan 25th - Feb 8th |
Design and create method for tracking movements (ie rolling windows of poses)
Draw boxes on the screen that indicate targets, move body part through the box in order to activate the sound effect
Switch movement tracking to blaze pose for better performance
Devise a way of mapping recordings on screen to vectors | * Concerned that code added to the same thread as the pose tracking will result in computation slowdown (it is important for this to be real time)
- Is it possible to define human movement as vectors | * Need to determine a way to normalize the motion vector matching based on distance from the camera → have access to 3D pose which could help here
- Need to overlay sounds on top of each other possibly with pydub | Added the ability to store vectorized movements during a screen recording and map that to a specific sound. The switch to blazepose from open pose increased the frame rate to near real time speeds. The look back period for vector tracking takes some tweaking but set correctly can create a smooth experience. Going forward the goal is to be able to map more complex movements instead of simple vectors. |
| Feb 8th - 22nd | Implement a similarity algorithm to compare the captured movements to the recorded movements (should be able to do this with numpy vectors)
Figure out a good way to breakdown and download the component parts of a song
There should be an easier way to load sounds | | | |
| | | | | |
Todo
Will need to figure out how to map wav files to movements / positions
- Want to check if the sound is still playing and see if we should restart the audio or ignore the movement
- In general audio should be quick enough that the movement cannot be repeated as rapidly
- Want to identify movements rather than just body parts
- lets start with change over certain magnitude triggering sound
Notes
Open pose installing