Automating Video

OpenCV as KNN


As an implementation of KNN, I wanted to set OpenCV's facial recognition coordinates as the query points. After experimenting with Sam Levine's video library VidPy, I first tried applying this to a video clip by accessing the pixel data as a nested for loop, where the KNN categories would change with every frame. I took a step back to better understand the process and settled on working with images. I'm going to open a github issue with cv2 after George Harrison wasn't recognized. 

Screen Shot 2017-11-20 at 2.19.51 PM.png

That's more like it

frame999932 copy.jpg

Music Video Battle


After seeing VidPy's "chroma" feature I knew I wanted to experiment with overlaps and color masks. This project took several detours as I tried exploring a few approaches and turned into an exercise (rather lesson) of thinking before doing. 

I first played around with openCV with the mindset that portraits would open some interesting avenues. i was excited to combine K Nearest Neighbor with OpenCV (where each face would be a point of interest) and have the chroma effects tied to each person. Sure enough simply figuring out numpy arrays was challenging enough and I settled on manually creating a chroma effect by reading and manipulating the pixels values of a video. 

I decided to tie the rewriting of pixel values with the volume of the audio from a music video. In each music video, the volume at any frame affects the amount of pixel information is masked in red. In a separate script, using VidPy's chroma feature, I then removed those masked areas and combined multiple videos in hopes there would be a push and pull between the videos as they fight over space. It ws an interesting experiment, but the resulting video falls flat. 

Screen Shot 2017-11-16 at 10.54.38 AM.png
Screen Shot 2017-11-16 at 12.54.20 PM.png
Screen Shot 2017-11-16 at 2.45.18 PM.png
frame143 copy.jpg
frame65 copy.jpg

ASL Translator


While playing around with OpenCV I was interested in hand gestures and looked to develop a tool that would recognize/translate American Sign Language gestures. I started with a simple script that recognized the number of digits by looking for a brightness threshold (finding the hand blob) and counted the negative spaces. 

This was a little limiting and I soon found a project by Shubham Gupta that recognized multiple hand shapes by training a ML model on a dataset of hundreds of images. It worked pretty well under the exact right lighting conditions, but the blob tracker was based on pixel color (of a skin tone) and didn't isolate the hand as well. So after updating some of the script to work with the latest version of OpenCV, I brought in the blob tracker from the digit counter so it would track based on threshold. 

Now that I have this up and running I'm looking to retrain the model on other gestures, maybe as controls for something else.

Mendelbrot 360


With the goal of creating an immersive environment, I set out to show recursion in a 360 video. I found a pretty common fractal zoom video of a mandelbrot as my source video file and began creating a function that mirrored the clip in four quadrants. The resulting file was the input for the same function so the original clip was recursively multiplied into many smaller, symmetrical copies of itself. I found the transitions between clips to be a little sudden and I began animating the clips to blend into the top left quadrant (where it's the same footage). 

Screen Shot 2017-11-02 at 10.30.40 AM.png