I’ve released a module for rendering your
gym environments in Google Colab. Since Colab runs on a VM instance, which doesn’t include any sort of a display, rendering in the notebook is difficult. After looking through the various approaches, I found that using the
moviepy library was best for rendering video in Colab. So I built a wrapper class for this purpose, called
apt-get install -y xvfb python-opengl ffmpeg > /dev/null 2>&1
pip install -U colabgymrender
Wrap a gym environment in the
env = gym.make("CartPole-v0")
env = Recorder(env, <directory>, <fps>)
If you specify a frame…
In this project, we’ll write code to crop images of your eyes each time you click the mouse. Using this data, we can train a model in reverse, predicting the position of the mouse from your eyes.
We’ll need a few libraries
# For monitoring web camera and performing image minipulations
import cv2# For performing array operations
import numpy as np# For creating and removing directories
import shutil# For recognizing and performing actions on mouse presses
from pynput.mouse import Listener
Let’s first learn how pynput’s
pynput.mouse.Listener creates a background thread that records mouse…
In this project, I’ll walk through an introductory project on tabular Q-learning. We’ll train a simple RL agent to be able to evaluate tic-tac-toe positions in order to return the best move by playing against itself for many games.
First, let’s import the required libraries
Note that tabular q-learning only works for environments which can be represented by a reasonable number of actions and states. Tic-tac-toe has 9 squares, each of which can be either an X, and O, or empty. Therefore, there are approximately 3⁹ = 19683 states (and 9 actions, of course). Therefore, we have a table…
Here, I’ll walk through a machine learning project I recently did in a tutorial-like manner. It is an approach to generating full images in an artistic style from line drawings.
I trained on 10% of the Imagenet dataset. This is a dataset commonly used for benchmarks in computer vision tasks. The Imagenet dataset is not openly available; it is restricted to those undergoing research which requires use of it to compute performance benchmarks for comparing with other approaches. Therefore, it is typically required that you submit a request form. But if you are just using it casually, it is available…
This paper proposes Wav2Lip, an adaptation of the SyncNet model, which outperforms all prior speaker-independent approaches towards the task of video-audio lip-syncing.
The authors note that, while prior approaches typically fail to generalize when presented with video of speakers not present in the training set, Wav2Lip is capable of producing accurate lip movements with a variety of speakers.
They continue to summarize the primary intentions of the paper:
I am a high-school student studying and researching in ML. I work on projects in Python, mostly using TensorFlow.