Rendering Breakout-v0 in Google Colab with colabgymrender

I’ve released a module for rendering your gym environments in Google Colab. Since Colab runs on a VM instance, which doesn’t include any sort of a display, rendering in the notebook is difficult. After looking through the various approaches, I found that using the moviepy library was best for rendering video in Colab. So I built a wrapper class for this purpose, called colabgymrender.

Installation

Example

Output

Qbert-v0

Usage

Wrap a gym environment in the Recorder object.

If you specify a frame…


Mouse automatically navigating to a coordinate according to eye position (Image by author)

A Machine Learning approach to eye pose estimation from just a single front-facing perspective as input

In this project, we’ll write code to crop images of your eyes each time you click the mouse. Using this data, we can train a model in reverse, predicting the position of the mouse from your eyes.

We’ll need a few libraries

Let’s first learn how pynput’s Listener works.

pynput.mouse.Listener creates a background thread that records mouse…


In this project, I’ll walk through an introductory project on tabular Q-learning. We’ll train a simple RL agent to be able to evaluate tic-tac-toe positions in order to return the best move by playing against itself for many games.

First, let’s import the required libraries

Note that tabular q-learning only works for environments which can be represented by a reasonable number of actions and states. Tic-tac-toe has 9 squares, each of which can be either an X, and O, or empty. Therefore, there are approximately 3⁹ = 19683 states (and 9 actions, of course). Therefore, we have a table…


Generating Images From Line Drawings With ML

Here, I’ll walk through a machine learning project I recently did in a tutorial-like manner. It is an approach to generating full images in an artistic style from line drawings.

Dataset

I trained on 10% of the Imagenet dataset. This is a dataset commonly used for benchmarks in computer vision tasks. The Imagenet dataset is not openly available; it is restricted to those undergoing research which requires use of it to compute performance benchmarks for comparing with other approaches. Therefore, it is typically required that you submit a request form. But if you are just using it casually, it is available…


Wav2Lip Model Architecture (https://arxiv.org/pdf/2008.10010v1.pdf)

This paper proposes Wav2Lip, an adaptation of the SyncNet model, which outperforms all prior speaker-independent approaches towards the task of video-audio lip-syncing.

The authors note that, while prior approaches typically fail to generalize when presented with video of speakers not present in the training set, Wav2Lip is capable of producing accurate lip movements with a variety of speakers.

They continue to summarize the primary intentions of the paper:

  1. Identifying the cause of prior approaches failing to generalize to a variety of speakers.
  2. Resolve said issues by incorporating a powerful lip-sync discriminator.
  3. Propose new benchmarks for evaluating the performance of approaches…

Ryan Rudes

I am a high-school student studying and researching in ML. I work on projects in Python, mostly using TensorFlow.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store